Commit Graph

55 Commits

  • Python: Improve PR template and breaking-change label automation (#6473)
    * Improve PR template and breaking-change label automation
    
    - Add a structured "Related Issue" section using GitHub closing keywords
    - Add a Review Guide prompt (major changes, impact, reviewer focus) with a
      note that the focus item is for human reviewers only
    - Add checklist items for issue linkage / no duplicate PRs and invert the
      breaking-change item (checked = not breaking)
    - Extend label-title-prefix to prepend [BREAKING] when the "breaking change"
      label is added
    - Add label-breaking-change workflow to apply the "breaking change" label
      when a PR title contains [BREAKING]
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add pull-requests agent skill with dotnet/python links
    
    - Add root .github/skills/pull-requests/SKILL.md covering PR description
      authoring (following the PR template) and the review-comment workflow
      (review -> plan -> user review -> implement -> reply to all -> resolve)
    - Symlink the skill from python/.github/skills and dotnet/.github/skills
    - Reference the skill from python/AGENTS.md and dotnet/AGENTS.md
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fold breaking-change labeling into label-pr workflow
    
    Move the title -> 'breaking change' label logic into the existing label-pr
    workflow (which already applies the python/.NET labels) and drop the separate
    label-breaking-change workflow.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR title prefix review feedback
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Pin patched MessagePack for .NET restore
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Revert MessagePack central pin
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Move title prefix tests out of tracked GitHub tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Exclude skill docs from CI path filters
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Match skill symlinks in CI path exclusions
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Exclude AGENTS docs from CI path filters
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Scope title-prefix normalization to a real prefix
    
    The normalization branch in addTitlePrefix matched ^Python (no colon), so
    titles like "Python samples improvements" or "Pythonic refactor" were treated
    as already-prefixed and only re-cased, never receiving the "Python: " prefix.
    Scope the match to ^<prefix>:\s* so only an actual existing prefix is
    normalized; otherwise the prefix is prepended. Same fix applies to the .NET
    prefix (e.g. ".NETStandard bump").
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Add GitHub Copilot integration tests to CI workflows (#6346)
    Add a dedicated integration test job for the github_copilot package to both
    python-integration-tests.yml and python-merge-tests.yml.
    
    The job:
    - Runs 6 integration tests marked with @pytest.mark.integration
    - Uses COPILOT_GITHUB_TOKEN secret from the integration environment
    - Follows the same pattern as other provider integration jobs
    - Includes path filtering in merge-tests (github_copilot package + core changes)
    - Added to needs lists in report and check jobs
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • ci: pin third-party GitHub Actions to commit SHAs (#5972)
    Replaces every floating tag in our workflow and composite action files
    with an immutable 40-character commit SHA, keeping the original `# vX`
    comment so Dependabot can still propose version bumps. 186 occurrences
    across 25 workflows and 2 composite actions.
    
    Also widens the github-actions Dependabot entry to use the plural
    `directories` key with `/.github/actions/*` so composite actions under
    `.github/actions/<name>/action.yml` are kept up to date. Previously
    Dependabot only scanned `.github/workflows` and the repo-root
    `action.yml`, leaving our `python-setup` and `sample-validation-setup`
    composite actions unmaintained.
  • Python: Reduce flaky integration tests and improve CI signal quality (#5454)
    * Enable Ollama integration tests in CI and rename report to Integration Test Report
    
    - Install Ollama, cache models (qwen2.5:0.5b + nomic-embed-text), and start
      server in the Misc integration job for both workflow files
    - Set OLLAMA_MODEL and OLLAMA_EMBEDDING_MODEL env vars so the 5 Ollama tests
      are no longer skipped
    - Rename Flaky Test Report to Integration Test Report throughout (job names,
      artifact names, cache keys, file names, script titles/docstrings)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Bump Ollama model to qwen2.5:1.5b for better instruction following
    
    The 0.5b model was too small to reliably follow simple prompts like
    'Say Hello World', causing test assertion failures. The 1.5b model
    follows instructions more reliably while still being small enough
    for fast CI pulls (~1GB).
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Re-enable reliable streaming integration tests
    
    Remove the hard skip on test_03_reliable_streaming tests that was
    temporarily disabled for instability investigation. CI infrastructure
    (Azurite, DTS emulator, Redis, func CLI) is already in place.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Re-enable skipped Functions/DurableTask tests and bump timeout to 480s
    
    - Remove hard skips from 4 tests in test_11_workflow_parallel.py
    - Remove hard skip from test_conditional_branching in test_06_dt_multi_agent_orchestration_conditionals.py
    - Increase pytest --timeout from 360 to 480 for Functions+DurableTask CI job
    - Updated in both python-merge-tests.yml and python-integration-tests.yml
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Re-skip failing Functions/DurableTask tests with specific root causes
    
    - test_11_workflow_parallel (4 tests): xdist worker crashes during execution
    - test_conditional_branching: orchestration fails with RuntimeError, not a timeout
    - Keep 480s timeout bump for remaining Functions tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix auth routing in samples 06/11: api_key -> credential for Azure OpenAI
    
    Both samples passed a bearer token provider via api_key= which caused the
    client to route to api.openai.com instead of Azure OpenAI, resulting in
    401 Unauthorized. Changed to credential= which correctly triggers Azure
    routing and picks up AZURE_OPENAI_ENDPOINT from the environment.
    
    - samples/azure_functions/11_workflow_parallel/function_app.py: 1 fix
    - samples/durabletask/06_multi_agent_orchestration_conditionals/worker.py: 2 fixes
    - Re-enable 4 parallel workflow tests and 1 conditional branching test
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Re-skip parallel workflow tests: xdist worker distribution issue
    
    The 4 parallel workflow tests crash because xdist worksteal distributes
    them across separate workers, each spawning its own func process against
    shared emulators. Auth fix (api_key->credential) was valid and stays.
    test_conditional_branching now passes with the auth fix.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix E501 line-too-long in azurefunctions parallel test skip reasons
    
    Wrap skip reason strings to stay within 120 char line limit.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add retry logic and port-conflict fix for Ollama CI setup
    
    - Kill any auto-started Ollama before launching serve (fixes port
      conflict: 'address already in use')
    - Retry ollama pull up to 3 times with 15s backoff (fixes 429 rate
      limit failures)
    - Applied to both python-merge-tests.yml and python-integration-tests.yml
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix flaky integration tests and re-enable skipped tests
    
    - Foundry agent: add allow_preview=True to custom client test
    - Foundry hosting: raise max_output_tokens 50->200, add temperature,
      relax assertion in test_temperature_and_max_tokens
    - Foundry embedding: update skip reason with root cause (endpoint mismatch)
    - OpenAI file search: fix vector store indexing race condition by polling
      file_counts before querying; fix get_streaming_response -> get_response(stream=True)
    - Azure OpenAI file search: remove skip (transient 500 resolved)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove temperature from foundry hosting test (unsupported by CI model)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Stabilize Ollama tool call integration tests with no-arg function
    
    Use a no-argument greet() function instead of hello_world(arg1) for
    integration tests. The 1.5B model in CI is unreliable at generating
    correct tool call arguments, causing 'Argument parsing failed' errors.
    A no-arg function eliminates this flakiness entirely.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Increase reliable streaming test timeouts from 30s to 60s
    
    The LLM call through Azure OpenAI + Redis streaming pipeline can exceed
    30s in CI due to cold starts or throttling. Raise to 60s to reduce
    flaky timeouts while still bounded by pytest's 120s per-test limit.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Re-enable workflow parallel tests with xdist_group marker
    
    The tests were skipped because xdist distributes module tests across
    workers, each spawning their own func process (port conflicts). Adding
    xdist_group forces all tests in this module onto a single worker so
    the module-scoped function_app_for_test fixture works correctly.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Revert "Re-enable workflow parallel tests with xdist_group marker"
    
    This reverts commit 455c28da62.
    
    * Rename flaky_report to integration_test_report and add try/finally cleanup
    
    - Rename scripts/flaky_report/ to scripts/integration_test_report/ to
      reflect expanded scope beyond flaky-test detection
    - Update workflow references in both CI files
    - Wrap file search integration tests in try/finally to ensure vector
      store cleanup runs even on test failure or timeout
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Ollama pull failure propagation and Azure OpenAI vector store readiness
    
    - Ollama CI: fail the step immediately if model pull fails after 3
      retries instead of silently proceeding to tests
    - Azure OpenAI file search: add the same vector-store readiness polling
      that was applied to the non-Azure OpenAI tests, preventing eventual
      consistency race conditions
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * remove load_dotenv from test file
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Update hosting agent samples + fixes (#5485)
    * Update foundry hosting samples
    
    * Add file data type support
    
    * Fix file content and add more tests
    
    * Fix README
    
    * Address comments
    
    * Fix int tests
    
    * remove temp
  • Python: Flaky test report (#5342)
    * Add flaky test trend reporting to CI workflows
    
    Parse JUnit XML (pytest.xml) from each integration test job and
    aggregate results into a markdown trend report showing per-test
    pass/fail/skip status across the last 5 runs.
    
    Changes:
    - Add python/scripts/flaky_report/ package (JUnit XML parser + trend
      report generator following the sample_validation pattern)
    - Add upload-artifact steps to all 6 integration test jobs in both
      python-merge-tests.yml and python-integration-tests.yml
    - Add python-flaky-test-report aggregation job with history caching
    - Add --junitxml=pytest.xml to integration-tests.yml jobs (already
      present in merge-tests.yml)
    - Fix Cosmos job --junitxml path (use absolute path since uv run
      --directory changes cwd)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix flaky report: handle missing test results gracefully
    
    - Guard against missing reports directory in load_current_run()
    - Only run report job when at least one integration test job completed
      (skip when all jobs are skipped, e.g. on pull_request events)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review: fix provider names and if-expression precedence
    
    - Use explicit provider name mapping in _derive_provider() so OpenAI
      renders correctly instead of 'Openai'
    - Fix operator precedence in workflow if-expressions by wrapping
      success/failure checks in parentheses
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add File column and xfail detection to flaky test report
    
    - Add File column showing module name (e.g., test_openai_chat_client)
      to disambiguate tests with the same function name across files
    - Detect pytest xfail tests in JUnit XML (type=pytest.xfail) and
      show them with a distinct warning emoji instead of skip emoji
    - Update legend to include xfail explanation
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add Foundry embedding env vars to merge-tests workflow
    
    Sync the Foundry integration job in python-merge-tests.yml with
    python-integration-tests.yml by adding FOUNDRY_MODELS_ENDPOINT,
    FOUNDRY_MODELS_API_KEY, FOUNDRY_EMBEDDING_MODEL, and
    FOUNDRY_IMAGE_EMBEDDING_MODEL. Once the repo variables/secrets
    are configured, the embedding integration test will run in CI.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix File column showing class name instead of module name
    
    When a test is inside a class, pytest writes the classname as e.g.
    'pkg.test_file.TestClass'. The previous rsplit logic extracted
    'TestClass' instead of 'test_file'. Now detect uppercase-starting
    segments as class names and use the preceding segment instead.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review: UTC timestamps, XML error handling, summary fix, docstring
    
    - Use datetime.now(timezone.utc) for accurate UTC timestamps
    - Catch ET.ParseError per-file so corrupt XML doesn't crash the report
    - Remove separate 'error' key from summary (errors folded into 'failed')
    - Fix _short_name docstring to show actual dotted classname::name format
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Add Hyperlight CodeAct package and docs (#5185)
    * initial work on code_mode
    
    * updated samples
    
    * updates to codeact
    
    * udpated codeact
    
    * Draft CodeAct ADR and sample updates
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * initial implementation and adr and feature
    
    * Python: Limit Hyperlight wasm backend to Python <3.14
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Fix CI for Hyperlight CodeAct PR
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Run Hyperlight integration when available
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Address Hyperlight review feedback
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Simplify Hyperlight file mount inputs
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Accept Path host paths in Hyperlight mounts
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Fix Hyperlight mount typing for CI
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * temp run integration test
    
    * Python: Strengthen Hyperlight real sandbox tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * added additional tests
    
    * Python: Simplify Hyperlight CodeAct API
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * set tests as non-integration
    
    * Retry Hyperlight allowed-domain registration
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Gate Hyperlight integration tests by runtime support
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Hyperlight skip test on Python 3.14
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Delay Hyperlight runtime probe until test execution
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Relax Hyperlight Windows integration stdout assertion
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Scan Hyperlight output directory for artifacts
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Retry Hyperlight output artifact collection
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Harden Hyperlight integration output assertions
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Retry Hyperlight read-back check in integration test
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Simplify Hyperlight integration write assertion
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Avoid pathlib in Hyperlight integration sandbox
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Use socket network check in Hyperlight sandbox
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Replace blocked Azure AI Search blog link
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Clarify Hyperlight guest stdlib limits
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Use _socket in Hyperlight integration sandbox
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Handle Hyperlight mounted file paths
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Broaden Hyperlight sandbox path fallbacks
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Search Hyperlight guest mounts recursively
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Split Hyperlight mount coverage
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Split Hyperlight live network tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Hyperlight file-write test on Windows
    
    Enable the sandbox filesystem by providing a workspace_root so
    /output is mounted. Remove os.path.exists assertion (unsupported
    in WASM guest) and fix Content data assertion to use .uri.
    Skip the network integration test on Windows where the WASM
    sandbox lacks the encodings.idna codec.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review: ADR intro, manual wiring sample, doc clarifications
    
    - Add CodeAct introduction section to ADR for unfamiliar readers
    - Clarify 'less runtime efficient' con with specific overhead description
    - Add note in Python impl doc clarifying ADR vs impl doc split
    - Explain why before_run hooks must be per-run (CRUD, concurrency, approval)
    - Rename code_interpreter variable to codeact in E2E sample
    - Add manual static wiring sample (codeact_manual_wiring.py)
    - Add 'when to use which pattern' guidance to samples README
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR #5185 review comments and add .NET CodeAct design doc
    
    - Fix async callback: _make_sandbox_callback returns sync wrapper with
      thread + asyncio.run() bridge (was broken with real Wasm FFI)
    - Fix stale output: clear output_dir before each sandbox.run() call
    - Fix blocking event loop: _run_code now async with asyncio.to_thread()
    - Revert _agents.py options['tools'] injection (unnecessary; provider
      uses context.extend_tools())
    - Revert SessionContext.options docstring back to read-only
    - Add real-sandbox test fixtures (shared/restored/fresh)
    - Add 8 new real-sandbox tests for callback round-trip, stale output,
      event loop non-blocking, basic execution, stdout/stderr, errors,
      snapshot/restore, and tool registration
    - Add comprehensive .NET HyperlightCodeActProvider design document
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update hyperlight README with code snippets and remove Public API section
    
    Replace bare export list with Quick Start code examples covering the
    context provider, standalone tool, manual static wiring, and file
    mounts / network access patterns.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: bump misc-integration retry delay to 30s (#5293)
    The misc-integration job (Anthropic, Ollama, MCP) frequently fails on merge to main when the upstream MCP server (e.g. learn.microsoft.com/api/mcp) returns a transient rate-limit error. The previous 5s retry delay is too short to ride out the upstream backoff window, so all retries fail and the merge queue is blocked. Bumping to 30s gives the upstream a chance to recover before pytest-retry re-runs the test.
  • Python: Stop emitting duplicate reasoning content from OpenAI response.reasoning_text.done and response.reasoning_summary_text.done events (#5162)
    * Fix reasoning text done events duplicating streamed delta content (#5157)
    
    The OpenAI Responses API sends both reasoning_text.delta (incremental
    chunks) and reasoning_text.done (full accumulated text) events. The
    chat client was emitting Content for both, causing ag-ui to append the
    full done text onto already-accumulated delta text, producing
    duplicated reasoning output.
    
    Stop emitting Content for reasoning_text.done and
    reasoning_summary_text.done events, matching how output_text.done is
    already handled (not emitted). The deltas contain all the content;
    the done event is redundant.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(openai): emit reasoning done content as fallback when no deltas observed (#5157)
    
    Address PR review feedback:
    - Track item_ids that received reasoning deltas via seen_reasoning_delta_item_ids set
    - Emit content from done events only when no deltas were received for the
      item_id, preventing silent content loss on stream resumption
    - Add comment documenting code_interpreter done event asymmetry
    - Replace redundant ag-ui test with deduplication-focused test
    - Add integration test for delta+done sequence in OpenAI chat client tests
    - Add fallback path tests for done events without preceding deltas
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #5157: Python: [Bug]: "type": "response.reasoning_text.delta" and "response.reasoning_text.done" both get exposed as "text_reasoning"
    
    * Fix AG-UI reasoning streaming to use proper Start/End pattern (#5157)
    
    _emit_text_reasoning now follows the same streaming pattern as _emit_text:
    - Emits ReasoningStartEvent/ReasoningMessageStartEvent only on the first
      delta for a given message_id
    - Emits only ReasoningMessageContentEvent for subsequent deltas
    - Defers ReasoningMessageEndEvent/ReasoningEndEvent until
      _close_reasoning_block is called (on content type switch or end-of-run)
    
    This produces the correct protocol pattern:
      ReasoningStartEvent
        ReasoningMessageStartEvent
        ReasoningMessageContentEvent(delta1)
        ReasoningMessageContentEvent(delta2)
        ReasoningMessageEndEvent
      ReasoningEndEvent
    
    Instead of wrapping every delta in a full Start→End sequence.
    
    Backward compatibility is preserved: calling _emit_text_reasoning without
    a flow argument still produces the full sequence per call.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix import ordering lint error in AG-UI test file (#5157)
    
    Move inline import of TextMessageContentEvent to the top-level import
    block and ensure alphabetical ordering to satisfy ruff I001 rule.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix mypy error: rename loop variable to avoid type conflict with WorkflowEvent
    
    The 'event' variable was already typed as WorkflowEvent[Any] from the
    async for loop at line 590. Reusing it in the _close_reasoning_block
    loop (which returns list[BaseEvent]) caused an incompatible assignment
    error. Renamed to 'reasoning_evt' to avoid the conflict.
    
    Fixes #5162
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #5157: review comment fixes
    
    * narrow test result reporting to explicit pytest JUnit XML
    
    * Fix test args
    
    * Fix pytest-results-action in merge workflow and remove committed test artifacts
    
    Apply the same JUnit XML fix from python-tests.yml to python-merge-tests.yml:
    add --junitxml=pytest.xml to all test commands and narrow the results action
    path from ./python/**.xml to ./python/pytest.xml. Also remove accidentally
    committed pytest.xml and python-coverage.xml and add them to .gitignore.
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [BREAKING] Python: move Azure AI embeddings to Foundry (#5056)
    * renamed AzureAIINferenceEmbeddings and lazy load azure-cosmos and env var rename
    
    * updated coverage
    
    * fix readme
  • Python: [BREAKING] Standardize model selection on model (#4999)
    * Refactor Anthropic model option and provider clients
    
    Rename the Anthropic client model option from model_id to model, add provider-specific Anthropic wrappers for Foundry, Bedrock, and Vertex, and expose them through the Anthropic, Foundry, Amazon, and Google namespaces. Update core option handling, docs, samples, and tests accordingly.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Anthropic skills sample typing
    
    Cast the Anthropic beta client to Any in the skills sample so the pre-commit sample pyright check no longer fails on beta skills and files endpoints that are not exposed by the current SDK stubs.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * undo sample mypy
    
    * Retry CI after transient external failures
    
    Retrigger PR validation after an unrelated Copilot review workflow SAML failure and a transient external tau2 git fetch failure in the Windows Python test setup.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback on model option merging
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address Anthropic compatibility review feedback
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * moved all to `model`
    
    * fixes for azure ai search
    
    * Python: standardize remaining sample env var names
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: fix foundry-local pyright compatibility
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updated env vars in cicd
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [BREAKING] Remove deprecated Python OpenAI/Azure AI surfaces (#4990)
    * [BREAKING] Remove deprecated Python OpenAI/Azure AI surfaces
    
    Also clean up follow-on docs, environment guidance, package metadata, and lab test stability.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix deleted semantic-kernel sample links
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review feedback
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * improve foundry language
    
    * Fix A2A Foundry sample regression
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [BREAKING] Remove deprecated kwargs compatibility paths (#4858)
    * [BREAKING] Remove deprecated kwargs compatibility paths
    
    Remove the deprecated kwargs compatibility shims across core agents, clients, tools, middleware, and telemetry.
    
    Keep workflow kwargs behavior intact in this branch and follow up separately in #4850.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix PR CI fallout for kwargs removal
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review feedback
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updates
    
    * Fix Azure AI CI fallout
    
    Remove the stale _get_current_conversation_id override from the Azure AI client after the OpenAI base helper was deleted.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fixed new classes
    
    * Fix Assistants deprecated import gating
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix integration replay regressions
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Switch multi-agent hosting samples to Azure chat completions
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Simplify Azure multi-agent sample config
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • [BREAKING] Python: fix OpenAI Azure routing and provider samples (#4925)
    * Python: fix OpenAI Azure routing and provider samples
    
    Prefer OpenAI when OPENAI_API_KEY is present unless Azure is explicitly requested. Clarify constructor docs, keep deprecated Azure wrappers compatible with stricter settings validation, and refresh the provider samples and tests to use the current client patterns.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix bandit
    
    * Python: align OpenAI embedding Azure routing
    
    Extend the shared OpenAI-vs-Azure routing and credential behavior to the embedding client, add Azure embedding regression coverage, and refresh the embedding samples to use the generic client path.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: fix embedding client pyright check
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: thin OpenAI embedding wrapper
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: document embedding overload routing
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: fix callable OpenAI key routing
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: fix Azure credential routing tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: address OpenAI review feedback
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: narrow Azure routing markers
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: refine OpenAI model fallback order
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: narrow Azure deployment docs
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: remove embedding routing wording
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: run embedding Azure integration tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * changed variable name
    
    * Python: expand OpenAI package README
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * clarified readme
    
    * Python: fix Azure OpenAI integration setup
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: correct Azure integration env mapping
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updated code to fix int tests
    
    * test updates
    
    * test fix
    
    * fix test setup
    
    * updates to tests and setup
    
    * remove openai assistants int tests
    
    * improvements in int tests
    
    * fix env var
    
    * fix env vars
    
    * fix azure responses test
    
    * trigger actions
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [BREAKING] Python: Provider-leading client design & OpenAI package extraction (#4818)
    * Python: Provider-leading client design & OpenAI package extraction
    
    Major refactoring of the Python Agent Framework client architecture:
    
    - Extract OpenAI clients into new `agent-framework-openai` package
    - Core package no longer depends on openai, azure-identity, azure-ai-projects
    - Rename clients for discoverability: OpenAIResponsesClient → OpenAIChatClient,
      OpenAIChatClient → OpenAIChatCompletionClient
    - Unify `model_id`/`deployment_name`/`model_deployment_name` → `model` param
    - New FoundryChatClient for Azure AI Foundry Responses API
    - New FoundryAgent/FoundryAgentClient for connecting to pre-configured Foundry agents
    - Remove OpenAIBase/OpenAIConfigMixin from non-deprecated client MRO
    - Deprecate AzureOpenAI* clients, AzureAIClient, OpenAIAssistantsClient
    - Reorganize samples: azure_openai+azure_ai+azure_ai_agent → azure/
    - ADR-0020: Provider-Leading Client Design
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: missing Agent imports in samples, .model_id → .model in foundry_local sample
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: CI failures — mypy errors, coverage targets, sample imports
    
    - azure-ai mypy: add type ignores for TypedDict total=, model arg, forward ref
    - Coverage: replace core.azure/openai targets with openai package target
    - project_provider: add type annotation for opts dict
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: populate openai .pyi stub, fix broken README links, coverage targets
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fixes
    
    * updated observabilitty
    
    * reset azure init.pyi
    
    * fix errors
    
    * updated adr number
    
    * fix foundry local
    
    * fixed not renamed docstrings and comments, and added deprecated markers to old classes
    
    * fix tests and pyprojects
    
    * fix test vars
    
    * updated function tests
    
    * update durable
    
    * updated test setup for functions
    
    * Fix Foundry auth in workflow samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Stabilize Python integration workflows
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update hosting samples for Foundry
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Trigger full CI rerun
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Trigger CI rerun again
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * trigger rerun
    
    * trigger rerun
    
    * fix for litellm
    
    * undo durabletask changes
    
    * Move Foundry APIs into foundry namespace
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Foundry pyproject formatting
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Split provider samples by Foundry surface
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Restore hosting sample requirements
    
    Also fix the Foundry Local sample link after the provider sample move.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updated tests
    
    * udpated foundry integration tests
    
    * removed dist from azurefunctions tests
    
    * Use separate Foundry clients for concurrent agents
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix client setup in azfunc and durable
    
    * disabled two tests
    
    * updated setup for some function and durable tests
    
    * improved azure openai setup with new clients
    
    * ignore deprecated
    
    * fixes
    
    * skip 11
    
    * remove openai assistants int tests
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Simplify Python Poe tasks and unify package selectors (#4722)
    * updated automation tasks and commands, with alias for the time being
    
    * Restore aggregate test exclusions
    
    Preserve the legacy all-tests scope for test --all by excluding lab and devui from the default aggregate sweep, while still allowing explicit package selection. Also ignore hidden/generated test directories such as .mypy_cache during aggregate discovery.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updated versions in pre-commit
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • [BREAKING] Python: Update github-copilot-sdk integration to use ToolInvocation/ToolResult types (#4551)
    * Update github_copilot package for github-copilot-sdk>=0.1.32 (#4549)
    
    - Update requires-python from >=3.10 to >=3.11
    - Remove Python 3.10 classifier
    - Update mypy python_version to 3.11
    - Update dependency to github-copilot-sdk>=0.1.32
    - Fix ToolResult API: use snake_case kwargs (text_result_for_llm,
      result_type) instead of camelCase (textResultForLlm, resultType)
    - Update test assertions to use attribute access on ToolResult
    - Add ToolResult type assertions to tool handler tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix tests to use ToolInvocation dataclass instead of plain dict (#4549)
    
    Update test_github_copilot_agent.py to pass ToolInvocation objects to tool
    handlers instead of plain dicts, matching the github-copilot-sdk>=0.1.32 API
    where ToolInvocation is a dataclass with an .arguments attribute.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add regression tests for ToolInvocation contract (#4549)
    
    Add tests to lock in the new ToolInvocation-based calling convention:
    - test_tool_handler_rejects_raw_dict_invocation: verifies passing a raw
      dict (old calling convention) raises TypeError/AttributeError
    - test_tool_handler_with_empty_arguments: verifies ToolInvocation with
      empty arguments works correctly for no-arg tools
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Revert requires-python to >=3.10 to avoid breaking CI (#4549)
    
    The repo CI runs with Python 3.10 (uv sync --all-packages) and all other
    packages require >=3.10. Raising this package to >=3.11 would break the
    shared install flow. The SDK dependency version constraint (>=0.1.32) will
    enforce any Python version requirement from the SDK itself.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix min Python version for github_copilot package to >=3.11
    
    github-copilot-sdk>=0.1.32 requires Python>=3.11, which conflicts
    with the package's declared >=3.10 minimum, breaking uv sync.
    
    * Bump py version for GH workflows to 3.11, exclude GHCP sdk from 3.10 items
    
    * Fix uv command
    
    * Fixes
    
    * Update samples
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Add Azure Cosmos history provider package (#4271)
    * Created cosmos history provider
    
    * add marker
    
    * Python: address Cosmos PR feedback
    
    - address provider/test/sample review feedback and cleanup typing
    - add cosmos integration test coverage and skip gating
    - add dedicated cosmos emulator jobs to python merge/integration workflows
    - switch cosmos workflow execution to package poe integration-tests task
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: handle empty Cosmos session id
    
    - replace default partition fallback for empty session_id
    - log warning and generate GUID when session_id is empty
    - update unit tests to validate GUID fallback behavior
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix sample
    
    * fix cross partition query
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: updated integration tests and guidance (#4181)
    * updated integration tests and guidance
    
    * fixed merge test
    
    * updated integration tests
    
    * fix: remove duplicate --dist loadfile flag from pytest-xdist config
    
    Only one --dist mode can be active at a time; the second value silently
    overrides the first. Keep --dist worksteal (dynamic load balancing) and
    remove the redundant --dist loadfile from all workflow files and
    pyproject.toml configs.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: add keep-in-sync notes for merge and integration test workflows
    
    Both python-merge-tests.yml and python-integration-tests.yml share the
    same parallel job structure. Added sync reminders in workflow file
    comments, the python-testing SKILL.md, and CODING_STANDARD.md.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * refactor: remove RUN_INTEGRATION_TESTS flag
    
    Integration test gating now uses two mechanisms:
    - `@pytest.mark.integration` for test selection via `-m` filtering
    - `skip_if_*_disabled` for credential/service availability checks
    
    The RUN_INTEGRATION_TESTS env var was redundant since the marker handles
    selection and the skip decorators already check for actual credentials.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: sync missing env vars from merge-tests to integration-tests
    
    Add OPENAI_EMBEDDINGS_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
    to python-integration-tests.yml to match python-merge-tests.yml.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: remove remaining RUN_INTEGRATION_TESTS from embedding tests and docs
    
    Missed test_openai_embedding_client.py and vector-stores README in the
    earlier cleanup.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * set functions tests to 3.10
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: feat(python): Add embedding abstractions and OpenAI implementation (Phase 1) (#4153)
    * feat(python): Add embedding abstractions and OpenAI implementation (Phase 1)
    
    This PR contains two parts:
    
    1. **Overall migration plan** for porting vector stores and embeddings from
       Semantic Kernel to Agent Framework (docs/features/vector-stores-and-embeddings/README.md)
       covering all 10 phases from core abstractions through connectors and TextSearch.
    
    2. **Phase 1 implementation** — core embedding abstractions and OpenAI/Azure OpenAI
       embedding clients:
    
       Core types (_types.py):
       - EmbeddingGenerationOptions TypedDict (total=False)
       - Embedding[EmbeddingT] generic class with model_id, dimensions, created_at
       - GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT] list container with options, usage
       - EmbeddingInputT (default str) and EmbeddingT (default list[float]) TypeVars
    
       Protocol + base class (_clients.py):
       - SupportsGetEmbeddings protocol — Generic[EmbeddingInputT, EmbeddingT, OptionsContraT]
       - BaseEmbeddingClient ABC — Generic[EmbeddingInputT, EmbeddingT, OptionsCoT]
    
       Telemetry (observability.py):
       - EmbeddingTelemetryLayer with gen_ai.operation.name = "embeddings"
    
       OpenAI implementation (openai/_embedding_client.py):
       - RawOpenAIEmbeddingClient, OpenAIEmbeddingClient, OpenAIEmbeddingOptions
       - Uses _ensure_client() factory pattern
    
       Azure OpenAI implementation (azure/_embedding_client.py):
       - AzureOpenAIEmbeddingClient following AzureOpenAIChatClient pattern
       - Supports API key, Entra ID credentials, env var configuration
    
       Tests:
       - 47 unit tests for types, protocol, base class, OpenAI, and Azure clients
       - 6 integration tests (gated behind RUN_INTEGRATION_TESTS + credentials)
    
       Samples:
       - samples/02-agents/embeddings/openai_embeddings.py
       - samples/02-agents/embeddings/azure_openai_embeddings.py
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: Add AzureOpenAIEmbeddingClient to azure __init__.pyi stub
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * ci: Add embedding env vars to Python integration tests
    
    Map OPENAI_EMBEDDING_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
    from GitHub vars to the integration test environment.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: Handle base64 encoding_format in OpenAI embedding client
    
    When encoding_format='base64' is used, the OpenAI API returns base64-encoded
    floats instead of a JSON array. Decode these automatically to list[float]
    so the return type stays consistent regardless of encoding format.
    
    Also adds a unit test for base64 decoding and fixes minor docstring/import issues.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: Only record INPUT_TOKENS for embedding telemetry
    
    Embeddings have no output/completion tokens. Remove OUTPUT_TOKENS recording
    which was double-counting prompt_tokens via the total_tokens fallback.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: Resolve mypy variance error and lint warning
    
    Use contravariant/covariant TypeVars for SupportsGetEmbeddings Protocol.
    Combine nested if into single statement in telemetry layer.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: Make EmbeddingCoT invariant for mypy compatibility
    
    GeneratedEmbeddings is invariant in its type param, so the Protocol
    TypeVar cannot be covariant.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: Address PR review - empty values guard, service_url for telemetry
    
    - Add early return for empty values in get_embeddings to avoid unnecessary API calls
    - Add service_url() method to RawOpenAIEmbeddingClient for proper telemetry endpoint reporting
    - Add test for empty values behavior
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Fix OpenAI chat client compatibility with third-party endpoints and OTel 0.4.14 (#4161)
    
    * Fix system message content sent as list instead of string
    
    Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages
    when content is a list of content parts. This change flattens system and
    developer message content to a plain string in the Chat Completions client.
    
    Fixes https://github.com/microsoft/agent-framework/issues/1407
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix compatibility with opentelemetry-semantic-conventions-ai 0.4.14
    
    Version 0.4.14 removed several LLM_* attributes from SpanAttributes
    (LLM_SYSTEM, LLM_REQUEST_MODEL, LLM_RESPONSE_MODEL, LLM_REQUEST_MAX_TOKENS,
    LLM_REQUEST_TEMPERATURE, LLM_REQUEST_TOP_P, LLM_TOKEN_TYPE).
    
    Move these to the OtelAttr enum with their well-known gen_ai.* string values
    and update all references in observability.py and tests.
    
    Fixes https://github.com/microsoft/agent-framework/issues/4160
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Flatten text-only message content to string for all roles
    
    Extend the system/developer fix to all message roles. Text-only content
    lists are now post-processed into plain strings, while multimodal content
    (text + images/audio) remains as a list. This fixes compatibility with
    OpenAI-like endpoints that cannot deserialize list content (e.g. Foundry
    Local's Neutron backend).
    
    Partially fixes https://github.com/microsoft/agent-framework/issues/4084
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix streaming text lost when usage data in same chunk
    
    Some providers (e.g. Gemini) include both usage data and text content
    in the same streaming chunk. The early return on chunk.usage caused
    text and tool call parsing to be skipped entirely. Remove the early
    return and process usage alongside text/tool calls.
    
    Fixes https://github.com/microsoft/agent-framework/issues/3434
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix mypy errors in _chat_client.py
    
    Rename shadowed variable 'args' in system/developer branch to 'sys_args'
    and rename loop variable 'content' to 'msg_content' to avoid type conflict.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * reorder imports
    
    * fix: Use OtelAttr.REQUEST_MODEL instead of removed SpanAttributes.LLM_REQUEST_MODEL
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: Add score_threshold to vector store plan
    
    Reference SK .NET PR #13501 for score threshold filtering semantics.
    Include score_threshold in SearchOptions from Phase 3.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: Add reference to roji's SK .NET MEVD work for SQL connectors
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: Clear env vars in construction tests to avoid CI leakage
    
    Tests for missing API key / model ID now use monkeypatch.delenv to ensure
    env vars from the integration test environment don't prevent the expected
    ValueError from being raised.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Updated GitHub action for manual integration tests (#4147)
    * Updated merge test permissions
    
    * Removed repo check
    
    * Added fetch from main for comparison
    
    * Updated path detection logic
    
    * Small updates
    
    * Reverted file rename
    
    * Created dedicated workflows for integration tests
    
    * Small fix for Python
    
    * Small fixes
    
    * Small update
    
    * Small update
    
    * Added tests check for Python
  • Added new GitHub action for manual integration test run based on PR (#4135)
    * Added new GitHub action for manual integration test run based on PR
    
    * Addressed comments
    
    * Added branch name as input
    
    * Small improvements
  • Python: [BREAKING] Moved to a single get_response and run API (#3379)
    * WIP
    
    * big update to new ResponseStream model
    
    * fixed tests and typing
    
    * fixed tests and typing
    
    * fixed tools typevar import
    
    * fix
    
    * mypy fix
    
    * mypy fixes and some cleanup
    
    * fix missing quoted names
    
    * and client
    
    * fix  imports agui
    
    * fix anthropic override
    
    * fix agui
    
    * fix ag ui
    
    * fix import
    
    * fix anthropic types
    
    * fix mypy
    
    * refactoring
    
    * updated typing
    
    * fix 3.11
    
    * fixes
    
    * redid layering of chat clients and agents
    
    * redid layering of chat clients and agents
    
    * Fix lint, type, and test issues after rebase
    
    - Add @overload decorators to AgentProtocol.run() for type compatibility
    - Add missing docstring params (middleware, function_invocation_configuration)
    - Fix TODO format (TD002) by adding author tags
    - Fix broken observability tests from upstream:
      - Replace non-existent use_instrumentation with direct instantiation
      - Replace non-existent use_agent_instrumentation with AgentTelemetryLayer mixin
      - Fix get_streaming_response to use get_response(stream=True)
      - Add AgentInitializationError import
      - Update streaming exception tests to match actual behavior
    
    * Fix AgentExecutionException import error in test_agents.py
    
    - Replace non-existent AgentExecutionException with AgentRunException
    
    * Fix test import and asyncio deprecation issues
    
    - Add 'tests' to pythonpath in ag-ui pyproject.toml for utils_test_ag_ui import
    - Replace deprecated asyncio.get_event_loop().run_until_complete with asyncio.run
    
    * Fix azure-ai test failures
    
    - Update _prepare_options patching to use correct class path
    - Fix test_to_azure_ai_agent_tools_web_search_missing_connection to clear env vars
    
    * Convert ag-ui utils_test_ag_ui.py to conftest.py
    
    - Move test utilities to conftest.py for proper pytest discovery
    - Update all test imports to use conftest instead of utils_test_ag_ui
    - Remove old utils_test_ag_ui.py file
    - Revert pythonpath change in pyproject.toml
    
    * fix: use relative imports for ag-ui test utilities
    
    * fix agui
    
    * Rename Bare*Client to Raw*Client and BaseChatClient
    
    - Renamed BareChatClient to BaseChatClient (abstract base class)
    - Renamed BareOpenAIChatClient to RawOpenAIChatClient
    - Renamed BareOpenAIResponsesClient to RawOpenAIResponsesClient
    - Renamed BareAzureAIClient to RawAzureAIClient
    - Added warning docstrings to Raw* classes about layer ordering
    - Updated README in samples/getting_started/agents/custom with layer docs
    - Added test for span ordering with function calling
    
    * Fix layer ordering: FunctionInvocationLayer before ChatTelemetryLayer
    
    This ensures each inner LLM call gets its own telemetry span, resulting in
    the correct span sequence: chat -> execute_tool -> chat
    
    Updated all production clients and test mocks to use correct ordering:
    - ChatMiddlewareLayer (first)
    - FunctionInvocationLayer (second)
    - ChatTelemetryLayer (third)
    - BaseChatClient/Raw...Client (fourth)
    
    * Remove run_stream usage
    
    * Fix conversation_id propagation
    
    * Python: Add BaseAgent implementation for Claude Agent SDK (#3509)
    
    * Added ClaudeAgent implementation
    
    * Updated streaming logic
    
    * Small updates
    
    * Small update
    
    * Fixes
    
    * Small fix
    
    * Naming improvements
    
    * Updated imports
    
    * Addressed comments
    
    * Updated package versions
    
    * Update Claude agent connector layering
    
    * fix test and plugin
    
    * Store function middleware in invocation layer
    
    * Fix telemetry streaming and ag-ui tests
    
    * Remove legacy ag-ui tests folder
    
    * updates
    
    * Remove terminate flag from FunctionInvocationContext, use MiddlewareTermination instead
    
    - Remove terminate attribute from FunctionInvocationContext
    - Add result attribute to MiddlewareTermination to carry function results
    - FunctionMiddlewarePipeline.execute() now lets MiddlewareTermination propagate
    - _auto_invoke_function captures context.result in exception before re-raising
    - _try_execute_function_calls catches MiddlewareTermination and sets should_terminate
    - Fix handoff middleware to append to chat_client.function_middleware directly
    - Update tests to use raise MiddlewareTermination instead of context.terminate
    - Add middleware flow documentation in samples/concepts/tools/README.md
    - Fix ag-ui to use FunctionMiddlewarePipeline instead of removed create_function_middleware_pipeline
    
    * fix: remove references to removed terminate flag in purview tests, add type ignore
    
    * fix: move _test_utils.py from package to test folder
    
    * fix: call get_final_response() to trigger context provider notification in streaming test
    
    * fix: correct broken links in tools README
    
    * docs: clarify default middleware behavior in summary table
    
    * fix: ensure inner stream result hooks are called when using map()/from_awaitable()
    
    * Fix mypy type errors
    
    * Address PR review comments on observability.py
    
    - Remove TODO comment about unconsumed streams, add explanatory note instead
    - Remove redundant _close_span cleanup hook (already called in _finalize_stream)
    - Clarify behavior: cleanup hooks run after stream iteration, if stream is not
      consumed the span remains open until garbage collected
    
    * Remove gen_ai.client.operation.duration from span attributes
    
    Duration is a metrics-only attribute per OpenTelemetry semantic conventions.
    It should be recorded to the histogram but not set as a span attribute.
    
    * Remove duration from _get_response_attributes, pass directly to _capture_response
    
    Duration is a metrics-only attribute. It's now passed directly to _capture_response
    instead of being included in the attributes dict that gets set on the span.
    
    * Remove redundant _close_span cleanup hook in AgentTelemetryLayer
    
    _finalize_stream already calls _close_span() in its finally block,
    so adding it as a separate cleanup hook is redundant.
    
    * Use weakref.finalize to close span when stream is garbage collected
    
    If a user creates a streaming response but never consumes it, the cleanup
    hooks won't run. Now we register a weak reference finalizer that will close
    the span when the stream object is garbage collected, ensuring spans don't
    leak in this scenario.
    
    * Fix _get_finalizers_from_stream to use _result_hooks attribute
    
    Renamed function to _get_result_hooks_from_stream and fixed it to
    look for the _result_hooks attribute which is the correct name in
    ResponseStream class.
    
    * Add missing asyncio import in test_request_info_mixin.py
    
    * Fix leftover merge conflict marker in image_generation sample
    
    * Update integration tests
    
    * Fix integration tests: increase max_iterations from 1 to 2
    
    Tests with tool_choice options require at least 2 iterations:
    1. First iteration to get function call and execute the tool
    2. Second iteration to get the final text response
    
    With max_iterations=1, streaming tests would return early with only
    the function call/result but no final text content.
    
    * Fix duplicate function call error in conversation-based APIs
    
    When using conversation_id (for Responses/Assistants APIs), the server
    already has the function call message from the previous response. We
    should only send the new function result message, not all messages
    including the function call which would cause a duplicate ID error.
    
    Fix: When conversation_id is set, only send the last message (the tool
    result) instead of all response.messages.
    
    * Add regression test for conversation_id propagation between tool iterations
    
    Port test from PR #3664 with updates for new streaming API pattern.
    Tests that conversation_id is properly updated in options dict during
    function invocation loop iterations.
    
    * Fix tool_choice=required to return after tool execution
    
    When tool_choice is 'required', the user's intent is to force exactly one
    tool call. After the tool executes, return immediately with the function
    call and result - don't continue to call the model again.
    
    This fixes integration tests that were failing with empty text responses
    because with tool_choice=required, the model would keep returning function
    calls instead of text.
    
    Also adds regression tests for:
    - conversation_id propagation between tool iterations (from PR #3664)
    - tool_choice=required returns after tool execution
    
    * Document tool_choice behavior in tools README
    
    - Add table explaining tool_choice values (auto, none, required)
    - Explain why tool_choice=required returns immediately after tool execution
    - Add code example showing the difference between required and auto
    - Update flow diagram to show the early return path for tool_choice=required
    
    * Fix tool_choice=None behavior - don't default to 'auto'
    
    Remove the hardcoded default of 'auto' for tool_choice in ChatAgent init.
    When tool_choice is not specified (None), it will now not be sent to the
    API, allowing the API's default behavior to be used.
    
    Users who want tool_choice='auto' can still explicitly set it either in
    default_options or at runtime.
    
    Fixes #3585
    
    * Fix tool_choice=none should not remove tools
    
    In OpenAI Assistants client, tools were not being sent when
    tool_choice='none'. This was incorrect - tool_choice='none' means
    the model won't call tools, but tools should still be available
    in the request (they may be used later in the conversation).
    
    Fixes #3585
    
    * Add test for tool_choice=none preserving tools
    
    Adds a regression test to ensure that when tool_choice='none' is set but
    tools are provided, the tools are still sent to the API. This verifies
    the fix for #3585.
    
    * Fix tool_choice=none should not remove tools in all clients
    
    Apply the same fix to OpenAI Responses client and Azure AI client:
    - OpenAI Responses: Remove else block that popped tool_choice/parallel_tool_calls
    - Azure AI: Remove tool_choice != 'none' check when adding tools
    
    When tool_choice='none', the model won't call tools, but tools should
    still be sent to the API so they're available for future turns.
    
    Also update README to clarify tool_choice=required supports multiple tools.
    
    Fixes #3585
    
    * Keep tool_choice even when tools is None
    
    Move tool_choice processing outside of the 'if tools' block in OpenAI
    Responses client so tool_choice is sent to the API even when no tools
    are provided.
    
    * Update test to match new parallel_tool_calls behavior
    
    Changed test_prepare_options_removes_parallel_tool_calls_when_no_tools to
    test_prepare_options_preserves_parallel_tool_calls_when_no_tools to reflect
    that parallel_tool_calls is now preserved even when no tools are present,
    consistent with the tool_choice behavior.
    
    * Fix ChatMessage API and Role enum usage after rebase
    
    - Update ChatMessage instantiation to use keyword args (role=, text=, contents=)
    - Fix Role enum comparisons to use .value for string comparison
    - Add created_at to AgentResponse in error handling
    - Fix AgentResponse.from_updates -> from_agent_run_response_updates
    - Fix DurableAgentStateMessage.from_chat_message to convert Role enum to string
    - Add Role import where needed
    
    * Fix additional ChatMessage API and method name changes
    
    - Fix ChatMessage usage in workflow files (use text= instead of contents= for strings)
    - Fix AgentResponse.from_updates -> from_agent_run_response_updates in workflow files
    - Fix test files for ChatMessage and Role enum usage
    
    * Fix remaining ChatMessage API usage in test files
    
    * Fix more ChatMessage and Role API changes in source and test files
    
    - Fix ChatMessage in _magentic.py replan method
    - Fix Role enum comparison in test assertions
    - Fix remaining test files with old ChatMessage syntax
    
    * Fix ChatMessage and Role API changes across packages
    
    - Add Role import where missing
    - Fix ChatMessage signature: positional args to keyword args (role=, text=, contents=)
    - Fix Role enum comparisons: .role.value instead of .role string
    - Fix FinishReason enum usage in ag-ui event converters
    - Rename AgentResponse.from_updates to from_agent_run_response_updates in ag-ui
    
    Fixes API compatibility after Types API Review improvements merge
    
    * Fix ChatMessage and Role API changes in github_copilot tests
    
    * Fix ChatMessage and Role API changes in redis and github_copilot packages
    
    - Fix redis provider: Role enum comparison using .value
    - Fix redis tests: ChatMessage signature and Role comparisons
    - Fix github_copilot tests: ChatMessage signature and Role comparisons
    - Update docstring examples in redis chat message store
    
    * Fix ChatMessage and Role API changes in devui package
    
    - Fix executor: ChatMessage signature change
    - Fix conversations: Role enum to string conversion in two places
    - Fix tests: ChatMessage signatures and Role comparisons
    
    * Fix ChatMessage and Role API changes in a2a and lab packages
    
    - Fix a2a tests: Role comparisons and ChatMessage signatures
    - Fix lab tau2 source: Role enum comparison in flip_messages, log_messages, sliding_window
    - Fix lab tau2 tests: ChatMessage signatures and Role comparisons
    
    * Remove duplicate test files from ag-ui/tests (tests are in ag_ui_tests)
    
    * Fix ChatMessage and Role API changes across packages
    
    After rebasing on upstream/main which merged PR #3647 (Types API Review
    improvements), fix all packages to use the new API:
    
    - ChatMessage: Use keyword args (role=, text=, contents=) instead of
      positional args
    - Role: Compare using .value attribute since it's now an enum
    
    Packages fixed:
    - ag-ui: Fixed Role value extraction bugs in _message_adapters.py
    - anthropic: Fixed ChatMessage and Role comparisons in tests
    - azure-ai: Fixed Role comparison in _client.py
    - azure-ai-search: Fixed ChatMessage and Role in source/tests
    - bedrock: Fixed ChatMessage signatures in tests
    - chatkit: Fixed ChatMessage and Role in source/tests
    - copilotstudio: Fixed ChatMessage and Role in tests
    - declarative: Fixed ChatMessage in _executors_agents.py
    - mem0: Fixed ChatMessage and Role in source/tests
    - purview: Fixed ChatMessage in source/tests
    
    * Fix mypy errors for ChatMessage and Role API changes
    
    - durabletask: Use str() fallback in role value extraction
    - core: Fix ChatMessage in _orchestrator_helpers.py to use keyword args
    - core: Add type ignore for _conversation_state.py contents deserialization
    - ag-ui: Fix type ignore comments (call-overload instead of arg-type)
    - azure-ai-search: Fix get_role_value type hint to accept Any
    - lab: Move get_role_value to module level with Any type hint
    
    * Improve CI test timeout configuration
    
    - Increase job timeout from 10 to 15 minutes
    - Reduce per-test timeout to 60s (was 900s/300s)
    - Add --timeout_method thread for better timeout handling
    - Add --timeout-verbose to see which tests are slow
    - Reduce retries from 3 to 2 and delay from 10s to 5s
    
    This ensures individual test timeouts are shorter than the job
    timeout, providing better visibility when tests hang.
    
    With 60s timeout and 2 retries, worst case per test is ~180s.
    
    * Fix ChatMessage API usage in docstrings and source
    
    - Fix ChatMessage positional args in docstrings: _serialization.py, _threads.py, _middleware.py
    - Fix ChatMessage in tau2 runner.py
    - Fix role comparison in _orchestrator_helpers.py to use .value
    - Fix role comparison in _group_chat.py docstring example
    - Fix role assertions in test_durable_entities.py to use .value
    
    * Revert tool_choice/parallel_tool_calls changes - must be removed when no tools
    
    OpenAI API requires tool_choice and parallel_tool_calls to only be
    present when tools are specified. Restored the logic that removes
    these options when there are no tools.
    
    - Restored check in _chat_client.py to remove tool_choice and
      parallel_tool_calls when no tools present
    - Restored same logic in _responses_client.py
    - Reverted test to expect the correct behavior
    
    * fixed issue in tests
    
    * fix: resolve merge conflict markers in ag-ui tests
    
    * fix: restructure ag-ui tests and fix Role/FinishReason to use string types
    
    * fix: streaming function invocation and middleware termination
    
    - Refactor streaming function invocation to use get_final_response() on inner streams
    - Fix MiddlewareTermination to accept result parameter for passing results
    - Fix _AutoHandoffMiddleware to use MiddlewareTermination instead of context.terminate
    - Fix AgentMiddlewareLayer.run() to properly forward function/chat middleware
    - Remove duplicate middleware registration in AgentMiddlewareLayer.__init__
    - Fix exception handling in _auto_invoke_function to properly capture termination
    - Fix mypy errors in core package
    - Update tests to use stream=True parameter for unified run API
    
    * fix all tests command
    
    * Refactor integration tests to use pytest fixtures
    
    - Merge testutils.py into conftest.py for azurefunctions integration tests
    - Merge dt_testutils.py into conftest.py for durabletask integration tests
    - Convert all integration tests to use fixtures instead of direct imports
      (fixes ModuleNotFoundError with --import-mode=importlib)
    - Add sample_helper fixture for azurefunctions tests
    - Add agent_client_factory and orchestration_helper fixtures for durabletask
    - Integration tests now skip with descriptive messages when services unavailable
    - Restructure devui tests into tests/devui/ with proper conftest.py
    - Add test organization guidelines to CODING_STANDARD.md
    - Remove __init__.py from test directories per pytest best practices
    
    * Fix pytest_collection_modifyitems to only skip integration tests
    
    The hook was skipping all tests in the test session, not just
    integration tests. Now it only skips items in the integration_tests
    directory.
    
    * Fix mem0 tests failing on Python 3.13
    
    Use patch.object on the imported module instead of @patch with string
    path to ensure the mock takes effect regardless of import timing.
    
    * fix mem0
    
    * another attempt for mem0
    
    * fix for mem0
    
    * fix mem0
    
    * Increase worker initialization wait time in durabletask tests
    
    Increase from 2 to 8 seconds to allow time for:
    - Python startup and module imports
    - Azure OpenAI client creation
    - Agent registration with DTS worker
    - Worker connection to DTS
    
    This helps prevent test failures in CI where the first tests may run
    before the worker is fully ready to process requests.
    
    * Fix streaming test to use ResponseStream with finalizer
    
    The _consume_stream method now expects a ResponseStream that can provide
    a final AgentResponse via get_final_response(). Update the test to use
    ResponseStream with AgentResponse.from_updates as the finalizer.
    
    * Fix MockToolCallingAgent to use new ResponseStream API and update samples
    
    * small updates to run_stream to run
    
    * fix sub workflow
    
    * temp fix for az func test
    
    ---------
    
    Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
  • Python: [BREAKING] changed AIFunction to FunctionTool and @ai_function to @tool (#3413)
    * changed AIFunction to FunctionTool and @ai_function to @tool
    
    * test and mypy fixes
    
    * mypy fix
    
    * switch function tool to always_require
    
    * fix noop
    
    * fix github copilot imports
    
    * test fixes
    
    * fix ollama test
    
    * fixes for tests
    
    * fix tests
    
    * reverted change to always_require and extended timeout
    
    * fix test
  • Python: [BREAKING]: Introducing Options as TypedDict and Generic (#3140)
    * WIP typeddict for options
    
    * updated all clients and ChatAgents
    
    * updated everything
    
    * added ADR
    
    * fix mypy
    
    * proper typevar imports
    
    * fixed import
    
    * fixed other imports
    
    * slight update in the sample
    
    * updated from feedback
    
    * fixes
    
    * fixed missing covariants and test fixes
    
    * fixed typing
    
    * updated anthropic thinking config
    
    * ruff fixes
    
    * fixed int tests
    
    * fix tests and mypy
    
    * updated integration tests
    
    * updated docstring and test fix
    
    * improved options handling in obser
    
    * mypy fix
    
    * updated a host of integration tests
    
    * fix tests
    
    * bedrock fix
  • Python: latency improvements (#3014)
    * latency improvements
    
    * fixed mypy, added coding standards and instructions
    
    * slight logic improvement
  • Bump actions/checkout from 5 to 6 (#2404)
    Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
    - [Release notes](https://github.com/actions/checkout/releases)
    - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/actions/checkout/compare/v5...v6)
    
    ---
    updated-dependencies:
    - dependency-name: actions/checkout
      dependency-version: '6'
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
  • .NET: Python: Azure Functions feature branch (#1916)
    * Python: Add Scaffolding for Durable AzureFunctions package to Agent Framework (#1823)
    
    * Add scafolding
    
    * update readme
    
    * add code owners and label
    
    * update owners
    
    * .NET: Durable extension: initial src and unit tests (#1900)
    
    * Python: Add Durable Agent Wrapper code (#1913)
    
    * add initial changes
    
    * Move code and add single sample
    
    * Update logger
    
    * Remove unused code
    
    * address PR comments
    
    * cleanup code and address comments
    
    ---------
    
    Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
    
    * Azure Functions .NET samples (#1939)
    
    * Python: Add Unit tests for Azurefunctions package (#1976)
    
    * Add Unit tests for Azurefunctions
    
    * remove duplicate import
    
    * .NET: [Feature Branch] Migrate state schema updates and support for agents as MCP tools (#1979)
    
    * Python: Add more samples for Azure Functions (#1980)
    
    * Move all samples
    
    * fix comments
    
    * remove dead lines
    
    * Make samples simpler
    
    * .NET: [Feature Branch] Durable Task extension integration tests (#2017)
    
    * .NET: [Feature Branch] Update OpenAI config for integration tests (#2063)
    
    * Python: Add Integration tests for AzureFunctions  (#2020)
    
    * Add Integration tests
    
    * Remove DTS extension
    
    * Apply suggestions from code review
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Apply suggestions from code review
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Add pyi file for type safety
    
    * Add samples in readme
    
    * Updated all readme instructions
    
    * Address comments
    
    * Update readmes
    
    * Fix requirements
    
    * Address comments
    
    ---------
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * .NET: [Feature Branch] Update dotnet-build-and-test.yml to support integration tests (#2070)
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Fix DTS startup issue and improve logging (#2103)
    
    * .NET: [Feature Branch] Introduce Azure OpenAI config for .NET pipeline (#2106)
    
    Also fixes an issue where we were trying to start docker containers for integration tests on Windows, which doesn't work.
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Fix uv.lock after merge
    
    * Python: Add README for Azure Functions samples setup (#2100)
    
    * Add README for Azure Functions samples setup
    
    Added setup instructions for Azure Functions samples, including environment setup, virtual environment creation, and running samples.
    
    * Update python/samples/getting_started/azure_functions/README.md
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Apply suggestions from code review
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Apply suggestion from @Copilot
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Apply suggestions from code review
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    Co-authored-by: Laveesh Rohra <larohra@microsoft.com>
    
    * Fix or remove broken markdown file links (#2115)
    
    * .NET: [Feature Branch] Update HTTP API to be consistent across languages (#2118)
    
    * Python: Fix AzureFunctions Integration Tests (#2116)
    
    * Add Identity Auth to samples
    
    * Update python/samples/getting_started/azure_functions/README.md
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Update python/samples/getting_started/azure_functions/01_single_agent/function_app.py
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Update python/samples/getting_started/azure_functions/02_multi_agent/function_app.py
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Update python/samples/getting_started/azure_functions/06_multi_agent_orchestration_conditionals/README.md
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Python: Fix Http Schema (#2112)
    
    * Rename to threadid
    
    * Respond in plain text
    
    * Make snake-case
    
    * Add http prefix
    
    * rename to wait-for-response
    
    * Add query param check
    
    * address comments
    
    * .NET: Remove IsPackable=false in preparation for nuget release (#2142)
    
    * Python: Move `azurefunctions` to `azure` for import (#2141)
    
    * Move import to Azure
    
    * fix mypy
    
    * Update python/packages/azurefunctions/README.md
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Add missing types
    
    * Address comments
    
    ---------
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Update python/packages/azurefunctions/pyproject.toml
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Update python/packages/azurefunctions/agent_framework_azurefunctions/__init__.py
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Fix imports
    
    * Address PR feedback from westey-m (#2150)
    
    - Adds a link from the /dotnet/samples/README.md to /dotnet/samples/AzureFunctions
    - Make DurableAgentThread deserialization internal for future-proofing
    - Update JSON serialization logic to address recently discovered issues with source generator serialization
    
    * Address comments (#2160)
    
    ---------
    
    Co-authored-by: Laveesh Rohra <larohra@microsoft.com>
    Co-authored-by: Chris Gillum <cgillum@microsoft.com>
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    Co-authored-by: Anirudh Garg <anirudhg@microsoft.com>
  • Python: Introducing the Anthropic Client (#1819)
    * initial version of anthropic connector
    
    * updated implementation and added tests
    
    * fix type and readme
    
    * mypy fix and int tests enabled
    
    * add integration test setup
    
    * updated based on comments
    
    * improved function result handling
    
    * added extra unordered test
    
    * updated from review
    
    * fix tool choice handling
    
    * same fix for chat client
  • Python: Updated merge test jobs (#1578)
    * Updated merge test jobs
    
    * Small fix
  • Python: [BREAKING] Main to core (#983)
    * removed pydantic from types
    
    * fix assistants client
    
    * Remove Pydantic usage from workflow code.
    
    * updated lock and test fixes
    
    * moved main to core, and setup meta package
    
    * updated versions
    
    * updated lock
    
    * fixed agents dependency
    
    * added retry to merge tests
    
    ---------
    
    Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>
  • Python: [Breaking] removed pydantic from types and workflows (#917)
    * removed pydantic from types
    
    * fix test
    
    * fix test
    
    * fix tests
    
    * fix assistants client
    
    * Remove Pydantic usage from workflow code.
    
    * updated pydantic removal
    
    * updated lock and test fixes
    
    * fix mypy
    
    * updated build system
    
    * updated chat client parsing
    
    * fix broken test
    
    ---------
    
    Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>
  • Python: [BREAKING] updated structure and samples (#875)
    * updated structure and samples
    
    * updated names and removed cross tests
    
    * updated projects etc
    
    * updated tests
    
    * updated test
    
    * test fixes
    
    * removed devui for now
    
    * updated all-tests task
    
    * removed old style configs
    
    * remove coverage from tests
    
    * updated to unit tests with all-tests
    
    * updated foundry everywhere
    
    * fix azure ai tests
    
    * fix merge tests
    
    * fix mypy
  • Python: Add tau2 benchmark integration with comprehensive testing and documentation (#817)
    * first commit to tau2-bench
    
    * tau2-bench agent
    
    * tau2 agent
    
    * add condition
    
    * checkpoint
    
    * bug fix
    
    * add tests
    
    * fix tests
    
    * add comments
    
    * add comments
    
    * minor fix
    
    * fix
    
    * batch test script
    
    * .
    
    * init.bak -> init.py
    
    * fix mypy
    
    * update readme
    
    * fix env
    
    * remove temp files
    
    * setup tests
    
    * fix gaia tasks
    
    * fix tau2 tests
    
    * fix coverage
    
    * fix default version
    
    * update cookiecutter template
    
    ---------
    
    Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
  • Bump actions/github-script from 7 to 8 (#743)
    Bumps [actions/github-script](https://github.com/actions/github-script) from 7 to 8.
    - [Release notes](https://github.com/actions/github-script/releases)
    - [Commits](https://github.com/actions/github-script/compare/v7...v8)
    
    ---
    updated-dependencies:
    - dependency-name: actions/github-script
      dependency-version: '8'
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
  • Python: api doc generation setup (#342)
    * api doc generation setup
    
    * remove old log file
    
    * improved check md function
    
    * update with sample code in docstring
    
    * updated script
    
    * docs update
    
    * docs update and action
    
    * removed all-extras
    
    * fixed sync command
    
    * moved install
    
    * moved action
    
    * renamed folder
    
    * fixed syntax
    
    * add python path
    
    * fix mypy and reused steps
    
    * updated merge test
    
    * undo change
    
    * slight update in poe commands
    
    * dev setup update
    
    * updated uvlock
  • Bump actions/checkout from 4 to 5 (#435)
    Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
    - [Release notes](https://github.com/actions/checkout/releases)
    - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/actions/checkout/compare/v4...v5)
    
    ---
    updated-dependencies:
    - dependency-name: actions/checkout
      dependency-version: '5'
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
  • Python: Introducing UserInputRequest and Response types and HostedMcpTool (#405)
    * initial work on User Approval (and hosted mcp to validate)
    
    * small update to the comments in the sample
    
    * enable local MCP tools in chatClient get methods
    
    * working streaming and improved setup
    
    * fix for pyright
    
    * updated create_approval -> create_response method
    
    * added tests
    
    * updated HostedMcpTool and addressed feedback
    
    * update type name
    
    * naming updates
    
    * small docstring update
    
    * mypy fix
    
    * fixes and updates
    
    * fixes for responses
    
    * fix int tests
    
    * removed broken tests
    
    * updated test running
    
    * removed specific content check on websearch
    
    * increased timeout
    
    * split slow foundry test
    
    * don't parallel run samples
    
    * add dist load to unit tests
    
    ---------
    
    Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
  • Python: Samples Integration Tests (#615)
    * Samples Tests
    
    * small fixes
    
    * job fix
    
    * telemetry dependency fix
    
    * job error fix
    
    * sorting provider specific tests
    
    * telemetry fixes
    
    * openai file search fix
    
    ---------
    
    Co-authored-by: Giles Odigwe <gilesodigwe@microsoft.com>
  • Python: Web search file search tools (#395)
    * Add web and file search tools
    
    * add tests
    
    * PR comments
    
    * Add tools support for chat and assistants clients
    
    * fix code checks
    
    * add tests for assistants client
    
    * Add samples
    
    * fix fn descriptions
    
    * Add openai responses model id to environment variables
    
    ---------
    
    Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
  • Python: Foundry Chat Client unit tests to improve coverage (#423)
    * Add unit tests for Foundry Chat Client to improve coverage
    
    * Updated Azure OpenAI endpoint and tests timeout
    
    * Error fixes for Foundry Chat Client tests
    
    * Error fixes
    
    ---------
    
    Co-authored-by: Giles Odigwe <gilesodigwe@microsoft.com>
    Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
  • Python: Introducing Local MCP Servers (#389)
    * mcp parts
    
    * mcp parts 2
    
    * removed structured output in favor of handling in chatresponse, mcp as AITool and running samples
    
    * updated naming
    
    * fixed test
  • Python: add better test coverage to individual tests, and all-tests task, gh … (#400)
    * add better test coverage to individual tests, and all-tests task, gh action to surface
    
    * remove cache location
    
    * test version-file
    
    * updated uv setup for consistency
    
    * mypy fix
    
    * update naming
    
    * temporarily removed mypy from workflow
  • Python: Azure Responses client (#311)
    * Azure Responses client
    
    * Fix a change made in the wrong place
    
    * allow api_version and token_endpoint to use env vars
    
    * Add getting started sample
    
    * add responses deployment name env var
    
    * update azure clients to use defaults for api_version and token_endpoint
    
    * make tests more reliable
    
    ---------
    
    Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
  • Python: Merge tests (#253)
    * updated var and secret names
    
    * updated ptyhon with conditional logic