Commit Graph

19 Commits

  • ci: pin third-party GitHub Actions to commit SHAs (#5972)
    Replaces every floating tag in our workflow and composite action files
    with an immutable 40-character commit SHA, keeping the original `# vX`
    comment so Dependabot can still propose version bumps. 186 occurrences
    across 25 workflows and 2 composite actions.
    
    Also widens the github-actions Dependabot entry to use the plural
    `directories` key with `/.github/actions/*` so composite actions under
    `.github/actions/<name>/action.yml` are kept up to date. Previously
    Dependabot only scanned `.github/workflows` and the repo-root
    `action.yml`, leaving our `python-setup` and `sample-validation-setup`
    composite actions unmaintained.
  • Python: Stop emitting duplicate reasoning content from OpenAI response.reasoning_text.done and response.reasoning_summary_text.done events (#5162)
    * Fix reasoning text done events duplicating streamed delta content (#5157)
    
    The OpenAI Responses API sends both reasoning_text.delta (incremental
    chunks) and reasoning_text.done (full accumulated text) events. The
    chat client was emitting Content for both, causing ag-ui to append the
    full done text onto already-accumulated delta text, producing
    duplicated reasoning output.
    
    Stop emitting Content for reasoning_text.done and
    reasoning_summary_text.done events, matching how output_text.done is
    already handled (not emitted). The deltas contain all the content;
    the done event is redundant.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(openai): emit reasoning done content as fallback when no deltas observed (#5157)
    
    Address PR review feedback:
    - Track item_ids that received reasoning deltas via seen_reasoning_delta_item_ids set
    - Emit content from done events only when no deltas were received for the
      item_id, preventing silent content loss on stream resumption
    - Add comment documenting code_interpreter done event asymmetry
    - Replace redundant ag-ui test with deduplication-focused test
    - Add integration test for delta+done sequence in OpenAI chat client tests
    - Add fallback path tests for done events without preceding deltas
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #5157: Python: [Bug]: "type": "response.reasoning_text.delta" and "response.reasoning_text.done" both get exposed as "text_reasoning"
    
    * Fix AG-UI reasoning streaming to use proper Start/End pattern (#5157)
    
    _emit_text_reasoning now follows the same streaming pattern as _emit_text:
    - Emits ReasoningStartEvent/ReasoningMessageStartEvent only on the first
      delta for a given message_id
    - Emits only ReasoningMessageContentEvent for subsequent deltas
    - Defers ReasoningMessageEndEvent/ReasoningEndEvent until
      _close_reasoning_block is called (on content type switch or end-of-run)
    
    This produces the correct protocol pattern:
      ReasoningStartEvent
        ReasoningMessageStartEvent
        ReasoningMessageContentEvent(delta1)
        ReasoningMessageContentEvent(delta2)
        ReasoningMessageEndEvent
      ReasoningEndEvent
    
    Instead of wrapping every delta in a full Start→End sequence.
    
    Backward compatibility is preserved: calling _emit_text_reasoning without
    a flow argument still produces the full sequence per call.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix import ordering lint error in AG-UI test file (#5157)
    
    Move inline import of TextMessageContentEvent to the top-level import
    block and ensure alphabetical ordering to satisfy ruff I001 rule.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix mypy error: rename loop variable to avoid type conflict with WorkflowEvent
    
    The 'event' variable was already typed as WorkflowEvent[Any] from the
    async for loop at line 590. Reusing it in the _close_reasoning_block
    loop (which returns list[BaseEvent]) caused an incompatible assignment
    error. Renamed to 'reasoning_evt' to avoid the conflict.
    
    Fixes #5162
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #5157: review comment fixes
    
    * narrow test result reporting to explicit pytest JUnit XML
    
    * Fix test args
    
    * Fix pytest-results-action in merge workflow and remove committed test artifacts
    
    Apply the same JUnit XML fix from python-tests.yml to python-merge-tests.yml:
    add --junitxml=pytest.xml to all test commands and narrow the results action
    path from ./python/**.xml to ./python/pytest.xml. Also remove accidentally
    committed pytest.xml and python-coverage.xml and add them to .gitignore.
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Simplify Python Poe tasks and unify package selectors (#4722)
    * updated automation tasks and commands, with alias for the time being
    
    * Restore aggregate test exclusions
    
    Preserve the legacy all-tests scope for test --all by excluding lab and devui from the default aggregate sweep, while still allowing explicit package selection. Also ignore hidden/generated test directories such as .mypy_cache during aggregate discovery.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updated versions in pre-commit
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • [BREAKING] Python: Update github-copilot-sdk integration to use ToolInvocation/ToolResult types (#4551)
    * Update github_copilot package for github-copilot-sdk>=0.1.32 (#4549)
    
    - Update requires-python from >=3.10 to >=3.11
    - Remove Python 3.10 classifier
    - Update mypy python_version to 3.11
    - Update dependency to github-copilot-sdk>=0.1.32
    - Fix ToolResult API: use snake_case kwargs (text_result_for_llm,
      result_type) instead of camelCase (textResultForLlm, resultType)
    - Update test assertions to use attribute access on ToolResult
    - Add ToolResult type assertions to tool handler tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix tests to use ToolInvocation dataclass instead of plain dict (#4549)
    
    Update test_github_copilot_agent.py to pass ToolInvocation objects to tool
    handlers instead of plain dicts, matching the github-copilot-sdk>=0.1.32 API
    where ToolInvocation is a dataclass with an .arguments attribute.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add regression tests for ToolInvocation contract (#4549)
    
    Add tests to lock in the new ToolInvocation-based calling convention:
    - test_tool_handler_rejects_raw_dict_invocation: verifies passing a raw
      dict (old calling convention) raises TypeError/AttributeError
    - test_tool_handler_with_empty_arguments: verifies ToolInvocation with
      empty arguments works correctly for no-arg tools
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Revert requires-python to >=3.10 to avoid breaking CI (#4549)
    
    The repo CI runs with Python 3.10 (uv sync --all-packages) and all other
    packages require >=3.10. Raising this package to >=3.11 would break the
    shared install flow. The SDK dependency version constraint (>=0.1.32) will
    enforce any Python version requirement from the SDK itself.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix min Python version for github_copilot package to >=3.11
    
    github-copilot-sdk>=0.1.32 requires Python>=3.11, which conflicts
    with the package's declared >=3.10 minimum, breaking uv sync.
    
    * Bump py version for GH workflows to 3.11, exclude GHCP sdk from 3.10 items
    
    * Fix uv command
    
    * Fixes
    
    * Update samples
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Bump actions/checkout from 5 to 6 (#2404)
    Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
    - [Release notes](https://github.com/actions/checkout/releases)
    - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/actions/checkout/compare/v5...v6)
    
    ---
    updated-dependencies:
    - dependency-name: actions/checkout
      dependency-version: '6'
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
  • Python: add support for Python 3.14 (#1904)
    * add tests for py3.14 and add classifier
    
    * remove macos
    
    * allow openai v2
  • Python: [BREAKING] parameter naming and other fixes (#1255)
    * parameter naming and other fixes
    
    * fix test
    
    * fix azure openai responses decorator ordering
    
    * fix test
    
    * fix mypy
    
    * fixes in options handling
    
    * fix tests
    
    * final fixes
    
    * exclude macos tests
    
    * fix model param
  • Python: Introducing AI Function approval (#1131)
    * support for local function approval
    
    * small fix
    
    * fix mypy
    
    * added bigger test scenario's for function calling and approvals
    
    * updated lock
    
    * updated return message for rejection
    
    * fix test
    
    * updated function result content handling
  • Python: skip macos-latest in gatekeeper (#989)
    * update
    
    * tmp change to test ci
    
    * Update
    
    * Update
    
    * update
    
    * Just update the matrix
  • Python: [BREAKING] updated structure and samples (#875)
    * updated structure and samples
    
    * updated names and removed cross tests
    
    * updated projects etc
    
    * updated tests
    
    * updated test
    
    * test fixes
    
    * removed devui for now
    
    * updated all-tests task
    
    * removed old style configs
    
    * remove coverage from tests
    
    * updated to unit tests with all-tests
    
    * updated foundry everywhere
    
    * fix azure ai tests
    
    * fix merge tests
    
    * fix mypy
  • Python: Add tau2 benchmark integration with comprehensive testing and documentation (#817)
    * first commit to tau2-bench
    
    * tau2-bench agent
    
    * tau2 agent
    
    * add condition
    
    * checkpoint
    
    * bug fix
    
    * add tests
    
    * fix tests
    
    * add comments
    
    * add comments
    
    * minor fix
    
    * fix
    
    * batch test script
    
    * .
    
    * init.bak -> init.py
    
    * fix mypy
    
    * update readme
    
    * fix env
    
    * remove temp files
    
    * setup tests
    
    * fix gaia tasks
    
    * fix tau2 tests
    
    * fix coverage
    
    * fix default version
    
    * update cookiecutter template
    
    ---------
    
    Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
  • Python: [BREAKING] Move workflow to main package (#767)
    * Move workflow to main package
    
    * Remove workflow specific unit test config
    
    * Remove workflow-specific version info
    
    * Revert unintended telemetry changes
    
    * Removed the obsolete packages/workflow/tests target
    
    * Rename dir workflow to _workflow
    
    * Fix test imports
  • Python: api doc generation setup (#342)
    * api doc generation setup
    
    * remove old log file
    
    * improved check md function
    
    * update with sample code in docstring
    
    * updated script
    
    * docs update
    
    * docs update and action
    
    * removed all-extras
    
    * fixed sync command
    
    * moved install
    
    * moved action
    
    * renamed folder
    
    * fixed syntax
    
    * add python path
    
    * fix mypy and reused steps
    
    * updated merge test
    
    * undo change
    
    * slight update in poe commands
    
    * dev setup update
    
    * updated uvlock
  • Bump actions/checkout from 4 to 5 (#435)
    Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
    - [Release notes](https://github.com/actions/checkout/releases)
    - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/actions/checkout/compare/v4...v5)
    
    ---
    updated-dependencies:
    - dependency-name: actions/checkout
      dependency-version: '5'
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
  • Python: Introducing UserInputRequest and Response types and HostedMcpTool (#405)
    * initial work on User Approval (and hosted mcp to validate)
    
    * small update to the comments in the sample
    
    * enable local MCP tools in chatClient get methods
    
    * working streaming and improved setup
    
    * fix for pyright
    
    * updated create_approval -> create_response method
    
    * added tests
    
    * updated HostedMcpTool and addressed feedback
    
    * update type name
    
    * naming updates
    
    * small docstring update
    
    * mypy fix
    
    * fixes and updates
    
    * fixes for responses
    
    * fix int tests
    
    * removed broken tests
    
    * updated test running
    
    * removed specific content check on websearch
    
    * increased timeout
    
    * split slow foundry test
    
    * don't parallel run samples
    
    * add dist load to unit tests
    
    ---------
    
    Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
  • Python: Update getting started with workflows sample structure and README (#653)
    * Update getting started with workflows sample structure and README
    
    * Small updates
    
    * Adjust getting started samples. Fix agent executor bug. Add workflow tests to unit test file.
    
    * Fix resource links
  • Python: add better test coverage to individual tests, and all-tests task, gh … (#400)
    * add better test coverage to individual tests, and all-tests task, gh action to surface
    
    * remove cache location
    
    * test version-file
    
    * updated uv setup for consistency
    
    * mypy fix
    
    * update naming
    
    * temporarily removed mypy from workflow
  • updated builds and release pipeline setup (without pypi publish) (#219)
    * updated builds and release pipeline setup (without pypi publish)
    
    * add label prefix action
    
    * updated uv versions
    
    * final uv updates
    
    * moved uv version semver into variable
  • Python: move all tests under tests and initial work on int tests (#206)
    * move all tests under tests and initial work on int tests
    
    * added updated tests setup and merge tests
    
    * without failing step
    
    * fixed upload
    
    * updated file names for coverage
    
    * reenable surface tests
    
    * removed package matrix
    
    * simplified variables
    
    * correct path
    
    * removed mistake
    
    * fix mistake in path
    
    * fix path
    
    * windows specific env set
    
    * updated merge tests
    
    * slight update in marker
    
    * added run integration tests settings
    
    * updated setup, moved foundry int tests and updated merge test