Commit Graph

9 Commits

  • Python: [BREAKING] Standardize orchestration terminal outputs as AgentResponse (#5301)
    * Fix orchestration outputs so as_agent() returns the final answer only. Align other orchestration outputs
    
    * Fix orchestration output issues from review comments
    
    1. Sample cleanup: Remove commented-out FoundryChatClient block and update
       prerequisites to reference OPENAI_CHAT_MODEL_ID instead of FOUNDRY_* vars.
    
    2. Sequential approval output: Change _EndWithConversation.end_with_agent_executor_response
       from a no-op sink to yield response.agent_response. When the last participant is
       AgentApprovalExecutor (via with_request_info), _EndWithConversation is the output
       executor so the yield produces the terminal answer. When the last participant is a
       regular AgentExecutor, _EndWithConversation is not in output_executors so the yield
       is silently filtered out.
    
    3. Forward data events through WorkflowExecutor: _process_workflow_result now also
       forwards 'data' events from sub-workflows so that emit_intermediate_data=True on
       AgentExecutor works correctly when wrapped in AgentApprovalExecutor.
    
    4. Concurrent docstring: Update _AggregateAgentConversations docstring to say
       'deterministic participant order' instead of 'completion order'.
    
    5. Add test_concurrent_intermediate_outputs_emits_data_events verifying that
       ConcurrentBuilder(intermediate_outputs=True) emits per-participant data events
       alongside the single aggregated output event.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add tests for sequential workflow with_request_info and intermediate_outputs (#5301)
    
    Address PR review comments 2, 3, and 5:
    
    - Add test_sequential_request_info_last_participant_emits_output:
      Verifies that when the last participant is wrapped via with_request_info()
      (AgentApprovalExecutor), the workflow still emits a terminal output after
      approval, exercising the _EndWithConversation.end_with_agent_executor_response
      fallback path.
    
    - Add test_sequential_request_info_with_intermediate_outputs_emits_data_events:
      Verifies that emit_intermediate_data=True works correctly through
      AgentApprovalExecutor wrapping—WorkflowExecutor._process_result already
      forwards data events from sub-workflows, so intermediate agent responses
      surface as data events in the parent workflow.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright type errors from AgentResponse output refactor (#5301)
    
    Update cast() calls in _group_chat.py and _magentic.py to use
    WorkflowContext[Never, AgentResponse] instead of the old
    WorkflowContext[Never, list[Message]], matching the updated method
    signatures in _base_group_chat_orchestrator.py.
    
    Fix _sequential.py _EndWithConversation.end_with_agent_executor_response
    to declare WorkflowContext[Any, AgentResponse] so yield_output accepts
    AgentResponse[None].
    
    Fix _workflow_executor.py data event forwarding to handle nullable
    executor_id.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright reportUnknownVariableType in _agent.py (#5301)
    
    Extract event.data into a typed local variable before the isinstance
    check to avoid pyright narrowing it to AgentResponse[Unknown].
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright reportMissingImports for orjson in file history samples (#5301)
    
    Add pyright: ignore[reportMissingImports] to orjson imports that are
    already guarded by try/except ImportError, matching the existing pattern
    used elsewhere in the samples.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #5301: review comment fixes
    
    * Address review feedback for #5301: review comment fixes
    
    * Revert sequential_workflow_as_agent sample to FoundryChatClient
    
    Reverts the mistaken switch from FoundryChatClient to OpenAIChatClient
    in the sequential workflow as agent sample.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address ultrareview feedback: emit_data_events rename + WorkflowAgent reasoning conversion
    
    Layered on top of the prior review-feedback work in this branch.
    
    Renames:
    - AgentExecutor.emit_intermediate_data -> emit_data_events (mechanical
      rename; orchestration semantics live at the orchestration layer, not
      the general-purpose executor). Forwarded through MagenticAgentExecutor,
      AgentApprovalExecutor, and all orchestration call sites.
    - HandoffAgentExecutor._check_terminate_and_yield -> _should_terminate
      (pure predicate; no longer yields anything). HandoffBuilder docstring
      rewritten to describe the new per-agent AgentResponse output contract.
    
    WorkflowAgent reasoning-content conversion:
    - Add _rewrite_text_to_reasoning(contents) and _msg_as_reasoning(msg)
      helpers; the as_agent() path now reframes text content from data events
      as text_reasoning Content blocks before merging into the AgentResponse.
    - Consumers iterate msg.contents and branch on content.type — same path
      they already use for Claude thinking and OpenAI reasoning. No new
      field on Message/AgentResponse/WorkflowEvent.
    - Streaming branch constructs fresh AgentResponseUpdate instances instead
      of mutating shared payloads (regression test added).
    - Helper _msg_maybe_reasoning consolidates the conditional rewrite at
      three call sites in the non-streaming conversion.
    
    Tests:
    - TestWorkflowAgentReasoningHelpers + TestWorkflowAgentDataEventReasoningConversion
      add 9 new tests covering helpers, non-streaming, streaming, mixed content,
      already-reasoning passthrough, and mutation-safety regression.
    - Updated test_sequential_as_agent_with_intermediate_outputs_includes_chain
      to assert text_reasoning content for intermediate agents.
    
    * Fix pyright: widen event.data to Any to avoid partial-unknown narrowing
    
    The streaming conversion path narrowed event.data via isinstance against
    generic AgentResponse, producing AgentResponse[Unknown] and tripping
    reportUnknownVariableType/reportUnknownMemberType. Binding data: Any
    before the check keeps runtime behavior identical while restoring a fully
    known type for downstream access.
    
    * Clean up design
    
    * Scope to agent output semantics only
    
    * yield AgentResponseUpdate streaming, AgentResponse non-streaming
    
    * Fix mypy/pyright: widen cast types at GroupChat callsites
    
    Eight callsites in _group_chat.py still cast to WorkflowContext[Never,
    AgentResponse] but the base orchestrator methods now accept the wider
    WorkflowContext[Never, AgentResponse | AgentResponseUpdate] (mode-aware
    yields). W_OutT is invariant, so the narrower cast is not assignable.
    Magentic was widened in the same commit; this catches the GroupChat
    callsites that were missed.
    
    * Python: skip flaky Foundry / Foundry Hosting integration tests (#5553)
    
    These two integration tests have been failing in the merge queue across
    multiple unrelated PRs (5301, 5531). Both are marked `@pytest.mark.flaky`
    with 3 retries, but all attempts fail back-to-back. Skipping both with a
    reason pointing to #5553 so they can be fixed properly without continuing
    to block unrelated merges.
    
    - packages/foundry_hosting/tests/test_responses_int.py::TestOptions::test_temperature_and_max_tokens
    - packages/foundry/tests/foundry/test_foundry_embedding_client.py::TestFoundryEmbeddingIntegration::test_text_embedding_live
    
    Also includes a one-line uv.lock specifier-ordering normalization
    auto-applied by the poe-check pre-commit hook.
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [BREAKING] update to v1.0.0 (#5062)
    * updates to final deprecated pieces and versions
    
    * fix mypy
    
    * fix readme links
  • [BREAKING] Python: Checkpoint refactor: encode/decode, checkpoint format, etc (#3744)
    * WIP: Checkpoint refactor: encode/decode, checkpoint format, etc
    
    * WIP: Remove workflow ID in checkpoints
    
    * Refactor checkpointing
    
    * Add get_latest tests
    
    * Increase test coverage
    
    * Fix formatting
    
    * Fix unit tests
    
    * Fix samples
    
    * fix unit tests
    
    * fix pipeline
    
    * Copilot comments
    
    * Fix tests
    
    * Fix more tests
    
    * Address comments part 1
    
    * Address comments part 2
    
    * Comments
  • Python: [BREAKING] Simplify API: ChatAgent -> Agent, ChatMessage -> Message (#3747)
    * [BREAKING] Rename ChatAgent -> Agent, ChatMessage -> Message, ChatClientProtocol -> SupportsChatGetResponse
    
    Simplify the public API by removing redundant 'Chat' prefix from core types:
    - ChatAgent -> Agent
    - RawChatAgent -> RawAgent
    - ChatMessage -> Message
    - ChatClientProtocol -> SupportsChatGetResponse
    
    Also renamed internal WorkflowMessage (was Message in _runner_context) to avoid collision.
    
    No backward compatibility aliases - this is a clean breaking change.
    
    * [BREAKING] Rename Agent chat_client parameter to client
    
    * Fix rebase issues: WorkflowMessage references and broken markdown links
    
    * Fix formatting and lint issues from code quality checks
    
    * Fix import ordering in workflow sample files
    
    * fixed rebase
    
    * Fix test failures: use WorkflowMessage and A2AMessage after ChatMessage→Message rename
    
    - Replace Message(data=..., source_id=...) with WorkflowMessage(...) in workflow tests
    - Fix isinstance check in A2A agent to use A2AMessage instead of Message
    - Fix import in test_workflow_observability.py (Message→WorkflowMessage)
    
    * Fix lint, fmt, and sample errors after ChatMessage→Message rename
    
    - Auto-fix 70+ ruff lint issues across samples (ChatMessage→Message refs)
    - Fix HostedVectorStoreContent→Content.from_hosted_vector_store in file search sample
    - Fix _normalize_messages→normalize_messages in custom agent sample
    - Fix context.terminate→raise MiddlewareTermination in middleware samples
    - Fix with_update_hook→with_transform_hook in override middleware sample
    - Add TOptions_co import back to custom_chat_client sample
    - Add noqa for FastAPI File() default in chatkit sample
    - Fix B023 loop variable capture in weather agent sample
    
    * fix: update Agent constructor calls from chat_client to client in declaration-only tool tests
    
    * fix: add register_cleanup to devui lazy-loading proxy and type stub
    
    * fixed tests and updated new pieces
    
    * fix agui typevar
    
    * fix merge errors
    
    * fix merge conflicts
    
    * fiux merge
    
    * Remove unused links
    
    ---------
    
    Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>
  • [BREAKING] Python: Remove workflow register factory methods. Update tests and samples (#3781)
    * Remove workflow register factory methods. Update tests and samples
    
    * Address Copilot feedback
  • [BREAKING] Python: Move single-config fluent methods to constructor parameters (#3693)
    * Move single-config fluent methods to constructor parameters
    
    * Updates
    
    * Adjust magentic and group chat
  • [BREAKING] Python: Refactor workflow events to unified discriminated union pattern (#3690)
    * Refactor events
    
    * Merge main
    
    * Fixes
    
    * Cleanup
    
    * Update samples and tests
    
    * Remove unused imports
    
    * PR feedback
    
    * Merge main. Add properties for events to help typing
    
    * Formatting
    
    * Cleanup
    
    * use builtins.type to avoid shadowing by WorkflowEvent.type attribute
    
    * Final improvements
  • Python: [BREAKING] Moved to a single get_response and run API (#3379)
    * WIP
    
    * big update to new ResponseStream model
    
    * fixed tests and typing
    
    * fixed tests and typing
    
    * fixed tools typevar import
    
    * fix
    
    * mypy fix
    
    * mypy fixes and some cleanup
    
    * fix missing quoted names
    
    * and client
    
    * fix  imports agui
    
    * fix anthropic override
    
    * fix agui
    
    * fix ag ui
    
    * fix import
    
    * fix anthropic types
    
    * fix mypy
    
    * refactoring
    
    * updated typing
    
    * fix 3.11
    
    * fixes
    
    * redid layering of chat clients and agents
    
    * redid layering of chat clients and agents
    
    * Fix lint, type, and test issues after rebase
    
    - Add @overload decorators to AgentProtocol.run() for type compatibility
    - Add missing docstring params (middleware, function_invocation_configuration)
    - Fix TODO format (TD002) by adding author tags
    - Fix broken observability tests from upstream:
      - Replace non-existent use_instrumentation with direct instantiation
      - Replace non-existent use_agent_instrumentation with AgentTelemetryLayer mixin
      - Fix get_streaming_response to use get_response(stream=True)
      - Add AgentInitializationError import
      - Update streaming exception tests to match actual behavior
    
    * Fix AgentExecutionException import error in test_agents.py
    
    - Replace non-existent AgentExecutionException with AgentRunException
    
    * Fix test import and asyncio deprecation issues
    
    - Add 'tests' to pythonpath in ag-ui pyproject.toml for utils_test_ag_ui import
    - Replace deprecated asyncio.get_event_loop().run_until_complete with asyncio.run
    
    * Fix azure-ai test failures
    
    - Update _prepare_options patching to use correct class path
    - Fix test_to_azure_ai_agent_tools_web_search_missing_connection to clear env vars
    
    * Convert ag-ui utils_test_ag_ui.py to conftest.py
    
    - Move test utilities to conftest.py for proper pytest discovery
    - Update all test imports to use conftest instead of utils_test_ag_ui
    - Remove old utils_test_ag_ui.py file
    - Revert pythonpath change in pyproject.toml
    
    * fix: use relative imports for ag-ui test utilities
    
    * fix agui
    
    * Rename Bare*Client to Raw*Client and BaseChatClient
    
    - Renamed BareChatClient to BaseChatClient (abstract base class)
    - Renamed BareOpenAIChatClient to RawOpenAIChatClient
    - Renamed BareOpenAIResponsesClient to RawOpenAIResponsesClient
    - Renamed BareAzureAIClient to RawAzureAIClient
    - Added warning docstrings to Raw* classes about layer ordering
    - Updated README in samples/getting_started/agents/custom with layer docs
    - Added test for span ordering with function calling
    
    * Fix layer ordering: FunctionInvocationLayer before ChatTelemetryLayer
    
    This ensures each inner LLM call gets its own telemetry span, resulting in
    the correct span sequence: chat -> execute_tool -> chat
    
    Updated all production clients and test mocks to use correct ordering:
    - ChatMiddlewareLayer (first)
    - FunctionInvocationLayer (second)
    - ChatTelemetryLayer (third)
    - BaseChatClient/Raw...Client (fourth)
    
    * Remove run_stream usage
    
    * Fix conversation_id propagation
    
    * Python: Add BaseAgent implementation for Claude Agent SDK (#3509)
    
    * Added ClaudeAgent implementation
    
    * Updated streaming logic
    
    * Small updates
    
    * Small update
    
    * Fixes
    
    * Small fix
    
    * Naming improvements
    
    * Updated imports
    
    * Addressed comments
    
    * Updated package versions
    
    * Update Claude agent connector layering
    
    * fix test and plugin
    
    * Store function middleware in invocation layer
    
    * Fix telemetry streaming and ag-ui tests
    
    * Remove legacy ag-ui tests folder
    
    * updates
    
    * Remove terminate flag from FunctionInvocationContext, use MiddlewareTermination instead
    
    - Remove terminate attribute from FunctionInvocationContext
    - Add result attribute to MiddlewareTermination to carry function results
    - FunctionMiddlewarePipeline.execute() now lets MiddlewareTermination propagate
    - _auto_invoke_function captures context.result in exception before re-raising
    - _try_execute_function_calls catches MiddlewareTermination and sets should_terminate
    - Fix handoff middleware to append to chat_client.function_middleware directly
    - Update tests to use raise MiddlewareTermination instead of context.terminate
    - Add middleware flow documentation in samples/concepts/tools/README.md
    - Fix ag-ui to use FunctionMiddlewarePipeline instead of removed create_function_middleware_pipeline
    
    * fix: remove references to removed terminate flag in purview tests, add type ignore
    
    * fix: move _test_utils.py from package to test folder
    
    * fix: call get_final_response() to trigger context provider notification in streaming test
    
    * fix: correct broken links in tools README
    
    * docs: clarify default middleware behavior in summary table
    
    * fix: ensure inner stream result hooks are called when using map()/from_awaitable()
    
    * Fix mypy type errors
    
    * Address PR review comments on observability.py
    
    - Remove TODO comment about unconsumed streams, add explanatory note instead
    - Remove redundant _close_span cleanup hook (already called in _finalize_stream)
    - Clarify behavior: cleanup hooks run after stream iteration, if stream is not
      consumed the span remains open until garbage collected
    
    * Remove gen_ai.client.operation.duration from span attributes
    
    Duration is a metrics-only attribute per OpenTelemetry semantic conventions.
    It should be recorded to the histogram but not set as a span attribute.
    
    * Remove duration from _get_response_attributes, pass directly to _capture_response
    
    Duration is a metrics-only attribute. It's now passed directly to _capture_response
    instead of being included in the attributes dict that gets set on the span.
    
    * Remove redundant _close_span cleanup hook in AgentTelemetryLayer
    
    _finalize_stream already calls _close_span() in its finally block,
    so adding it as a separate cleanup hook is redundant.
    
    * Use weakref.finalize to close span when stream is garbage collected
    
    If a user creates a streaming response but never consumes it, the cleanup
    hooks won't run. Now we register a weak reference finalizer that will close
    the span when the stream object is garbage collected, ensuring spans don't
    leak in this scenario.
    
    * Fix _get_finalizers_from_stream to use _result_hooks attribute
    
    Renamed function to _get_result_hooks_from_stream and fixed it to
    look for the _result_hooks attribute which is the correct name in
    ResponseStream class.
    
    * Add missing asyncio import in test_request_info_mixin.py
    
    * Fix leftover merge conflict marker in image_generation sample
    
    * Update integration tests
    
    * Fix integration tests: increase max_iterations from 1 to 2
    
    Tests with tool_choice options require at least 2 iterations:
    1. First iteration to get function call and execute the tool
    2. Second iteration to get the final text response
    
    With max_iterations=1, streaming tests would return early with only
    the function call/result but no final text content.
    
    * Fix duplicate function call error in conversation-based APIs
    
    When using conversation_id (for Responses/Assistants APIs), the server
    already has the function call message from the previous response. We
    should only send the new function result message, not all messages
    including the function call which would cause a duplicate ID error.
    
    Fix: When conversation_id is set, only send the last message (the tool
    result) instead of all response.messages.
    
    * Add regression test for conversation_id propagation between tool iterations
    
    Port test from PR #3664 with updates for new streaming API pattern.
    Tests that conversation_id is properly updated in options dict during
    function invocation loop iterations.
    
    * Fix tool_choice=required to return after tool execution
    
    When tool_choice is 'required', the user's intent is to force exactly one
    tool call. After the tool executes, return immediately with the function
    call and result - don't continue to call the model again.
    
    This fixes integration tests that were failing with empty text responses
    because with tool_choice=required, the model would keep returning function
    calls instead of text.
    
    Also adds regression tests for:
    - conversation_id propagation between tool iterations (from PR #3664)
    - tool_choice=required returns after tool execution
    
    * Document tool_choice behavior in tools README
    
    - Add table explaining tool_choice values (auto, none, required)
    - Explain why tool_choice=required returns immediately after tool execution
    - Add code example showing the difference between required and auto
    - Update flow diagram to show the early return path for tool_choice=required
    
    * Fix tool_choice=None behavior - don't default to 'auto'
    
    Remove the hardcoded default of 'auto' for tool_choice in ChatAgent init.
    When tool_choice is not specified (None), it will now not be sent to the
    API, allowing the API's default behavior to be used.
    
    Users who want tool_choice='auto' can still explicitly set it either in
    default_options or at runtime.
    
    Fixes #3585
    
    * Fix tool_choice=none should not remove tools
    
    In OpenAI Assistants client, tools were not being sent when
    tool_choice='none'. This was incorrect - tool_choice='none' means
    the model won't call tools, but tools should still be available
    in the request (they may be used later in the conversation).
    
    Fixes #3585
    
    * Add test for tool_choice=none preserving tools
    
    Adds a regression test to ensure that when tool_choice='none' is set but
    tools are provided, the tools are still sent to the API. This verifies
    the fix for #3585.
    
    * Fix tool_choice=none should not remove tools in all clients
    
    Apply the same fix to OpenAI Responses client and Azure AI client:
    - OpenAI Responses: Remove else block that popped tool_choice/parallel_tool_calls
    - Azure AI: Remove tool_choice != 'none' check when adding tools
    
    When tool_choice='none', the model won't call tools, but tools should
    still be sent to the API so they're available for future turns.
    
    Also update README to clarify tool_choice=required supports multiple tools.
    
    Fixes #3585
    
    * Keep tool_choice even when tools is None
    
    Move tool_choice processing outside of the 'if tools' block in OpenAI
    Responses client so tool_choice is sent to the API even when no tools
    are provided.
    
    * Update test to match new parallel_tool_calls behavior
    
    Changed test_prepare_options_removes_parallel_tool_calls_when_no_tools to
    test_prepare_options_preserves_parallel_tool_calls_when_no_tools to reflect
    that parallel_tool_calls is now preserved even when no tools are present,
    consistent with the tool_choice behavior.
    
    * Fix ChatMessage API and Role enum usage after rebase
    
    - Update ChatMessage instantiation to use keyword args (role=, text=, contents=)
    - Fix Role enum comparisons to use .value for string comparison
    - Add created_at to AgentResponse in error handling
    - Fix AgentResponse.from_updates -> from_agent_run_response_updates
    - Fix DurableAgentStateMessage.from_chat_message to convert Role enum to string
    - Add Role import where needed
    
    * Fix additional ChatMessage API and method name changes
    
    - Fix ChatMessage usage in workflow files (use text= instead of contents= for strings)
    - Fix AgentResponse.from_updates -> from_agent_run_response_updates in workflow files
    - Fix test files for ChatMessage and Role enum usage
    
    * Fix remaining ChatMessage API usage in test files
    
    * Fix more ChatMessage and Role API changes in source and test files
    
    - Fix ChatMessage in _magentic.py replan method
    - Fix Role enum comparison in test assertions
    - Fix remaining test files with old ChatMessage syntax
    
    * Fix ChatMessage and Role API changes across packages
    
    - Add Role import where missing
    - Fix ChatMessage signature: positional args to keyword args (role=, text=, contents=)
    - Fix Role enum comparisons: .role.value instead of .role string
    - Fix FinishReason enum usage in ag-ui event converters
    - Rename AgentResponse.from_updates to from_agent_run_response_updates in ag-ui
    
    Fixes API compatibility after Types API Review improvements merge
    
    * Fix ChatMessage and Role API changes in github_copilot tests
    
    * Fix ChatMessage and Role API changes in redis and github_copilot packages
    
    - Fix redis provider: Role enum comparison using .value
    - Fix redis tests: ChatMessage signature and Role comparisons
    - Fix github_copilot tests: ChatMessage signature and Role comparisons
    - Update docstring examples in redis chat message store
    
    * Fix ChatMessage and Role API changes in devui package
    
    - Fix executor: ChatMessage signature change
    - Fix conversations: Role enum to string conversion in two places
    - Fix tests: ChatMessage signatures and Role comparisons
    
    * Fix ChatMessage and Role API changes in a2a and lab packages
    
    - Fix a2a tests: Role comparisons and ChatMessage signatures
    - Fix lab tau2 source: Role enum comparison in flip_messages, log_messages, sliding_window
    - Fix lab tau2 tests: ChatMessage signatures and Role comparisons
    
    * Remove duplicate test files from ag-ui/tests (tests are in ag_ui_tests)
    
    * Fix ChatMessage and Role API changes across packages
    
    After rebasing on upstream/main which merged PR #3647 (Types API Review
    improvements), fix all packages to use the new API:
    
    - ChatMessage: Use keyword args (role=, text=, contents=) instead of
      positional args
    - Role: Compare using .value attribute since it's now an enum
    
    Packages fixed:
    - ag-ui: Fixed Role value extraction bugs in _message_adapters.py
    - anthropic: Fixed ChatMessage and Role comparisons in tests
    - azure-ai: Fixed Role comparison in _client.py
    - azure-ai-search: Fixed ChatMessage and Role in source/tests
    - bedrock: Fixed ChatMessage signatures in tests
    - chatkit: Fixed ChatMessage and Role in source/tests
    - copilotstudio: Fixed ChatMessage and Role in tests
    - declarative: Fixed ChatMessage in _executors_agents.py
    - mem0: Fixed ChatMessage and Role in source/tests
    - purview: Fixed ChatMessage in source/tests
    
    * Fix mypy errors for ChatMessage and Role API changes
    
    - durabletask: Use str() fallback in role value extraction
    - core: Fix ChatMessage in _orchestrator_helpers.py to use keyword args
    - core: Add type ignore for _conversation_state.py contents deserialization
    - ag-ui: Fix type ignore comments (call-overload instead of arg-type)
    - azure-ai-search: Fix get_role_value type hint to accept Any
    - lab: Move get_role_value to module level with Any type hint
    
    * Improve CI test timeout configuration
    
    - Increase job timeout from 10 to 15 minutes
    - Reduce per-test timeout to 60s (was 900s/300s)
    - Add --timeout_method thread for better timeout handling
    - Add --timeout-verbose to see which tests are slow
    - Reduce retries from 3 to 2 and delay from 10s to 5s
    
    This ensures individual test timeouts are shorter than the job
    timeout, providing better visibility when tests hang.
    
    With 60s timeout and 2 retries, worst case per test is ~180s.
    
    * Fix ChatMessage API usage in docstrings and source
    
    - Fix ChatMessage positional args in docstrings: _serialization.py, _threads.py, _middleware.py
    - Fix ChatMessage in tau2 runner.py
    - Fix role comparison in _orchestrator_helpers.py to use .value
    - Fix role comparison in _group_chat.py docstring example
    - Fix role assertions in test_durable_entities.py to use .value
    
    * Revert tool_choice/parallel_tool_calls changes - must be removed when no tools
    
    OpenAI API requires tool_choice and parallel_tool_calls to only be
    present when tools are specified. Restored the logic that removes
    these options when there are no tools.
    
    - Restored check in _chat_client.py to remove tool_choice and
      parallel_tool_calls when no tools present
    - Restored same logic in _responses_client.py
    - Reverted test to expect the correct behavior
    
    * fixed issue in tests
    
    * fix: resolve merge conflict markers in ag-ui tests
    
    * fix: restructure ag-ui tests and fix Role/FinishReason to use string types
    
    * fix: streaming function invocation and middleware termination
    
    - Refactor streaming function invocation to use get_final_response() on inner streams
    - Fix MiddlewareTermination to accept result parameter for passing results
    - Fix _AutoHandoffMiddleware to use MiddlewareTermination instead of context.terminate
    - Fix AgentMiddlewareLayer.run() to properly forward function/chat middleware
    - Remove duplicate middleware registration in AgentMiddlewareLayer.__init__
    - Fix exception handling in _auto_invoke_function to properly capture termination
    - Fix mypy errors in core package
    - Update tests to use stream=True parameter for unified run API
    
    * fix all tests command
    
    * Refactor integration tests to use pytest fixtures
    
    - Merge testutils.py into conftest.py for azurefunctions integration tests
    - Merge dt_testutils.py into conftest.py for durabletask integration tests
    - Convert all integration tests to use fixtures instead of direct imports
      (fixes ModuleNotFoundError with --import-mode=importlib)
    - Add sample_helper fixture for azurefunctions tests
    - Add agent_client_factory and orchestration_helper fixtures for durabletask
    - Integration tests now skip with descriptive messages when services unavailable
    - Restructure devui tests into tests/devui/ with proper conftest.py
    - Add test organization guidelines to CODING_STANDARD.md
    - Remove __init__.py from test directories per pytest best practices
    
    * Fix pytest_collection_modifyitems to only skip integration tests
    
    The hook was skipping all tests in the test session, not just
    integration tests. Now it only skips items in the integration_tests
    directory.
    
    * Fix mem0 tests failing on Python 3.13
    
    Use patch.object on the imported module instead of @patch with string
    path to ensure the mock takes effect regardless of import timing.
    
    * fix mem0
    
    * another attempt for mem0
    
    * fix for mem0
    
    * fix mem0
    
    * Increase worker initialization wait time in durabletask tests
    
    Increase from 2 to 8 seconds to allow time for:
    - Python startup and module imports
    - Azure OpenAI client creation
    - Agent registration with DTS worker
    - Worker connection to DTS
    
    This helps prevent test failures in CI where the first tests may run
    before the worker is fully ready to process requests.
    
    * Fix streaming test to use ResponseStream with finalizer
    
    The _consume_stream method now expects a ResponseStream that can provide
    a final AgentResponse via get_final_response(). Update the test to use
    ResponseStream with AgentResponse.from_updates as the finalizer.
    
    * Fix MockToolCallingAgent to use new ResponseStream API and update samples
    
    * small updates to run_stream to run
    
    * fix sub workflow
    
    * temp fix for az func test
    
    ---------
    
    Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
  • [BREAKING] Python: Move orchestrations to dedicated package (#3685)
    * Move orchestrations to dedicated package
    
    * Merge main
    
    * Fix markdown links
    
    * Fix links