39 Commits

  • Python: [Breaking] Additional bug fix for declarative workflows (#6489)
    * Fix declarative object parsing bug
    
    * Remove unnecessary comment
    
    * Address PR comments
    
    * Address PR comments.
    
    * Fix CI failures.
    
    * declarative action approval bugfix
    
    * Address PR comments
    
    * Inlined single use variables.
  • Python: Refactor workflow as agent pending request handling (#6259)
    * WIP: Refactor Workflow as agent pending request handling
    
    * WIP: debugging empty message bug
    
    * Working: Workflow as agent with function approval
    
    * Address Copilot comments
    
    * Fix mypy
    
    * Address comments and fix pipeline
    
    * Request info non function approval now becomes function call
    
    * Revert uv.lock
    
    * Fix mypy
    
    * Bump min version of azure-ai-project
    
    * Remove RequestInfoFunctionArgs
    
    * fix tests
    
    * Fix failing tests
    
    * Fix sample
  • Python: Promote agent-framework-declarative package to RC (#6256)
    * Promote agent-framework-declarative package to RC
    
    * Update missed package status file.
  • Python: [Breaking] Remove Python-only declarative actions and rename alias kinds to C# canonical names (#6126)
    * Remove Python-only declarative actions and rename alias kinds to C# canonical names
    
    * Address PR comments.
    
    * Address PR comments.
    
    * Reduce verbose and duplicate output from sample workflow.
  • Python: Add Python parity sample for invoking Foundry Toolbox tools from declarative workflows (#5933)
    * Add Python parity sample for invoking Foundry Toolbox tools from declarative workflows
    
    * Python: address PR review on declarative toolbox sample
    
    Two security fixes for PR #5933:
    
    1. Add safe_mode flag to WorkflowFactory (default True) mirroring
       AgentFactory. Gates =Env.* exposure inside DeclarativeWorkflowState
       PowerFx symbols via _safe_mode_context, so workflow YAML loaded from
       untrusted sources no longer leaks the host's full os.environ snapshot
       into PowerFx evaluation. The flag is also forwarded to the
       internally-constructed AgentFactory so inline agent definitions
       follow the same policy.
    
    2. Pin the invoke_foundry_toolbox_mcp sample's _client_provider to the
       resolved toolbox endpoint. The bearer-authenticated httpx client is
       now only returned when MCPToolInvocation.server_url matches the
       toolbox URL case-insensitively; any other URL gets None (the default
       unauthenticated path), preventing the Foundry AAD bearer token from
       being attached to a mis-configured or injected server URL. Mirrors
       the .NET sample's httpClientProvider guard.
    
    The sample is updated to opt in to safe_mode=False because its YAML
    intentionally uses =Env.FOUNDRY_TOOLBOX_* to keep configuration in env
    vars under the developer's control.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright issues.
    
    * Addressed PR comments.
    
    * Fix CI pipelines.
    
    * Resolve PR comments
    
    * Revamped sample to address PR comments.
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • [BREAKING] Python: Enable instrumentation by default (#5865)
    * Enable instrumentation by default
    
    * Update samples
    
    * Optimization when span is not recording
    
    * Address Copilot comments
    
    * Revert uv.lock
    
    * Add warning
    
    * Formatting
    
    * Fix mypy
    
    * Add disable_instrumentation() with sticky user-intent semantics
    
    Add a public disable_instrumentation() entry point so users can explicitly opt
    out of Agent Framework telemetry, with a sticky-disable flag that makes the
    user's intent "leading" — no framework code path (foundry's
    configure_azure_monitor, configure_otel_providers, enable_instrumentation,
    enable_sensitive_telemetry, or direct OBSERVABILITY_SETTINGS.enable_*
    writes) can re-enable instrumentation until the user explicitly clears the
    disable with enable_instrumentation(force=True) /
    enable_sensitive_telemetry(force=True).
    
    Also addresses the two remaining unresolved review threads on the PR:
    1. test_observability_settings_defaults_instrumentation_true pins the new
       "ENABLE_INSTRUMENTATION defaults to True when env unset" behavior.
    2. test_enable_instrumentation_reads_env_sensitive_data restores coverage
       for the post-import load_dotenv() fallback path.
    
    Implementation:
    - ObservabilitySettings.enable_instrumentation / enable_sensitive_data become
      properties backed by _enable_*. While _user_disabled is True, the getters
      return False and the setters drop True writes (defense in depth so third-
      party writes can't subvert the disable).
    - Public is_user_disabled read-only property lets integrations (e.g. foundry's
      configure_azure_monitor) cheaply check the disable state without poking at
      privates.
    - enable_instrumentation() and enable_sensitive_telemetry() short-circuit with
      an info log when disabled; gain a force=True kwarg that clears the disable.
    - configure_otel_providers() still creates providers / exporters / views so a
      later force-enable can use them, but logs an info message when called while
      disabled.
    - Foundry's FoundryChatClient.configure_azure_monitor and
      FoundryAgent.configure_azure_monitor early-return when the user has
      disabled, so Azure Monitor's global providers aren't installed unnecessarily.
    
    Tests: 11 new tests covering default-on, env re-read at call time, sticky
    behavior against each re-enable surface (enable_instrumentation,
    enable_sensitive_telemetry, configure_otel_providers, direct attribute
    writes), force=True override, re-arming the disable, and the __all__ export.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: document disable_instrumentation() and force=True paths
    
    Add a "Disabling instrumentation" section to the observability sample README
    that walks through:
    
    - The distinction between the ENABLE_INSTRUMENTATION env var (initial,
      non-sticky) and disable_instrumentation() (process-wide, sticky).
    - Why the sticky semantics matter: framework integrations like
      FoundryChatClient.configure_azure_monitor() can call
      enable_instrumentation() as part of their setup, and the user's opt-out
      needs to win.
    - All five surfaces guarded by the sticky disable (property reads, public
      enable functions, configure_otel_providers, direct attribute writes,
      is_user_disabled-aware integrations).
    - The force=True escape hatch on both enable_instrumentation() and
      enable_sensitive_telemetry().
    - How third-party integrations should consult OBSERVABILITY_SETTINGS.is_user_disabled.
    - The limits of the disable (does not tear down existing providers /
      in-flight spans / third-party instrumentation, does not persist across
      processes).
    
    Cross-links the new section from the ENABLE_INSTRUMENTATION row in the env
    vars table.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: soften disable_instrumentation() overclaim about telemetry guarantees
    
    Replace 'no telemetry will be emitted no matter what' (which is too strong,
    since callers can still pass force=True or mutate private attributes) with
    language framing the disable as a user-intent contract that library and
    framework code is expected to honor: the framework actively short-circuits
    the public enable paths, force=True and private-attribute writes are
    acknowledged as out-of-contract escape hatches that integrations should
    not use on the user's behalf.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: correct observability Dependencies section
    
    - opentelemetry-sdk is no longer a hard dependency; it is lazily imported by
      create_resource(), create_metric_views(), and configure_otel_providers()
      with a clear ImportError when missing. Day-to-day instrumentation works
      with opentelemetry-api alone provided some other component configures the
      global OpenTelemetry providers (Azure Monitor, an APM agent, application
      bootstrap, etc.).
    - opentelemetry-semantic-conventions-ai is no longer used anywhere in the
      source; remove it from the listed dependencies.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: replace stale observability migration guide with current PR's only relevant migration
    
    The old guide documented the move away from setup_observability(otlp_endpoint=...)
    which was an earlier-release API change unrelated to this PR and stale enough that
    it's more confusing than helpful at this point. Replace it with a short note on the
    single migration this PR introduces: callers of
    enable_instrumentation(enable_sensitive_data=True) should switch to
    enable_sensitive_telemetry(). Cross-link to the Disabling instrumentation section
    for the rare 'force on without enabling sensitive data' use case where
    enable_instrumentation() still applies.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Improve the handling of intermediate outputs for workflows and orchestrations (#5623)
    * Improve the handling of intermediate outputs for workflows and orchestrations
    
    * Address PR review feedback on intermediate output forwarding
    
    - Switch workflow.as_agent() forwarding to an explicit allowlist of {output,
      intermediate, data, request_info} so orchestration-internal events
      (group_chat, handoff_sent, magentic_orchestrator) stay inside the workflow
      instead of leaking into agent responses via str(data) coercion.
    - Stop raising on intermediate AgentResponseUpdate in non-streaming run();
      surface the partial as a Message with text_reasoning content. The defensive
      raise still applies to terminal output events, where Update payloads would
      corrupt message ordering.
    - Extend the DevUI workflow-event mapper so intermediate yields wrapping
      plain strings, Messages, and list[Message] render as visible output items
      instead of generic completed-trace events.
    - Add orchestration coverage for GroupChat, Handoff, and Magentic builders
      (default vs intermediate_outputs=True; structural where end-to-end is heavy).
    
    * Lift output-designation policy into a value type
    
    Replace the ``Workflow._output_executors`` list and the
    ``RunnerContext.should_label_as_intermediate`` Protocol method with a single
    immutable ``OutputDesignation`` value type owned by ``Workflow``. Thread the
    designation as a parameter through the existing call chain (Runner ->
    EdgeRunner -> Executor -> WorkflowContext) so ``yield_output`` consults the
    threaded snapshot directly rather than calling back into the runner context.
    
    Removes the ``InProcRunnerContext._workflow`` back-reference and the
    ``WorkflowBuilder.build()`` assignment that wired it up. Adds the public
    predicate ``Workflow.is_terminal_executor(executor_id)`` for external
    observers; ``OutputDesignation`` itself stays package-internal.
    
    Key decisions
    - ``OutputDesignation.designated`` is ``frozenset[str] | None`` -- ``None``
      preserves legacy "every yield is type='output'" behavior, any frozenset
      (including empty) opts into strict mode. The ``DeprecationWarning`` for
      legacy mode at build time is unchanged.
    - ``output_designation`` is an optional parameter on ``Runner``,
      ``EdgeRunner.send_message``, ``EdgeRunner._execute_on_target``,
      ``Executor.execute``, ``Executor._create_context_for_handler``, and
      ``WorkflowContext.__init__``. Each defaults to legacy ``OutputDesignation()``
      so direct callers (Azure Functions ``CapturingRunnerContext``,
      ``test_runner`` recording fixtures) keep working without ceremony.
    - The workflow-level filter in ``_run_core`` reads ``self._output_designation``
      live, preserving today's semantics where mutating the designation after
      build still affects subsequent runs (used by two existing tests).
    - ``Workflow.to_dict()`` continues to emit ``"output_executors":
      list[str] | None`` (sorted from the frozenset). Checkpoint format unchanged.
    
    Files changed
    - _workflow.py: add ``OutputDesignation`` dataclass; replace
      ``_output_executors`` with ``_output_designation``; add
      ``is_terminal_executor``; delete ``_should_yield_output_event``.
    - _runner_context.py: drop ``should_label_as_intermediate`` Protocol method
      and ``InProcRunnerContext`` impl; drop ``_workflow`` back-reference.
    - _workflow_builder.py: remove ``context._workflow = workflow`` assignment.
    - _runner.py, _edge_runner.py, _executor.py, _workflow_context.py: thread
      ``output_designation`` parameter through the call chain.
    - tests/workflow/test_output_designation.py (new): three-state coverage of
      the value type plus the public predicate delegation.
    - tests/workflow/test_workflow_builder.py, test_validation.py,
      test_workflow.py, test_runner.py and
      orchestrations/tests/test_orchestration_intermediate_vs_terminal.py:
      switch probes from ``_output_executors`` set checks to
      ``get_output_executors`` / ``is_terminal_executor``; update two
      post-build mutation tests to set ``_output_designation`` instead.
    
    Verification
    - core/tests/workflow/, orchestrations/tests/, azurefunctions/tests/:
      1119 passed, 42 skipped, 2 xfailed.
    - ``uv run poe lint``: clean.
    - ``uv run poe typing``: only the pre-existing
      ``_AGENT_FORWARDED_EVENT_TYPES`` pyright warning from 394bcd607 remains.
    
    Notes for next iteration
    - The builder's own ``_output_executors`` attribute (``list[Executor |
      SupportsAgentRun]``) is intentionally untouched; the issue scoped the
      rename to the workflow attribute.
    - Adjacent review candidates (twin ``WorkflowAgent`` translators,
      ``_AGENT_FORWARDED_EVENT_TYPES`` kind classifier,
      ``_event_origin_context`` ContextVar removal, ``WorkflowEvent`` ADT
      split, legacy-mode removal) remain out of scope.
    
    * Add explicit workflow output designation
    
    Key decisions
    
    - Extend the internal OutputDesignation value type from terminal-only membership to output/intermediate/hidden classification. Legacy mode remains outputs=None, so workflows built without output_executors or intermediate_executors still label every yield_output as type='output'.
    
    - WorkflowBuilder now accepts intermediate_executors. Providing either designation enters explicit mode; output executors emit output, intermediate executors emit intermediate, and unlisted yield_output payloads are hidden from caller-facing events while remaining in executor_completed data.
    
    - Empty explicit designation, duplicate entries, overlaps, unknown executors, and designated executors without workflow output annotations fail build validation. Existing orchestration builders pass intermediate-capable participants through intermediate_executors to preserve current intermediate_outputs behavior until participant-oriented designation lands.
    
    Files changed
    
    - packages/core/agent_framework/_workflows/_workflow.py, _workflow_builder.py, _workflow_context.py, _validation.py, _events.py
    
    - packages/core/tests/workflow/test_output_designation.py, test_output_executors_contract.py, test_strict_mode_event_labeling.py, test_validation.py, test_workflow.py, test_workflow_agent_intermediate.py
    
    - packages/orchestrations/agent_framework_orchestrations/_sequential.py, _concurrent.py, _group_chat.py, _magentic.py
    
    - packages/core/AGENTS.md
    
    Verification
    
    - uv run pytest packages/core/tests/workflow packages/orchestrations/tests packages/devui/tests/devui/test_mapper.py -q
    
    - uv run pytest packages/azurefunctions/tests -q
    
    - uv run poe lint
    
    - uv run poe typing fails only on pre-existing packages/core/agent_framework/_workflows/_agent.py _AGENT_FORWARDED_EVENT_TYPES private-use pyright error.
    
    Notes for next iteration
    
    - issues/03-core-workflow-explicit-designation.md was moved to issues/done but issues/ remains untracked and intentionally excluded from this commit.
    
    - Slice 4 should tighten workflow.as_agent() mapping for hidden emissions and streaming-only update payloads; Slice 5 should replace orchestration intermediate_outputs with participant-oriented designation.
    
    * Tighten workflow-as-agent output mapping
    
    Key decisions
    
    - Treat AgentResponseUpdate as a streaming-only payload across the workflow.as_agent() adapter, so non-streaming agent runs now reject both terminal output and intermediate workflow events carrying updates.
    - Keep streaming classification behavior explicit: terminal update payloads remain normal text content, while intermediate update payloads are rewritten to text_reasoning content.
    - Add explicit-mode coverage proving hidden yield_output emissions do not appear in non-streaming AgentResponse messages or streaming AgentResponseUpdate chunks.
    
    Files changed
    
    - packages/core/agent_framework/_workflows/_agent.py
    - packages/core/tests/workflow/test_workflow_agent_intermediate.py
    
    Verification
    
    - uv run pytest packages/core/tests/workflow/test_workflow_agent_intermediate.py -q
    - uv run pytest packages/core/tests/workflow/test_workflow_agent.py packages/core/tests/workflow/test_workflow_agent_intermediate.py -q
    - uv run pytest packages/core/tests/workflow packages/orchestrations/tests packages/devui/tests/devui/test_mapper.py -q
    - uv run poe lint
    - uv run poe typing fails only on the pre-existing packages/core/agent_framework/_workflows/_agent.py _AGENT_FORWARDED_EVENT_TYPES private-use pyright error.
    
    Blockers or notes for next iteration
    
    - issues/04-workflow-as-agent-output-mapping.md was moved to issues/done/ but issues/ remains untracked and intentionally excluded from this commit.
    - Slice 5 should replace orchestration intermediate_outputs with participant-oriented designation.
    
    * Add orchestration participant output designation
    
    Key decisions
    
    - Replace orchestration intermediate_outputs with participant-oriented output_participants and intermediate_participants across Sequential, Concurrent, GroupChat, Magentic, and Handoff builders.
    - Keep synthetic final executors terminal by default for Concurrent, GroupChat, and Magentic; keep Sequential's final participant terminal by default; keep Handoff participants terminal by default.
    - Centralize participant designation validation for empty explicit designation, duplicates, overlaps, and unknown participants, then map validated participants to workflow output/intermediate executors.
    
    Files changed
    
    - packages/orchestrations/agent_framework_orchestrations/_participant_designation.py
    - packages/orchestrations/agent_framework_orchestrations/_sequential.py
    - packages/orchestrations/agent_framework_orchestrations/_concurrent.py
    - packages/orchestrations/agent_framework_orchestrations/_group_chat.py
    - packages/orchestrations/agent_framework_orchestrations/_magentic.py
    - packages/orchestrations/agent_framework_orchestrations/_handoff.py
    - packages/orchestrations/tests/test_orchestration_intermediate_vs_terminal.py
    - packages/orchestrations/tests/test_magentic.py
    
    Blockers or notes for next iteration
    
    - issues/05-orchestration-participant-designation.md was moved to issues/done/ but issues/ remains untracked and intentionally excluded from this commit.
    - Slice 7 should migrate samples and docs away from intermediate_outputs to the new participant designation API.
    - uv run poe typing still fails only on the pre-existing packages/core/agent_framework/_workflows/_agent.py _AGENT_FORWARDED_EVENT_TYPES private-use pyright error.
    
    * Migrate samples to explicit output designation
    
    Key decisions
    
    - Replace sample usage of the removed orchestration intermediate_outputs boolean with participant-oriented intermediate_participants designation.
    - Update raw workflow guidance to show output_executors together with intermediate_executors, and document that unlisted yields are hidden in explicit designation mode.
    - Keep orchestration final outputs terminal while streaming designated participant responses as intermediate progress, including workflow.as_agent() samples where intermediates map to text_reasoning content.
    - Refresh workflow and orchestration README guidance plus the changelog reference so public docs no longer point users at intermediate_outputs.
    
    Files changed
    
    - CHANGELOG.md
    - packages/orchestrations/README.md
    - samples/README.md
    - samples/03-workflows/README.md
    - samples/03-workflows/control-flow/intermediate_vs_terminal_outputs.py
    - samples/03-workflows/orchestrations/README.md
    - samples/03-workflows/orchestrations/group_chat_agent_manager.py
    - samples/03-workflows/orchestrations/group_chat_philosophical_debate.py
    - samples/03-workflows/orchestrations/group_chat_simple_selector.py
    - samples/03-workflows/orchestrations/magentic.py
    - samples/03-workflows/orchestrations/magentic_human_plan_review.py
    - samples/03-workflows/orchestrations/sequential_chain_only_agent_responses.py
    - samples/03-workflows/agents/group_chat_workflow_as_agent.py
    - samples/03-workflows/agents/magentic_workflow_as_agent.py
    - samples/03-workflows/agents/sequential_workflow_as_agent.py
    - samples/semantic-kernel-migration/orchestrations/group_chat.py
    - samples/semantic-kernel-migration/orchestrations/magentic.py
    
    Blockers or notes for next iteration
    
    - issues/07-samples-and-docs-explicit-output-designation.md was moved to issues/done/ but issues/ remains untracked and intentionally excluded from this commit.
    - issues/06-devui-intermediate-event-rendering.md remains present and appears already satisfied by existing DevUI mapper/tests from the prior implementation slice.
    - PRD-explicit-workflow-output-designation.md remains untracked and intentionally excluded from this commit.
    
    * Render DevUI intermediate workflow outputs
    
    Key decisions
    
    - Preserve workflow output designation metadata on visible DevUI output messages and text deltas so intermediate/data emissions remain distinguishable from terminal output.
    - Render intermediate workflow message items in the execution timeline using executor metadata, while excluding them from the final workflow result aggregation.
    - Keep terminal output message rendering unchanged and retain legacy data events on the intermediate compatibility path.
    
    Files changed
    
    - packages/devui/agent_framework_devui/_mapper.py
    - packages/devui/frontend/src/components/features/workflow/execution-timeline.tsx
    - packages/devui/frontend/src/components/features/workflow/workflow-view.tsx
    - packages/devui/frontend/src/types/openai.ts
    - packages/devui/tests/devui/test_mapper.py
    
    Blockers or notes for next iteration
    
    - issues/06-devui-intermediate-event-rendering.md was moved to issues/done/ but issues/ remains untracked and intentionally excluded from this commit.
    - PRD-explicit-workflow-output-designation.md remains untracked and intentionally excluded from this commit.
    - uv run poe typing still fails only on the pre-existing packages/core/agent_framework/_workflows/_agent.py _AGENT_FORWARDED_EVENT_TYPES private-use pyright error.
    
    * Fix mypy
    
    * Clarify orchestration participant output config
    
    * Rename participant output kwargs for clarity
    
    output_participants -> final_output_from, intermediate_participants ->
    intermediate_output_from. The old names read like categories of
    participant; the new names make it clear the kwarg designates which
    participants' outputs surface as final vs. intermediate events.
    
    * Rename core workflow output kwargs with deprecation shim
    
    Adds final_output_from / intermediate_output_from as canonical kwargs on
    Workflow and WorkflowBuilder. Old output_executors / intermediate_executors
    kwargs continue to work but emit DeprecationWarning via a shared coalesce
    helper that also rejects supplying both. Wire-format keys in to_dict()
    stay as output_executors / intermediate_executors so checkpoint
    compatibility is preserved.
    
    Internal call sites in orchestrations and samples updated to the new
    names so users following sample code learn the canonical vocabulary;
    legacy callers still work with a one-shot warning.
    
    * Suppress pyright reportPrivateUsage on cross-module sentinel import
    
    * Update docstrings
    
    * Propagate sub-workflow intermediate outputs, fix handoff/sequential intermediate-only designation, and shore up tests, sample, and docstrings around the intermediate output contract.
    
    * Add canonical workflow output_from selection
    
    Key decisions:\n- Make output_from the canonical workflow-output allow-list and keep output_executors/final_output_from as deprecated compatibility aliases.\n- Treat empty output_from/intermediate_output_from lists as explicit selections and keep validation responsible for empty, duplicate, overlap, and unknown selections.\n- Remove the branch-only public intermediate_executors WorkflowBuilder kwarg while preserving legacy wire keys in to_dict().\n\nFiles changed:\n- packages/core/agent_framework/_workflows/_workflow.py\n- packages/core/agent_framework/_workflows/_workflow_builder.py\n- packages/core/agent_framework/_workflows/_workflow_context.py\n- packages/core/agent_framework/_workflows/_agent.py\n- packages/core/agent_framework/_workflows/_agent_executor.py\n- packages/core/tests/workflow/* output-selection coverage updates\n- packages/core/AGENTS.md\n- issues/done/001-canonical-list-based-output-selection.md\n\nBlockers/notes:\n- Orchestration builders still pass final_output_from internally; follow-up issue 004 should migrate them to output_from.\n- Legacy omitted-selection behavior and explicit all/all_other literals are left for issues 002 and 003.
    
    * Add explicit all workflow output selection
    
    Key decisions:
    - Treat output_from='all' as an explicit workflow-output selection sentinel and expand it at build time to executors with declared workflow output types.
    - Keep omitted output selections in legacy all-output mode with a deprecation warning that names output_from and intermediate_output_from and points to output_from='all'.
    - Reject intermediate_output_from='all' at construction because the all-output literal is output-only for this issue.
    
    Files changed:
    - packages/core/agent_framework/_workflows/_workflow_builder.py
    - packages/core/tests/workflow/test_output_executors_contract.py
    - issues/done/002-explicit-all-output-and-legacy-migration.md
    
    Blockers/notes:
    - all_other intermediate-output selection remains for issue 003.
    - Workflow-as-agent/orchestration parity remains for issue 004.
    
    * Add all-other intermediate output selection
    
    Key decisions:
    - Treat intermediate_output_from='all_other' as an explicit intermediate-output selection sentinel and expand it at build time after the workflow graph is complete.
    - Expand all_other to output-capable executors not selected by output_from; omitted or empty output_from selects no workflow outputs, while output_from='all' leaves an empty intermediate selection.
    - Keep output_from='all_other' invalid so all_other remains intermediate-output-only and runtime classification still receives concrete executor-id sets.
    
    Files changed:
    - packages/core/agent_framework/_workflows/_workflow_builder.py
    - packages/core/tests/workflow/test_output_executors_contract.py
    - issues/done/003-all-other-intermediate-output-selection.md
    
    Blockers/notes:
    - Workflow-as-agent and orchestration parity remains for issue 004.
    - Full documentation updates remain for issue 005.
    
    * Add orchestration output selection parity
    
    Key decisions:
    - Expose output_from on sequential, concurrent, group chat, handoff, and magentic builders while keeping final_output_from as a deprecated compatibility alias.
    - Resolve orchestration participant selections through the same explicit rules as workflows: output_from='all', intermediate_output_from='all_other', hidden unselected participant payloads, and overlap/duplicate/unknown/invalid-literal validation.
    - Continue preserving documented orchestration defaults by always designating each pattern's terminal internal executor where applicable.
    
    Files changed:
    - packages/orchestrations/agent_framework_orchestrations/_participant_output_config.py
    - packages/orchestrations/agent_framework_orchestrations/_sequential.py
    - packages/orchestrations/agent_framework_orchestrations/_concurrent.py
    - packages/orchestrations/agent_framework_orchestrations/_group_chat.py
    - packages/orchestrations/agent_framework_orchestrations/_handoff.py
    - packages/orchestrations/agent_framework_orchestrations/_magentic.py
    - packages/orchestrations/agent_framework_orchestrations/_orchestration_request_info.py
    - packages/orchestrations/tests/test_orchestration_intermediate_vs_terminal.py
    - issues/done/004-workflow-as-agent-and-orchestration-parity.md
    
    Blockers/notes:
    - Full documentation and sample migration wording remains for issue 005.
    - Existing tests that intentionally use final_output_from now emit the new deprecation warning.
    
    * Document workflow output selection contract
    
    Key decisions:
    - Use Workflow Output and Intermediate Output as the developer-facing terms for selected caller-facing emissions.
    - Document output_from and intermediate_output_from as the canonical API, with output_from as an allow-list and unselected payloads hidden unless explicitly selected as intermediate.
    - Add scenario and invalid-selection tables for workflow and orchestration docs, including legacy omission warnings, output_from='all', intermediate_output_from='all_other', list selections, invalid literals, overlap, duplicates, unknown selections, and empty explicit selections.
    - Migrate samples away from final_output_from and output_executors except where compatibility aliases are explicitly documented.
    
    Files changed:
    - packages/core/AGENTS.md
    - packages/orchestrations/README.md
    - packages/orchestrations/agent_framework_orchestrations/_handoff.py
    - packages/orchestrations/agent_framework_orchestrations/_sequential.py
    - samples/03-workflows/README.md
    - samples/03-workflows/control-flow/intermediate_vs_terminal_outputs.py
    - samples/03-workflows/human-in-the-loop/agents_with_approval_requests.py
    - samples/03-workflows/orchestrations/README.md
    - samples/04-hosting/foundry-hosted-agents/responses/05_workflows/main.py
    - scripts/sample_validation/create_dynamic_workflow_executor.py
    - issues/done/005-document-output-selection-contract.md
    
    Blockers/notes:
    - Direct full Ruff on scripts/sample_validation/create_dynamic_workflow_executor.py still reports pre-existing docstring/print/line-length issues outside this docs migration; syntax-focused checks for changed files pass.
    - No remaining AFK issue files are present under issues/.
    
    * Latest updates
    
    * Typing fixes
    
    * Cleanup
  • Python: Add Python parity for InvokeMcpTool in declarative workflow (#5630)
    * Add Python parity for HttpRequestAction in declarative workflow
    
    * Ran pyupgrade and pright to fix CI issues
    
    * Fix conversation ID dot parsing for http executor
    
    * Removed unnecessary export command
    
    * Initial implementation of invoke mcp tool in python
    
    * Update sample to support require approval to be toggled by environment variable.
    
    * Fix cache and PR comments
    
    * Update python/samples/03-workflows/declarative/invoke_mcp_tool/main.py
    
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
  • docs: fix outdated @ai_function reference to @tool in workflows README (#5622)
    The @ai_function decorator was renamed to @tool in release 
    python-1.0.0b260128 (PR #3413) as a breaking change.
    
    Line 58 of python/samples/03-workflows/README.md still referenced 
    the old @ai_function name, causing users to hit:
    ImportError: cannot import name 'AIFunction'
    
    Changes made:
    - Fixed @ai_function to @tool on line 58 only
    - No formatting or whitespace changes
  • Python: Add Python parity for HttpRequestAction in declarative workflow (#5599)
    * Add Python parity for HttpRequestAction in declarative workflow
    
    * Ran pyupgrade and pright to fix CI issues
    
    * Fix conversation ID dot parsing for http executor
    
    * Removed unnecessary export command
  • Python: [BREAKING] Standardize orchestration terminal outputs as AgentResponse (#5301)
    * Fix orchestration outputs so as_agent() returns the final answer only. Align other orchestration outputs
    
    * Fix orchestration output issues from review comments
    
    1. Sample cleanup: Remove commented-out FoundryChatClient block and update
       prerequisites to reference OPENAI_CHAT_MODEL_ID instead of FOUNDRY_* vars.
    
    2. Sequential approval output: Change _EndWithConversation.end_with_agent_executor_response
       from a no-op sink to yield response.agent_response. When the last participant is
       AgentApprovalExecutor (via with_request_info), _EndWithConversation is the output
       executor so the yield produces the terminal answer. When the last participant is a
       regular AgentExecutor, _EndWithConversation is not in output_executors so the yield
       is silently filtered out.
    
    3. Forward data events through WorkflowExecutor: _process_workflow_result now also
       forwards 'data' events from sub-workflows so that emit_intermediate_data=True on
       AgentExecutor works correctly when wrapped in AgentApprovalExecutor.
    
    4. Concurrent docstring: Update _AggregateAgentConversations docstring to say
       'deterministic participant order' instead of 'completion order'.
    
    5. Add test_concurrent_intermediate_outputs_emits_data_events verifying that
       ConcurrentBuilder(intermediate_outputs=True) emits per-participant data events
       alongside the single aggregated output event.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add tests for sequential workflow with_request_info and intermediate_outputs (#5301)
    
    Address PR review comments 2, 3, and 5:
    
    - Add test_sequential_request_info_last_participant_emits_output:
      Verifies that when the last participant is wrapped via with_request_info()
      (AgentApprovalExecutor), the workflow still emits a terminal output after
      approval, exercising the _EndWithConversation.end_with_agent_executor_response
      fallback path.
    
    - Add test_sequential_request_info_with_intermediate_outputs_emits_data_events:
      Verifies that emit_intermediate_data=True works correctly through
      AgentApprovalExecutor wrapping—WorkflowExecutor._process_result already
      forwards data events from sub-workflows, so intermediate agent responses
      surface as data events in the parent workflow.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright type errors from AgentResponse output refactor (#5301)
    
    Update cast() calls in _group_chat.py and _magentic.py to use
    WorkflowContext[Never, AgentResponse] instead of the old
    WorkflowContext[Never, list[Message]], matching the updated method
    signatures in _base_group_chat_orchestrator.py.
    
    Fix _sequential.py _EndWithConversation.end_with_agent_executor_response
    to declare WorkflowContext[Any, AgentResponse] so yield_output accepts
    AgentResponse[None].
    
    Fix _workflow_executor.py data event forwarding to handle nullable
    executor_id.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright reportUnknownVariableType in _agent.py (#5301)
    
    Extract event.data into a typed local variable before the isinstance
    check to avoid pyright narrowing it to AgentResponse[Unknown].
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright reportMissingImports for orjson in file history samples (#5301)
    
    Add pyright: ignore[reportMissingImports] to orjson imports that are
    already guarded by try/except ImportError, matching the existing pattern
    used elsewhere in the samples.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #5301: review comment fixes
    
    * Address review feedback for #5301: review comment fixes
    
    * Revert sequential_workflow_as_agent sample to FoundryChatClient
    
    Reverts the mistaken switch from FoundryChatClient to OpenAIChatClient
    in the sequential workflow as agent sample.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address ultrareview feedback: emit_data_events rename + WorkflowAgent reasoning conversion
    
    Layered on top of the prior review-feedback work in this branch.
    
    Renames:
    - AgentExecutor.emit_intermediate_data -> emit_data_events (mechanical
      rename; orchestration semantics live at the orchestration layer, not
      the general-purpose executor). Forwarded through MagenticAgentExecutor,
      AgentApprovalExecutor, and all orchestration call sites.
    - HandoffAgentExecutor._check_terminate_and_yield -> _should_terminate
      (pure predicate; no longer yields anything). HandoffBuilder docstring
      rewritten to describe the new per-agent AgentResponse output contract.
    
    WorkflowAgent reasoning-content conversion:
    - Add _rewrite_text_to_reasoning(contents) and _msg_as_reasoning(msg)
      helpers; the as_agent() path now reframes text content from data events
      as text_reasoning Content blocks before merging into the AgentResponse.
    - Consumers iterate msg.contents and branch on content.type — same path
      they already use for Claude thinking and OpenAI reasoning. No new
      field on Message/AgentResponse/WorkflowEvent.
    - Streaming branch constructs fresh AgentResponseUpdate instances instead
      of mutating shared payloads (regression test added).
    - Helper _msg_maybe_reasoning consolidates the conditional rewrite at
      three call sites in the non-streaming conversion.
    
    Tests:
    - TestWorkflowAgentReasoningHelpers + TestWorkflowAgentDataEventReasoningConversion
      add 9 new tests covering helpers, non-streaming, streaming, mixed content,
      already-reasoning passthrough, and mutation-safety regression.
    - Updated test_sequential_as_agent_with_intermediate_outputs_includes_chain
      to assert text_reasoning content for intermediate agents.
    
    * Fix pyright: widen event.data to Any to avoid partial-unknown narrowing
    
    The streaming conversion path narrowed event.data via isinstance against
    generic AgentResponse, producing AgentResponse[Unknown] and tripping
    reportUnknownVariableType/reportUnknownMemberType. Binding data: Any
    before the check keeps runtime behavior identical while restoring a fully
    known type for downstream access.
    
    * Clean up design
    
    * Scope to agent output semantics only
    
    * yield AgentResponseUpdate streaming, AgentResponse non-streaming
    
    * Fix mypy/pyright: widen cast types at GroupChat callsites
    
    Eight callsites in _group_chat.py still cast to WorkflowContext[Never,
    AgentResponse] but the base orchestrator methods now accept the wider
    WorkflowContext[Never, AgentResponse | AgentResponseUpdate] (mode-aware
    yields). W_OutT is invariant, so the narrower cast is not assignable.
    Magentic was widened in the same commit; this catches the GroupChat
    callsites that were missed.
    
    * Python: skip flaky Foundry / Foundry Hosting integration tests (#5553)
    
    These two integration tests have been failing in the merge queue across
    multiple unrelated PRs (5301, 5531). Both are marked `@pytest.mark.flaky`
    with 3 retries, but all attempts fail back-to-back. Skipping both with a
    reason pointing to #5553 so they can be fixed properly without continuing
    to block unrelated merges.
    
    - packages/foundry_hosting/tests/test_responses_int.py::TestOptions::test_temperature_and_max_tokens
    - packages/foundry/tests/foundry/test_foundry_embedding_client.py::TestFoundryEmbeddingIntegration::test_text_embedding_live
    
    Also includes a one-line uv.lock specifier-ordering normalization
    auto-applied by the poe-check pre-commit hook.
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: (core): Add functional workflow API (#4238)
    * Add functional workflow api
    
    * cleanup
    
    * More cleanup
    
    * address copilot feedback
    
    * Address PR feedbacK
    
    * updates
    
    * PR feedback
    
    * Address review comments on functional workflow samples
    
    - Swap 05/06 get-started samples: agent workflow first (motivates
      why workflows exist), simple text workflow second
    - Rename text_pipeline → text_workflow, poem_pipeline → poem_workflow
    - Add @step to agent workflow sample (05) to demonstrate caching
    - Switch agent samples to AzureOpenAIResponsesClient with Foundry
    - Remove .as_agent() from agent_integration.py to focus on the key
      difference between inline agent calls vs @step-cached calls
    - Add commented-out Agent.run example in hitl_review.py
    - Add clarifying comment in _functional.py that event streaming is
      buffered (not true per-token streaming)
    - Add naive_group_chat.py functional sample: round-robin group chat
      as a plain Python loop
    - Update READMEs to reflect new file names and group chat sample
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright type errors
    
    * Address PR review comments on functional workflow API
    
    1. Allow request_info inside @step: Auto-inject RunContext into step
       functions that declare a RunContext parameter (by type or name 'ctx'),
       and expose get_run_context() for programmatic access.
    
    2. Handle None responses: Log a warning when a response value is None,
       and document the behavior in request_info docstring.
    
    3. Add executor_bypassed event type: Replace executor_invoked +
       executor_completed with a single executor_bypassed event when a step
       replays from cache, making cached vs live execution explicit.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add regression tests for PR review comments on functional workflow API
    
    The three review comments (request_info in @step, None response handling,
    executor_bypassed event type) were already addressed in 7da7db4e. This
    commit adds cross-cutting regression tests that exercise the interactions
    between these features:
    
    - HITL in step with caching: preceding step bypassed on resume
    - Full checkpoint lifecycle with HITL step (interrupt -> resume -> restore)
    - None response inside step-level request_info logs warning
    - WorkflowInterrupted from step does not emit executor_failed
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR #4238 review comments on functional workflow API
    
    Comment 1 (request_info in @step): Already supported. Added comment in
    StepWrapper.__call__ explaining why WorkflowInterrupted (BaseException)
    safely bypasses the except Exception handler.
    
    Comment 2 (None response): Added docstring to _get_response clarifying
    the (found, value) return tuple semantics and None handling.
    
    Comment 3 (bypass event type): executor_bypassed is already a dedicated
    event type in WorkflowEventType. Updated comment at the bypass site to
    make the deliberate event type choice explicit.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add experimental API warnings to functional workflow module
    
    Mark all public classes and decorators (workflow, step, RunContext,
    FunctionalWorkflow, StepWrapper, FunctionalWorkflowAgent) as
    experimental and subject to change or removal.
    
    * Address PR #4238 review comments from @eavanvalkenburg
    
    - RunContext docstring leads with purpose (opt-in handle for HITL,
      custom events, state) so readers importing it from the public surface
      understand its role before the mechanics (#2993513452).
    - Rename `06_first_functional_workflow.py` to
      `06_functional_workflow_basics.py`; the previous filename was
      confusing since it followed `05_functional_workflow_with_agents.py`
      (#2993531979).
    - Simplify `05_functional_workflow_with_agents.py` to call agents
      directly without a @step wrapper; the step-vs-no-step contrast lives
      in `03-workflows/functional/agent_integration.py`, keeping the
      get-started sample minimal (#2993525532).
    - Switch functional samples to `FoundryChatClient` for consistency with
      the rest of 01-get-started and 03-workflows (follow-up on #2876988570).
    - Use walrus in `hitl_review.py` final-state assertion (#2993572182).
    - Add expected-output block to `basic_streaming_pipeline.py` (#2993557609).
    - Clarify in `parallel_pipeline.py` that `@step` composes with
      `asyncio.gather` (#2993597282).
    - `naive_group_chat.py` threads `list[Message]` between turns instead
      of stringifying the transcript, preserving role/authorship (#2993583231).
    
    Drive-by: pre-commit hook sorts an unrelated import block in
    `samples/04-hosting/foundry-hosted-agents/responses/02_local_tools/main.py`.
    
    * Fix 10 functional-workflow API bugs from /ultrareview pass
    
    - bug_001: `ctx.request_info()` without an explicit `request_id` now derives
      a deterministic `auto::<index>` id from the call-counter, so HITL resume
      works correctly on the documented default path.  A uuid was regenerated on
      every replay, making resume impossible.
    
    - bug_002: `StepWrapper.__call__` no longer deepcopies arguments on the
      cache-hit replay branch.  The copy is only performed on the live-execution
      path (for the event log) and falls back to the original mapping if deepcopy
      fails, so steps whose args aren't deepcopyable (locks, sockets, sessions)
      can still resume from checkpoint.
    
    - bug_007: `_set_responses` now prunes each resolved `request_id` from
      `_pending_requests`, and the cache-hit branch in `request_info` does the
      same.  Previously, answered requests were re-serialized into every
      subsequent checkpoint and the final checkpoint falsely claimed pending
      requests even after the workflow completed.
    
    - bug_008: `_compute_signature_hash` now mixes the function's `co_code` and
      `co_names` into the checkpoint signature, so changes to the workflow body
      invalidate older checkpoints even when steps are accessed via module /
      class attributes (which `_discover_step_names` can't see statically).
      `RunContext._record_observed_step` records observed step names for
      diagnostics.
    
    - bug_010: `FunctionalWorkflow.run()` docstring corrected — says "at least
      one of message/responses/checkpoint_id" and explicitly notes `responses`
      may be combined with `checkpoint_id` (the validator already allowed this).
    
    - bug_013: `FunctionalWorkflowAgent` now surfaces `request_info` events as
      `FunctionApprovalRequestContent` items (mirroring graph `WorkflowAgent`),
      threads `responses=` and `checkpoint_id=` through to the underlying
      workflow, and exposes `pending_requests`.  Previously `.as_agent()`
      returned empty `AgentResponse` for HITL workflows — effectively unusable.
    
    - bug_014: `FunctionalWorkflow` now clears `_last_message`,
      `_last_step_cache`, and `_last_pending_request_ids` on clean completion.
      `run()` validates that `responses=` keys intersect the currently-pending
      request set (or raises with a clear error) instead of silently replaying
      against stale singleton state from a prior run.
    
    - bug_015: `FunctionalWorkflow.as_agent` signature now matches graph
      `Workflow.as_agent`: accepts `name`, `description`, `context_providers`,
      and `**kwargs`.  `FunctionalWorkflowAgent` stores the overrides.
    
    - bug_017: `RunContext.set_state` raises `ValueError` for underscore-
      prefixed keys (the framework's `_step_cache` / `_original_message` keys
      would silently clobber user state on checkpoint save and user
      underscore-prefixed state was dropped on restore).  Docstring documents
      the reserved prefix.
    
    - merged_bug_003: Workflow function arity is validated at decoration time.
      Multiple non-ctx parameters raise `ValueError` immediately (previously
      every arg past the first was silently dropped at call time).  Passing a
      non-None `message` to a ctx-only workflow raises `ValueError` instead of
      silently discarding the message.
    
    Test coverage: +18 regression tests covering every fix.  Full workflow
    suite now 766 passed, 1 skipped, 2 xfailed; full core suite 2338 passed.
    
    * Deslop functional.py fix commit
    
    - Remove dead instrumentation added in the prior commit that was never
      consumed: `RunContext._observed_step_names`,
      `RunContext._record_observed_step`, `FunctionalWorkflow._runtime_step_names`,
      and `FunctionalWorkflowAgent._extra_kwargs`.  The signature hash relies on
      `co_code` alone, which covers the attribute-access case without the
      collection-scaffolding.
    - Trim over-explanatory comments that restated what the code does or what
      it no longer does.  Keep only the comments that answer "why" for the
      non-obvious bits (deterministic id contract, defensive deepcopy, stale
      replay guard).
    - Compress the `_compute_signature_hash` and FunctionalWorkflow `__init__`
      block docstrings without losing the user-facing reasoning.
    
    Net -49 lines.  Regression lock preserved (766 passed, 1 skipped, 2 xfailed).
    
    * Fix functional workflow review feedback
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    Co-authored-by: Copilot <copilot@github.com>
  • Python: Add second approval-required tool (set_stop_loss) to concurrent_builder_tool_approval sample (#4875)
    * Add set_stop_loss tool to concurrent_builder_tool_approval sample
    
    Add a second approval-gated tool (set_stop_loss) to the concurrent workflow
    tool approval sample to demonstrate handling approval requests for different
    tools in the same concurrent workflow.
    
    Changes:
    - Add set_stop_loss(symbol, stop_price) with approval_mode='always_require'
    - Include new tool in both agents' tool lists
    - Update agent instructions and prompt to encourage stop-loss usage
    - Update docstring to reflect two approval-gated tools
    - Update sample output to show mixed approval requests
    
    Fixes #4874
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Print tool name and arguments in concurrent sample's process_event_stream (#4874)
    
    Align process_event_stream in concurrent_builder_tool_approval.py to print
    the tool name and arguments when collecting approval requests, matching the
    sample output comment and the sequential_builder_tool_approval.py pattern.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add None-guard for function_call access in tool approval sample (#4874)
    
    Add explicit None-checks before accessing function_call.name and
    function_call.arguments in concurrent_builder_tool_approval.py. The
    function_call field is typed Content | None, so direct attribute access
    without a guard could raise AttributeError and required type: ignore
    comments. The None-guard is consistent with the pattern used in
    _agent_run.py and removes the suppression comments.
    
    Also add a regression test verifying that function_call defaults to None
    and that the None-guard pattern is safe.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Apply same function_call None-guard to sibling tool-approval samples (#4874)
    
    Apply the same fix to sequential_builder_tool_approval.py and
    group_chat_builder_tool_approval.py, which had the identical pattern
    of accessing function_call.name/arguments without a None-guard.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Stop emitting duplicate reasoning content from OpenAI response.reasoning_text.done and response.reasoning_summary_text.done events (#5162)
    * Fix reasoning text done events duplicating streamed delta content (#5157)
    
    The OpenAI Responses API sends both reasoning_text.delta (incremental
    chunks) and reasoning_text.done (full accumulated text) events. The
    chat client was emitting Content for both, causing ag-ui to append the
    full done text onto already-accumulated delta text, producing
    duplicated reasoning output.
    
    Stop emitting Content for reasoning_text.done and
    reasoning_summary_text.done events, matching how output_text.done is
    already handled (not emitted). The deltas contain all the content;
    the done event is redundant.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(openai): emit reasoning done content as fallback when no deltas observed (#5157)
    
    Address PR review feedback:
    - Track item_ids that received reasoning deltas via seen_reasoning_delta_item_ids set
    - Emit content from done events only when no deltas were received for the
      item_id, preventing silent content loss on stream resumption
    - Add comment documenting code_interpreter done event asymmetry
    - Replace redundant ag-ui test with deduplication-focused test
    - Add integration test for delta+done sequence in OpenAI chat client tests
    - Add fallback path tests for done events without preceding deltas
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #5157: Python: [Bug]: "type": "response.reasoning_text.delta" and "response.reasoning_text.done" both get exposed as "text_reasoning"
    
    * Fix AG-UI reasoning streaming to use proper Start/End pattern (#5157)
    
    _emit_text_reasoning now follows the same streaming pattern as _emit_text:
    - Emits ReasoningStartEvent/ReasoningMessageStartEvent only on the first
      delta for a given message_id
    - Emits only ReasoningMessageContentEvent for subsequent deltas
    - Defers ReasoningMessageEndEvent/ReasoningEndEvent until
      _close_reasoning_block is called (on content type switch or end-of-run)
    
    This produces the correct protocol pattern:
      ReasoningStartEvent
        ReasoningMessageStartEvent
        ReasoningMessageContentEvent(delta1)
        ReasoningMessageContentEvent(delta2)
        ReasoningMessageEndEvent
      ReasoningEndEvent
    
    Instead of wrapping every delta in a full Start→End sequence.
    
    Backward compatibility is preserved: calling _emit_text_reasoning without
    a flow argument still produces the full sequence per call.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix import ordering lint error in AG-UI test file (#5157)
    
    Move inline import of TextMessageContentEvent to the top-level import
    block and ensure alphabetical ordering to satisfy ruff I001 rule.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix mypy error: rename loop variable to avoid type conflict with WorkflowEvent
    
    The 'event' variable was already typed as WorkflowEvent[Any] from the
    async for loop at line 590. Reusing it in the _close_reasoning_block
    loop (which returns list[BaseEvent]) caused an incompatible assignment
    error. Renamed to 'reasoning_evt' to avoid the conflict.
    
    Fixes #5162
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #5157: review comment fixes
    
    * narrow test result reporting to explicit pytest JUnit XML
    
    * Fix test args
    
    * Fix pytest-results-action in merge workflow and remove committed test artifacts
    
    Apply the same JUnit XML fix from python-tests.yml to python-merge-tests.yml:
    add --junitxml=pytest.xml to all test commands and narrow the results action
    path from ./python/**.xml to ./python/pytest.xml. Also remove accidentally
    committed pytest.xml and python-coverage.xml and add them to .gitignore.
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Add Cosmos DB NoSQL Checkpoint Storage for Python Workflows (#4916)
    * Add CosmosCheckpointStorage for Python workflow checkpointing
    
    Add native Cosmos DB NoSQL support for workflow checkpoint storage in the
    Python agent-framework-azure-cosmos package, achieving parity with the
    existing .NET CosmosCheckpointStore.
    
    New files:
    - _checkpoint_storage.py: CosmosCheckpointStorage implementing the
      CheckpointStorage protocol with 6 methods (save, load, list_checkpoints,
      delete, get_latest, list_checkpoint_ids)
    - test_cosmos_checkpoint_storage.py: Unit and integration tests
    - workflow_checkpointing.py: Sample demonstrating Cosmos DB-backed
      workflow checkpoint/resume
    
    Auth support:
    - Managed identity / RBAC via Azure credential objects
      (DefaultAzureCredential, ManagedIdentityCredential, etc.)
    - Key-based auth via account key string or AZURE_COSMOS_KEY env var
    - Pre-created CosmosClient or ContainerProxy
    
    Key design decisions:
    - Partition key: /workflow_name for efficient per-workflow queries
    - Serialization: Reuses encode/decode_checkpoint_value for full Python
      object fidelity (hybrid JSON + pickle approach)
    - Container auto-creation via create_container_if_not_exists
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Adding cosmos checkpointer
    
    * Resolving comments
    
    * Fixing builds
    
    * Adding sample for history provider and checkpoint storage
    
    * Resolving comments
    
    * fixing builds
    
    * Resolving comments
    
    ---------
    
    Co-authored-by: Aayush Kataria <aayushkataria@Aayushs-MacBook-Pro-2.local>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>
  • Python: [BREAKING] update to v1.0.0 (#5062)
    * updates to final deprecated pieces and versions
    
    * fix mypy
    
    * fix readme links
  • Python: [BREAKING] Python: move Azure AI embeddings to Foundry (#5056)
    * renamed AzureAIINferenceEmbeddings and lazy load azure-cosmos and env var rename
    
    * updated coverage
    
    * fix readme
  • [BREAKING] Python: Refactor workflows kwargs (#5010)
    * Refactor workflows kwargs usage
    
    * Update sample
    
    * Add tests
    
    * Update samples
    
    * Fix formatting
    
    * Comments
    
    * Comments 2
    
    * Comments 3
    
    * Fix test and typing
  • Python: Move workflow-samples and agent-samples under declarative-agents directory (#5011)
    * Move workflow-samples and agent-samples under declarative-agents and update all references
    
    Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f70f7d19-9256-4eec-b7db-28007d74440c
    
    Co-authored-by: sphenry <6749825+sphenry@users.noreply.github.com>
    
    * Fix relative paths in README files inside moved directories
    
    Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f70f7d19-9256-4eec-b7db-28007d74440c
    
    Co-authored-by: sphenry <6749825+sphenry@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
    Co-authored-by: sphenry <6749825+sphenry@users.noreply.github.com>
    Co-authored-by: Shawn Henry <shahen@microsoft.com>
  • Python: [BREAKING] Remove deprecated Python OpenAI/Azure AI surfaces (#4990)
    * [BREAKING] Remove deprecated Python OpenAI/Azure AI surfaces
    
    Also clean up follow-on docs, environment guidance, package metadata, and lab test stability.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix deleted semantic-kernel sample links
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review feedback
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * improve foundry language
    
    * Fix A2A Foundry sample regression
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Foundry Evals integration for Python (#4750)
    * Foundry Evals integration for Python
    
    Merged and refactored eval module per Eduard's PR review:
    
    - Merge _eval.py + _local_eval.py into single _evaluation.py
    - Convert EvalItem from dataclass to regular class
    - Rename to_dict() to to_eval_data()
    - Convert _AgentEvalData to TypedDict
    - Simplify check system: unified async pattern with isawaitable
    - Parallelize checks and evaluators with asyncio.gather
    - Add all/any mode to tool_called_check
    - Fix bool(passed) truthy bug in _coerce_result
    - Remove deprecated function_evaluator/async_function_evaluator aliases
    - Remove _MinimalAgent, tighten evaluate_agent signature
    - Set self.name in __init__ (LocalEvaluator, FoundryEvals)
    - Limit FoundryEvals to AsyncOpenAI only
    - Type project_client as AIProjectClient
    - Remove NotImplementedError continuous eval code
    - Add evaluation samples in 02-agents/ and 03-workflows/
    - Update all imports and tests (167 passing)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: resolve mypy redundant-cast errors while keeping pyright happy
    
    Use cast(list[Any], x) with type: ignore[redundant-cast] comments to
    satisfy both mypy (which considers casting Any redundant) and pyright
    strict mode (which needs explicit casts to narrow Unknown types).
    
    Also fix evaluator decorator check_name type annotation to be
    explicitly str, resolving mypy str|Any|None mismatch.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: CI failures — pyupgrade, evaluator overloads, sample API, reset attr
    
    - Apply pyupgrade: Sequence from collections.abc, remove forward-ref quotes
    - Add @overload signatures to evaluator() for proper @evaluator usage
    - Fix evaluate_workflow sample to use WorkflowBuilder(start_executor=) API
    - Fix _workflow.py executor.reset() to use getattr pattern for pyright
    - Remove unused EvalResults forward-ref string in default_factory lambda
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: skip gRPC-dependent observability test
    
    The test_configure_otel_providers_with_env_file_and_vs_code_port test
    triggers gRPC OTLP exporter creation, but the grpc dependency is
    optional and not installed by default. Add skipif decorator matching
    the pattern used by all other gRPC exporter tests in the same file.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: add nosec B101 for bandit assert check
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * style: align eval samples with repo conventions
    
    - Move module docstrings before imports (after copyright header)
    - Add -> None return type to all main() and helper functions
    - Fix line-too-long in multiturn sample conversation data
    - Add Workflow import for typed return in all_patterns_sample
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review feedback: async fixes, sample bugs, deprecation warnings
    
    - Simplify _ensure_async_result to direct await (async-only clients)
    - Replace get_event_loop() with get_running_loop()
    - Narrow _fetch_output_items exception handling to specific types
    - Add warning log when _filter_tool_evaluators falls back to defaults
    - Add DeprecationWarning to options alias in Agent.__init__
    - Add DeprecationWarning to evaluate_response()
    - Rename raw key to _raw_arguments in convert_message fallback
    - Fix evaluate_agent_sample.py: replace evals.select() with FoundryEvals()
    - Fix evaluate_multiturn_sample.py: use Message/Content/FunctionTool types
    - Fix evaluate_workflow_sample.py: replace evals.select() with FoundryEvals()
    - Update test mocks to use AsyncMock for awaited API calls
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add test coverage for review feedback items
    
    - Add num_repetitions=2 positive test verifying 2×items and 4 agent calls
    - Add _poll_eval_run tests: timeout, failed, and canceled paths
    - Add evaluate_traces tests: validation error, response_ids path, trace_ids path
    - Add evaluate_foundry_target happy-path test with target/query verification
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix ruff ISC004 lint error and apply formatter
    
    - Wrap implicit string concatenation in parens in evaluate_multiturn_sample.py
    - Apply ruff formatter to 6 other files with minor formatting drift
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove core type changes (extracted to fix/workflow-stale-session branch)
    
    Reverts changes to _agents.py, _agent_executor.py, and _workflow.py
    back to upstream/main. These fixes are now in a separate PR.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review round 2: bugs, tests, and architecture
    
    Code fixes:
    - Fix _normalize_queries inverted condition (single query now replicates
      to match expected_count)
    - Fix substring match bug: 'end' in 'backend' matched; use exact set
      lookup for executor ID filtering
    - Fix used_available_tools sample: tool_definitions→tools param, use
      FunctionTool attribute access instead of dict .get()
    - Add None-check in _resolve_openai_client for misconfigured project
    - Add Returns section to evaluate_workflow docstring
    - Cache inspect.signature in @evaluator wrapper (avoid per-item reflection)
    
    Architecture:
    - Extract _evaluate_via_responses as module-level helper; evaluate_traces
      now calls it directly instead of creating a FoundryEvals instance
    - Move Foundry-specific typed-content conversion out of core to_eval_data;
      core now returns plain role/content dicts, FoundryEvals applies
      AgentEvalConverter in _evaluate_via_dataset
    
    Tests:
    - evaluate_response() deprecation warning emission and delegation
    - num_repetitions > 1 with expected_output and expected_tool_calls
    - Mock output_items.list in test_evaluate_calls_evals_api
    - Update to_eval_data assertions for plain-dict format
    - Unknown param error now raised at @evaluator decoration time
    
    Skipped (separate PR): executor reset loop, xfail removal, options alias
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix CI: revert test_full_conversation, fix pyright errors
    
    - Revert test_full_conversation.py to upstream/main (the session
      preservation test was incorrectly changed to assert clearing)
    - Fix pyright reportUnnecessaryComparison on get_openai_client() None
      check by adding ignore comment
    - Fix pyright reportPrivateUsage: add public EvalItem.split_messages()
      method and use it in FoundryEvals._evaluate_via_dataset instead of
      accessing private _split_conversation
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review round 3: reliability, test gaps, cleanup
    
    - Add try/except guard for non-numeric score in _coerce_result
    - Add poll_interval minimum bound (0.1s) to prevent tight loops
    - Add runtime async client check in _resolve_openai_client
    - Remove _ensure_async_result wrapper (10 call sites → direct await)
    - Better error message when queries provided without agent
    - Import-time asserts for evaluator set consistency
    - Remove 28 redundant @pytest.mark.asyncio decorators
    - Add doc note about _raw_arguments sensitive data
    - Tests: tool_called_check mode=any, _normalize_queries branches,
      _extract_result_counts paths, _extract_per_evaluator, bare check
      via evaluate_agent, output_items assertion, modulo wrapping,
      async client check, queries-without-agent error
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix CI: ruff S101 assert, pyright and mypy arg-type errors
    
    - Replace module-level assert with if/raise for evaluator set
      consistency checks (ruff S101 disallows bare assert)
    - Add type: ignore[arg-type] and pyright: ignore[reportArgumentType]
      on OpenAI SDK evals API calls that pass dicts where typed params
      are expected (SDK accepts dicts at runtime)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review round 4: bugs, reliability, test fixes
    
    - Fix all_passed ignoring parent result_counts when sub_results present
    - Fix _extract_tool_calls: parse string arguments via json.loads before
      falling back to None (real LLM responses use string arguments)
    - Sanitize _raw_arguments to '[unparseable]' to avoid leaking sensitive
      tool-call data to external evaluation services
    - Add NOTE comment on to_eval_data message serialization dropping
      non-text content (tool calls, results)
    - Eliminate double conversation split in _evaluate_via_dataset: build
      JSONL dicts directly from split_messages + AgentEvalConverter
    - Raise poll_interval floor from 0.1s to 1.0s to prevent rate-limit
      exhaustion
    - Fix MagicMock(name=...) bug in test: sets display name not .name attr
    - Fix mock_output_item.sample: use MagicMock object instead of dict so
      _fetch_output_items exercises error/usage/input/output extraction
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review round 5: reliability, docs, test coverage
    
    Code fixes:
    - Move import-time RuntimeError checks to unit tests (avoids breaking
      imports for all users on developer set-drift mistake)
    - _filter_tool_evaluators now raises ValueError when all evaluators
      require tools but no items have tools (was silently substituting)
    - Add poll_interval upper bound (60s) to prevent single-iteration sleep
    - Log exc_info=True in _fetch_output_items for debugging API changes
    - Fix evaluate() docstring: remove claim about Responses API optimization
    - Validate target dict has 'type' key in evaluate_foundry_target
    - Document to_eval_data() limitation: non-text content is omitted
    
    Tests:
    - TestEvaluatorSetConsistency: verify _AGENT/_TOOL subsets of _BUILTIN
    - TestEvaluateTracesAgentId: agent_id-only path with lookback_hours
    - TestFilterToolEvaluatorsRaises: ValueError on all-tool no-items
    - TestEvaluateFoundryTargetValidation: target without 'type' key
    - Assert items==[] on failed/canceled poll results
    - Mock output_items.list in response_ids test for full flow
    - TestAllPassedSubResults: result_counts=None + sub_results delegation
      and parent failures override sub_results
    - TestBuildOverallItemEmpty: empty workflow outputs returns None
    
    Skipped r5-07 (_raw_arguments length hint): marginal debugging value,
    could leak content size information.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix error message: evaluate_responses() → evaluate_traces(response_ids=...)
    
    The referenced function doesn't exist; the correct API is
    evaluate_traces(response_ids=...) from the azure-ai package.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove dead to_eval_data() method, fix docstring claims
    
    - Remove to_eval_data() from EvalItem (dead code after r4-05 JSONL refactor)
    - Migrate 15 tests from to_eval_data() to split_messages()
    - Update sample to use split_messages() + Message properties
    - Remove unimplemented Responses API optimization docstring claim
    - Update split_messages() docstring to not reference removed method
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Reduce default eval timeout from 600s to 180s (3 minutes)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove dead _evaluate_via_responses method from FoundryEvals
    
    The method was never called — evaluate() uses _evaluate_via_dataset,
    and evaluate_traces() calls _evaluate_via_responses_impl directly.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Revert unrelated formatting changes to get-started samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright: remove phantom FoundryMemoryProvider import, apply ruff format
    
    - Remove import of non-existent _foundry_memory_provider module
      (incorrectly kept during rebase conflict resolution)
    - Apply ruff formatter to test_local_eval.py and get-started samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix eval samples: use FoundryChatClient for Agent()
    
    The upstream provider-leading client refactor (#4818) made client=
    a required parameter on Agent(). Update the three getting-started
    eval samples to use FoundryChatClient with FOUNDRY_PROJECT_ENDPOINT,
    matching the standard pattern from 01-get-started samples.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Simplify self-reflection sample using FoundryEvals
    
    Replace ~80 lines of manual OpenAI evals API code (create_eval,
    run_eval, manual polling, raw JSONL params) with FoundryEvals:
    
    - evaluate_groundedness() uses FoundryEvals.evaluate() with EvalItem
    - Remove create_openai_client(), create_eval(), run_eval() functions
    - Remove openai SDK type imports (DataSourceConfigCustom, etc.)
    - run_self_reflection_batch creates FoundryEvals instance once,
      reuses it for all iterations across all prompts
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update eval samples to FoundryChatClient and FOUNDRY_PROJECT_ENDPOINT
    
    - Migrate all foundry_evals samples from AzureOpenAIResponsesClient to FoundryChatClient
    - Update env var from AZURE_AI_PROJECT_ENDPOINT to FOUNDRY_PROJECT_ENDPOINT
    - Use AzureCliCredential consistently across all samples
    - Fix README.md: correct function names (evaluate_dataset -> FoundryEvals.evaluate, evaluate_responses -> evaluate_traces)
    - Update self_reflection .env.example and README.md
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix lint errors in eval samples (E501, ASYNC240, formatting)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove evaluate_all_patterns_sample.py (redundant with focused samples)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix async credential mismatch: use azure.identity.aio for async AIProjectClient
    
    AIProjectClient from azure.ai.projects.aio requires an async credential.
    Switch all foundry_evals samples from azure.identity.AzureCliCredential
    to azure.identity.aio.AzureCliCredential. Also pass project_client to
    FoundryChatClient instead of duplicating endpoint+credential.
    
    Close credential in self_reflection sample to avoid resource leak.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Revert test_observability.py to upstream/main (not our test)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address moonbox3 review: sphinx docstrings, pagination, isinstance check
    
    - Convert all Example:: / Typical usage:: code blocks to .. code-block:: python
      format matching codebase convention (both _evaluation.py and _foundry_evals.py)
    - Add async pagination in _fetch_output_items via async for (handles large result sets)
    - Replace hasattr(__aenter__) with isinstance(client, AsyncOpenAI) in _resolve_openai_client
    - Move AsyncOpenAI import from TYPE_CHECKING to runtime (needed for isinstance)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix test failures and address remaining moonbox3 review comments
    
    - Fix tests: use MagicMock(spec=AsyncOpenAI) for project_client mocks
      (isinstance check now requires proper type, not duck-typing)
    - Fix tests: replace mock_page.__iter__ with _AsyncPage helper for async for
    - Fix evaluate_response: auto-extract queries from response messages when
      query is not provided (previously always raised ValueError)
    - Add debug logging when skipping internal _-prefixed executor IDs
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address Tao's PR review comments on Foundry Evals
    
    - T1: Add comment explaining builtin.* pass-through in _resolve_evaluator
    - T2: Add comment referencing OpenAI evals API for testing_criteria dict
    - T3: Document Mustache-style {{item.*}} template placeholders
    - T4: Document poll loop 60s sleep upper bound rationale
    - T5: Narrow run type to RunRetrieveResponse, use typed field access
      instead of vars()/getattr dance in _extract_result_counts and
      _extract_per_evaluator; use run.error and run.report_url directly
    - T6: Clarify openai_client docstring re: Azure Foundry endpoint
    - T8: Remove misleading empty expected_tool_calls from sample
    - Update tests to match real SDK PerTestingCriteriaResult shape
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove unnecessary Any union from run type annotations
    
    RunRetrieveResponse is the correct type — no backward compat needed
    for a brand new feature.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Accept FoundryChatClient instead of raw AsyncOpenAI
    
    FoundryEvals now takes client: FoundryChatClient as its primary
    parameter instead of openai_client: AsyncOpenAI.  The builtin.*
    evaluators require a Foundry endpoint, so the type should reflect that.
    
    - FoundryEvals.__init__: client: FoundryChatClient replaces openai_client
    - evaluate_traces / evaluate_foundry_target: same change
    - _resolve_openai_client: extracts .client from FoundryChatClient
    - project_client fallback retained for standalone functions
    - All samples updated to construct FoundryChatClient and pass as client=
    - Tests updated (openai_client= → client=)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove implicit 60s upper bound on poll interval
    
    If a developer sets a higher poll_interval, respect it. Only clamp
    to remaining time and enforce a 1s minimum for rate-limit protection.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove 1s floor on poll interval — let the developer control it
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update python/samples/05-end-to-end/evaluation/foundry_evals/.env.example
    
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
    
    * Update python/samples/02-agents/evaluation/evaluate_agent.py
    
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
    
    * Address eavanvalkenburg review (round 2) on Python eval PR
    
    - Rename model_deployment -> model across FoundryEvals and all samples
    - Make model param optional, resolves from client.model
    - Convert EvalResults from dataclass to regular class
    - Remove deprecated evaluate_response() function
    - Refactor splitters: BUILT_IN_SPLITTERS dict + standalone functions
    - Change per_turn_items from classmethod to staticmethod
    - Simplify EvalCheck type alias to use Awaitable[CheckResult]
    - Remove errored property from EvalResults
    - Remove default value from Evaluator protocol eval_name
    - Rename assert_passed -> raise_for_status, add EvalNotPassedError
    - Type agent param as SupportsAgentRun | None
    - Fix Arguments docstring
    - Update __init__.py exports
    - Update all tests and samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Move FoundryEvals to foundry package, split tool eval sample
    
    - Move _foundry_evals.py from azure-ai to foundry package
    - Move test_foundry_evals.py to foundry/tests/
    - Update lazy re-exports in agent_framework.foundry namespace
    - Update .pyi type stubs
    - All samples now import from agent_framework.foundry
    - Split tool-call evaluation into evaluate_tool_calls_sample.py
    - Fix all_passed to check errored count from result_counts
    - Fix raise_for_status to include errored item details
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Auto-create FoundryChatClient from env vars when no client provided
    
    FoundryEvals() now works zero-config when FOUNDRY_PROJECT_ENDPOINT and
    FOUNDRY_MODEL environment variables are set. Auto-creates a FoundryChatClient
    under the hood, matching the established env var pattern.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright errors: remove dead _normalize_queries, suppress EvalAPIError check
    
    - Remove unused _normalize_queries function and its tests
    - Add pyright ignore for EvalAPIError None check (defensive guard)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Support multimodal image content in eval pipeline
    
    Add image (data/uri) content handling to AgentEvalConverter.convert_message()
    so that Content.from_data() and Content.from_uri() image payloads are
    preserved as input_image parts in the Foundry evaluator format.
    
    - Handle Content type='data' and type='uri' → emit input_image parts
    - Add 6 unit tests for image content through convert_message/convert_messages
    - Add integration test verifying images flow through EvalItem → JSONL path
    - Add evaluate_multimodal.py sample demonstrating local image eval
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address remaining review comments
    
    - Fix project_client docstring to say async-only (not sync/async)
    - Add builtin evaluator name validation warning in _resolve_evaluator
    - Replace getattr with typed attribute access in _poll_eval_run,
      _extract_result_counts, _extract_per_evaluator, _fetch_output_items
    - Remove cast import from _foundry_evals (no longer needed)
    - Tighten _coerce_result: honour explicit 'passed' when both 'score'
      and 'passed' are present; remove performative cast
    - Fix self_reflection sample: add env file existence check
    - Fix traces sample: correct Pattern 2 section label
    - Update all Foundry eval samples to FoundryChatClient + FOUNDRY_MODEL
      (remove AIProjectClient + AZURE_AI_MODEL_DEPLOYMENT_NAME pattern)
    - Add eval_name and OpenAI client docs to FoundryEvals docstring
    - Update test mocks to match typed SDK objects (_MockResultCounts)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix ruff lint errors (E501, SIM108, SIM102)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix pyright errors: type-narrow dict to dict[str, Any], add ignore comments
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Replace ConversationSplitter type alias with Protocol
    
    ConversationSplitter is now a runtime-checkable Protocol with a named
    'conversation' parameter, making the expected signature self-documenting.
    
    ConversationSplit enum members gain a __call__ method so they satisfy
    the protocol directly -- ConversationSplit.LAST_TURN(conversation) works.
    
    This simplifies _split_conversation from an isinstance dispatch to a
    single split(conversation) call.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Standardize on AZURE_AI_MODEL_DEPLOYMENT_NAME and fix Unicode in samples
    
    - Replace FOUNDRY_MODEL with AZURE_AI_MODEL_DEPLOYMENT_NAME in all
      eval samples to match repo convention
    - Replace Unicode symbols with ASCII equivalents in all eval sample
      print statements to avoid cp1252 encoding errors on Windows
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update python/samples/03-workflows/evaluation/evaluate_workflow.py
    
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
    
    * Apply suggestions from code review
    
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
    
    * Rename ADR 0020 to 0023 (foundry evals integration)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: alliscode <bentho@microsoft.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
  • Python: Fix samples (#4980)
    * First samples 1st batch
    
    * Fix sample paths
    
    * Fix workflow samples
    
    * Fix workflow dependency
    
    * Correct env vars
    
    * Increase idle timeout
    
    * Fix workflows HIL sample
    
    * Fix more workflow samples
  • Python: [BREAKING] Python: Provider-leading client design & OpenAI package extraction (#4818)
    * Python: Provider-leading client design & OpenAI package extraction
    
    Major refactoring of the Python Agent Framework client architecture:
    
    - Extract OpenAI clients into new `agent-framework-openai` package
    - Core package no longer depends on openai, azure-identity, azure-ai-projects
    - Rename clients for discoverability: OpenAIResponsesClient → OpenAIChatClient,
      OpenAIChatClient → OpenAIChatCompletionClient
    - Unify `model_id`/`deployment_name`/`model_deployment_name` → `model` param
    - New FoundryChatClient for Azure AI Foundry Responses API
    - New FoundryAgent/FoundryAgentClient for connecting to pre-configured Foundry agents
    - Remove OpenAIBase/OpenAIConfigMixin from non-deprecated client MRO
    - Deprecate AzureOpenAI* clients, AzureAIClient, OpenAIAssistantsClient
    - Reorganize samples: azure_openai+azure_ai+azure_ai_agent → azure/
    - ADR-0020: Provider-Leading Client Design
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: missing Agent imports in samples, .model_id → .model in foundry_local sample
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: CI failures — mypy errors, coverage targets, sample imports
    
    - azure-ai mypy: add type ignores for TypedDict total=, model arg, forward ref
    - Coverage: replace core.azure/openai targets with openai package target
    - project_provider: add type annotation for opts dict
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: populate openai .pyi stub, fix broken README links, coverage targets
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fixes
    
    * updated observabilitty
    
    * reset azure init.pyi
    
    * fix errors
    
    * updated adr number
    
    * fix foundry local
    
    * fixed not renamed docstrings and comments, and added deprecated markers to old classes
    
    * fix tests and pyprojects
    
    * fix test vars
    
    * updated function tests
    
    * update durable
    
    * updated test setup for functions
    
    * Fix Foundry auth in workflow samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Stabilize Python integration workflows
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update hosting samples for Foundry
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Trigger full CI rerun
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Trigger CI rerun again
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * trigger rerun
    
    * trigger rerun
    
    * fix for litellm
    
    * undo durabletask changes
    
    * Move Foundry APIs into foundry namespace
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Foundry pyproject formatting
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Split provider samples by Foundry surface
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Restore hosting sample requirements
    
    Also fix the Foundry Local sample link after the provider sample move.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updated tests
    
    * udpated foundry integration tests
    
    * removed dist from azurefunctions tests
    
    * Use separate Foundry clients for concurrent agents
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix client setup in azfunc and durable
    
    * disabled two tests
    
    * updated setup for some function and durable tests
    
    * improved azure openai setup with new clients
    
    * ignore deprecated
    
    * fixes
    
    * skip 11
    
    * remove openai assistants int tests
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • [BREAKING] Python: Add context mode to AgentExecutor (#4668)
    * Add context mode to AgentExecutor
    
    * Fix unit tests
    
    * Address comments
    
    * Address comments
    
    * REvise context mode and add tests
    
    * Add chain config to sequential builder
    
    * Add sample
    
    * Fix pipeline
    
    * Address comments
    
    * Address comments
  • Python: fix(python): Use AgentResponse.value instead of model_validate_json in HITL sample (#4405)
    * fix(python): use AgentResponse.value instead of model_validate_json in HITL sample
    
    Since the agent is configured with response_format=GuessOutput, the
    AgentResponse already provides .value with the parsed Pydantic model.
    Using .value is more idiomatic and avoids redundant JSON parsing.
    
    Fixes #4396
    
    * fix: add safety guard for AgentResponse.value being None
    
    Address Copilot review feedback: .value is optional and may be None
    if response_format isn't propagated through the streaming path.
    Add an explicit None check with a clear error message.
  • Python: Update workflow orchestration samples to use AzureOpenAIResponsesClient (#4285)
    * Update workflow orchestration samples to use AzureOpenAIResponsesClient
    
    * Fix broken link
  • [BREAKING] Python: Add InvokeFunctionTool action for declarative workflows (#3716)
    * add(declarative): Declarative workflow InvokeFunctionTool feature
    
    * Cleanup
    
    * Address PR feedback
    
    * Remove InvokeTool kind, consolidate to InvokeFunctionTool
    
    * Fix sample locations
    
    * pin azure-ai-projects to 2.0.0b3 due to breaking changes
  • Python: Add CreateConversationExecutor, fix input routing, remove unused handler layer (#4159)
    * Fixed declarative deep research sample
    
    * Small fix
    
    * Resolved comment
    
    * Add CreateConversationExecutor, fix input routing, remove unused handler layer
    
    * Address Copilot feedback
    
    * Fix System.ConversationId
    
    ---------
    
    Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
  • Python: fix reasoning model workflow handoff and history serialization (#4083)
    * fix: strip function_call and text_reasoning from cross-agent workflow handoff
    
    When a reasoning model (e.g. gpt-5-mini) runs as Agent 1 in a workflow, its
    response includes text_reasoning items (with server-scoped IDs like rs_XXXX)
    and function_call items. Forwarding these to Agent 2 in a fresh conversation
    caused API errors because the reasoning/call IDs are scoped to the original
    stored response context.
    
    Changes:
    - Strip 'function_call', 'text_reasoning', 'function_approval_request', and
      'function_approval_response' from handoff messages in _agent_executor.py
    - Keep 'function_result' so the actual tool output content is preserved for
      the next agent's context
    - Update unit tests to reflect that function_result messages survive handoff
      (messages grow from 2→3: user, tool(result), assistant(summary))
    - Fix incorrect test assertions in test_function_invocation_stop_clears_*
      that assumed the client layer updates session.service_session_id
    - Also fixed _extract_function_calls to search all messages with call_id
      deduplication, and the error-limit stop path to submit function_call_output
      items before halting (via tool_choice=none cleanup call)
    
    Relates to: https://github.com/microsoft/agent-framework/issues/4047
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: reasoning model workflow handoff and history serialization
    
    Fixes multiple related issues when using reasoning models (gpt-5-mini,
    gpt-5.2) in multi-agent workflows that chain agents via from_response
    or replay full conversation history via AgentExecutorRequest.
    
    ## Reasoning items always emitted on output_item.added
    
    When a reasoning model produces encrypted or hidden reasoning (no
    visible text), the Responses API still fires a reasoning output item
    without any reasoning_text.delta events. Previously no text_reasoning
    Content was emitted in that case, making it invisible to downstream
    logic. Both the non-streaming (_parse_response_from_openai) and
    streaming (output_item.added) paths now always emit at least one
    text_reasoning Content — with empty text if no content is available —
    so co-occurrence detection and serialization guards work reliably.
    
    ## Reasoning items only serialized when paired with a function_call
    
    The Responses API only accepts reasoning items in input when they
    directly preceded a function_call in the original response. Sending a
    reasoning item that preceded a text response (no tool call) causes:
      "reasoning was provided without its required following item"
    _prepare_message_for_openai now checks has_function_call per message
    and skips text_reasoning serialization when there is no accompanying
    function_call.
    
    ## summary field is an array, not an object
    
    The reasoning item summary field sent to the Responses API must be an
    array of objects ([{"type": "summary_text", "text": ...}]), not a
    single object. Fixed _prepare_content_for_openai accordingly.
    
    ## service_session_id cleared when explicit history is provided
    
    When a workflow coordinator replays a full conversation (including
    function calls from a previous agent run) back to an executor via
    AgentExecutorRequest or from_response, the executor's session still
    held a service_session_id (previous_response_id) from the prior run.
    The API then received the same function-call items twice — once from
    previous_response_id (server-stored) and once from the explicit input —
    causing: "Duplicate item found with id fc_...".
    
    AgentExecutor.run (when should_respond=True) and from_response now
    reset self._session.service_session_id = None before running so that
    explicit input is the sole source of conversation context.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * small improvements in text reasoning
    
    * refactor: add reset_service_session to AgentExecutorRequest for explicit history replay
    
    Replace the implicit 'always clear service_session_id when should_respond=True'
    with an explicit opt-in field on AgentExecutorRequest.
    
    The old approach used should_respond=True as a proxy for 'full history replay',
    but that conflates two distinct intents:
    - Orchestrations group chat sends should_respond=True with an empty/single-message
      list (not a full replay) — unnecessarily clearing service_session_id.
    - HITL / feedback coordinators send the full prior conversation and truly need
      a fresh service session ID to avoid duplicate-item API errors.
    
    Changes:
    - Add AgentExecutorRequest.reset_service_session: bool = False
    - AgentExecutor.run only clears service_session_id when this flag is True
    - AgentExecutor.from_response unchanged (always clears; always full conversation)
    - Set reset_service_session=True in all full-history-replay call sites:
      agents_with_HITL.py, azure_chat_agents_tool_calls_with_feedback.py,
      autogen-migration round-robin coordinator, tau2 runner
    - Update _FullHistoryReplayCoordinator test helper to pass the flag
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * comment update
    
    * fixes from feedback
    
    * fix test
    
    * reverted changes to agent executor
    
    * fix: remove reset_service_session from tau2 runner
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * two other reverts
    
    * fix sample
    
    ---------
    
    Co-authored-by: Giles Odigwe <79032838+giles17@users.noreply.github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Add load_dotenv() to samples for .env file support (#4043)
    * Initial plan
    
    * Add load_dotenv() to 303 Python samples for environment variable loading
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Update SAMPLE_GUIDELINES.md to document load_dotenv() requirement
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Update samples README.md to document .env file usage
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Run ruff format on all changed sample files
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Clarify load_dotenv() usage in README - local dev vs production
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Remove deprecated getting_started folder as requested
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Document env_file_path parameter for per-client configuration
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Merge main branch to resolve conflicts
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Fix run_evaluation.py file that was empty in merge commit
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Remove dotnet changes from merge - out of scope for this PR
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Remove package and test changes from merge - only sample changes needed
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Remove test_func_utils.py - only sample changes needed
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Revert sample files not in original changeset - keep only load_dotenv additions
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Move load_dotenv() outside snippet tag in 06_host_your_agent.py
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Fix comment placement - move load_dotenv before code comments
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Fix load_dotenv() placement across all samples - after docstring, before code comments
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Merge latest main branch with load_dotenv changes
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Remove non-sample changes from merge - keep only load_dotenv additions
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Revert non-load_dotenv sample changes from merge
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Fix run_evaluation.py - use main's improved version (file already had load_dotenv)
    
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    
    * Manual update
    
    * Manual update 2
    
    * Fix Role usage and load_dotenv placement per PR review feedback
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Fix Role usage - use string literals not enum attributes
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Fix SAMPLE_GUIDELINES.md example - load_dotenv before docstring per guidance
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Move load_dotenv() before docstrings in all samples per SAMPLE_GUIDELINES ordering
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Address PR review: rename files, fix placement, add session usage, remove note
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Update Redis README to reference renamed file redis_history_provider.py
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
    Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
    Co-authored-by: Tao Chen <taochen@microsoft.com>
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
  • Python: Fix workflow samples for bugbash: part 1 (#4055)
    * Fix workflow samples for bugbash: part 1
    
    * Fix mypy
    
    * Fix tests
  • Python: [BREAKING] Scope provider state by source_id and standardize source IDs (#3995)
    * Initial plan
    
    * Add FoundryMemoryProvider and tests
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Add sample and documentation for FoundryMemoryProvider
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Address code review feedback for FoundryMemoryProvider
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Address PR review comments: Add DEFAULT_SOURCE_ID, use logging.getLogger, move state to session.state
    
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    
    * Fix Foundry memory ItemParam usage and exports
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Refactor provider hook state and standardize source IDs
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Support endpoint-based Foundry memory init
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix core README workflows link
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updated implementation and sample
    
    * Split out Foundry memory provider changes
    
    Remove FoundryMemoryProvider implementation/tests/sample plus export and docs mentions from this branch so only non-Foundry changes remain.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Trigger CI rerun for PR #3995
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
    Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Remove duplicate samples (#3899)
    * Remove duplicate samples
    
    * Correct paths
    
    * Update readme
    
    * Update readme
    
    * Fix ruff
    
    ---------
    
    Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>
  • Python: Fix streamed workflow agent continuation context by finalizing AgentExecutor streams (#3882)
    * Fix streamed workflow agent continuation context by finalizing AgentExecutor streams
    
    * Fix stream handling
    
    * Fixes
    
    * Fix DevUI and tests
  • Python: [BREAKING] PR2 — Wire context provider pipeline, remove old types, update all consumers (#3850)
    * PR2: Wire context provider pipeline and update all internal consumers
    
    - Replace AgentThread with AgentSession across all packages
    - Replace ContextProvider with BaseContextProvider across all packages
    - Replace context_provider param with context_providers (Sequence)
    - Replace thread= with session= in run() signatures
    - Replace get_new_thread() with create_session()
    - Add get_session(service_session_id) to agent interface
    - DurableAgentThread -> DurableAgentSession
    - Remove _notify_thread_of_new_messages from WorkflowAgent
    - Wire before_run/after_run context provider pipeline in RawAgent
    - Auto-inject InMemoryHistoryProvider when no providers configured
    
    * fix: update all tests for context provider pipeline, fix lazy-loaders, remove old test files
    
    * refactor: update all sample files for context provider pipeline (AgentThread→AgentSession, ContextProvider→BaseContextProvider)
    
    * fix: update remaining ag-ui references (client docstring, getting_started sample)
    
    * fix: make get_session service_session_id keyword-only to avoid confusion with session_id
    
    * refactor: rename _RunContext.thread_messages to session_messages
    
    * refactor: remove _threads.py, _memory.py, and old provider files; migrate devui to use plain message lists
    
    * rename: remove _new_ prefix from test files
    
    * refactor: rewrite SlidingWindowChatMessageStore as SlidingWindowHistoryProvider(InMemoryHistoryProvider)
    
    * fix: read full history from session state directly instead of reaching into provider internals
    
    * fix: update stale .pyi stubs, sample imports, and README references for new provider types
    
    * fix: remove stale message_store, _notify_thread_of_new_messages, and session_id.key references in samples
    
    * refactor: merge context_providers and sessions sample folders into sessions, remove aggregate_context_provider
    
    * refactor: UserInfoMemory stores state in session.state instead of instance attributes
    
    * feat: add Pydantic BaseModel support to session state serialization
    
    Pydantic models stored in session.state are now automatically serialized
    via model_dump() and restored via model_validate() during to_dict()/from_dict()
    round-trips. Models are auto-registered on first serialization; use
    register_state_type() for cold-start deserialization.
    
    Also export register_state_type as a public API.
    
    * fix mem0
    
    * Update sample README links and descriptions for session terminology
    
    - Replace 'thread' with 'session' in sample descriptions across all READMEs
    - Update file links for renamed samples (mem0_sessions, redis_sessions, etc.)
    - Fix Threads section → Sessions section in main samples/README.md
    - Update tools, middleware, workflows, durabletask, azure_functions READMEs
    - Update architecture diagrams in concepts/tools/README.md
    - Update migration guides (autogen, semantic-kernel)
    
    * Fix broken Redis README link to renamed sample
    
    * Fix Mem0 OSS client search: pass scoping params as direct kwargs
    
    AsyncMemory (OSS) expects user_id/agent_id/run_id as direct kwargs,
    while AsyncMemoryClient (Platform) expects them in a filters dict.
    Adds tests for both client types.
    
    Port of fix from #3844 to new Mem0ContextProvider.
    
    * Fix rebase issues: restore missing _conversation_state.py and checkpoint decode logic
    
    - Add back _conversation_state.py (encode/decode_chat_messages) lost in rebase
    - Fix on_checkpoint_restore to decode cache/conversation with decode_chat_messages
    - Fix on_checkpoint_restore to use decode_checkpoint_value for pending requests
    - Add tests/workflow/__init__.py for relative import support
    - Fix test_agent_executor checkpoint selection (checkpoints[1] not superstep)
    
    * Add STORES_BY_DEFAULT ClassVar to skip redundant InMemoryHistoryProvider injection
    
    Chat clients that store history server-side by default (OpenAI Responses API,
    Azure AI Agent) now declare STORES_BY_DEFAULT = True. The agent checks this
    during auto-injection and skips InMemoryHistoryProvider unless the user
    explicitly sets store=False.
    
    * Fix broken markdown links in azure_ai and redis READMEs
    
    * Fix getting-started samples to use session API instead of removed thread/ContextProvider API
    
    * updates to workflow as agent
    
    * fix group chat import
    
    * Rename Thread→Session throughout, fix service_session_id propagation, remove stale AGUIThread
    
    - Fix: Propagate conversation_id from ChatResponse back to session.service_session_id
      in both streaming and non-streaming paths in _agents.py
    - Rename AgentThreadException → AgentSessionException
    - Remove stale AGUIThread from ag_ui lazy-loader
    - Rename use_service_thread → use_service_session in ag-ui package
    - Rename test functions from *_thread_* to *_session_*
    - Rename sample files from *_thread* to *_session*
    - Update docstrings and comments: thread → session
    - Update _mcp.py kwargs filter: add 'session' alongside 'thread'
    - Fix ContinuationToken docstring example: thread=thread → session=session
    - Fix _clients.py docstring: 'Agent threads' → 'Agent sessions'
    
    * Fix broken markdown links after thread→session file renames
    
    * fix azure ai test
  • Python: restructure: Python samples into progressive 01-05 layout (#3862)
    * restructure: Python samples into progressive 01-05 layout
    
    - 01-get-started/: 6 numbered steps (hello agent → hosting)
    - 02-agents/: all agent concept samples (tools, middleware, providers, etc.)
    - 03-workflows/: ALL existing workflow samples preserved as-is
    - 04-hosting/: azure-functions, durabletask, a2a
    - 05-end-to-end/: demos, evaluation, hosted agents
    - Old files moved to _to_delete/ for review
    - Added AGENTS.md with structure documentation
    - autogen-migration/ and semantic-kernel-migration/ preserved at root
    
    * fix: switch to AzureOpenAI Foundry, fix CI failures
    
    - Switch all 01-get-started samples to AzureOpenAIResponsesClient with
      Azure AI Foundry project endpoint (AZURE_AI_PROJECT_ENDPOINT +
      AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME + AzureCliCredential)
    - Add _to_delete/ and 05-end-to-end/ to pyrightconfig.samples.json excludes
    - Fix test paths in packages/ that referenced old getting_started/ dirs:
      durabletask conftest + streaming test, azurefunctions conftest,
      devui conftest + capture_messages + openai_sdk_integration
    - Fix workflow_as_agent_human_in_the_loop.py import (sibling import)
    - Update hosting READMEs and tool comment paths
    - Replace root README.md with new structure overview
    - Update AGENTS.md to document Azure OpenAI Foundry as default provider
    
    * cleanup: remove _to_delete folder, copy resource files to active dirs
    
    All files in _to_delete/ were either:
    - Exact duplicates of files in the new structure (240 files)
    - Same file with only comment path updates (100 files)
    - One import-fix diff (workflow_as_agent_human_in_the_loop.py)
    - One superseded minimal_sample.py
    
    Resource files (sample.pdf, countries.json, employees.pdf, weather.json)
    copied to 02-agents/sample_assets/ and 02-agents/resources/ since active
    samples reference them.
    
    * fix: address PR review comments, centralize resources, remove root duplicates
    
    - Fix type annotation in 04_memory.py (string union -> proper types)
    - Fix old sample paths in observability files
    - Fix grammar/spelling in observability samples
    - Move sample_assets/ and resources/ to shared/ folder
    - Remove 8 duplicate observability files from 02-agents root
    - Update resource path references in multimodal_input and provider samples
    
    * fix: update broken links from old getting_started paths to new structure
    
    - Update relative paths in READMEs: getting_started/ → 01-get-started/,
      02-agents/, 03-workflows/, 04-hosting/, 05-end-to-end/
    - Fix absolute GitHub URLs in package READMEs
    - Fix broken link in ollama package README
    
    * fix: convert absolute GitHub URLs to relative paths for link checker
    
    Absolute URLs to python/samples/ on main branch 404 until PR merges.
    Converted to relative paths that linkspector can verify locally.
    
    * fix: update link for handoff sample moved to orchestrations/
    
    * fix: update chatkit-integration README path from demos/ to 05-end-to-end/
    
    * fix: update broken links in orchestrations README to match flat directory structure