Commit Graph

8 Commits

  • Python: Add opt-in AG-UI thread snapshot persistence and hydration (#6471)
    * feat(ag-ui): add thread snapshot store primitives
    
    Key decisions:\n- Introduce an AGUIThreadSnapshot model limited to replayable messages, optional Shared State, and optional interrupt state.\n- Define AGUIThreadSnapshotStore as an async protocol keyed by explicit Snapshot Scope and AG-UI Thread id.\n- Add InMemoryAGUIThreadSnapshotStore as memory-only, latest-only, bounded local/demo/test storage; no file-backed store is introduced.\n- Require snapshot_scope_resolver whenever an endpoint is configured with a snapshot store, including pre-wrapped runners, so thread ids are not authorization boundaries.\n\nFiles changed:\n- packages/ag-ui/agent_framework_ag_ui/_snapshots.py\n- packages/ag-ui/agent_framework_ag_ui/__init__.py\n- packages/ag-ui/agent_framework_ag_ui/_agent.py\n- packages/ag-ui/agent_framework_ag_ui/_workflow.py\n- packages/ag-ui/agent_framework_ag_ui/_endpoint.py\n- packages/core/agent_framework/ag_ui/__init__.py\n- packages/core/agent_framework/ag_ui/__init__.pyi\n- packages/ag-ui/tests/ag_ui/test_snapshots.py\n- packages/ag-ui/tests/ag_ui/test_endpoint.py\n- packages/ag-ui/tests/ag_ui/test_public_exports.py\n- packages/ag-ui/AGENTS.md\n\nVerification:\n- uv run pytest packages/ag-ui/tests/ag_ui/test_snapshots.py packages/ag-ui/tests/ag_ui/test_public_exports.py packages/ag-ui/tests/ag_ui/test_endpoint.py::test_endpoint_requires_snapshot_scope_resolver_when_store_configured packages/ag-ui/tests/ag_ui/test_endpoint.py::test_endpoint_accepts_snapshot_store_with_scope_resolver -q\n- uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_endpoint_requires_snapshot_scope_resolver_when_store_configured packages/ag-ui/tests/ag_ui/test_endpoint.py::test_endpoint_requires_snapshot_scope_resolver_when_wrapped_runner_has_store packages/ag-ui/tests/ag_ui/test_endpoint.py::test_endpoint_accepts_snapshot_store_with_scope_resolver -q\n- uv run poe syntax -P ag-ui -C\n- uv run poe pyright -P ag-ui\n- uv run poe syntax -P core -C\n- uv run poe pyright -P core\n- uv run poe typing -P ag-ui\n- uv run poe typing -P core\n- uv run poe test -P ag-ui\n- uv run poe check -P ag-ui\n- git diff --check\n- git diff --cached --check\n\nBlockers / next iteration:\n- No blockers. Next slice can use the store contract to capture and hydrate agent snapshots.\n- uv repeatedly refreshed azure-ai-projects in uv.lock during local runs; reverted the generated lockfile churn because this change does not alter dependencies.\n- The poe-check commit hook was skipped after manual verification because it reformatted unrelated core MCP files outside this task.
    
    * feat(ag-ui): hydrate agent threads from snapshots
    
    Key decisions:
    - Resolve Snapshot Scope per endpoint request and pass it to the AG-UI runner only when snapshot storage is active.
    - Treat empty messages with no resume payload as an agent Hydrate Request when a scoped snapshot store is configured, replaying stored Shared State and message snapshots without invoking the wrapped agent.
    - Save the latest replayable agent message snapshot and Shared State at normal completion under Snapshot Scope plus AG-UI Thread id; no durable or file-backed store is introduced.
    
    Files changed:
    - packages/ag-ui/agent_framework_ag_ui/_agent_run.py
    - packages/ag-ui/agent_framework_ag_ui/_endpoint.py
    - packages/ag-ui/agent_framework_ag_ui/_snapshots.py
    - packages/ag-ui/tests/ag_ui/test_endpoint.py
    
    Verification:
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_hydrates_stored_thread_snapshot_without_invoking_agent -q
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_hydrates_stored_thread_snapshot_without_invoking_agent packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_hydrates_snapshots_by_scope_and_thread -q
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_endpoint_empty_messages packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_hydrates_stored_thread_snapshot_without_invoking_agent packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_hydrates_snapshots_by_scope_and_thread -q
    - uv run poe syntax -P ag-ui -C
    - uv run poe pyright -P ag-ui
    - uv run poe typing -P ag-ui
    - uv run poe test -P ag-ui
    - uv run poe check -P ag-ui
    - git diff --check
    - git diff --cached --check
    
    Blockers / next iteration:
    - No blockers. Next slice can reconstruct normal new-user agent turns from stored snapshots.
    - uv repeatedly refreshed azure-ai-projects in uv.lock during local runs; reverted the generated lockfile churn because this change does not alter dependencies.
    - The poe-check commit hook was skipped after manual verification because it refreshed unrelated uv.lock dependency resolution.
    
    * feat(ag-ui): reconstruct agent turns from snapshots
    
    Key decisions:
    - Load scoped thread snapshots for non-hydrate agent requests only when snapshot storage is active and no resume payload is present.
    - Rebuild prior AG-UI history from stored snapshot messages, preserving the incoming new user suffix and treating stored snapshot content as authoritative over conflicting prior client history.
    - Merge stored Shared State with request state overrides before schema defaults and existing state-context injection.
    
    Files changed:
    - packages/ag-ui/agent_framework_ag_ui/_agent_run.py
    - packages/ag-ui/tests/ag_ui/test_endpoint.py
    
    Verification:
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_prepends_stored_snapshot_for_new_user_turn -q
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_deduplicates_full_history_and_merges_fresh_state -q
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_endpoint_empty_messages packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_hydrates_stored_thread_snapshot_without_invoking_agent packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_hydrates_snapshots_by_scope_and_thread packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_prepends_stored_snapshot_for_new_user_turn packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_deduplicates_full_history_and_merges_fresh_state -q
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py -q
    - uv run poe syntax -P ag-ui -C
    - uv run poe pyright -P ag-ui
    - uv run poe test -P ag-ui
    - uv run poe check -P ag-ui
    - uv run poe typing -P ag-ui
    - git diff --check
    - git diff --cached --check
    
    Blockers / next iteration:
    - No blockers. Next slice can enable workflow AG-UI Thread Snapshot persistence and hydration.
    - uv repeatedly refreshed azure-ai-projects in uv.lock during local runs; reverted the generated lockfile churn because this change does not alter dependencies.
    - The poe-check commit hook was skipped after manual verification because it refreshes unrelated uv.lock dependency resolution.
    
    * feat(ag-ui): hydrate workflow threads from snapshots
    
    Key decisions:
    - Handle workflow Hydrate Requests before resolving or invoking the wrapped workflow when snapshot storage and Snapshot Scope are active.
    - Capture only replayable workflow protocol data: workflow-emitted state snapshots, workflow-emitted message snapshots, and synthesized messages from text/tool output.
    - Keep workflow snapshot capture inactive without configured persistence, and skip saving snapshots when the workflow stream emits RUN_ERROR.
    
    Files changed:
    - packages/ag-ui/agent_framework_ag_ui/_workflow.py
    - packages/ag-ui/tests/ag_ui/test_endpoint.py
    
    Verification:
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_workflow_endpoint_hydrates_emitted_snapshots_without_invoking_workflow packages/ag-ui/tests/ag_ui/test_endpoint.py::test_workflow_endpoint_hydrates_synthesized_text_and_tool_snapshot -q
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py -q
    - uv run pytest packages/ag-ui/tests/ag_ui/golden/test_scenario_workflow.py -q
    - uv run poe syntax -P ag-ui -C
    - uv run poe pyright -P ag-ui
    - uv run poe test -P ag-ui
    - uv run poe typing -P ag-ui
    - uv run poe check -P ag-ui
    - git diff --check
    - git diff --cached --check
    
    Blockers / next iteration:
    - No blockers. Next slice can preserve interruption state and protect snapshots on errors across agent and workflow endpoints.
    - uv repeatedly refreshed azure-ai-projects in uv.lock during local runs; reverted the generated lockfile churn because this change does not alter dependencies.
    - The poe-check commit hook was skipped after manual verification because it refreshes unrelated uv.lock dependency resolution.
    
    * feat(ag-ui): preserve interrupted thread snapshots
    
    Key decisions:
    - Capture workflow RUN_FINISHED interrupt metadata in replayable AG-UI Thread Snapshots so Hydrate Requests can restore pending workflow actions without invoking or resuming the workflow.
    - Keep failed agent and workflow runs from replacing the last good snapshot; RUN_ERROR streams leave the previous snapshot available for hydration.
    - Verify interruption hydration through endpoint-level AG-UI streams for both agent and workflow wrappers, including Shared State replay and no wrapped runner invocation.
    
    Files changed:
    - packages/ag-ui/agent_framework_ag_ui/_workflow.py
    - packages/ag-ui/tests/ag_ui/test_endpoint.py
    
    Verification:
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_workflow_endpoint_hydrates_interrupted_thread_without_invoking_workflow -q
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_hydrates_interrupted_thread_without_invoking_agent packages/ag-ui/tests/ag_ui/test_endpoint.py::test_agent_endpoint_run_error_does_not_overwrite_previous_snapshot packages/ag-ui/tests/ag_ui/test_endpoint.py::test_workflow_endpoint_hydrates_interrupted_thread_without_invoking_workflow packages/ag-ui/tests/ag_ui/test_endpoint.py::test_workflow_endpoint_run_error_does_not_overwrite_previous_snapshot -q
    - uv run pytest packages/ag-ui/tests/ag_ui/test_endpoint.py -q
    - uv run pytest packages/ag-ui/tests/ag_ui/golden/test_scenario_workflow.py -q
    - uv run poe syntax -P ag-ui -C
    - uv run poe pyright -P ag-ui
    - uv run poe test -P ag-ui
    - uv run poe typing -P ag-ui
    - uv run poe check -P ag-ui
    - git diff --check
    - git diff --cached --check
    
    Blockers / next iteration:
    - No blockers. Next slice can document AG-UI Thread Snapshot security and usage.
    - uv repeatedly refreshed azure-ai-projects in uv.lock during local runs; reverted the generated lockfile churn because this change does not alter dependencies.
    - The poe-check commit hook was skipped after manual verification because it refreshes unrelated uv.lock dependency resolution.
    
    * docs(ag-ui): document thread snapshot security
    
    Key decisions:
    - Document AG-UI Thread Snapshot persistence as opt-in and disabled unless a snapshot_store is configured.
    - Place Snapshot Scope guidance next to endpoint authentication guidance, making clear that AG-UI Thread ids identify threads but do not authorize snapshot access.
    - Describe built-in storage as in-memory only, process-local, latest-only, and not durable production storage; durable stores remain app-owned implementations of AGUIThreadSnapshotStore.
    - Call out snapshot confidentiality impact and that no file-backed AG-UI snapshot store is provided.
    
    Files changed:
    - packages/ag-ui/README.md
    
    Verification:
    - uv run python scripts/check_md_code_blocks.py packages/ag-ui/README.md --no-glob
    - git diff --check
    - git diff --cached --check
    - commit hook without SKIP ran changed-package lint/format and AG-UI README markdown-code-lint successfully before stopping because uv.lock was modified
    - uv run poe markdown-code-lint (failed due existing unrelated packages/mistral/README.md missing agent_framework_mistral import resolution; changed AG-UI README blocks passed)
    
    Blockers / next iteration:
    - No blockers. Local issue/PRD planning artifacts remain uncommitted.
    - uv refreshed azure-ai-projects in uv.lock during markdown lint and the commit hook; reverted the generated lockfile churn because this documentation change does not alter dependencies.
    - The poe-check commit hook was skipped after manual verification because it refreshes unrelated uv.lock dependency resolution.
    
    * fix(ag-ui): harden thread snapshot persistence edge cases
    
    - Persist the completed confirm_changes turn with interrupt=None so hydration
      no longer replays a stale pending interrupt after the user responds; resume
      requests prepend stored history so the persisted thread is not truncated.
    - Defer endpoint default_state application to the runners when snapshot
      persistence is active, filling only keys missing from both the stored
      snapshot state and the request state so defaults never reset persisted
      Shared State.
    - Always fold the turn's output into the persisted messages snapshot even when
      the outbound MESSAGES_SNAPSHOT event is suppressed for predictive tools
      without confirmation.
    - Load the stored snapshot on workflow follow-up turns, reconstruct full
      thread history into the run input, and seed the snapshot builder with merged
      state so saving a new turn no longer replaces prior history.
    - Move snapshot message reconstruction helpers to _run_common for reuse by the
      workflow runner; load stored agent snapshots on resume turns for state merge.
    - Add endpoint regression tests for all four scenarios.
    
    * fix(ag-ui): protect snapshot history on resume and harden suffix trust
    
    - Prepend stored thread history when persisting snapshots for resume runs on
      both the agent and workflow paths, so a resumed interrupt no longer
      overwrites the stored thread with just the resume turn's output.
    - Filter the incoming message suffix during thread reconstruction: only user
      turns and tool results answering backend-issued tool calls (stored tool
      calls or pending interrupts) may extend authoritative history. Client-forged
      assistant and tool messages are dropped and logged instead of being
      persisted and replayed.
    - Close the workflow snapshot builder's tool-call group when a tool result or
      text message lands, so synthesized transcripts keep tool results adjacent to
      their tool_calls message and stay valid as provider replay history.
    - Export DEFAULT_MAX_THREAD_SNAPSHOTS from agent_framework_ag_ui and expose
      SnapshotScopeResolver through the core ag_ui facade and stub.
    - Add regression tests for agent and workflow resume history preservation,
      forged suffix rejection, builder tool-call grouping, and the export surface.
    
    * fix(ag-ui): tolerate snapshot save failures and scope workflow cache
    
    - Wrap snapshot_store.save() on both the agent and workflow paths so a
      transient store failure (timeout, connection refused) is logged instead of
      propagating. Previously a failing save converted an already-streamed
      successful run into RUN_ERROR, and on the workflow path emitted RUN_ERROR
      after RUN_FINISHED, violating the single-terminal-event invariant. The
      previous snapshot stays available for hydration.
    - Key the workflow_factory instance cache by (snapshot_scope, thread_id). The
      Snapshot Scope is the authorization boundary, so the same thread id under
      different scopes no longer shares an in-memory workflow instance.
      clear_thread_workflow accepts an optional snapshot_scope and clears all
      scopes for the thread when omitted.
    - Add tests: save-failure tolerance for agent and workflow endpoints,
      scope-isolated workflow cache, async snapshot_scope_resolver support, and
      in-memory store key validation errors.
    
    * fix(ci): ignore all dotnet.microsoft.com links in linkspector
    
    The existing ignore pattern only matched https://dotnet.microsoft.com/download,
    but Microsoft sites insert a locale segment between host and path
    (e.g. /en-us/download/dotnet/10.0), so localized links slip past the pattern
    and get checked. dotnet.microsoft.com bot-blocks CI link checkers with
    intermittent 403s across the whole site, which fails markdown-link-check on
    unrelated pull requests since linkspector scans the entire repository.
    
    Ignore the domain wholesale, matching how platform.openai.com is already
    handled for the same reason. A 403 from bot blocking is indistinguishable
    from a removed page, so the checker cannot produce a meaningful signal for
    this domain either way.
    
    * ag-ui: simplify raw_messages assignment and drop OrderedDict
    
    - Replace list(cast(...)) with a typed annotation for raw_messages
      (_agent_run.py:866) per review suggestion
    - Replace OrderedDict with a plain dict in InMemoryAGUIThreadSnapshotStore
      (_snapshots.py:136); regular dicts are insertion-order-safe since
      Python 3.7, so OrderedDict is unnecessary. Update _evict_oldest to use
      next(iter(...)) for FIFO removal instead of popitem(last=False).
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #2458: review comment fixes
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: feat(foundry): add to_prompt_agent / deploy_as_prompt_agent (experimental) (#5959)
    * feat(foundry): add experimental to_prompt_agent converter
    
    Adds `to_prompt_agent(agent)`, an experimental converter
    (`ExperimentalFeature.TO_PROMPT_AGENT`) that turns an Agent Framework
    `Agent` into a Foundry `PromptAgentDefinition` ready to publish via
    `AIProjectClient.agents.create_version(...)`.
    
    Behaviour:
    
    * `agent.client` must be a `FoundryChatClient` (or subclass); otherwise
      `TypeError` is raised. The model deployment name is lifted from the
      bound client so the same Agent definition used for local runs can be
      published as a hosted prompt agent without restating the model.
    * Foundry SDK tool instances (from `FoundryChatClient.get_*_tool()`) are
      passed through unchanged. AF `FunctionTool`s (and `@tool`-decorated
      callables) are emitted as Foundry `FunctionTool` declarations.
    * Local AF MCP tools cannot be expressed in a `PromptAgentDefinition`;
      the converter raises `ValueError` and points at
      `FoundryChatClient.get_mcp_tool()` for hosted MCP servers.
    * The converter walks both `agent.default_options["tools"]` and
      `agent.mcp_tools` because `normalize_tools()` splits local MCP off
      into its own list.
    
    Re-exported through the `agent_framework.foundry` lazy-loading namespace
    (updates both `__init__.py` and the `__init__.pyi` type stub).
    
    Adds a portable-agent sample showing the same `Agent` driven through
    both `agent.run(...)` and `to_prompt_agent(agent)`, and a README section
    covering the new converter.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * chore(samples): remove snippet tags from portable agent sample
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * chore(samples): inline FoundryChatClient and enable prompt-agent publish
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * chore(samples): drop async credential context manager
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(foundry): trim README to_prompt_agent example to publish-only flow
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(foundry): note FoundryAgent runs @tool callables for deployed prompt agents
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(foundry): address review comments on to_prompt_agent converter
    
    * Construct `PromptAgentDefinition` `Tool` from a dict via `**tool_item`
      unpacking rather than the positional Mapping constructor \u2014 cleaner and
      matches the typical Pydantic / Azure SDK pattern.
    * Drop the redundant `isinstance(mcp_tool, MCPTool)` guard in
      `_convert_tools`; the parameter is already typed `Iterable[MCPTool]` so
      the second `raise` was unreachable. The remaining single `raise`
      fires for every entry as intended.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(foundry): match Agent.__init__ model resolution in to_prompt_agent
    
    * Read the model from `agent.default_options.get("model")` first,
      falling back to `agent.client.model`. This mirrors the order
      `Agent.__init__` uses (`_agents.py:740`) when assembling
      default_options, so the model the agent runs with is the same model
      the converter publishes \u2014 e.g. when the caller passes
      `default_options={"model": "..."}` to override the bound client.
    * Updated the missing-model error message to point at both the client
      and the default_options paths.
    * Added tests:
      * tool-only agent with no `instructions` produces a definition
        where `instructions` is `None` and is omitted from the dict
        payload (`Agent.__init__` strips None values from default_options
        before storing them).
      * `default_options['model']` wins over the bound client's model.
      * Fallback to client.model when default_options has no model.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * feat(foundry): add deploy_as_prompt_agent helper + samples
    
    Adds `deploy_as_prompt_agent(agent)`, a convenience wrapper around
    `to_prompt_agent` that reuses the bound FoundryChatClient's project
    client to call `project_client.agents.create_version(...)`. Defaults
    `agent_name` / `description` from `agent.name` / `agent.description`
    so the Agent stays the single source of truth.
    
    * Exposed from `agent_framework_foundry` and the lazy-loading
      `agent_framework.foundry` namespace (including the .pyi stub).
    * Marked experimental with the existing
      `ExperimentalFeature.TO_PROMPT_AGENT` tag.
    * Tests cover the happy path, name/description defaulting, explicit
      override, no-name error, metadata + description forwarding, extra
      kwargs passthrough, and the experimental metadata.
    
    Samples:
    * Renamed the existing sample to `creating_prompt_agents.py`, drops
      'portable' wording, presents `deploy_as_prompt_agent` first as the
      recommended path and `to_prompt_agent` + `AIProjectClient` as the
      two-step alternative, and adds a cleanup step that deletes the
      published agent so re-runs stay idempotent.
    * New `using_prompt_agents.py` shows the end-to-end loop: deploy the
      agent, connect to it with `FoundryAgent` passing the same local
      `@tool` callable, run a query against the deployed prompt agent,
      then clean up.
    
    README updated to introduce `deploy_as_prompt_agent` as the
    recommended path and link to both runnable samples.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(foundry): restore missing-model ValueError in to_prompt_agent
    
    The check was accidentally dropped while reworking docstrings in the
    previous commit. Test `test_to_prompt_agent_rejects_missing_model`
    exercises this path and was failing on CI as a result.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * refactor(foundry): rename deploy_as_prompt_agent -> create_prompt_agent
    
    Renames the helper across the foundry package, core lazy-loader stubs,
    tests, README and samples. The new name better matches the action
    performed (a prompt-agent definition is created in Foundry) and is
    consistent with the surrounding ''create_*'' API surface.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * refactor(foundry): drop create_prompt_agent, enrich to_prompt_agent params
    
    Remove the create_prompt_agent helper and consolidate on to_prompt_agent.
    Expose every PromptAgentDefinition parameter that has either an Agent
    Framework equivalent (sourced from default_options) or no equivalent
    (accepted as a keyword argument).
    
    * default_options-sourced (with kwarg overrides):
      temperature, top_p, string tool_choice
    * kwarg-only Foundry knobs:
      reasoning, text, structured_inputs, rai_config, ToolChoiceParam tool_choice
    
    Precedence is always: explicit keyword > default_options entry > unset.
    
    Tests cover every path (defaults, default_options, kwargs, kwarg override).
    Samples and README rewritten around the enriched to_prompt_agent.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * refactor(foundry): single source of truth for prompt-agent options
    
    Stop duplicating the generation-parameter surface between FoundryChatOptions
    and to_prompt_agent. Translate every field with an Agent Framework equivalent
    (temperature, top_p, tool_choice, reasoning, response_format/text/verbosity)
    from agent.default_options via a new RawFoundryChatClient helper
    _prepare_prompt_agent_options. Only Foundry-specific fields with no AF
    equivalent — structured_inputs and rai_config — remain as keyword arguments
    on to_prompt_agent.
    
    - tool_choice is dropped when there are no tools (mirrors _prepare_options
      semantics and avoids polluting tool-less prompt agents with Agent.__init__'s
      'auto' default).
    - response_format Pydantic models route through
      openai.lib._parsing._responses.type_to_text_format_param; dict shapes go
      through the existing _prepare_response_and_text_format helper.
    - default_options is not mutated; text dict is defensively copied.
    
    Tests, README, and creating_prompt_agents.py sample updated to reflect the
    new single-source model.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(foundry): consolidate prompt-agent sample
    
    Drop creating_prompt_agents.py (the publish-only variant) and rename
    using_prompt_agents.py to foundry_prompt_agents.py so the single sample
    covers the full convert -> publish -> connect -> run loop. Update the
    README link list accordingly.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(foundry): run local Agent + deployed agent in same sample
    
    Add an agent.run() call against the local Agent before publishing, then run
    the deployed prompt agent on the same query. Expand the docstring with a
    compare-and-contrast covering runtime/latency, configurability, and
    persistence/sharing differences between the two execution paths.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * test(foundry): cover conflicting response_format + text.format in to_prompt_agent
    
    Exercises the ValueError path when a Pydantic response_format would overwrite
    an explicit text.format mapping with a different shape. Lifts _chat_client.py
    coverage from 89% to 90%.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * refactor(foundry): move _prepare_prompt_agent_options into _to_prompt_agent
    
    Lift the translation helper off RawFoundryChatClient and into the
    _to_prompt_agent module as a module-private function that takes the client
    as its first argument. The chat client no longer needs to carry a method
    whose only consumer is the prompt-agent converter, while still serving as
    the source of the request-path helper (_prepare_response_and_text_format)
    that the converter reuses for dict-shaped response_format values.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs(python): codify GA terminology + post-run docs review
    
    Add two pieces of guidance to python/AGENTS.md:
    
    * Terminology - reserve 'GA' for hosted services; use 'released' or 'stable'
      for Agent Framework code/features to match the feature-lifecycle stages.
    * Maintaining Documentation - review AGENTS.md and skills at the end of every
      run and update any guidance the conversation made stale; before adding a
      new principle, ask the user to confirm it should be captured.
    
    Also pulls in a docstring fix in foundry_prompt_agents.py that swaps the
    stray 'GA' for 'released', applying the new terminology rule.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * address PR review: strict=True default, Tool._deserialize dispatch, sample cleanup safety
    
    - FunctionTool published as strict=True so the server-side schema validation
      matches what the local FoundryAgent(tools=[same_callable]) dispatcher
      enforces. AF FunctionTool has no 'strict' attribute, so the safer default
      is used uniformly instead of silently downgrading to a permissive contract.
    - _validate_mapping_tool now dispatches through ProjectsTool._deserialize so
      dict-shaped tools rehydrate to the concrete subclass (FunctionTool,
      WebSearchTool, ...) via the 'type' discriminator instead of returning a
      generic Tool. Added a test that asserts isinstance(WebSearchTool) and a
      new test for the function-typed dict path.
    - foundry_prompt_agents.py sample now wraps credential + project client in
      async with and the create_version / run flow in try/finally so a failure
      on connect or run still deletes the published prompt agent rather than
      leaving an orphaned, billable resource in the user's Foundry project.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix(ci): correct linkspector ignorePattern typo (./pulls -> ./pull)
    
    GitHub PR URLs use the singular segment /pull/N (compare to /issues/N
    for issues). The existing './pulls' ignore pattern never matched
    anything as a result, so legitimately stale PR links (e.g. PRs deleted
    from forks) surface as linkspector failures on unrelated PRs.
    
    This is the same convention the './issues' rule above already follows.
    Fixes the markdown-link-check failure on a dangling link in
    dotnet/src/Microsoft.Agents.AI.DurableTask/CHANGELOG.md.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • .NET: Foundry Evals integration for .NET (#4914)
    * Foundry Evals integration for .NET
    
    - Core evaluation framework: EvalItem, LocalEvaluator, FunctionEvaluator, EvalChecks
    - IAgentEvaluator interface with MeaiEvaluatorAdapter bridge
    - AgentEvaluationExtensions for agent.EvaluateAsync() overloads
    - FoundryEvals wrapping MEAI quality/safety evaluators
    - ConversationSplitters (LastTurn, Full) and IConversationSplitter
    - EvalItem.PerTurnItems() for multi-turn decomposition
    - HasImageContent for multimodal content detection
    - WorkflowEvaluationExtensions for per-agent workflow evaluation
    - 7 eval samples mirroring Python parity:
      02-agents/Evaluation: SimpleEval, ExpectedOutputs, Multimodal
      03-workflows/Evaluation: WorkflowEval
      05-end-to-end/Evaluation: FoundryQuality, MixedProviders, ConversationSplits
    - Comprehensive unit tests (1958 passing)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Rewrite FoundryEvals to use real Foundry Evals API
    
    Replace MEAI evaluator shim with actual OpenAI EvaluationClient protocol
    methods. FoundryEvals now creates eval definitions, submits runs, polls
    for completion, and fetches per-item results server-side.
    
    - New constructor: FoundryEvals(AIProjectClient, model, evaluators)
    - Add FoundryEvalConverter for MEAI ChatMessage -> Foundry JSON format
    - Add EvalId, RunId, ReportUrl to AgentEvaluationResults
    - All 20 built-in evaluator constants now work (agent, tool, quality, safety)
    - Remove Microsoft.Extensions.AI.Evaluation.Quality/Safety dependencies
    - Update all samples for new constructor (no more ChatConfiguration)
    - Replace BuildEvaluators tests with ResolveEvaluator tests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add response output to CustomEvals and ExpectedOutputs samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review: pagination, validation, error handling, tests
    
    FoundryEvals fixes:
    - Add pagination for output items (has_more/after cursor)
    - Add guard clauses for pollIntervalSeconds/timeoutSeconds <= 0
    - Fix double TryGetProperty for passed field parsing
    - Throw on all-tool-evaluators with no tool definitions
    - Fix XML doc (default 300s, not 180s)
    
    New tests (30 added, 1989 total):
    - EvalChecks: NonEmpty, ContainsExpected (pass/fail/skip/case),
      HasImageContent, ToolCallsPresent
    - FoundryEvalConverter: ConvertMessage (text, image, function call,
      function results fan-out, empty fallback, mixed content),
      ConvertEvalItem, BuildTestingCriteria (quality/agent/tool/groundedness
      data mappings), BuildItemSchema
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix review: null-refs, Data.ToString() bug, ContainsExpected, add tests
    
    - Fix NullReferenceException in sample Response display (pattern matching)
    - Fix WorkflowEvaluationExtensions Data?.ToString() producing type names
      instead of message text (pattern-match ChatMessage/AgentResponse/list)
    - Change EvalChecks.ContainsExpected to return Passed=false when no
      ExpectedOutput (was silently passing, masking misconfiguration)
    - Add EvalItem constructor tests with LastTurn/Full/null splitters
    - Add FoundryEvalConverter.ConvertMessage DataContent (base64 image) test
    - Add ExtractAgentData tests with ChatMessage, list, and AgentResponse data
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix review: conversation fidelity, eval caching, fallback tests
    
    - WorkflowEvaluationExtensions: preserve full response messages (tool calls,
      intermediate) instead of synthetic 2-message conversation. Cast completed
      Data to AgentResponse and use Messages when available, fallback to text.
    - FoundryEvals: cache evalId per schema shape (hasContext, hasTools) so
      subsequent EvaluateAsync calls create runs under the same eval definition.
    - MeaiEvaluatorAdapter: code already correctly passes queryMessages (not full
      conversation) to IEvaluator — no change needed, verified by inspection.
    - Add tests: AgentResponse full messages preservation, unknown object
      ToString() fallback for ExtractAgentData.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Rename AzureAI→Foundry: move eval files, update references
    
    - Move FoundryEvals.cs and FoundryEvalConverter.cs from
      Microsoft.Agents.AI.AzureAI to Microsoft.Agents.AI.Foundry
    - Update namespace from AzureAI to Foundry in both files
    - Add explicit usings required by Foundry project (no implicit usings)
    - Move FoundryEvalConverter tests to Foundry.UnitTests project
      (avoids ReplacingRedactor type conflict from dual project refs)
    - Update all sample csproj references and using statements
    - Remove Foundry project reference from AI UnitTests
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * PR review round 4: wire up tool extraction, remove eval cache, fix null safety
    
    - BuildEvalItem: extract tools from agent via GetService<ChatOptions>() into EvalItem.Tools (Python parity)
    - FoundryEvals: remove eval ID cache - each call creates fresh definition (matches Python behavior)
    - FoundryEvals: replace null-forgiving operators with descriptive InvalidOperationException
    - MixedProviders sample: remove unnecessary explicit PackageReferences (transitively provided)
    - FoundryEvalConverter: document that tool results take precedence over text content
    - Add LocalEvaluator zero-checks test documenting 0 metrics = failed behavior
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python-dotnet parity: 9 feature gaps filled
    
    New checks:
    - ToolCallArgsMatch() — verify tool call names + argument subset match
    - ToolCalledCheck(ToolCalledMode.Any, ...) — match any of the specified tools
    - ToolCalledMode enum (All/Any)
    
    FoundryEvals enhancements:
    - Default evaluators now [Relevance, Coherence, TaskAdherence] (was Relevance, Coherence)
    - Auto-add ToolCallAccuracy when items have tool definitions
    - EvaluateTracesAsync — evaluate by response_ids, trace_ids, or agent_id
    - EvaluateFoundryTargetAsync — evaluate deployed Foundry targets
    
    Result type enrichment:
    - AgentEvaluationResults: added Status, Error, PerEvaluator, DetailedItems
    - New EvalItemResult/EvalScoreResult/PerEvaluatorResult types
    - FoundryEvals populates all new fields from API responses
    
    Workflow fix:
    - Skip internal executors (_*, input-conversation, end-conversation, end)
    
    Tests: 8 new tests covering ToolCallArgsMatch, ToolCalledMode.Any, internal executor filtering
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add MeaiEvaluatorAdapter and PerTurnItems edge case tests
    
    - 3 tests for MeaiEvaluatorAdapter: query message forwarding, synthetic
      response fallback, multiple items aggregation
    - 3 tests for EvalItem.PerTurnItems: empty conversation, no user messages,
      system+assistant only
    - StubEvaluator and StubChatClient test helpers
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Blocking link check for outdated package in DevUI.
    
    * Replace Dictionary<string, object> payloads with typed wire models
    
    Introduce internal FoundryEvalWireModels.cs with compile-time-safe types
    for the OpenAI Evals API wire format. The OpenAI .NET SDK (2.9.1) only
    provides protocol-level methods with BinaryContent/ClientResult — no
    typed request models. These internal models replace scattered dictionary
    literals with [JsonPropertyName]-annotated classes, giving:
    
    - Compile-time safety (typos become build errors)
    - Single point of change when the API evolves
    - IntelliSense discoverability
    - Cleaner serialization via JsonPolymorphic for content items
    
    Models: WireContentItem hierarchy (text, image, tool_call, tool_result),
    WireMessage, WireEvalItemPayload, WireTestingCriterion, WireItemSchema,
    WireCreateEvalRequest, WireCreateRunRequest, and data source variants.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Skip metric when Foundry returns neither score nor passed
    
    When an evaluator returns no score and no passed value, the previous
    code created BooleanMetric(name, false), which falsely failed items
    via ItemPassed. Now we skip the MEAI metric entirely for indeterminate
    results — the raw data remains available in DetailedItems for diagnostics.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR #4914 review comments: fix tool evaluator bug and add tests
    
    - Fix duplicate ToolCallAccuracy: resolve evaluator names before checking
      against ToolEvaluators set (Comment 2)
    - Make FilterToolEvaluators internal for testability; add tests for the
      ArgumentException edge case when all evaluators are tool-type (Comment 3)
    - Add CancellationToken test for LocalEvaluator (Comment 4)
    - Add EvaluateAsync integration test on Run with sequential workflow and
      per-agent SubResults verification (Comment 5)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address Peter's review comments on PR #4914
    
    - Add trailing newline to Evaluation_FoundryQuality.csproj (Comment 6)
    - Make evaluator name lookups case-insensitive: switch BuiltinEvaluators,
      ToolEvaluators, AgentEvaluators, and ResolveEvaluator's StartsWith check
      from Ordinal to OrdinalIgnoreCase (Comment 7)
    - Add Trace.TraceWarning when Foundry returns fewer results than submitted
      items, indicating expected vs actual count before padding (Comment 8)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add Microsoft.Extensions.AI.Evaluation packages to Directory.Packages.props
    
    These were removed in #5269 as unused, but are needed by the Foundry
    and core evaluation integration added in this PR.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: alliscode <bentho@microsoft.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • .NET: Add foundry extension samples for python and dotnet (#4359)
    * Add foundry extension samples for python and dotnet
    
    * Align foundry extension samples with existing hosted agent patterns
    
    - Fix Python multiagent indentation bug (from_agent_framework ran in both modes)
    - Remove hardcoded personal endpoint from appsettings.Development.json
    - Rename .NET folders/projects to PascalCase (FoundryMultiAgent, FoundrySingleAgent)
    - Upgrade .NET multiagent from net9.0 to net10.0
    - Add ManagePackageVersionsCentrally=false and analyzer blocks to .csproj files
    - Replace wildcard package versions with fixed versions
    - Use alpine Docker images and standard build pattern
    - Align agent.yaml structure (template nesting, displayName, resources, authors)
    - Convert .NET multiagent from namespace/class to top-level statements
    - Add run-requests.http for multiagent sample
    - Fix Python requirements.txt (remove dev deps, add agent-framework)
    - Add proper copyright headers
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Align foundry samples: fix builds, upgrade AgentServer to beta.8
    
    - Fix TargetFrameworks (plural) to override inherited net472 from Directory.Build.props
    - Upgrade Azure.AI.AgentServer.AgentFramework to 1.0.0-beta.8 (latest)
    - Bump OpenTelemetry packages to 1.12.0 (required by beta.8)
    - Fix Roslynator/format errors (imports ordering, BOM, sealed record, target-typed new)
    - Verified with docker dotnet format (matching CI pipeline)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Refactor hosted samples to use AIProjectClient.CreateAIAgentAsync
    
    Replace PersistentAgentsClient and manual AzureOpenAIClient setup with
    AIProjectClient.CreateAIAgentAsync() from Microsoft.Agents.AI.AzureAI.
    
    - FoundryMultiAgent: Remove Azure.AI.Agents.Persistent, use CreateAIAgentAsync
      for Writer and Reviewer agents with cleanup in finally block
    - FoundrySingleAgent: Remove manual GetConnection/AzureOpenAIClient chain,
      use CreateAIAgentAsync with hotel search tool
    - Update csproj: add Microsoft.Agents.AI.AzureAI, remove unused packages
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update READMEs to reflect AIProjectClient.CreateAIAgentAsync usage
    
    - Reference Microsoft.Agents.AI.AzureAI and Microsoft.Agents.AI.Workflows packages
    - Add Azure AI Developer role requirement for agents/write data action
    - Replace PersistentAgentsClient references
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add HostedAgents READMEs and Foundry samples to solution
    
    - Create dotnet/samples/05-end-to-end/HostedAgents/README.md with sample index
    - Create python/samples/05-end-to-end/hosted_agents/README.md with sample index
    - Add FoundryMultiAgent and FoundrySingleAgent to agent-framework-dotnet.slnx
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Python linting: reorder imports before load_dotenv, remove trailing whitespace
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update uv.lock to match latest package versions
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix trailing whitespace in foundry_single_agent agent.yaml
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Exclude dotnet.microsoft.com from link checker
    
    This domain intermittently times out in CI, causing flaky markdown
    link check failures unrelated to PR changes.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Align env vars to AZURE_AI_PROJECT_ENDPOINT and default model to gpt-4o-mini
    
    Addresses PR review feedback:
    - Rename PROJECT_ENDPOINT to AZURE_AI_PROJECT_ENDPOINT across all
      Foundry samples (dotnet + python) to match existing samples
    - Change default model from gpt-4.1-mini to gpt-4o-mini consistently
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Skip flaky test CreatesWorkflowEndToEndActivities_WithCorrectName_DefaultAsync
    
    Tracked in #4398
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove Python foundry samples from PR scope
    
    Python hosted agent samples need further alignment with the azure-ai
    package conventions. Removing from this PR to ship .NET samples first.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Narrow linkspector exclusion to dotnet.microsoft.com/download only
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Leo Yao <leoyao@Leos-MacBook-Pro.local>
    Co-authored-by: Roger Barreto <19890735+rogerbarreto@users.noreply.github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • .NET: Add DevUI package for .NET (#1603)
    * Implement DevUI
    
    * Review feedback
    
    * Fix build
  • Python: bump version and update changelog (#1277)
    * bump version and update changelog
    
    * undo lab package version bump
    
    * add changelog to exclude pattern for links
    
    * fix for ignore path
    
    * exclude file
  • .NET: Add link inspector (#1062)
    * Add link inspector
    
    * Comment out excludedirs while it's empty
    
    * Fix broken links
    
    * More links fixes
    
    * Push further fixes
    
    * Fix more links