agent-framework

Build(deps): Bump python-multipart from 0.0.26 to 0.0.27 in /python

Bumps [python-multipart](https://github.com/Kludex/python-multipart) from 0.0.26 to 0.0.27.
- [Release notes](https://github.com/Kludex/python-multipart/releases)
- [Changelog](https://github.com/Kludex/python-multipart/blob/main/CHANGELOG.md)
- [Commits](https://github.com/Kludex/python-multipart/compare/0.0.26...0.0.27)

---
updated-dependencies:
- dependency-name: python-multipart
  dependency-version: 0.0.27
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

dependabot[bot] · 2026-06-09 06:01:18 +00:00

6b2ff3ed7a

Python: Harness console for python (#6312 )

* Add initial harness console for python

* Add textual to project

* Add planning and approval flows with list selector

* Address PR comments

* Fix list selection bug

* Fix PR #6312 round 2 review comments

- Escape untrusted agent text with rich.markup.escape() in observers
  (text_output, planning_output, reasoning_display) to prevent markup injection
- Remove non-functional 'Always approve' choices from tool_approval.py
  (framework lacks CreateAlwaysApproveToolResponse support)
- Remove textual from root pyproject.toml dev deps (sample-specific)
- Add PEP 723 inline script metadata to harness_research.py
- Narrow except Exception to except NoMatches in list_selection.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix build error

* Fix build errors

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

westey · 2026-06-09 05:48:35 +00:00

bad05a2bdc

Python: Fix per-service-call history persistence with server-storing clients (#6310 )

* Fix per-service-call history persistence with server-storing clients

When an Agent set require_per_service_call_history_persistence=True together
with a HistoryProvider, and the chat client stored history server-side by
default (e.g. OpenAIChatClient, STORES_BY_DEFAULT=True), the external history
provider was silently never persisted.

Unify persistence on the per-service-call middleware: when the flag is set and
a HistoryProvider exists, the middleware is always installed and owns
persistence. service_stores_history now only selects middleware behavior:
- service does not store: load providers and drive the function loop with a
  local sentinel conversation id, or
- service stores: skip loading (the service owns history) and persist each
  service call while the real conversation id flows through.

Also rationalize chat-options handling in _prepare_run_context:
- _merge_options now skips None overrides and strips remaining None values, so
  an unset `store` is never forwarded and the service decides its own default.
- Resolve `store` and `conversation_id` once from a single combined view
  (effective_options) instead of probing both default and runtime dicts; the
  auto-injection and per-service-call resolution now agree on conversation_id.

Fixes #5798

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Correct as_agent() docstring: persistence is per service call, not once per run

Address PR review: when the client stores history server-side, the
per-service-call middleware still persists after each model call; only
provider loading is skipped. The previous "persist once per run()" wording
contradicted the implementation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: docs, missing-conversation-id warning, and tests

- Clarify that require_per_service_call_history_persistence is a no-op when no
  HistoryProvider is present (docstrings in _agents.py and _clients.py).
- Warn on every service call when the client stores history server-side but
  returns no conversation_id, so the (uncommon) loss of cross-turn resumability
  cannot fail silently.
- Add tests: storing client + existing conversation_id does not raise and the id
  propagates; two runs on the same session keep persisting with a stable
  service_session_id and no provider loading; storing-without-conversation-id
  warns per call.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-06-09 05:47:57 +00:00

7e0767a0a0

.NET: [BREAKING] Migrate .NET GitHub Copilot SDK to v1.0.0 (#6381 )

* Migrate .NET GitHub Copilot SDK from 1.0.0-beta.2 to 1.0.0

- Update namespace from GitHub.Copilot.SDK to GitHub.Copilot
- Replace PermissionRequestResult/PermissionRequestResultKind with PermissionDecision
- Remove ConnectionState check (StartAsync is now idempotent)
- Rename ConfigDir to ConfigDirectory
- Use SessionConfig.Clone() for CopySessionConfig
- Update Tools type from List<AIFunction> to List<AIFunctionDeclaration>
- Rename UserMessageAttachmentFile to AttachmentFile
- Update usage data types (CacheWriteTokens: long, Duration: TimeSpan)
- Add GHCP001 NoWarn for experimental SDK APIs (matches framework convention)
- Specify type argument on CopilotSession.On<SessionEvent>()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix formatting: remove unused using directive

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Skip AzureFunctions SamplesValidation tests pending func tools fix

Azure Functions Core Tools v4 can no longer auto-detect the worker
runtime in CI (local.settings.json is gitignored). All 7 active
SamplesValidation tests fail with 'Worker runtime cannot be None'.

Tracked by: https://github.com/microsoft/agent-framework/issues/6402

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Skip additional failing integration tests in CI

WorkflowSamplesValidation (5 tests): same func tools issue as #6402.
WorkflowConsoleAppSamplesValidation (4 tests): KeyNotFoundException
during workflow execution, tracked by #6404.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-06-08 22:34:05 +00:00

af772997af

.NET: Add approval bypassing to harness as the default (#6387 )

* Add approval bypassing to harness as a default

* Add tests

* Address PR comments.

westey · 2026-06-08 17:50:41 +00:00

b343625c1f

Match AG-UI approval responses to requested arguments (#6376 )

Evan Mattson · 2026-06-08 16:33:16 +00:00

9bc7b27813

.NET: [BREAKING] Fix hosting bugs (#6388 )

* Fix hosting bugs

* Address PR comments

westey · 2026-06-08 16:17:54 +00:00

6a2efeae7c

Python: fix(mem0): isolate entity retrieval and correct app_id payload (#6242 )

* fix(mem0): parallel memory retrieval logic and strict type compliance

* fix(mem0): align parallel retrieval types for pyright and mypy

* fix(mem0): handle asyncio.CancelledError in search response and update test description

* fix(mem0): improve error handling for asyncio.CancelledError and update test names for clarity

* fix(mem0): improve retrieval response handling

Vedant Sonani · 2026-06-08 13:50:23 +00:00

6169df04cb

.NET: Fix single-column value unwrap in declarative workflow (#6367 )

* Fix single-column value unwrap in declarative workflow

* Added more tests

Peter Ibekwe · 2026-06-08 11:37:12 +00:00

331201294b

fix: preserve foreach record values (#6208 )

Yufeng He · 2026-06-05 22:01:59 +00:00

fa9e086576

Python: feat(python): Add MCP client OTel spans per GenAI semantic conventions (#6349 )

* feat(python): Add MCP client OTel spans per GenAI semantic conventions

Implement MCP client spans per the OTel GenAI Semantic Conventions for MCP
(https://opentelemetry.io/docs/specs/semconv/gen-ai/mcp/#client).

Operations instrumented:
- initialize: CLIENT span capturing MCP session setup
- tools/list: CLIENT span for tool listing (per-page)
- prompts/list: CLIENT span for prompt listing (per-page)
- tools/call: CLIENT span (nested under execute_tool when called via FunctionTool)
- prompts/get: CLIENT span

Span attributes follow the MCP semantic conventions:
- Required: mcp.method.name
- Conditional: error.type, gen_ai.tool.name, gen_ai.prompt.name
- Recommended: gen_ai.operation.name, mcp.protocol.version, mcp.session.id,
  network.transport, server.address, server.port

Transport-specific attributes per subclass:
- MCPStdioTool: network.transport=pipe
- MCPStreamableHTTPTool: network.transport=tcp, network.protocol.name=http
- MCPWebsocketTool: network.transport=tcp, network.protocol.name=websocket

All span creation gated behind OBSERVABILITY_SETTINGS.ENABLED.

Closes #3624
Closes #4697

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: simplify MCP spans — remove enrichment logic and protocol version caching

- Always create nested CLIENT spans for tools/call instead of enriching
  the parent execute_tool span
- Remove _ACTIVE_TOOL_EXECUTION_SPAN contextvar (no longer needed)
- Remove enrich_span_with_mcp_attributes() helper
- Remove _otel_error_type preservation in FunctionTool.invoke()
- Remove _mcp_protocol_version instance variable; protocol version is
  only set on the initialize span where it is available

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Refine copilot solution

* fix: enable automatic exception recording on MCP spans

Remove record_exception=False and set_status_on_exception=False from
create_mcp_client_span. Let OTel handle exception recording and status
setting automatically. The manual set_mcp_span_error calls for tools/call
still correctly set error.type (which OTel's automatic handling doesn't
touch), so tool_error is preserved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Reduce number of lines

* Add comment to sample

* test: address PR review comments on MCP observability tests

- Fix initialize test to call mocked session.initialize() and read
  protocolVersion from the result instead of hardcoding it
- Add tools/call McpError error-path test
- Add prompts/get McpError error-path test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix export error

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Tao Chen · 2026-06-05 19:23:01 +00:00

dcc218dbac

.NET: [BREAKING] Add auto-approval rules (heuristics) to ToolApprovalAgent (#6335 )

* Add support for approving tools via heuristic rules

* Address PR comments

* Address PR comments

* Apply suggestion from @SergeyMenshykh

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

---------

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

westey · 2026-06-05 18:43:07 +01:00

6bd2cfec03

.NET: Allow storage of auto-approved functions (#4950 )

* Allow storage of auto-approved functions

* Address PR comments

westey · 2026-06-05 18:42:21 +01:00

ab8ba8fc61

Python: Refactor workflow as agent pending request handling (#6259 )

* WIP: Refactor Workflow as agent pending request handling

* WIP: debugging empty message bug

* Working: Workflow as agent with function approval

* Address Copilot comments

* Fix mypy

* Address comments and fix pipeline

* Request info non function approval now becomes function call

* Revert uv.lock

* Fix mypy

* Bump min version of azure-ai-project

* Remove RequestInfoFunctionArgs

* fix tests

* Fix failing tests

* Fix sample

Tao Chen · 2026-06-05 17:23:19 +00:00

9cafd7e58b

Python (fix:gemini): make Gemini honor declarative outputSchema, not just JSON mode (#5893 )

* fix(gemini): preserve schema response_format

* fix(gemini): satisfy pyright strict in response schema extraction

Cast Any-narrowed mappings to Mapping[str, Any] in the structured-output
schema helpers so pyright strict no longer reports partially-unknown
member, argument, and variable types. Pass response_format["format"]
straight into the recursive extractor, which already guards non-mapping
inputs. No behavior change.

* fix(gemini): use Sequence[object] cast to satisfy both mypy and pyright

The Sequence[Any] cast pyright strict needs to know the loop element type
is reported as a redundant-cast by mypy, which already narrows the
isinstance branch to Sequence[Any]. Cast to Sequence[object] instead:
pyright gets a fully known element type and mypy no longer sees an
identical-type cast. No behavior change.

---------

Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>

cooleryu · 2026-06-05 15:17:51 +00:00

d5335fbeae

Python: MCP long-running task support in Python (#6319 )

* MCP long-running task support in Python

* Fix pyupgrade and AGENTS.md reconnect description

- pyupgrade: drop forward-reference string annotations in _mcp.py (Python 3.10+ resolves them natively now that MCPTaskOptions is defined before use).

- AGENTS.md: align reconnect description with current behavior. Phase 1 (initial tools/call) does NOT retry on connection loss; raises 'connection lost; task state unknown' instead, so a server that accepted the request but lost the response cannot start the operation twice. Phase 2 (tasks/get / tasks/result) still reconnects once against the same task_id.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix bandit nosec marker for CI pipeline

* Address PR feedbacks

* Clarifiied comments and addressed more PR feedbacks.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Peter Ibekwe · 2026-06-05 00:04:55 +00:00

bf4ad48cf2

Python: bump package versions for 1.8.0 release (#6351 )

- Released cohort (core, openai, foundry, root): 1.7.0 -> 1.8.0
- agent-framework-github-copilot: promote to RC (1.0.0rc1)
- agent-framework-orchestrations: rc2 -> rc3 (bug fix)
- Beta/alpha packages with changes: a2a, anthropic, azurefunctions, bedrock,
  foundry-hosting, mistral bumped to new date stamp (260604)
- Inter-package dependency bounds updated for changed packages
- CHANGELOG.md and PACKAGE_STATUS.md updated

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-06-04 23:03:24 +00:00

python-1.8.0 01fc518b29

Python: Add GitHub Copilot integration tests to CI workflows (#6346 )

Add a dedicated integration test job for the github_copilot package to both
python-integration-tests.yml and python-merge-tests.yml.

The job:
- Runs 6 integration tests marked with @pytest.mark.integration
- Uses COPILOT_GITHUB_TOKEN secret from the integration environment
- Follows the same pattern as other provider integration jobs
- Includes path filtering in merge-tests (github_copilot package + core changes)
- Added to needs lists in report and check jobs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-06-04 22:06:26 +00:00

f3c3efed43

.NET: Bump ModelContextProtocol from 1.1.0 to 1.2.0 (#3956 ) (#6239 )

Co-authored-by: Neeraj Karamchandani <neerajkaramchandani@mac.mynetworksettings.com>

neerajkaram · 2026-06-04 21:51:15 +01:00

bbccb7c28c

Python: Fix toolbox consent flow in hosted agent (#6249 )

* Fix toolbox consent flow in hosted agent

* Resolve conflict

* Make unused tool as comment

* Fix tests

Tao Chen · 2026-06-04 20:28:59 +00:00

dbc312a78a

.NET: Restructure skill script schemas XML and remove resources from body (#6343 )

* Restore UTF-8 BOMs and fix BuildScriptSchemasBlock doc comment

- Restore UTF-8 BOM on all changed files to match repo convention
- Fix XML doc: <schema name=...> -> <schema script=...> to match emitted output

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review comments: fix doc remarks and rename tests

- Update script doc remarks to clarify only parameter schemas are included
- Fix grammar: 'arguments format' -> 'argument format'
- Rename misleading test methods to match actual assertions
- Clarify comment about removed wrapper element

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: SergeyMenshykh <SergeMenshikh@outlook.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SergeyMenshykh · 2026-06-04 21:15:29 +01:00

bb9ed63a34

Python: Add timeout parameter to FoundryAgent to fix ConnectTimeout on multi-turn conversations (#6263 )

* Python: fix ConnectTimeout on multi-turn FoundryAgent conversations (#6241)

Expose a `timeout` parameter on `RawFoundryAgentChatClient`,
`_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`, and
`RawOpenAIChatClient` so callers can override the HTTP timeout used by
the underlying AsyncOpenAI client.

Root cause: `RawFoundryAgentChatClient.__init__` called
`project_client.get_openai_client()` without configuring any timeout,
inheriting the OpenAI SDK default of `httpx.Timeout(connect=5.0)`.
When connections are recycled between turns under load, the 5 s connect
timeout fires and surfaces as `openai.APITimeoutError`.

Fix:
- `load_openai_service_settings` (`_shared.py`): accept `timeout` and
  include it in `client_args` for all three `AsyncOpenAI`/
  `AsyncAzureOpenAI` construction paths.
- `RawOpenAIChatClient.__init__` (`_chat_client.py`): accept `timeout`
  and forward to `load_openai_service_settings`.
- `RawFoundryAgentChatClient.__init__` (`_agent.py`): accept `timeout`
  and set `openai_client.timeout = timeout` on the client returned by
  `get_openai_client()` before passing it to the base class.
- `_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`: accept
  and propagate `timeout` through the construction chain.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add timeout parameter to FoundryAgent and RawOpenAIChatClient

Expose a timeout parameter on RawFoundryAgentChatClient,
_FoundryAgentChatClient, RawFoundryAgent, FoundryAgent, and
RawOpenAIChatClient. When provided, the value is applied to the
underlying AsyncOpenAI client so that connect timeouts under load
or after connection recycling can be tuned by callers.

Previously, get_openai_client() was called without any timeout
override, so the SDK default of httpx.Timeout(connect=5.0) was
inherited and could fire on multi-turn conversations where the
underlying connection is recycled between turns.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Add `timeout` parameter to `FoundryAgent` to fix `ConnectTimeout` on multi-turn conversations

Fixes #6241

* fix(foundry): use with_options to avoid mutating shared OpenAI client timeout (#6241)

Replace direct assignment  with
 in
RawFoundryAgentChatClient.__init__.

The Azure AI Projects SDK caches and returns a shared AsyncOpenAI client
per AIProjectClient. Mutating its .timeout attribute leaked the override
to all other code paths sharing that client (other agents, user code).
with_options() returns a new client instance with the override applied,
leaving the original shared client untouched.

Update tests to assert with_options is called with the correct timeout
and that the original shared client's timeout attribute is not mutated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(foundry): assert with_options return value flows to instance.client (#6241)

The four timeout propagation tests verified that with_options was called
but did not confirm that the returned (timeout-configured) client was
actually stored on the instance. A silent discard of the return value
would have left the tests green while the timeout had no effect.

Each test now captures the constructed instance and asserts:
  assert <instance>.client is openai_client_mock.with_options.return_value

Affected tests:
- test_raw_foundry_agent_chat_client_init_applies_timeout_to_openai_client
- test_raw_foundry_agent_chat_client_init_applies_timeout_with_preview_enabled
- test_foundry_agent_chat_client_init_propagates_timeout
- test_foundry_agent_init_propagates_timeout_to_openai_client

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-06-04 18:25:18 +00:00

6b94315161

fix: drop hosted MCP calls when reasoning is stripped (#6210 )

Yufeng He · 2026-06-04 18:11:24 +00:00

bc0e65d716

Python: Fix spurious Magentic custom manager warning (#6261 )

* Fix magentic manager warning

* Use typing_extensions.Sentinel for _MISSING sentinel value

Replace the bare object() sentinel with typing_extensions.Sentinel per
PEP 661 (now final). Sentinel provides a proper name and repr
('<_MISSING>') and is the idiomatic approach going forward.

Refs #4306

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: correct Sentinel type annotation for max_stall_count param (#6261)

Use int | Sentinel for max_stall_count parameter type annotation instead
of int with cast(Any, _MISSING) to properly express that the parameter
can hold either an int or the _MISSING sentinel value. This fixes the
pyright reportUnnecessaryComparison errors caused by the types int and
Sentinel having no overlap.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Rename _MISSING sentinel to UNSET in orchestrations

The sentinel is user-visible as a default in public init signatures, so
use UNSET (no leading underscore) instead of the private _MISSING name.
Drop the now-unnecessary reportPrivateUsage ignores on the UNSET imports.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-06-04 08:59:04 +00:00

4268080c20

Python: [BREAKING] Upgrade github-copilot-sdk to v1.0.0 (stable) (#6292 )

* Python: Upgrade github-copilot-sdk to v1.0.0 (stable)

Upgrade agent-framework-github-copilot from github-copilot-sdk 1.0.0b2 to the
stable 1.0.0 release, adapting to all breaking API changes.

Source changes (_agent.py):
- SubprocessConfig removed: use RuntimeConnection.for_stdio(path=...) +
  CopilotClient kwargs (connection, log_level, base_directory)
- Import paths: copilot.generated.session_events -> copilot.session_events
- Settings: copilot_home -> base_directory (env GITHUB_COPILOT_BASE_DIRECTORY)
- Default deny handler: PermissionDecisionUserNotAvailable() (from
  copilot.generated.rpc)

Test changes:
- Updated imports and client-construction assertions (kwargs-based)
- Permission handler tests use concrete decision types
  (PermissionDecisionApproveOnce, PermissionDecisionDeniedInteractivelyByUser)

Sample changes:
- Permission handlers use PermissionHandler.approve_all or sync
  approve_and_log pattern (v1.0.0 protocol v3 dispatch is incompatible
  with blocking input() in permission handlers)
- Function approval sample uses asyncio.to_thread for interactive prompts
- Simplified imports across all samples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: scope permission handlers, widen type, add test

- Shell sample: only approve kind='shell', deny others
- URL sample: only approve kind='url', deny others
- Use getattr() for kind-specific attributes to satisfy pyright
- Widen PermissionHandlerType to accept async handlers (matches SDK)
- Add test for _deny_all_permissions return value

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix validation script and strengthen test assertion

- Update scripts/sample_validation/create_dynamic_workflow_executor.py to
  use copilot.session_events imports and PermissionHandler.approve_all
- Assert isinstance(result, PermissionDecisionUserNotAvailable) instead of
  stringly-typed kind check

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add integration tests for GitHubCopilotAgent

Add 6 integration tests mirroring .NET coverage:
- Basic non-streaming response
- Streaming response
- Function tool invocation
- Session context (multi-turn)
- Session resume by ID
- Shell command execution

Tests require COPILOT_GITHUB_TOKEN env var (skipped otherwise).
Each test cleans up its Copilot session via delete_session.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-06-04 08:42:35 +00:00

fe08574a7c

Python: Fix compaction message-id collisions and tool-loop summary persistence (#6299 )

* Fix compaction message-id collisions and tool-loop summary persistence

Fixes two bugs in the compaction strategies:

- #5237: incremental group annotation assigned message ids by position
  within the re-annotated slice, so moving the re-annotation start back to
  a previous group start restarted ids at 0 and produced collisions
  (e.g. a user message reusing an assistant message's id), merging groups
  and causing tool-result compaction to wrongly exclude messages.
  group_messages/_ensure_message_ids now take an id_offset and guard
  against existing-id collisions; annotate_message_groups threads the
  slice start index through as the offset.

- #4991: the function-invocation loop copied the message list each
  iteration, so summaries inserted by compaction landed in a throwaway
  copy and were lost across tool-loop iterations (only the persistent
  excluded flags survived). _prepare_messages_for_model_call now compacts
  the list in place when messages is a list, so inserted summaries persist.

Adds regression tests (incremental id uniqueness, existing-id collision
avoidance, idempotency, and tool-loop summary persistence including
streaming and conversation-id modes).

Also adds a summarization.py sample demonstrating SummarizationStrategy
directly with a real client, and reworks advanced.py with tool-call
groups and a real summarizer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Guard incremental message-id assignment against prefix-id collisions

Addresses PR review on #5237: _ensure_message_ids only guarded against
collisions within the re-annotated slice. A preexisting (e.g. user-supplied)
id in the preserved prefix could still be reassigned in the suffix when the
id was numerically out of position, merging groups across the re-annotation
boundary again.

group_messages/_ensure_message_ids now accept reserved_ids, and
annotate_message_groups passes the preserved prefix's ids so auto-assigned
suffix ids never collide across the full list. Adds a regression test
reproducing the out-of-position prefix-id collision.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-06-04 08:37:59 +00:00

f970a699d8

Python: run sync tools off the event loop (#5773 )

* fix: run sync tools off event loop

* chore: silence harness tool marker type check

Yufeng He · 2026-06-04 04:42:08 +00:00

f29bae8fbc

Fix Observability/WorkflowAsAnAgent sampl (#6316 )

Peter Ibekwe · 2026-06-03 23:52:50 +00:00

c3901a4ddd

Don't count dependabot prs as part of the limit (#6317 )

Evan Mattson · 2026-06-04 08:31:36 +09:00

ba617fc3b5

Updating dotnet package versions for 1.9 release (#6314 )

Co-authored-by: Ben Thomas <25218250+alliscode@users.noreply.github.com>

Ben Thomas · 2026-06-03 20:03:21 +00:00

dotnet-1.9.0 afa7834e2e

Python: Add MCP-based skills discovery (McpSkillsSource) (#6169 )

* Add MCP-based skills discovery (McpSkill, McpSkillsSource, McpSkillResource)

Implement Agent Skills discovery over MCP following the SEP-2640 convention:
- McpSkillsSource: reads skill://index.json to discover skills served by an MCP server
- McpSkill: lazily fetches SKILL.md content via resources/read on demand
- McpSkillResource: wraps MCP resource results (text and binary)
- Path traversal protection in get_resource for defense in depth
- Samples for Foundry Toolbox and standalone MCP skills server
- Comprehensive unit tests (514 lines)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review comments: rename to MCP* convention, fix error handling and samples

- Rename McpSkill/McpSkillResource/McpSkillsSource to MCPSkill/MCPSkillResource/MCPSkillsSource
- Add data-URI prefix stripping for blob resource decoding
- Let non-McpError exceptions propagate from get_resource()
- Fix contradictory test comment
- Use interactive input() in mcp_based_skill sample
- Remove misleading sample output block

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Restore debug logging for McpError in get_resource()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use AzureCliCredential in Foundry toolbox skills sample for consistency

Replace DefaultAzureCredential with AzureCliCredential to match the
credential convention used in all other samples.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use MCPStreamableHTTPTool in MCP skills sample

Replace raw mcp library imports (ClientSession, streamable_http_client)
with the framework's MCPStreamableHTTPTool to keep MCP server connections
consistent regardless of whether skills are enabled.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Branch on McpError.error.code so only not-found errors return empty

Previously _try_read_index() and get_resource() swallowed every McpError
as 'no skills available', making auth failures, server crashes, and
connection drops indistinguishable from a server that simply has no
skills.

Now only two codes are treated as not-found:
- -32002 (MCP-spec Resource not found)
- -32601 (METHOD_NOT_FOUND — server lacks resources/read)

All other McpError codes and non-McpError exceptions propagate with a
warning log, surfacing real failures visibly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add tests for non-McpError and non-not-found error propagation in MCP skills

Cover the re-raise branch in MCPSkill.get_resource for plain
ConnectionError/TimeoutError, the generic McpError (code 0) propagation
on get_resource, and TimeoutError propagation in _try_read_index.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revert "Use MCPStreamableHTTPTool in MCP skills sample"

This reverts commit f31ed0ded9.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Introduce MCP_SKILLS experimental feature for MCP skill classes

Add a separate MCP_SKILLS feature ID to ExperimentalFeature enum and
use it for MCPSkillResource, MCPSkill, and MCPSkillsSource, since their
promotion timeline is partly outside of our control.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

semenshi-m · 2026-06-03 18:09:50 +00:00

c6951c21f6

.NET: Bug fixes for AGUI hosting and workflows (#6311 )

* Add mcp tool execution fix

* Apply IsolationKeyScopedAgentSessionStore to MapAGUI by default if not yet set and improve comments in samples

* Address PR comments

* Fix formatting

westey · 2026-06-03 17:45:58 +00:00

a982428916

.NET: Add ILoggerFactory and IServiceProvider to HarnessAgent constructor (#6273 )

* Add ILoggerFactory and IServiceProvider to HarnessAgent constructor

Add optional ILoggerFactory and IServiceProvider parameters to the
HarnessAgent constructor and AsHarnessAgent extension method, passing
them to all downstream components that accept them:

- FunctionInvokingChatClient (via UseFunctionInvocation)
- CompactionProvider
- AgentSkillsProvider
- ChatClientAgent (via BuildAIAgent)
- AIAgentBuilder.Build()

Closes #6103

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Improve tests to verify ILoggerFactory and IServiceProvider propagation

- Add test verifying ILoggerFactory.CreateLogger() is called by
  downstream components (CompactionProvider, AgentSkillsProvider)
- Add test verifying IServiceProvider is queried during pipeline build

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

westey · 2026-06-03 09:09:39 +00:00

90a3e5de47

Python: progressive tool exposure via FunctionInvocationContext (#6233 )

* Python: progressive tool exposure via FunctionInvocationContext

Add first-class progressive tool exposure to the Python core function-calling
loop. Tools can now add or remove real FunctionTool schemas at runtime via the
injected FunctionInvocationContext, taking effect on the next iteration of the
loop.

- FunctionInvocationContext gains a live `tools` list plus experimental
  `add_tools()` / `remove_tools()` helpers (feature: PROGRESSIVE_TOOLS).
- The function-calling loop establishes a run-local, normalized tools list and
  threads it into the context at both invocation paths so mutations propagate.
- Add a sample (dynamic_tool_exposure.py) and a tools samples README, including
  a note that CodeAct providers (Monty/Hyperlight) use their own provider-level
  tool management instead.

Supersedes #3877.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Validate non-negative input in dynamic_tool_exposure sample tools

Address review feedback: factorial and fibonacci now return an error
message for negative n instead of producing incorrect results.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Make add_tools atomic and surface swallowed function errors

Address review feedback on progressive tool exposure:

- add_tools now validates the full batch against a throwaway copy before
  committing, so a duplicate-name clash partway through a sequence leaves
  the live tool list unchanged (all-or-nothing).
- _auto_invoke_function now logs a warning (with traceback) when a tool
  raises, so contract errors such as a duplicate-name ValueError from
  add_tools are debuggable without enabling include_detailed_errors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Avoid retaining tracebacks when logging swallowed function errors

Logging with exc_info=exc fed the exception traceback to the logging
machinery, whose frame references created reference cycles collected
lazily by the cyclic GC. On Windows that could drop a hyperlight
WasmSandbox on a non-owning thread ("unsendable, dropped on another
thread"), crashing the xdist worker. Log a pre-formatted message with
the exception repr instead, so no traceback object is retained.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* added missing decorator

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-06-03 09:01:07 +00:00

49a6e433a3

Python: Promote agent-framework-declarative package to RC (#6256 )

* Promote agent-framework-declarative package to RC

* Update missed package status file.

Peter Ibekwe · 2026-06-02 19:30:05 +00:00

6086a74302

Python: Fix FoundryAgent stripping model from PromptAgent requests (#5526 )

* Fix FoundryAgent stripping model from PromptAgent requests

Move run_options.pop('model', None) inside the _uses_foundry_agent_session()
conditional so that model is only stripped for hosted agent sessions (where
the server manages the model) and preserved for PromptAgent requests that
require it in the Responses API call.

Fixes #5525

* test: add coverage for resp_* continuation preserving model

Adds test_raw_foundry_agent_chat_client_prepare_options_preserves_model_for_resp_continuation
to explicitly verify that HostedAgent v1 / v2-no-session paths (where conversation_id
starts with resp_) preserve model and previous_response_id without triggering the
hosted-session gate.

---------

Co-authored-by: Benke Qu <bequ@microsoft.com>
Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

Benke Qu · 2026-06-02 18:30:04 +00:00

fa8cfb7567

.NET: Promote Workflows.Declarative packages to stable versions (#6254 )

* Promote Workflows.Declarative packages to stable versions

* Address PR feedback: enable package validation on GA declarative packages

Both Workflows.Declarative and Workflows.Declarative.Mcp set IsReleased=true

but were disabling package validation, bypassing the repo's GA convention

(see dotnet/nuget/nuget-package.props which auto-enables validation when

IsReleased=true).

Re-enable validation by removing the local EnablePackageValidation=false

overrides and pointing PackageValidationBaselineVersion at 1.8.0-rc1 (the

latest published version of each package). This catches accidental breaking

changes between RC and the first GA. Future GAs should bump the baseline to

the previous GA version.

Verified locally: dotnet build -c Release on both projects runs

RunPackageValidation -> APICompat ran successfully without finding any

breaking changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update statement for the baseline validation.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Peter Ibekwe · 2026-06-02 15:10:02 +00:00

6de4c24fdd

Python: Fix OTLP HTTP base-endpoint losing /v1/{signal} auto-append (#5913 )

* Python: Fix OTLP HTTP base-endpoint losing /v1/{signal} auto-append

Per the OTel spec, OTEL_EXPORTER_OTLP_ENDPOINT is a *base* URL for HTTP —
the SDK auto-appends /v1/traces, /v1/metrics, /v1/logs when it reads the
env var directly. Signal-specific endpoint env vars are *full* URLs used
verbatim.

_get_exporters_from_env read the base endpoint and forwarded it as the
constructor ``endpoint=`` argument, which the SDK always treats as a full
signal URL. As a result, with OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
and HTTP protocol, the exporter sent to http://localhost:4318 instead of
http://localhost:4318/v1/traces (and likewise for metrics/logs).

Replicate the spec's auto-append here when falling back to the base
endpoint under HTTP. gRPC behavior is unchanged.

* Python: Fix mypy type errors in OTLP endpoint assignment

Pre-declare traces_endpoint, metrics_endpoint, logs_endpoint as
str | None before the if/else block. Mypy inferred str from the
if-branch f-string assignments and then rejected the str | None
expressions in the else-branch as incompatible.

Dineshsuriya D · 2026-06-02 09:59:50 +00:00

a5f355e04a

.NET: Add Hosted-ToolboxMcpSkills sample (#6175 )

* .NET: Add Hosted-ToolboxMcpSkills sample

Adds a hosted Foundry Responses sample that discovers MCP-based skills from a Foundry Toolbox and makes them available to the agent via AgentSkillsProvider.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Align README and Program.cs default model to gpt-5

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Clarify MCP skills provider log to avoid implying eager discovery

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Drop redundant skills provider configured log

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add Foundry Toolbox Skills tag to manifest

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Simplify BearerTokenHandler by deriving from HttpClientHandler

Removes the need for an explicit InnerHandler. Enables CheckCertificateRevocationList to satisfy CA5399.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

semenshi-m · 2026-06-02 08:41:21 +00:00

0cf48923cd

ci: harden Python test coverage workflow (#5982 )

Improve input handling and token management in the Python test coverage
workflows.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-06-02 07:43:08 +00:00

cdc4809b8a

Python: Persist hosted MCP call/results as canonical mcp_call output (#6070 )

* Persist hosted MCP call/results as canonical mcp_call output

- Preserve hosted MCP call/result pairs as canonical mcp_call output items

- Coalesce MCP call + result in non-streaming conversion path

- Keep call-id alignment for MCP tool call tracking and output mapping

- Update tests and package metadata

* Fix missing Mapping import in hosted responses adapter

* Fix pyright unknown type in MCP output stringification

* Fix typing for MCP output sequence iteration

* Improve MCP output robustness and avoid eager flattening

* Bump foundry_hosting to b7 and update responses dependency to b7

* Restore foundry_hosting package version to 1.0.0a260521

* Refactor hosted MCP output parsing

Hameed Kunkanoor · 2026-06-02 07:30:36 +00:00

043208241a

fix: skip orphan anthropic thinking signatures (#5784 )

Yufeng He · 2026-06-02 00:48:42 +00:00

05ebb966cf

Fix open pr count check (#6255 )

Evan Mattson · 2026-06-02 09:09:36 +09:00

c83a944e85

Python: feat(bedrock): implement native structured output support via Converse API (#6052 )

* feat(bedrock): add structured output support via Converse API (Fixes #5966)

* fix(bedrock): improve unsupported model exception handling and schema parsing

* refactor(bedrock): use generic traversal for strict schema enforcement

* address Copilot review comments on structured output

* refine bedrock structured output: guard additionalProperties, TypeError check, docs + test

* fix(bedrock): widen response_format to Mapping and add missing test coverage

Thota Sai Karthik · 2026-06-01 23:30:19 +00:00

5d98beddf5

Python: feat(evals): Foundry Adaptive Evals integration (rubric-generation) (#6101 )

* Python: feat(evals): RubricScore type + EvalScoreResult.dimensions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: feat(foundry-evals): RubricDimension + GeneratedEvaluatorRef + accept in evaluators=

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: feat(evals): parse rubric_scores from output items + assertion helpers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: feat(evals): BaseAgent.as_eval_source / Workflow.as_eval_source

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: feat(foundry-evals): EvalGenerationSource + generate_rubric helper

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: feat(foundry-evals): YAML config loader + sample

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(evals): address PR review feedback

Addresses 4 Copilot review comments on PR #6101:

1. assert_dimension_score_at_least: drop the (not evaluator or found_any) guard so require_applicable=True correctly raises when the named evaluator produces no entries for the dimension. Adds TestRubricAssertions covering the regression.

2. GeneratedEvaluatorRef docstring: reword to describe actual behaviour (pinning recommended, not required) so it matches the dataclass default and FoundryEvals warning path.

3. _poll_generation_job: switch from asyncio.get_event_loop() to get_running_loop() and bound the per-iteration sleep by remaining time, matching _poll_eval_run.

4. generate_rubric: type category as Literal['quality','safety'] and validate at the entry point with a ValueError; drop the silent 'invalid -> quality' rewrite in _generation_job_to_ref. Adds a regression test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: feat(foundry-evals): hosted-agent-aware rubric generation

* Auto-detect hosted Foundry agents in agent_as_eval_source: when the
  agent's chat_client exposes a string agent_name (the convention used
  by RawFoundryAgentChatClient for PromptAgents/HostedAgents), emit a
  type='agent' EvalGenerationSource so the service fetches instructions
  and tools from the agent registry instead of relying on the local
  wrapper (which holds neither for hosted agents).
* Add hosted_agent_version kwarg and a new agent_version field on
  EvalGenerationSource so PromptAgent runs can pin to a specific hosted
  version for reproducible rubric generation.
* Add force_prompt_source escape hatch to bypass auto-detection and
  always emit a rendered prompt dossier - useful when the local wrapper
  carries overrides the service-side agent doesnt see.
* Fix _to_sdk_source for dataset sources: SDK ctor takes name=/version=,
  not dataset_name=/dataset_version=. The mismatch would raise TypeError
  against the real azure-ai-projects 2.3.0a* SDK; only unmocked
  integration paths were affected.

Tests cover: auto-detection happy path, versionless hosted agent,
explicit hosted_agent_version forwarding, force_prompt_source override,
non-string chat_client attrs (MagicMock test doubles) not mis-detected,
agent_version forwarded through _to_sdk_source, and the corrected
dataset SDK kwarg names.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(foundry-evals): accept canonical dimension_scores key per docs

The published Foundry rubric-evaluator output (Microsoft Learn 'Rubric evaluators' reference) places per-dimension breakdowns under properties.dimension_scores, not properties.rubric_scores. The parser now tries dimension_scores first and falls back to rubric_scores for preview-build compatibility, and tolerates non-list payloads (e.g. MagicMock auto-attrs) by trying the next candidate when parsing yields zero entries.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(foundry-evals): add manual create_rubric_evaluator

Adds FoundryEvals.create_rubric_evaluator as the agent-framework surface over project_client.beta.evaluators.create_version. This is the manual counterpart to generate_rubric: callers supply RubricDimension instances (authored locally, ported from another framework, or hand-tuned) and we POST a RubricBasedEvaluatorDefinition. The service auto-attaches the non-editable residual dimension (general_quality for quality, general_policy_compliance for safety).

Per the Microsoft Learn 'Rubric evaluators' reference, the auto-generation path (create_generation_job) is primarily a portal/UI feature; external SDK clients with rich local agent context are better served by manual create_version. This keeps generate_rubric for users who want to round-trip through a Foundry-registered agent.

Validation up front: weight must be in [1,10], ids unique, descriptions non-empty, pass_threshold in [0,1]. The returned GeneratedEvaluatorRef is identical in shape to one obtained from generate_rubric, so downstream evaluators= lists work unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* samples(foundry-evals): manual rubric sample + namespace re-exports

Adds evaluate_with_manual_rubric_sample.py demonstrating the end-to-end dev scenario for FoundryEvals.create_rubric_evaluator: hand-author a list of RubricDimension, register via create_rubric_evaluator, then use the pinned GeneratedEvaluatorRef alongside built-in evaluators in an agent regression run.

Also re-exports RubricDimension, GeneratedEvaluatorRef, build_sources, and load_evals_config from agent_framework.foundry (both the lazy runtime shim and the type stub) so the rubric samples can import everything from a single namespace; the auto-generate sample was previously broken because the shim was missing build_sources / load_evals_config.

Updates the foundry-evals README with a chooser entry for the two rubric paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(foundry-evals): remove rubric creation flows; keep consumption only

Reframes agent-framework as a pure consumer of Foundry rubric evaluators: scoring against rubrics that already exist (authored in the Foundry portal or via the dedicated SDK / REST surface) instead of creating them from the SDK.

Removed creation surface area:

- FoundryEvals.generate_rubric (auto-generate path) and create_rubric_evaluator (manual path), plus all _GenerationSdkTypes / _ManualRubricSdkTypes / _to_sdk_dimensions / _coalesce_generation_sources / _to_sdk_source / _poll_generation_job / _generation_job_to_ref / _evaluator_version_to_ref / _get_beta_evaluators / _import_*_sdk_types helpers.

- EvalGenerationSource (the input source discriminator), RubricDimension (the input dimension type), agent_as_eval_source / workflow_as_eval_source / _detect_hosted_foundry_agent helpers, and the YAML-config loader (_evals_config.py with RubricGenerationSpec / RubricSourceSpec / parse_evals_config / load_evals_config / build_sources).

- BaseAgent.as_eval_source / Workflow.as_eval_source plus the _render_agent_dossier / _render_workflow_dossier helpers in core. These existed only to feed the now-removed generation pipeline.

- Samples evaluate_with_generated_rubric_sample.py, evaluate_with_manual_rubric_sample.py, and evaluators.yaml. Replaced with a short README section showing how to reference an existing rubric evaluator via GeneratedEvaluatorRef.

Kept (consumption surface):

- GeneratedEvaluatorRef, slimmed to (name, version, display_name). Still accepted alongside built-in evaluator strings in FoundryEvals(evaluators=[...]). Versionless refs still warn.

- RubricScore on EvalScoreResult.dimensions plus EvalResults.assert_dimension_score_at_least for per-dimension CI gates.

- _parse_dimension_entries / _extract_rubric_scores output parsing (both canonical dimension_scores and the legacy rubric_scores key).

Tests: 160/160 foundry unit tests and 71/71 core local-eval tests pass; pyright is clean across changed files. The pre-existing tests/core/test_telemetry.py::test_detect_hosted_fallback_import_error failure is unrelated and reproduces on the prior commit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* samples(foundry-evals): add evaluate_with_rubric_sample

Adds a runnable end-to-end sample showing how to consume a pre-existing rubric evaluator created in Foundry: reference it with GeneratedEvaluatorRef(name, version), mix it with built-in evaluators in FoundryEvals, and gate CI with assert_dimension_score_at_least on a specific dimension.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(foundry-evals): satisfy mypy on _fetch_output_items

mypy infers OutputItemListResponse.sample as dict[str, object] | None while pyright correctly infers the typed Sample model. Cast to Any so both type checkers accept the attribute access pattern, rename the local to avoid shadowing the inner-loop sample binding, and drop the now-stale pyright suppressions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(foundry-evals): drop unpublished rubric-evaluators learn.microsoft.com link

The Adaptive Evals authoring docs are not yet published on Microsoft Learn, so the link 404s. Keep the descriptive text without the broken hyperlink; we can re-add it once the docs ship.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(foundry-evals): hoist repeated local imports to module top

Per code review feedback (eavanvalkenburg): the test file repeated 'from agent_framework_foundry._foundry_evals import ...' inside 22 test bodies and 'from agent_framework_foundry import GeneratedEvaluatorRef' inside 8 more. Move all of them to the existing top-level imports; the symbols are the same across tests and the local imports were redundant.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Ben Thomas <25218250+alliscode@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Ben Thomas · 2026-06-01 23:01:56 +00:00

e0d0ad16a0

Python: Fix core observability unsafe serialization of function-call arguments containing dataclass/framework objects (#6026 )

* fix: safely serialize function-call arguments in core observability

Apply make_json_safe() to content.arguments in _to_otel_part() before
building the otel message dict, so that dataclass/framework payloads
(e.g. workflow request_info events) do not cause a TypeError when
_capture_messages() calls json.dumps().

Lift make_json_safe() into agent_framework._serialization (no new
external deps — dataclasses/datetime only) so the core observability
path can use it without a dependency on the ag-ui adapter.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(core): safely serialize workflow request_info payloads in observability (#5733)

- Add make_json_safe() helper to recursively convert non-serializable objects
- Use make_json_safe() in _to_otel_part() for function_call arguments
- Fix CustomPayload test class to use @dataclass (resolves B903 lint error)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(serialization): guard callability and normalize dict keys in make_json_safe (#5733)

- Use callable(getattr(obj, method, None)) instead of hasattr() so that
  non-callable attributes named model_dump/to_dict/dict do not raise
  TypeError at runtime.
- Wrap each call in try/except TypeError to handle callables with
  mandatory arguments gracefully.
- Convert dict keys to str() so that non-string keys (e.g. datetime,
  int) cannot cause json.dumps to raise TypeError.
- Add regression tests for both scenarios.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address observability serialization review feedback

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-06-01 21:41:52 +00:00

f36096ce1a

.NET: Update hosted agents (#6243 )

* Updating to latest Foundry hosting packages.

* Re-applying .gitignore.

* Adding empty line at end of .gitignore

---------

Co-authored-by: Ben Thomas <25218250+alliscode@users.noreply.github.com>

Ben Thomas · 2026-06-01 21:27:29 +00:00

03e14ca187

.NET - Fix missing id on function_call_output in Foundry Hosting (#6246 )

* Fix missing id on function_call_output in Foundry Hosting

The Foundry storage layer was rejecting responses with
"ID cannot be null or empty (Parameter 'id')" because
function_call_output items emitted by OutputConverter had no id on
the wire.

OutputItemFunctionToolCallOutput's public ctor only sets CallId and
Output; Id is read-only and only the SDK's internal ctor populates
it. OutputItemBuilder<T>.ApplyAutoStamps fills ResponseId and
AgentReference but not Id, so the itemId passed to
AddOutputItem<T>(itemId) was used only for event sequencing and the
serialized item went out with id=null.

Switch to stream.OutputItemFunctionCallOutput(callId, output), the
SDK convenience method that uses the internal ctor and stamps the
id. Add a regression test asserting the added/done events carry a
non-empty matching Id.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* ci: free disk space and relocate NuGet cache on ubuntu runners

The ubuntu-latest dotnet-build/test jobs were hitting No space left on device because the runner image only ships ~14 GB free on /. The full multi-TFM build plus the dotnet pack + console-app install-check exhausts that easily.

Add a reusable composite action .github/actions/free-runner-disk-space that runs on Linux runners only and:

* removes pre-installed toolchains we never use here (Android SDK, GHC/Haskell, CodeQL, PyPy, Ruby, Go, boost, vcpkg, etc.), prunes docker images, and disables swap (reclaims ~25-30 GB on /)

* relocates the NuGet package cache to /mnt/nuget via NUGET_PACKAGES env, since /mnt has ~75 GB free on hosted runners

Wire the action into the four ubuntu-touching jobs in dotnet-build-and-test.yml (dotnet-build, dotnet-test, dotnet-foundry-hosted-it, dotnet-test-functions). The action self-guards with runner.os == 'Linux' so the matrix legs that run on windows are unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: alliscode <25218250+alliscode@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Ben Thomas · 2026-06-01 18:43:45 +00:00

b298113d15

Python: refresh dev dependencies and validate runtime bounds (#6238 )

Updates third-party dev dependencies across the Python workspace and
validates that all runtime dependency bounds still hold at both ends.

Dev dependency bumps (root, lab, declarative, durabletask):
- uv 0.11.6 -> 0.11.17, ruff 0.15.8 -> 0.15.15,
  pytest-asyncio 1.3.0 -> 1.4.0, mcp 1.27.0 -> 1.27.2,
  azure-monitor-opentelemetry 1.8.7 -> 1.8.8,
  poethepoet 0.42.1 -> 0.46.0, prek 0.3.9 -> 0.4.3,
  types-python-dateutil and types-PyYaml stub bumps.
- Transitive Dependabot items swept via lock: idna 3.11 -> 3.17,
  pip 26.0.1 -> 26.1.2.

Deliberately excluded:
- opentelemetry-sdk stays 1.40.0: azure-monitor-opentelemetry (incl.
  1.8.8) hard-pins opentelemetry-sdk==1.40.
- mypy stays 1.20.0 and pyright stays 1.1.408: the 2.1.0 / 1.1.409
  bumps introduce new diagnostics that fail type checking and need
  dedicated PRs.
- rich kept as a range: agentlightning (lab[lightning]) forces
  rich==13.9.4.

Code/formatting changes driven by the ruff upgrade:
- devui lifespan now uses try/finally so shutdown cleanup always runs
  (ruff RUF075).
- Removed unused TYPE_CHECKING imports in core and foundry flagged by
  ruff 0.15.15.
- Reapplied ruff 0.15.15 formatting to the files it changed.

Validation: validate-dependency-bounds-test "*" passes (31/31 lower +
31/31 upper); typing 62/62; lint 31/31; devui tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-06-01 17:53:56 +00:00

8091d052d8

Python: Add background agent support to harness agent (#6155 )

* Add background agent support to harness agent

* Address PR comments

westey · 2026-06-01 17:20:39 +00:00

52a8045bb6

2250 Commits