Commit Graph

1102 Commits

  • Python: Integrate shell tool into harness agent (#6451)
    * Integrate shell tool into AgentHarness
    
    * Validate shell_executor exposes as_function() with a clear TypeError
    
    Addresses PR review feedback: a public factory should fail fast with an
    actionable error rather than a cryptic AttributeError when an incompatible
    shell_executor is supplied. Validation happens upfront, regardless of whether
    the client supports shell tools.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Type shell harness params via TYPE_CHECKING import
    
    Addresses PR review feedback: type shell_executor and
    shell_environment_provider_options instead of Any, using a TYPE_CHECKING
    import from agent_framework_tools.shell. The import never executes at
    runtime, so there is no circular dependency, and the lazy runtime import of
    ShellEnvironmentProvider is retained. Since ShellExecutor is a protocol
    without as_function(), the validated getattr result is invoked directly.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Add tool approval middleware (#6414)
    * Add Python tool approval middleware
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix tool approval restored state handling
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Gate hidden approvals on explicit approval responses
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Handle string inputs in approval replay scan
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Cover argument-scoped approval rules
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Refine tool approval state and budgets
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix tool approval PR CI failures
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Revert DevUI Aspire README link change
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [Generated by SRE Agent] Fix MCP allowed_tools empty list handling (#6296)
    * Fix MCP allowed_tools empty list handling
    
    When allowed_tools is set to an empty list [], the falsy check
    'if not self.allowed_tools' incorrectly treats it as unconfigured
    (same as None), causing all tools to be exposed. Change to an
    explicit 'is None' check so that an empty list correctly results
    in no tools being allowed.
    
    Co-authored-by: Azure SRE Agent <noreply@microsoft.com>
    
    * Clarify allowed_tools docstring: None vs [] semantics
    
    Per Eduard's review on PR #6296: explicitly document that None exposes all tools and [] exposes none, across all four MCPTool / MCPStdioTool / MCPStreamableHTTPTool / MCPWebsocketTool docstrings.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * allowed_tools docstring: recommend load_tools=False for full disable
    
    Per Eduard's follow-up on PR #6296: `load_tools=False` is the cleaner idiom when you don't want to expose any tools. Reframe `allowed_tools=[]` in the docstring as a runtime guard / inspection-only path and cross-reference `load_tools`.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Azure SRE Agent <noreply@microsoft.com>
    Co-authored-by: Giles Odigwe <79032838+giles17@users.noreply.github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: HarnessAgent: Disable compaction when max tokens not provided (#6410)
    * HarnessAgent: Disable compaction when max tokens not provided
    
    * Fix regression.
    
    * Address PR comments
    
    * Require max_output_tokens to be positive
    
    Reject max_output_tokens=0 (must be positive), mirroring
    max_context_window_tokens. Addresses PR review feedback.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Parse MCP CallToolResult.structuredContent field to prevent tool results returning None (#6421)
    * Parse structuredContent from MCP CallToolResult (#3313)
    
    The _parse_tool_result_from_mcp method only iterated over the content
    field from CallToolResult, ignoring the structuredContent field entirely.
    MCP servers that return JSON data via structuredContent (e.g., Power BI
    MCP) appeared to return None.
    
    Add handling for structuredContent: when present, serialize it as JSON
    text and append it to the result list. This preserves the data for the
    LLM while maintaining backward compatibility with existing behavior.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Parse MCP CallToolResult.structuredContent field to prevent tool results returning None
    
    Fixes #3313
    
    * Address review feedback: add default=str to json.dumps and remove .checkpoints/
    
    - Add default=str to json.dumps for structuredContent serialization so
      non-JSON-serializable values (e.g. bytes) degrade gracefully instead
      of raising TypeError
    - Remove all .checkpoints/ runtime artifacts from the repository
    - Add **/.checkpoints/ to .gitignore to prevent future accidental commits
    - Add test for non-serializable structuredContent values
    
    Fixes #3313
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address review feedback for #3313: Python: MCP CallToolResult.structuredContent field is not parsed, causing tool results to return None
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [BREAKING] Add sampling guardrails to MCP tools (#6413)
    * Add sampling guardrails to MCP tools
    
    Add approval, token, and request-count controls to the MCP sampling
    callback used when an MCPTool is configured with a chat client.
    
    - Add `sampling_approval_callback`, `sampling_max_tokens`, and
      `sampling_max_requests` parameters to `MCPTool` and its
      `MCPStdioTool`, `MCPStreamableHTTPTool`, and `MCPWebsocketTool`
      subclasses, positioned directly after `client`.
    - Gate each server-initiated `sampling/createMessage` request behind the
      approval callback, which denies by default when no callback is provided.
    - Clamp the requested `maxTokens` to `sampling_max_tokens` and enforce a
      per-session request count via `sampling_max_requests`.
    - Log incoming sampling requests at WARNING level (counts only).
    - Export `SamplingApprovalCallback` from the public API.
    - Add tests, a sample, and documentation updates.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Make sampling denial message context-aware
    
    Distinguish the deny-by-default case (no approval callback configured)
    from an explicit denial by a configured `sampling_approval_callback`, so
    the returned ErrorData message is accurate for callback-driven denials
    and exceptions.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: bump package versions for 1.8.1 release (#6420)
    * Python: bump package versions for 1.8.1 release
    
    * Python: bump agent-framework-foundry-hosting for 1.8.1 release
    
    * Python: bump ag-ui and azurefunctions for 1.8.1 release
    
    * Remove incorrect agent-framework-foundry changelog entry for #6259
    
    * Add [1.8.1] changelog compare link and update [Unreleased] base
    
    ---------
    
    Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
  • Purview: Parallelize PSPC cold-cache scope refresh (#5832)
    * Parallelize Purview PSPC cold cache path
    
    * Cache Purview payment-required state for scope refresh
    
    * Cache Purview payment-required state for scope refresh
    
    * Align Purview policy action dedupe and 402 caching
    
     Deduplicate combined policy actions by action and restriction action so restriction-only actions are preserved
    without duplicating identical entries. Cache tenant-level payment-required state from background scope refresh so
    subsequent calls short-circuit consistently.
    
    * .NET: Implement best-effort caching for background job scope retrieval and add unit tests for cache write failures
    
    * Purview - feat: Enhance ScopedContentProcessor to queue ContentActivityJob when no applicable scopes are found and update related tests
    
    * docs: Update purview package README and AGENTS documentation to reflect caching optimizations and policy enforcement scenarios
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [Generated by SRE Agent] docs: clarify checkpoint storage security model and deserialization trust boundaries (#6295)
    * docs: clarify checkpoint storage security model and deserialization trust boundaries
    
    Add Security Model documentation sections to the checkpoint encoding and
    Azure Functions serialization modules explaining:
    - Checkpoint storage is a trusted data source requiring access controls
    - The RestrictedUnpickler allowlist is defense-in-depth, not a security boundary
    - Developer responsibilities for securing storage backends
    - Guidance on using allowed_types and strip_pickle_markers
    
    Co-authored-by: Azure SRE Agent <noreply@microsoft.com>
    
    * Apply suggestions from code review
    
    Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Azure SRE Agent <noreply@microsoft.com>
    Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
  • Python: fix: use getattr for non-OpenAI provider response compatibility (#6270)
    * fix: use getattr for non-OpenAI provider response compatibility
    
    Fixes #6234
    Fixes #6235
    
    Use getattr with None fallback for system_fingerprint and output
    attributes to prevent AttributeError when non-OpenAI providers
    return response objects without these fields.
    
    * fix: use typed variable for response output to satisfy pyright
    
    Fixes #6235
    
    Use getattr with None fallback for the output attribute, and assign
    to a typed list variable before the match statement to help pyright
    narrow the response item types correctly.
    
    * fix: rename response_outputs to avoid name collision with case-block variable
    
    Fixes #6235
    
    Rename outputs to response_outputs on line 1974 to avoid mypy error
    about conflicting variable names in the match statement's case blocks.
    Also use list[Any] for explicit generic type annotation.
    
    * fix: use cast(list[Any]) for response output to satisfy pyright
    
    Fixes #6235
    
    The getattr() call returns Unknown type which pyright cannot narrow
    in the match statement. Use an explicit cast to list[Any].
    
    * fix: use hasattr guard instead of getattr for response.output
    
    Fixes #6235
    
    Using hasattr(response, 'output') and then accessing response.output
    directly gives pyright enough type information to verify the match
    statement exhaustiveness. This avoids the cast(list[Any]) approach
    which pyright still flagged as partially unknown.
    
    * fix: use ternary operator for response_outputs assignment
    
    Replace if-else block with ternary expression to satisfy ruff SIM108 lint rule.
    This fixes the Package Checks (3.11) CI failure.
    
    * fix: use ternary with cast for ruff SIM108 and pyright type safety
    
    Replace if-else block with ternary expression using cast(list[Any], ...)
    to satisfy:
    - ruff SIM108 (use ternary instead of if-else)
    - ruff E501 (line length < 120)
    - pyright type narrowing (cast preserves type info lost in ternary)
    
    All local checks pass: ruff check, ruff format, pyright, 298 tests.
    
    * fix: replace hasattr+cast with try/except to preserve pyright types
    
    ---------
    
    Co-authored-by: Tao Chen <taochen@microsoft.com>
  • Python: Add Foundry Toolbox MCP skills hosted agent sample (#6363)
    * Add 12_foundry_toolbox_mcp_skills hosted agent sample
    
    Demonstrates using MCPSkillsSource with a Foundry Toolbox MCP endpoint
    to discover and serve skills via SkillsProvider (progressive disclosure).
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix env var reference in README and reuse local var in main.py
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Potential fix for pull request finding
    
    Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
    
    * Require AZURE_AI_MODEL_DEPLOYMENT_NAME and use placeholder in .env.example
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Document Toolbox MCP skills vs Foundry Skills in sample README
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Reference 12_foundry_toolbox_mcp_skills in parent README
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: SergeyMenshykh <SergeMenshikh@outlook.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
  • Python: Filter MCP tool kwargs to declared params via allowlist (#6399)
    * Filter MCP tool kwargs to declared params via allowlist
    
    Previously MCPTool combined framework runtime kwargs (from
    FunctionInvocationContext.kwargs) with the LLM-supplied arguments and
    stripped only a hardcoded denylist of known framework keys before
    forwarding to the MCP server. Any new framework-injected kwarg leaked to
    the server unless the denylist was updated.
    
    Switch to an allowlist built from each tool's declared parameters
    (inputSchema.properties). Only declared params are forwarded; everything
    else is stripped. Add an `additional_tool_argument_names` constructor
    argument so users can opt extra names back in, globally (Sequence[str])
    and/or per remote tool name (Mapping with reserved "*" global key). The
    existing denylist is kept as a safety net for framework-named params a
    server declares in its schema; explicitly opted-in extras always win. The
    reserved _meta handling is unchanged.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address MCP allowlist review comments and fix reload arg loss
    
    - Fix pyright reportUnknownArgumentType in _load_tools (cast schema properties).
    - Register declared param names before the existing-tool skip guard so that
      tool-list reloads preserve the allowlist for already-loaded tools (previously
      unchanged tools silently dropped all declared args after a background reload).
    - Handle bare-string values in an additional_tool_argument_names mapping instead
      of iterating their characters.
    - Clarify the framework denylist comment: explicit extras override the denylist.
    - Make the extras-override-denylist test unambiguous (opt in a denylisted name).
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: feat(claude): bump claude-agent-sdk to 0.2.87 (#6248)
    * feat(claude): bump claude-agent-sdk to 0.2.87
    
    Upgrade claude-agent-sdk dependency from >=0.1.36,<0.1.49 to >=0.2.87,<0.3.
    
    Changes:
    - Bump version pin in pyproject.toml
    - Add 'xhigh' effort level to ClaudeAgentOptions (Opus 4.7 specific)
    - Expose new upstream SDK options: skills, session_id, task_budget,
      include_hook_events, strict_mcp_config, continue_conversation,
      fork_session
    - Add TaskBudget type import
    - Update uv.lock
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * chore: lower claude-agent-sdk floor to >=0.1.36
    
    Keep the lower bound at 0.1.36 since the 0.1→0.2 transition was additive
    and our code works on older versions as long as new options aren't used.
    This avoids forcing unnecessary upgrades on existing users.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: replace TaskBudget import with inline type for SDK compat
    
    TaskBudget was added in claude-agent-sdk 0.2.93 but does not exist in
    0.2.87. Use dict[str, int] inline type instead so type checking passes
    against 0.2.87. Lock file pinned to 0.2.87.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Harness console for python (#6312)
    * Add initial harness console for python
    
    * Add textual to project
    
    * Add planning and approval flows with list selector
    
    * Address PR comments
    
    * Fix list selection bug
    
    * Fix PR #6312 round 2 review comments
    
    - Escape untrusted agent text with rich.markup.escape() in observers
      (text_output, planning_output, reasoning_display) to prevent markup injection
    - Remove non-functional 'Always approve' choices from tool_approval.py
      (framework lacks CreateAlwaysApproveToolResponse support)
    - Remove textual from root pyproject.toml dev deps (sample-specific)
    - Add PEP 723 inline script metadata to harness_research.py
    - Narrow except Exception to except NoMatches in list_selection.py
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix build error
    
    * Fix build errors
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Fix per-service-call history persistence with server-storing clients (#6310)
    * Fix per-service-call history persistence with server-storing clients
    
    When an Agent set require_per_service_call_history_persistence=True together
    with a HistoryProvider, and the chat client stored history server-side by
    default (e.g. OpenAIChatClient, STORES_BY_DEFAULT=True), the external history
    provider was silently never persisted.
    
    Unify persistence on the per-service-call middleware: when the flag is set and
    a HistoryProvider exists, the middleware is always installed and owns
    persistence. service_stores_history now only selects middleware behavior:
    - service does not store: load providers and drive the function loop with a
      local sentinel conversation id, or
    - service stores: skip loading (the service owns history) and persist each
      service call while the real conversation id flows through.
    
    Also rationalize chat-options handling in _prepare_run_context:
    - _merge_options now skips None overrides and strips remaining None values, so
      an unset `store` is never forwarded and the service decides its own default.
    - Resolve `store` and `conversation_id` once from a single combined view
      (effective_options) instead of probing both default and runtime dicts; the
      auto-injection and per-service-call resolution now agree on conversation_id.
    
    Fixes #5798
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Correct as_agent() docstring: persistence is per service call, not once per run
    
    Address PR review: when the client stores history server-side, the
    per-service-call middleware still persists after each model call; only
    provider loading is skipped. The previous "persist once per run()" wording
    contradicted the implementation.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review: docs, missing-conversation-id warning, and tests
    
    - Clarify that require_per_service_call_history_persistence is a no-op when no
      HistoryProvider is present (docstrings in _agents.py and _clients.py).
    - Warn on every service call when the client stores history server-side but
      returns no conversation_id, so the (uncommon) loss of cross-turn resumability
      cannot fail silently.
    - Add tests: storing client + existing conversation_id does not raise and the id
      propagates; two runs on the same session keep persisting with a stable
      service_session_id and no provider loading; storing-without-conversation-id
      warns per call.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: fix(mem0): isolate entity retrieval and correct app_id payload (#6242)
    * fix(mem0): parallel memory retrieval logic and strict type compliance
    
    * fix(mem0): align parallel retrieval types for pyright and mypy
    
    * fix(mem0): handle asyncio.CancelledError in search response and update test description
    
    * fix(mem0): improve error handling for asyncio.CancelledError and update test names for clarity
    
    * fix(mem0): improve retrieval response handling
  • Python: feat(python): Add MCP client OTel spans per GenAI semantic conventions (#6349)
    * feat(python): Add MCP client OTel spans per GenAI semantic conventions
    
    Implement MCP client spans per the OTel GenAI Semantic Conventions for MCP
    (https://opentelemetry.io/docs/specs/semconv/gen-ai/mcp/#client).
    
    Operations instrumented:
    - initialize: CLIENT span capturing MCP session setup
    - tools/list: CLIENT span for tool listing (per-page)
    - prompts/list: CLIENT span for prompt listing (per-page)
    - tools/call: CLIENT span (nested under execute_tool when called via FunctionTool)
    - prompts/get: CLIENT span
    
    Span attributes follow the MCP semantic conventions:
    - Required: mcp.method.name
    - Conditional: error.type, gen_ai.tool.name, gen_ai.prompt.name
    - Recommended: gen_ai.operation.name, mcp.protocol.version, mcp.session.id,
      network.transport, server.address, server.port
    
    Transport-specific attributes per subclass:
    - MCPStdioTool: network.transport=pipe
    - MCPStreamableHTTPTool: network.transport=tcp, network.protocol.name=http
    - MCPWebsocketTool: network.transport=tcp, network.protocol.name=websocket
    
    All span creation gated behind OBSERVABILITY_SETTINGS.ENABLED.
    
    Closes #3624
    Closes #4697
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * refactor: simplify MCP spans — remove enrichment logic and protocol version caching
    
    - Always create nested CLIENT spans for tools/call instead of enriching
      the parent execute_tool span
    - Remove _ACTIVE_TOOL_EXECUTION_SPAN contextvar (no longer needed)
    - Remove enrich_span_with_mcp_attributes() helper
    - Remove _otel_error_type preservation in FunctionTool.invoke()
    - Remove _mcp_protocol_version instance variable; protocol version is
      only set on the initialize span where it is available
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Refine copilot solution
    
    * fix: enable automatic exception recording on MCP spans
    
    Remove record_exception=False and set_status_on_exception=False from
    create_mcp_client_span. Let OTel handle exception recording and status
    setting automatically. The manual set_mcp_span_error calls for tools/call
    still correctly set error.type (which OTel's automatic handling doesn't
    touch), so tool_error is preserved.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Reduce number of lines
    
    * Add comment to sample
    
    * test: address PR review comments on MCP observability tests
    
    - Fix initialize test to call mocked session.initialize() and read
      protocolVersion from the result instead of hardcoding it
    - Add tools/call McpError error-path test
    - Add prompts/get McpError error-path test
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix export error
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Refactor workflow as agent pending request handling (#6259)
    * WIP: Refactor Workflow as agent pending request handling
    
    * WIP: debugging empty message bug
    
    * Working: Workflow as agent with function approval
    
    * Address Copilot comments
    
    * Fix mypy
    
    * Address comments and fix pipeline
    
    * Request info non function approval now becomes function call
    
    * Revert uv.lock
    
    * Fix mypy
    
    * Bump min version of azure-ai-project
    
    * Remove RequestInfoFunctionArgs
    
    * fix tests
    
    * Fix failing tests
    
    * Fix sample
  • Python (fix:gemini): make Gemini honor declarative outputSchema, not just JSON mode (#5893)
    * fix(gemini): preserve schema response_format
    
    * fix(gemini): satisfy pyright strict in response schema extraction
    
    Cast Any-narrowed mappings to Mapping[str, Any] in the structured-output
    schema helpers so pyright strict no longer reports partially-unknown
    member, argument, and variable types. Pass response_format["format"]
    straight into the recursive extractor, which already guards non-mapping
    inputs. No behavior change.
    
    * fix(gemini): use Sequence[object] cast to satisfy both mypy and pyright
    
    The Sequence[Any] cast pyright strict needs to know the loop element type
    is reported as a redundant-cast by mypy, which already narrows the
    isinstance branch to Sequence[Any]. Cast to Sequence[object] instead:
    pyright gets a fully known element type and mypy no longer sees an
    identical-type cast. No behavior change.
    
    ---------
    
    Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>
  • Python: MCP long-running task support in Python (#6319)
    * MCP long-running task support in Python
    
    * Fix pyupgrade and AGENTS.md reconnect description
    
    - pyupgrade: drop forward-reference string annotations in _mcp.py (Python 3.10+ resolves them natively now that MCPTaskOptions is defined before use).
    
    - AGENTS.md: align reconnect description with current behavior. Phase 1 (initial tools/call) does NOT retry on connection loss; raises 'connection lost; task state unknown' instead, so a server that accepted the request but lost the response cannot start the operation twice. Phase 2 (tasks/get / tasks/result) still reconnects once against the same task_id.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix bandit nosec marker for CI pipeline
    
    * Address PR feedbacks
    
    * Clarifiied comments and addressed more PR feedbacks.
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: bump package versions for 1.8.0 release (#6351)
    - Released cohort (core, openai, foundry, root): 1.7.0 -> 1.8.0
    - agent-framework-github-copilot: promote to RC (1.0.0rc1)
    - agent-framework-orchestrations: rc2 -> rc3 (bug fix)
    - Beta/alpha packages with changes: a2a, anthropic, azurefunctions, bedrock,
      foundry-hosting, mistral bumped to new date stamp (260604)
    - Inter-package dependency bounds updated for changed packages
    - CHANGELOG.md and PACKAGE_STATUS.md updated
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Fix toolbox consent flow in hosted agent (#6249)
    * Fix toolbox consent flow in hosted agent
    
    * Resolve conflict
    
    * Make unused tool as comment
    
    * Fix tests
  • Python: Add timeout parameter to FoundryAgent to fix ConnectTimeout on multi-turn conversations (#6263)
    * Python: fix ConnectTimeout on multi-turn FoundryAgent conversations (#6241)
    
    Expose a `timeout` parameter on `RawFoundryAgentChatClient`,
    `_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`, and
    `RawOpenAIChatClient` so callers can override the HTTP timeout used by
    the underlying AsyncOpenAI client.
    
    Root cause: `RawFoundryAgentChatClient.__init__` called
    `project_client.get_openai_client()` without configuring any timeout,
    inheriting the OpenAI SDK default of `httpx.Timeout(connect=5.0)`.
    When connections are recycled between turns under load, the 5 s connect
    timeout fires and surfaces as `openai.APITimeoutError`.
    
    Fix:
    - `load_openai_service_settings` (`_shared.py`): accept `timeout` and
      include it in `client_args` for all three `AsyncOpenAI`/
      `AsyncAzureOpenAI` construction paths.
    - `RawOpenAIChatClient.__init__` (`_chat_client.py`): accept `timeout`
      and forward to `load_openai_service_settings`.
    - `RawFoundryAgentChatClient.__init__` (`_agent.py`): accept `timeout`
      and set `openai_client.timeout = timeout` on the client returned by
      `get_openai_client()` before passing it to the base class.
    - `_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`: accept
      and propagate `timeout` through the construction chain.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add timeout parameter to FoundryAgent and RawOpenAIChatClient
    
    Expose a timeout parameter on RawFoundryAgentChatClient,
    _FoundryAgentChatClient, RawFoundryAgent, FoundryAgent, and
    RawOpenAIChatClient. When provided, the value is applied to the
    underlying AsyncOpenAI client so that connect timeouts under load
    or after connection recycling can be tuned by callers.
    
    Previously, get_openai_client() was called without any timeout
    override, so the SDK default of httpx.Timeout(connect=5.0) was
    inherited and could fire on multi-turn conversations where the
    underlying connection is recycled between turns.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Python: Add `timeout` parameter to `FoundryAgent` to fix `ConnectTimeout` on multi-turn conversations
    
    Fixes #6241
    
    * fix(foundry): use with_options to avoid mutating shared OpenAI client timeout (#6241)
    
    Replace direct assignment  with
     in
    RawFoundryAgentChatClient.__init__.
    
    The Azure AI Projects SDK caches and returns a shared AsyncOpenAI client
    per AIProjectClient. Mutating its .timeout attribute leaked the override
    to all other code paths sharing that client (other agents, user code).
    with_options() returns a new client instance with the override applied,
    leaving the original shared client untouched.
    
    Update tests to assert with_options is called with the correct timeout
    and that the original shared client's timeout attribute is not mutated.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * test(foundry): assert with_options return value flows to instance.client (#6241)
    
    The four timeout propagation tests verified that with_options was called
    but did not confirm that the returned (timeout-configured) client was
    actually stored on the instance. A silent discard of the return value
    would have left the tests green while the timeout had no effect.
    
    Each test now captures the constructed instance and asserts:
      assert <instance>.client is openai_client_mock.with_options.return_value
    
    Affected tests:
    - test_raw_foundry_agent_chat_client_init_applies_timeout_to_openai_client
    - test_raw_foundry_agent_chat_client_init_applies_timeout_with_preview_enabled
    - test_foundry_agent_chat_client_init_propagates_timeout
    - test_foundry_agent_init_propagates_timeout_to_openai_client
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Fix spurious Magentic custom manager warning (#6261)
    * Fix magentic manager warning
    
    * Use typing_extensions.Sentinel for _MISSING sentinel value
    
    Replace the bare object() sentinel with typing_extensions.Sentinel per
    PEP 661 (now final). Sentinel provides a proper name and repr
    ('<_MISSING>') and is the idiomatic approach going forward.
    
    Refs #4306
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: correct Sentinel type annotation for max_stall_count param (#6261)
    
    Use int | Sentinel for max_stall_count parameter type annotation instead
    of int with cast(Any, _MISSING) to properly express that the parameter
    can hold either an int or the _MISSING sentinel value. This fixes the
    pyright reportUnnecessaryComparison errors caused by the types int and
    Sentinel having no overlap.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Rename _MISSING sentinel to UNSET in orchestrations
    
    The sentinel is user-visible as a default in public init signatures, so
    use UNSET (no leading underscore) instead of the private _MISSING name.
    Drop the now-unnecessary reportPrivateUsage ignores on the UNSET imports.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: [BREAKING] Upgrade github-copilot-sdk to v1.0.0 (stable) (#6292)
    * Python: Upgrade github-copilot-sdk to v1.0.0 (stable)
    
    Upgrade agent-framework-github-copilot from github-copilot-sdk 1.0.0b2 to the
    stable 1.0.0 release, adapting to all breaking API changes.
    
    Source changes (_agent.py):
    - SubprocessConfig removed: use RuntimeConnection.for_stdio(path=...) +
      CopilotClient kwargs (connection, log_level, base_directory)
    - Import paths: copilot.generated.session_events -> copilot.session_events
    - Settings: copilot_home -> base_directory (env GITHUB_COPILOT_BASE_DIRECTORY)
    - Default deny handler: PermissionDecisionUserNotAvailable() (from
      copilot.generated.rpc)
    
    Test changes:
    - Updated imports and client-construction assertions (kwargs-based)
    - Permission handler tests use concrete decision types
      (PermissionDecisionApproveOnce, PermissionDecisionDeniedInteractivelyByUser)
    
    Sample changes:
    - Permission handlers use PermissionHandler.approve_all or sync
      approve_and_log pattern (v1.0.0 protocol v3 dispatch is incompatible
      with blocking input() in permission handlers)
    - Function approval sample uses asyncio.to_thread for interactive prompts
    - Simplified imports across all samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review: scope permission handlers, widen type, add test
    
    - Shell sample: only approve kind='shell', deny others
    - URL sample: only approve kind='url', deny others
    - Use getattr() for kind-specific attributes to satisfy pyright
    - Widen PermissionHandlerType to accept async handlers (matches SDK)
    - Add test for _deny_all_permissions return value
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix validation script and strengthen test assertion
    
    - Update scripts/sample_validation/create_dynamic_workflow_executor.py to
      use copilot.session_events imports and PermissionHandler.approve_all
    - Assert isinstance(result, PermissionDecisionUserNotAvailable) instead of
      stringly-typed kind check
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add integration tests for GitHubCopilotAgent
    
    Add 6 integration tests mirroring .NET coverage:
    - Basic non-streaming response
    - Streaming response
    - Function tool invocation
    - Session context (multi-turn)
    - Session resume by ID
    - Shell command execution
    
    Tests require COPILOT_GITHUB_TOKEN env var (skipped otherwise).
    Each test cleans up its Copilot session via delete_session.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Fix compaction message-id collisions and tool-loop summary persistence (#6299)
    * Fix compaction message-id collisions and tool-loop summary persistence
    
    Fixes two bugs in the compaction strategies:
    
    - #5237: incremental group annotation assigned message ids by position
      within the re-annotated slice, so moving the re-annotation start back to
      a previous group start restarted ids at 0 and produced collisions
      (e.g. a user message reusing an assistant message's id), merging groups
      and causing tool-result compaction to wrongly exclude messages.
      group_messages/_ensure_message_ids now take an id_offset and guard
      against existing-id collisions; annotate_message_groups threads the
      slice start index through as the offset.
    
    - #4991: the function-invocation loop copied the message list each
      iteration, so summaries inserted by compaction landed in a throwaway
      copy and were lost across tool-loop iterations (only the persistent
      excluded flags survived). _prepare_messages_for_model_call now compacts
      the list in place when messages is a list, so inserted summaries persist.
    
    Adds regression tests (incremental id uniqueness, existing-id collision
    avoidance, idempotency, and tool-loop summary persistence including
    streaming and conversation-id modes).
    
    Also adds a summarization.py sample demonstrating SummarizationStrategy
    directly with a real client, and reworks advanced.py with tool-call
    groups and a real summarizer.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Guard incremental message-id assignment against prefix-id collisions
    
    Addresses PR review on #5237: _ensure_message_ids only guarded against
    collisions within the re-annotated slice. A preexisting (e.g. user-supplied)
    id in the preserved prefix could still be reassigned in the suffix when the
    id was numerically out of position, merging groups across the re-annotation
    boundary again.
    
    group_messages/_ensure_message_ids now accept reserved_ids, and
    annotate_message_groups passes the preserved prefix's ids so auto-assigned
    suffix ids never collide across the full list. Adds a regression test
    reproducing the out-of-position prefix-id collision.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: run sync tools off the event loop (#5773)
    * fix: run sync tools off event loop
    
    * chore: silence harness tool marker type check
  • Python: Add MCP-based skills discovery (McpSkillsSource) (#6169)
    * Add MCP-based skills discovery (McpSkill, McpSkillsSource, McpSkillResource)
    
    Implement Agent Skills discovery over MCP following the SEP-2640 convention:
    - McpSkillsSource: reads skill://index.json to discover skills served by an MCP server
    - McpSkill: lazily fetches SKILL.md content via resources/read on demand
    - McpSkillResource: wraps MCP resource results (text and binary)
    - Path traversal protection in get_resource for defense in depth
    - Samples for Foundry Toolbox and standalone MCP skills server
    - Comprehensive unit tests (514 lines)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address PR review comments: rename to MCP* convention, fix error handling and samples
    
    - Rename McpSkill/McpSkillResource/McpSkillsSource to MCPSkill/MCPSkillResource/MCPSkillsSource
    - Add data-URI prefix stripping for blob resource decoding
    - Let non-McpError exceptions propagate from get_resource()
    - Fix contradictory test comment
    - Use interactive input() in mcp_based_skill sample
    - Remove misleading sample output block
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Restore debug logging for McpError in get_resource()
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Use AzureCliCredential in Foundry toolbox skills sample for consistency
    
    Replace DefaultAzureCredential with AzureCliCredential to match the
    credential convention used in all other samples.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Use MCPStreamableHTTPTool in MCP skills sample
    
    Replace raw mcp library imports (ClientSession, streamable_http_client)
    with the framework's MCPStreamableHTTPTool to keep MCP server connections
    consistent regardless of whether skills are enabled.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Branch on McpError.error.code so only not-found errors return empty
    
    Previously _try_read_index() and get_resource() swallowed every McpError
    as 'no skills available', making auth failures, server crashes, and
    connection drops indistinguishable from a server that simply has no
    skills.
    
    Now only two codes are treated as not-found:
    - -32002 (MCP-spec Resource not found)
    - -32601 (METHOD_NOT_FOUND — server lacks resources/read)
    
    All other McpError codes and non-McpError exceptions propagate with a
    warning log, surfacing real failures visibly.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add tests for non-McpError and non-not-found error propagation in MCP skills
    
    Cover the re-raise branch in MCPSkill.get_resource for plain
    ConnectionError/TimeoutError, the generic McpError (code 0) propagation
    on get_resource, and TimeoutError propagation in _try_read_index.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Revert "Use MCPStreamableHTTPTool in MCP skills sample"
    
    This reverts commit f31ed0ded9.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Introduce MCP_SKILLS experimental feature for MCP skill classes
    
    Add a separate MCP_SKILLS feature ID to ExperimentalFeature enum and
    use it for MCPSkillResource, MCPSkill, and MCPSkillsSource, since their
    promotion timeline is partly outside of our control.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: progressive tool exposure via FunctionInvocationContext (#6233)
    * Python: progressive tool exposure via FunctionInvocationContext
    
    Add first-class progressive tool exposure to the Python core function-calling
    loop. Tools can now add or remove real FunctionTool schemas at runtime via the
    injected FunctionInvocationContext, taking effect on the next iteration of the
    loop.
    
    - FunctionInvocationContext gains a live `tools` list plus experimental
      `add_tools()` / `remove_tools()` helpers (feature: PROGRESSIVE_TOOLS).
    - The function-calling loop establishes a run-local, normalized tools list and
      threads it into the context at both invocation paths so mutations propagate.
    - Add a sample (dynamic_tool_exposure.py) and a tools samples README, including
      a note that CodeAct providers (Monty/Hyperlight) use their own provider-level
      tool management instead.
    
    Supersedes #3877.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Validate non-negative input in dynamic_tool_exposure sample tools
    
    Address review feedback: factorial and fibonacci now return an error
    message for negative n instead of producing incorrect results.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Make add_tools atomic and surface swallowed function errors
    
    Address review feedback on progressive tool exposure:
    
    - add_tools now validates the full batch against a throwaway copy before
      committing, so a duplicate-name clash partway through a sequence leaves
      the live tool list unchanged (all-or-nothing).
    - _auto_invoke_function now logs a warning (with traceback) when a tool
      raises, so contract errors such as a duplicate-name ValueError from
      add_tools are debuggable without enabling include_detailed_errors.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Avoid retaining tracebacks when logging swallowed function errors
    
    Logging with exc_info=exc fed the exception traceback to the logging
    machinery, whose frame references created reference cycles collected
    lazily by the cyclic GC. On Windows that could drop a hyperlight
    WasmSandbox on a non-owning thread ("unsendable, dropped on another
    thread"), crashing the xdist worker. Log a pre-formatted message with
    the exception repr instead, so no traceback object is retained.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * added missing decorator
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Promote agent-framework-declarative package to RC (#6256)
    * Promote agent-framework-declarative package to RC
    
    * Update missed package status file.
  • Python: Fix FoundryAgent stripping model from PromptAgent requests (#5526)
    * Fix FoundryAgent stripping model from PromptAgent requests
    
    Move run_options.pop('model', None) inside the _uses_foundry_agent_session()
    conditional so that model is only stripped for hosted agent sessions (where
    the server manages the model) and preserved for PromptAgent requests that
    require it in the Responses API call.
    
    Fixes #5525
    
    * test: add coverage for resp_* continuation preserving model
    
    Adds test_raw_foundry_agent_chat_client_prepare_options_preserves_model_for_resp_continuation
    to explicitly verify that HostedAgent v1 / v2-no-session paths (where conversation_id
    starts with resp_) preserve model and previous_response_id without triggering the
    hosted-session gate.
    
    ---------
    
    Co-authored-by: Benke Qu <bequ@microsoft.com>
    Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>
  • Python: Fix OTLP HTTP base-endpoint losing /v1/{signal} auto-append (#5913)
    * Python: Fix OTLP HTTP base-endpoint losing /v1/{signal} auto-append
    
    Per the OTel spec, OTEL_EXPORTER_OTLP_ENDPOINT is a *base* URL for HTTP —
    the SDK auto-appends /v1/traces, /v1/metrics, /v1/logs when it reads the
    env var directly. Signal-specific endpoint env vars are *full* URLs used
    verbatim.
    
    _get_exporters_from_env read the base endpoint and forwarded it as the
    constructor ``endpoint=`` argument, which the SDK always treats as a full
    signal URL. As a result, with OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
    and HTTP protocol, the exporter sent to http://localhost:4318 instead of
    http://localhost:4318/v1/traces (and likewise for metrics/logs).
    
    Replicate the spec's auto-append here when falling back to the base
    endpoint under HTTP. gRPC behavior is unchanged.
    
    * Python: Fix mypy type errors in OTLP endpoint assignment
    
    Pre-declare traces_endpoint, metrics_endpoint, logs_endpoint as
    str | None before the if/else block. Mypy inferred str from the
    if-branch f-string assignments and then rejected the str | None
    expressions in the else-branch as incompatible.
  • Python: Persist hosted MCP call/results as canonical mcp_call output (#6070)
    * Persist hosted MCP call/results as canonical mcp_call output
    
    - Preserve hosted MCP call/result pairs as canonical mcp_call output items
    
    - Coalesce MCP call + result in non-streaming conversion path
    
    - Keep call-id alignment for MCP tool call tracking and output mapping
    
    - Update tests and package metadata
    
    * Fix missing Mapping import in hosted responses adapter
    
    * Fix pyright unknown type in MCP output stringification
    
    * Fix typing for MCP output sequence iteration
    
    * Improve MCP output robustness and avoid eager flattening
    
    * Bump foundry_hosting to b7 and update responses dependency to b7
    
    * Restore foundry_hosting package version to 1.0.0a260521
    
    * Refactor hosted MCP output parsing