Commit Graph

9 Commits

  • [BREAKING] Python: Enable instrumentation by default (#5865)
    * Enable instrumentation by default
    
    * Update samples
    
    * Optimization when span is not recording
    
    * Address Copilot comments
    
    * Revert uv.lock
    
    * Add warning
    
    * Formatting
    
    * Fix mypy
    
    * Add disable_instrumentation() with sticky user-intent semantics
    
    Add a public disable_instrumentation() entry point so users can explicitly opt
    out of Agent Framework telemetry, with a sticky-disable flag that makes the
    user's intent "leading" — no framework code path (foundry's
    configure_azure_monitor, configure_otel_providers, enable_instrumentation,
    enable_sensitive_telemetry, or direct OBSERVABILITY_SETTINGS.enable_*
    writes) can re-enable instrumentation until the user explicitly clears the
    disable with enable_instrumentation(force=True) /
    enable_sensitive_telemetry(force=True).
    
    Also addresses the two remaining unresolved review threads on the PR:
    1. test_observability_settings_defaults_instrumentation_true pins the new
       "ENABLE_INSTRUMENTATION defaults to True when env unset" behavior.
    2. test_enable_instrumentation_reads_env_sensitive_data restores coverage
       for the post-import load_dotenv() fallback path.
    
    Implementation:
    - ObservabilitySettings.enable_instrumentation / enable_sensitive_data become
      properties backed by _enable_*. While _user_disabled is True, the getters
      return False and the setters drop True writes (defense in depth so third-
      party writes can't subvert the disable).
    - Public is_user_disabled read-only property lets integrations (e.g. foundry's
      configure_azure_monitor) cheaply check the disable state without poking at
      privates.
    - enable_instrumentation() and enable_sensitive_telemetry() short-circuit with
      an info log when disabled; gain a force=True kwarg that clears the disable.
    - configure_otel_providers() still creates providers / exporters / views so a
      later force-enable can use them, but logs an info message when called while
      disabled.
    - Foundry's FoundryChatClient.configure_azure_monitor and
      FoundryAgent.configure_azure_monitor early-return when the user has
      disabled, so Azure Monitor's global providers aren't installed unnecessarily.
    
    Tests: 11 new tests covering default-on, env re-read at call time, sticky
    behavior against each re-enable surface (enable_instrumentation,
    enable_sensitive_telemetry, configure_otel_providers, direct attribute
    writes), force=True override, re-arming the disable, and the __all__ export.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: document disable_instrumentation() and force=True paths
    
    Add a "Disabling instrumentation" section to the observability sample README
    that walks through:
    
    - The distinction between the ENABLE_INSTRUMENTATION env var (initial,
      non-sticky) and disable_instrumentation() (process-wide, sticky).
    - Why the sticky semantics matter: framework integrations like
      FoundryChatClient.configure_azure_monitor() can call
      enable_instrumentation() as part of their setup, and the user's opt-out
      needs to win.
    - All five surfaces guarded by the sticky disable (property reads, public
      enable functions, configure_otel_providers, direct attribute writes,
      is_user_disabled-aware integrations).
    - The force=True escape hatch on both enable_instrumentation() and
      enable_sensitive_telemetry().
    - How third-party integrations should consult OBSERVABILITY_SETTINGS.is_user_disabled.
    - The limits of the disable (does not tear down existing providers /
      in-flight spans / third-party instrumentation, does not persist across
      processes).
    
    Cross-links the new section from the ENABLE_INSTRUMENTATION row in the env
    vars table.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: soften disable_instrumentation() overclaim about telemetry guarantees
    
    Replace 'no telemetry will be emitted no matter what' (which is too strong,
    since callers can still pass force=True or mutate private attributes) with
    language framing the disable as a user-intent contract that library and
    framework code is expected to honor: the framework actively short-circuits
    the public enable paths, force=True and private-attribute writes are
    acknowledged as out-of-contract escape hatches that integrations should
    not use on the user's behalf.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: correct observability Dependencies section
    
    - opentelemetry-sdk is no longer a hard dependency; it is lazily imported by
      create_resource(), create_metric_views(), and configure_otel_providers()
      with a clear ImportError when missing. Day-to-day instrumentation works
      with opentelemetry-api alone provided some other component configures the
      global OpenTelemetry providers (Azure Monitor, an APM agent, application
      bootstrap, etc.).
    - opentelemetry-semantic-conventions-ai is no longer used anywhere in the
      source; remove it from the listed dependencies.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: replace stale observability migration guide with current PR's only relevant migration
    
    The old guide documented the move away from setup_observability(otlp_endpoint=...)
    which was an earlier-release API change unrelated to this PR and stale enough that
    it's more confusing than helpful at this point. Replace it with a short note on the
    single migration this PR introduces: callers of
    enable_instrumentation(enable_sensitive_data=True) should switch to
    enable_sensitive_telemetry(). Cross-link to the Disabling instrumentation section
    for the rare 'force on without enabling sensitive data' use case where
    enable_instrumentation() still applies.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Document that W3C trace context injection does not apply to Foundry hosted/toolbox MCP tools (#5580)
    * docs: clarify MCP trace-context propagation scope for hosted/toolbox tools (#5547)
    
    Automatic W3C trace-context injection via params._meta applies only to
    MCP sessions opened by the agent process (MCPStreamableHTTPTool,
    MCPStdioTool, MCPWebsocketTool).  Hosted MCP tools
    (FoundryChatClient.get_mcp_tool) and toolbox-fetched tools
    (FoundryChatClient.get_toolbox) execute inside the Foundry agent service
    runtime; the framework never issues the tools/call for those and
    therefore cannot inject traceparent/tracestate.  The previous wording
    ("for all transports") implied coverage that does not exist.
    
    The updated section:
    - removes the inaccurate "for all transports" claim
    - adds a Scope paragraph naming the three client-opened transports that
      are covered
    - explicitly states that propagation across the agent-to-toolbox-to-MCP
      boundary is the responsibility of the Foundry service runtime
    - documents the workaround (use MCPStreamableHTTPTool directly) for
      users who need end-to-end distributed tracing today
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: broaden MCP _meta scope note to cover all provider-managed transports (#5547)
    
    - List OpenAIChatClient.get_mcp_tool() and AnthropicClient.get_mcp_tool()
      alongside FoundryChatClient.get_mcp_tool() as hosted/provider-managed
      exceptions; restricting the carve-out to Foundry was misleading for
      readers using other providers
    - Fix get_toolbox() wording: use 'await client.get_toolbox(...)' and note
      that toolbox.tools is passed into Agent(tools=...) so it reads as an
      async instance method call, not a static/class method call
    - Add parenthetical '(or any other client-opened MCPTool subclass)' to
      future-proof the list of covered transports
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * docs: add GeminiChatClient to MCP scope note and add learn-site observability doc (#5547)
    
    - Add GeminiChatClient.get_mcp_tool(...) to the hosted/provider-managed
      list in the MCP trace propagation scope note; Gemini's get_mcp_tool()
      returns a types.Tool with an McpServer entry executed by the Gemini
      service runtime, so it belongs alongside FoundryChatClient,
      OpenAIChatClient, and AnthropicClient in that list.
    - Create docs/features/observability/README.md as the learn-site
      documentation surface for observability, covering telemetry setup and
      MCP trace propagation with the same scope note (including
      GeminiChatClient) so that both doc surfaces are consistent.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Remove unneeded observability docs README
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <copilot@github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: Foundry hosted agent V2 (#5379)
    * Python: Wrapper + Samples 1st (#5177)
    
    * Experiment
    
    * Update dependency and add non streaming
    
    * Add more samples
    
    * Rename samples
    
    * Add invocations
    
    * Comments 1
    
    * Comments 2
    
    * Comments 3
    
    * Improve README
    
    * Add local shell sample
    
    * WIP: Add eval and memory samples
    
    * Update user agent prefix
    
    * Update user agent prefix doc
    
    * Update dependency (#5215)
    
    * Add tests and more content types (#5235)
    
    * Add tests
    
    * fix tests and sample
    
    * Fix formatting
    
    * Remove function approval contents
    
    * Python: Refine samples and upgrade packages (#5261)
    
    * Refine samples and upgrade pacakges
    
    * Upgrade to a new package that fixes a bug
    
    * Update model env var
    
    * Move samples (#5281)
    
    * Python: Upgrade agentserver packages (#5284)
    
    * Upgrade agentserver packages
    
    * Fix new types
    
    * Python: Add special handling for workflows (#5298)
    
    * Add special handling for workflows
    
    * Address comments
    
    * Improve samples (#5372)
    
    * Python: Add more types (#5378)
    
    * Add more type supports
    
    * Upgrade packages
    
    * Remove TODOs in README
    
    * Fix README
    
    * Comments and mypy
    
    * User agent scoped
    
    * Fix README
    
    * Fix pre commit
    
    * Fix pre commit 2
    
    * Fix pre commit 3
    
    * Fix pre commit 4
    
    * Fix pre commit 5
    
    * Fix pre commit 6
    
    * Add azure-monitor-opentelemetry to dev deps
    
    Fixes Samples & Markdown CI failure. The PR's new transitive dep on
    azure-monitor-opentelemetry-exporter (via azure-ai-agentserver-core) makes
    pyright resolve the azure.monitor.opentelemetry namespace, flipping the
    check_md_code_blocks diagnostic for `configure_azure_monitor` from
    reportMissingImports (filtered) to reportAttributeAccessIssue (not filtered).
    Installing the umbrella azure-monitor-opentelemetry package in dev makes
    pyright resolve the symbol correctly, matching the install guidance the
    observability README already gives users.
    
    ---------
    
    Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>
  • Python: [BREAKING] Python: Provider-leading client design & OpenAI package extraction (#4818)
    * Python: Provider-leading client design & OpenAI package extraction
    
    Major refactoring of the Python Agent Framework client architecture:
    
    - Extract OpenAI clients into new `agent-framework-openai` package
    - Core package no longer depends on openai, azure-identity, azure-ai-projects
    - Rename clients for discoverability: OpenAIResponsesClient → OpenAIChatClient,
      OpenAIChatClient → OpenAIChatCompletionClient
    - Unify `model_id`/`deployment_name`/`model_deployment_name` → `model` param
    - New FoundryChatClient for Azure AI Foundry Responses API
    - New FoundryAgent/FoundryAgentClient for connecting to pre-configured Foundry agents
    - Remove OpenAIBase/OpenAIConfigMixin from non-deprecated client MRO
    - Deprecate AzureOpenAI* clients, AzureAIClient, OpenAIAssistantsClient
    - Reorganize samples: azure_openai+azure_ai+azure_ai_agent → azure/
    - ADR-0020: Provider-Leading Client Design
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: missing Agent imports in samples, .model_id → .model in foundry_local sample
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: CI failures — mypy errors, coverage targets, sample imports
    
    - azure-ai mypy: add type ignores for TypedDict total=, model arg, forward ref
    - Coverage: replace core.azure/openai targets with openai package target
    - project_provider: add type annotation for opts dict
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: populate openai .pyi stub, fix broken README links, coverage targets
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fixes
    
    * updated observabilitty
    
    * reset azure init.pyi
    
    * fix errors
    
    * updated adr number
    
    * fix foundry local
    
    * fixed not renamed docstrings and comments, and added deprecated markers to old classes
    
    * fix tests and pyprojects
    
    * fix test vars
    
    * updated function tests
    
    * update durable
    
    * updated test setup for functions
    
    * Fix Foundry auth in workflow samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Stabilize Python integration workflows
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update hosting samples for Foundry
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Trigger full CI rerun
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Trigger CI rerun again
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * trigger rerun
    
    * trigger rerun
    
    * fix for litellm
    
    * undo durabletask changes
    
    * Move Foundry APIs into foundry namespace
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Foundry pyproject formatting
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Split provider samples by Foundry surface
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Restore hosting sample requirements
    
    Also fix the Foundry Local sample link after the provider sample move.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * updated tests
    
    * udpated foundry integration tests
    
    * removed dist from azurefunctions tests
    
    * Use separate Foundry clients for concurrent agents
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix client setup in azfunc and durable
    
    * disabled two tests
    
    * updated setup for some function and durable tests
    
    * improved azure openai setup with new clients
    
    * ignore deprecated
    
    * fixes
    
    * skip 11
    
    * remove openai assistants int tests
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: docs(observability): add Comet Opik setup example (#3940)
    * docs(observability): add Comet Opik setup example
    
    * Update README.md
  • Python: improve .env handling and observability samples (#4032)
    * Python: improve .env precedence and observability samples
    
    - Switch load_settings to explicit precedence: overrides -> explicit .env -> environment -> defaults\n- Raise when env_file_path is provided but missing\n- Update settings docs and tests for new behavior\n- Refresh observability samples and README guidance for env loading options\n\nCloses #3864\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fixed some imports
    
    * Fix load_settings CI regressions
    
    Allow explicit env_file_path values that exist but are not regular files (for example /dev/null) by checking path existence before dotenv parsing, and restore a dict accumulator with typed return cast to satisfy mypy.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Avoid implicit dotenv in observability
    
    Only load dotenv in observability helpers when env_file_path is explicitly provided, and remove test os.devnull workarounds that are no longer necessary.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • Python: feature: Inject OpenTelemetry trace context into MCP requests and update… (#3780)
    * feat: Inject OpenTelemetry trace context into MCP requests and update documentation
    
    * Update python/samples/getting_started/observability/README.md
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * Update python/packages/core/tests/core/test_mcp.py
    
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
    
    * refactor: move opentelemetry import to module level
    
    OpenTelemetry is a hard dependency of agent-framework-core (per
    pyproject.toml), so the try/except ImportError guard was dead code.
    Move the import to the top of the file to fail fast on missing
    dependencies instead of silently hiding installation issues.
    
    ---------
    
    Co-authored-by: Pete Roden <Pete.Roden@microsoft.com>
    Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com>
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
  • Python: [BREAKING] Fix #3613 chat/agent message typing alignment (#3920)
    * Fix #3613 message typing across chat and agents
    
    * Address #3613 review feedback and sample input style
    
    * refactor: use shared AgentRunMessages aliases (#3613)
    
    * refactor: rename agent run input aliases for #3613
    
    * samples: inline image content in run calls
    
    * core: export AgentRunInputs from package init
    
    * core: use explicit init re-exports without __all__
    
    * updated logging and inits
    
    * Fix core mypy export and samples XML note
    
    * Remove AgentRunInputsOrNone and dedupe loggers
    
    * Remove prepare_messages helper
    
    * fix integration tests
  • Python: restructure: Python samples into progressive 01-05 layout (#3862)
    * restructure: Python samples into progressive 01-05 layout
    
    - 01-get-started/: 6 numbered steps (hello agent → hosting)
    - 02-agents/: all agent concept samples (tools, middleware, providers, etc.)
    - 03-workflows/: ALL existing workflow samples preserved as-is
    - 04-hosting/: azure-functions, durabletask, a2a
    - 05-end-to-end/: demos, evaluation, hosted agents
    - Old files moved to _to_delete/ for review
    - Added AGENTS.md with structure documentation
    - autogen-migration/ and semantic-kernel-migration/ preserved at root
    
    * fix: switch to AzureOpenAI Foundry, fix CI failures
    
    - Switch all 01-get-started samples to AzureOpenAIResponsesClient with
      Azure AI Foundry project endpoint (AZURE_AI_PROJECT_ENDPOINT +
      AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME + AzureCliCredential)
    - Add _to_delete/ and 05-end-to-end/ to pyrightconfig.samples.json excludes
    - Fix test paths in packages/ that referenced old getting_started/ dirs:
      durabletask conftest + streaming test, azurefunctions conftest,
      devui conftest + capture_messages + openai_sdk_integration
    - Fix workflow_as_agent_human_in_the_loop.py import (sibling import)
    - Update hosting READMEs and tool comment paths
    - Replace root README.md with new structure overview
    - Update AGENTS.md to document Azure OpenAI Foundry as default provider
    
    * cleanup: remove _to_delete folder, copy resource files to active dirs
    
    All files in _to_delete/ were either:
    - Exact duplicates of files in the new structure (240 files)
    - Same file with only comment path updates (100 files)
    - One import-fix diff (workflow_as_agent_human_in_the_loop.py)
    - One superseded minimal_sample.py
    
    Resource files (sample.pdf, countries.json, employees.pdf, weather.json)
    copied to 02-agents/sample_assets/ and 02-agents/resources/ since active
    samples reference them.
    
    * fix: address PR review comments, centralize resources, remove root duplicates
    
    - Fix type annotation in 04_memory.py (string union -> proper types)
    - Fix old sample paths in observability files
    - Fix grammar/spelling in observability samples
    - Move sample_assets/ and resources/ to shared/ folder
    - Remove 8 duplicate observability files from 02-agents root
    - Update resource path references in multimodal_input and provider samples
    
    * fix: update broken links from old getting_started paths to new structure
    
    - Update relative paths in READMEs: getting_started/ → 01-get-started/,
      02-agents/, 03-workflows/, 04-hosting/, 05-end-to-end/
    - Fix absolute GitHub URLs in package READMEs
    - Fix broken link in ollama package README
    
    * fix: convert absolute GitHub URLs to relative paths for link checker
    
    Absolute URLs to python/samples/ on main branch 404 until PR merges.
    Converted to relative paths that linkspector can verify locally.
    
    * fix: update link for handoff sample moved to orchestrations/
    
    * fix: update chatkit-integration README path from demos/ to 05-end-to-end/
    
    * fix: update broken links in orchestrations README to match flat directory structure