* Python: feat(evals): RubricScore type + EvalScoreResult.dimensions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: feat(foundry-evals): RubricDimension + GeneratedEvaluatorRef + accept in evaluators= Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: feat(evals): parse rubric_scores from output items + assertion helpers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: feat(evals): BaseAgent.as_eval_source / Workflow.as_eval_source Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: feat(foundry-evals): EvalGenerationSource + generate_rubric helper Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: feat(foundry-evals): YAML config loader + sample Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: fix(evals): address PR review feedback Addresses 4 Copilot review comments on PR #6101: 1. assert_dimension_score_at_least: drop the (not evaluator or found_any) guard so require_applicable=True correctly raises when the named evaluator produces no entries for the dimension. Adds TestRubricAssertions covering the regression. 2. GeneratedEvaluatorRef docstring: reword to describe actual behaviour (pinning recommended, not required) so it matches the dataclass default and FoundryEvals warning path. 3. _poll_generation_job: switch from asyncio.get_event_loop() to get_running_loop() and bound the per-iteration sleep by remaining time, matching _poll_eval_run. 4. generate_rubric: type category as Literal['quality','safety'] and validate at the entry point with a ValueError; drop the silent 'invalid -> quality' rewrite in _generation_job_to_ref. Adds a regression test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: feat(foundry-evals): hosted-agent-aware rubric generation * Auto-detect hosted Foundry agents in agent_as_eval_source: when the agent's chat_client exposes a string agent_name (the convention used by RawFoundryAgentChatClient for PromptAgents/HostedAgents), emit a type='agent' EvalGenerationSource so the service fetches instructions and tools from the agent registry instead of relying on the local wrapper (which holds neither for hosted agents). * Add hosted_agent_version kwarg and a new agent_version field on EvalGenerationSource so PromptAgent runs can pin to a specific hosted version for reproducible rubric generation. * Add force_prompt_source escape hatch to bypass auto-detection and always emit a rendered prompt dossier - useful when the local wrapper carries overrides the service-side agent doesnt see. * Fix _to_sdk_source for dataset sources: SDK ctor takes name=/version=, not dataset_name=/dataset_version=. The mismatch would raise TypeError against the real azure-ai-projects 2.3.0a* SDK; only unmocked integration paths were affected. Tests cover: auto-detection happy path, versionless hosted agent, explicit hosted_agent_version forwarding, force_prompt_source override, non-string chat_client attrs (MagicMock test doubles) not mis-detected, agent_version forwarded through _to_sdk_source, and the corrected dataset SDK kwarg names. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(foundry-evals): accept canonical dimension_scores key per docs The published Foundry rubric-evaluator output (Microsoft Learn 'Rubric evaluators' reference) places per-dimension breakdowns under properties.dimension_scores, not properties.rubric_scores. The parser now tries dimension_scores first and falls back to rubric_scores for preview-build compatibility, and tolerates non-list payloads (e.g. MagicMock auto-attrs) by trying the next candidate when parsing yields zero entries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(foundry-evals): add manual create_rubric_evaluator Adds FoundryEvals.create_rubric_evaluator as the agent-framework surface over project_client.beta.evaluators.create_version. This is the manual counterpart to generate_rubric: callers supply RubricDimension instances (authored locally, ported from another framework, or hand-tuned) and we POST a RubricBasedEvaluatorDefinition. The service auto-attaches the non-editable residual dimension (general_quality for quality, general_policy_compliance for safety). Per the Microsoft Learn 'Rubric evaluators' reference, the auto-generation path (create_generation_job) is primarily a portal/UI feature; external SDK clients with rich local agent context are better served by manual create_version. This keeps generate_rubric for users who want to round-trip through a Foundry-registered agent. Validation up front: weight must be in [1,10], ids unique, descriptions non-empty, pass_threshold in [0,1]. The returned GeneratedEvaluatorRef is identical in shape to one obtained from generate_rubric, so downstream evaluators= lists work unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * samples(foundry-evals): manual rubric sample + namespace re-exports Adds evaluate_with_manual_rubric_sample.py demonstrating the end-to-end dev scenario for FoundryEvals.create_rubric_evaluator: hand-author a list of RubricDimension, register via create_rubric_evaluator, then use the pinned GeneratedEvaluatorRef alongside built-in evaluators in an agent regression run. Also re-exports RubricDimension, GeneratedEvaluatorRef, build_sources, and load_evals_config from agent_framework.foundry (both the lazy runtime shim and the type stub) so the rubric samples can import everything from a single namespace; the auto-generate sample was previously broken because the shim was missing build_sources / load_evals_config. Updates the foundry-evals README with a chooser entry for the two rubric paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(foundry-evals): remove rubric creation flows; keep consumption only Reframes agent-framework as a pure consumer of Foundry rubric evaluators: scoring against rubrics that already exist (authored in the Foundry portal or via the dedicated SDK / REST surface) instead of creating them from the SDK. Removed creation surface area: - FoundryEvals.generate_rubric (auto-generate path) and create_rubric_evaluator (manual path), plus all _GenerationSdkTypes / _ManualRubricSdkTypes / _to_sdk_dimensions / _coalesce_generation_sources / _to_sdk_source / _poll_generation_job / _generation_job_to_ref / _evaluator_version_to_ref / _get_beta_evaluators / _import_*_sdk_types helpers. - EvalGenerationSource (the input source discriminator), RubricDimension (the input dimension type), agent_as_eval_source / workflow_as_eval_source / _detect_hosted_foundry_agent helpers, and the YAML-config loader (_evals_config.py with RubricGenerationSpec / RubricSourceSpec / parse_evals_config / load_evals_config / build_sources). - BaseAgent.as_eval_source / Workflow.as_eval_source plus the _render_agent_dossier / _render_workflow_dossier helpers in core. These existed only to feed the now-removed generation pipeline. - Samples evaluate_with_generated_rubric_sample.py, evaluate_with_manual_rubric_sample.py, and evaluators.yaml. Replaced with a short README section showing how to reference an existing rubric evaluator via GeneratedEvaluatorRef. Kept (consumption surface): - GeneratedEvaluatorRef, slimmed to (name, version, display_name). Still accepted alongside built-in evaluator strings in FoundryEvals(evaluators=[...]). Versionless refs still warn. - RubricScore on EvalScoreResult.dimensions plus EvalResults.assert_dimension_score_at_least for per-dimension CI gates. - _parse_dimension_entries / _extract_rubric_scores output parsing (both canonical dimension_scores and the legacy rubric_scores key). Tests: 160/160 foundry unit tests and 71/71 core local-eval tests pass; pyright is clean across changed files. The pre-existing tests/core/test_telemetry.py::test_detect_hosted_fallback_import_error failure is unrelated and reproduces on the prior commit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * samples(foundry-evals): add evaluate_with_rubric_sample Adds a runnable end-to-end sample showing how to consume a pre-existing rubric evaluator created in Foundry: reference it with GeneratedEvaluatorRef(name, version), mix it with built-in evaluators in FoundryEvals, and gate CI with assert_dimension_score_at_least on a specific dimension. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(foundry-evals): satisfy mypy on _fetch_output_items mypy infers OutputItemListResponse.sample as dict[str, object] | None while pyright correctly infers the typed Sample model. Cast to Any so both type checkers accept the attribute access pattern, rename the local to avoid shadowing the inner-loop sample binding, and drop the now-stale pyright suppressions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(foundry-evals): drop unpublished rubric-evaluators learn.microsoft.com link The Adaptive Evals authoring docs are not yet published on Microsoft Learn, so the link 404s. Keep the descriptive text without the broken hyperlink; we can re-add it once the docs ship. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(foundry-evals): hoist repeated local imports to module top Per code review feedback (eavanvalkenburg): the test file repeated 'from agent_framework_foundry._foundry_evals import ...' inside 22 test bodies and 'from agent_framework_foundry import GeneratedEvaluatorRef' inside 8 more. Move all of them to the existing top-level imports; the symbols are the same across tests and the local imports were redundant. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Ben Thomas <25218250+alliscode@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Python Samples
This directory contains samples demonstrating the capabilities of Microsoft Agent Framework for Python.
Structure
| Folder | Description |
|---|---|
01-get-started/ |
Progressive tutorial: hello agent → hosting |
02-agents/ |
Deep-dive by concept: tools, middleware, providers, orchestrations |
03-workflows/ |
Workflow patterns: sequential, concurrent, state, declarative, explicit output designation |
04-hosting/ |
Deployment: Azure Functions, Durable Tasks, A2A |
05-end-to-end/ |
Full applications, evaluation, demos |
Getting Started
Start with 01-get-started/ and work through the numbered files:
- 01_hello_agent.py — Create and run your first agent
- 02_add_tools.py — Add function tools with
@tool - 03_multi_turn.py — Multi-turn conversations with
AgentSession - 04_memory.py — Agent memory with
ContextProvider - 05_functional_workflow_with_agents.py — Call agents inside a functional workflow
- 06_functional_workflow_basics.py — Write a workflow as a plain async function
- 07_first_graph_workflow.py — Build a workflow with executors and edges
- 08_host_your_agent.py — Host your agent via Azure Functions
Prerequisites
pip install agent-framework
Environment Variables
Samples call load_dotenv() to automatically load environment variables from a .env file in the python/ directory. This is a convenience for local development and testing.
For local development, set up your environment using any of these methods:
Option 1: Using a .env file (recommended for local development):
- Copy
.env.exampleto.envin thepython/directory:cp .env.example .env - Edit
.envand set your values (API keys, endpoints, etc.)
Option 2: Export environment variables directly:
export FOUNDRY_PROJECT_ENDPOINT="your-foundry-project-endpoint"
export FOUNDRY_MODEL="gpt-4o"
Option 3: Using env_file_path parameter (for per-client configuration):
All client classes (e.g., OpenAIChatClient, OpenAIChatCompletionClient) support an env_file_path parameter to load environment variables from a specific file:
from agent_framework.openai import OpenAIChatClient
# Load from a custom .env file
client = OpenAIChatClient(env_file_path="path/to/custom.env")
This allows different clients to use different configuration files if needed.
For the generic OpenAI clients (OpenAIChatClient and OpenAIChatCompletionClient), routing
precedence is:
- Explicit Azure inputs such as
credential,azure_endpoint, orapi_version OPENAI_API_KEY/ explicit OpenAI API-key parameters- Azure environment fallback such as
AZURE_OPENAI_ENDPOINTandAZURE_OPENAI_API_KEY
If you keep both OpenAI and Azure variables in your shell, the generic clients stay on OpenAI until you pass an explicit Azure input.
For the getting-started samples, you'll need at minimum:
FOUNDRY_PROJECT_ENDPOINT="your-foundry-project-endpoint"
FOUNDRY_MODEL="gpt-4o"
Consolidated sample env inventory
This is the single source of truth for package-level environment variables read by packages included by
agent-framework-core[all]. It intentionally excludes variables that are only read by standalone samples,
package sample folders, or tests. When package code adds, removes, or renames an environment variable,
update this table in the same change.
Example values below are illustrative. For entries not backed by a single public class, the class
column names the closest public surface, helper, or package-level initialization point that reads the
variable.
| package | class/module | env var | example value |
|---|---|---|---|
agent-framework-anthropic |
AnthropicClient |
ANTHROPIC_API_KEY |
sk-ant-api03-... |
agent-framework-anthropic |
AnthropicClient |
ANTHROPIC_CHAT_MODEL |
claude-sonnet-4-5-20250929 |
agent-framework-foundry |
FoundryEmbeddingClient |
FOUNDRY_MODELS_ENDPOINT |
https://my-endpoint.inference.ai.azure.com |
agent-framework-foundry |
FoundryEmbeddingClient |
FOUNDRY_MODELS_API_KEY |
env-key |
agent-framework-foundry |
FoundryEmbeddingClient |
FOUNDRY_EMBEDDING_MODEL |
text-embedding-3-small |
agent-framework-foundry |
FoundryEmbeddingClient |
FOUNDRY_IMAGE_EMBEDDING_MODEL |
Cohere-embed-v3-english |
agent-framework-azure-ai-search |
AzureAISearchContextProvider |
AZURE_SEARCH_ENDPOINT |
https://my-search.search.windows.net |
agent-framework-azure-ai-search |
AzureAISearchContextProvider |
AZURE_SEARCH_API_KEY |
search-key |
agent-framework-azure-ai-search |
AzureAISearchContextProvider |
AZURE_SEARCH_INDEX_NAME |
hotels-index |
agent-framework-azure-ai-search |
AzureAISearchContextProvider |
AZURE_SEARCH_KNOWLEDGE_BASE_NAME |
hotels-kb |
agent-framework-azure-cosmos |
CosmosHistoryProvider |
AZURE_COSMOS_ENDPOINT |
https://my-cosmos.documents.azure.com:443/ |
agent-framework-azure-cosmos |
CosmosHistoryProvider |
AZURE_COSMOS_DATABASE_NAME |
agent-history |
agent-framework-azure-cosmos |
CosmosHistoryProvider |
AZURE_COSMOS_CONTAINER_NAME |
messages |
agent-framework-azure-cosmos |
CosmosHistoryProvider |
AZURE_COSMOS_KEY |
C2F...== |
agent-framework-bedrock |
BedrockChatClient |
BEDROCK_REGION |
us-east-1 |
agent-framework-bedrock |
BedrockChatClient |
BEDROCK_CHAT_MODEL |
anthropic.claude-3-5-sonnet-20241022-v2:0 |
agent-framework-bedrock |
BedrockEmbeddingClient |
BEDROCK_REGION |
us-east-1 |
agent-framework-bedrock |
BedrockEmbeddingClient |
BEDROCK_EMBEDDING_MODEL |
amazon.titan-embed-text-v2:0 |
agent-framework-bedrock |
BedrockChatClient / BedrockEmbeddingClient |
AWS_ACCESS_KEY_ID |
AKIAIOSFODNN7EXAMPLE |
agent-framework-bedrock |
BedrockChatClient / BedrockEmbeddingClient |
AWS_SECRET_ACCESS_KEY |
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY |
agent-framework-bedrock |
BedrockChatClient / BedrockEmbeddingClient |
AWS_SESSION_TOKEN |
IQoJb3JpZ2luX2VjEO7//////////wEaCXVzLXdlc3QtMiJHMEUCIQD... |
agent-framework-copilotstudio |
CopilotStudioAgent |
COPILOTSTUDIOAGENT__ENVIRONMENTID |
00000000-0000-0000-0000-000000000000 |
agent-framework-copilotstudio |
CopilotStudioAgent |
COPILOTSTUDIOAGENT__SCHEMANAME |
cr123_agentname |
agent-framework-copilotstudio |
CopilotStudioAgent |
COPILOTSTUDIOAGENT__TENANTID |
11111111-1111-1111-1111-111111111111 |
agent-framework-copilotstudio |
CopilotStudioAgent |
COPILOTSTUDIOAGENT__AGENTAPPID |
22222222-2222-2222-2222-222222222222 |
agent-framework-core |
observability |
ENABLE_INSTRUMENTATION |
true |
agent-framework-core |
observability |
ENABLE_SENSITIVE_DATA |
false |
agent-framework-core |
observability |
ENABLE_CONSOLE_EXPORTERS |
true |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_ENDPOINT |
http://localhost:4317 |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT |
http://localhost:4318/v1/traces |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT |
http://localhost:4318/v1/metrics |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_LOGS_ENDPOINT |
http://localhost:4318/v1/logs |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_PROTOCOL |
grpc |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_HEADERS |
api-key=demo |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_TRACES_HEADERS |
api-key=trace-demo |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_METRICS_HEADERS |
api-key=metric-demo |
agent-framework-core |
observability |
OTEL_EXPORTER_OTLP_LOGS_HEADERS |
api-key=log-demo |
agent-framework-core |
observability |
OTEL_SERVICE_NAME |
sample-agent |
agent-framework-core |
observability |
OTEL_SERVICE_VERSION |
1.0.0 |
agent-framework-core |
observability |
OTEL_RESOURCE_ATTRIBUTES |
deployment.environment=dev,service.namespace=agent-framework |
agent-framework-devui |
DevUI server |
DEVUI_AUTH_TOKEN |
my-devui-token |
agent-framework-foundry |
FoundryChatClient |
FOUNDRY_PROJECT_ENDPOINT |
https://my-project.services.ai.azure.com/api/projects/my-project |
agent-framework-foundry |
FoundryChatClient |
FOUNDRY_MODEL |
gpt-4o |
agent-framework-foundry |
FoundryAgent |
FOUNDRY_AGENT_NAME |
travel-planner |
agent-framework-foundry |
FoundryAgent |
FOUNDRY_AGENT_VERSION |
v1 |
agent-framework-github-copilot |
GitHubCopilotAgent |
GITHUB_COPILOT_CLI_PATH |
copilot |
agent-framework-github-copilot |
GitHubCopilotAgent |
GITHUB_COPILOT_MODEL |
gpt-5 |
agent-framework-github-copilot |
GitHubCopilotAgent |
GITHUB_COPILOT_TIMEOUT |
60 |
agent-framework-github-copilot |
GitHubCopilotAgent |
GITHUB_COPILOT_LOG_LEVEL |
info |
agent-framework-mem0 |
agent_framework_mem0 package import |
MEM0_TELEMETRY |
false |
agent-framework-ollama |
OllamaChatClient |
OLLAMA_HOST |
http://localhost:11434 |
agent-framework-ollama |
OllamaChatClient |
OLLAMA_MODEL |
llama3.1:8b |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
OPENAI_API_KEY |
sk-proj-... |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
OPENAI_MODEL |
gpt-4o-mini |
agent-framework-openai |
OpenAIChatClient |
OPENAI_CHAT_MODEL |
gpt-4.1-mini |
agent-framework-openai |
OpenAIChatCompletionClient |
OPENAI_CHAT_COMPLETION_MODEL |
gpt-4o |
agent-framework-openai |
OpenAIEmbeddingClient |
OPENAI_EMBEDDING_MODEL |
text-embedding-3-small |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
OPENAI_BASE_URL |
https://api.openai.com/v1/ |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
OPENAI_ORG_ID |
org_123456789 |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
AZURE_OPENAI_ENDPOINT |
https://my-resource.openai.azure.com/ |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
AZURE_OPENAI_API_KEY |
sk-azure-... |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
AZURE_OPENAI_API_VERSION |
2024-10-21 |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
AZURE_OPENAI_BASE_URL |
https://my-resource.openai.azure.com/openai/v1/ |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
AZURE_OPENAI_MODEL |
gpt-4o |
agent-framework-openai |
OpenAIChatClient |
AZURE_OPENAI_CHAT_MODEL |
gpt-4.1 |
agent-framework-openai |
OpenAIChatCompletionClient |
AZURE_OPENAI_CHAT_COMPLETION_MODEL |
gpt-4o-mini |
agent-framework-openai |
OpenAIEmbeddingClient |
AZURE_OPENAI_EMBEDDING_MODEL |
text-embedding-3-large |
agent-framework-openai |
OpenAIChatClient / OpenAIChatCompletionClient / OpenAIEmbeddingClient |
AZURE_OPENAI_RESOURCE_URL |
https://cognitiveservices.azure.com/ |
agent-framework-openai supports the Azure OpenAI client-specific deployment aliases listed above; keep
packages/openai/README.md as the authoritative reference for the exact fallback order and package-specific
behavior.
Note for production: In production environments, set environment variables through your deployment platform (e.g., Azure App Settings, Kubernetes ConfigMaps/Secrets) rather than using .env files. The load_dotenv() call in samples will have no effect when a .env file is not present, allowing environment variables to be loaded from the system.
For Azure authentication, run az login before running samples.
Note on XML tags
Some sample files include XML-style snippet tags (for example <snippet_name> and </snippet_name>). These are used by our documentation tooling and can be ignored or removed when you use the samples outside this repository.
Additional Resources
- Agent Framework Documentation
- AGENTS.md — Structure documentation for maintainers
- SAMPLE_GUIDELINES.md — Coding conventions for samples