mirror of https://github.com/microsoft/agent-framework.git synced 2026-06-16 21:04:09 +08:00

Files

T

Eduard van Valkenburg 390f93344c Python: Add samples syntax checking with pyright (#3710 )

* Add samples syntax checking with pyright

- Add pyrightconfig.samples.json with relaxed type checking but import validation
- Add samples-syntax poe task to check samples for syntax and import errors
- Add samples-syntax to check and pre-commit-check tasks
- Fix 78 sample errors:
  - Update workflow builder imports to use agent_framework_orchestrations
  - Change content type isinstance checks to content.type comparisons
  - Use Content factory methods instead of removed content type classes
  - Fix TypedDict access patterns for Annotation
  - Fix various API mismatches (normalize_messages, ChatMessage.text, role)

* fixed a bunch of samples and tweaks to pre-commit

* updated lock

* updated lock

* fixes

* added lint to samples

390f93344c · 2026-02-07 07:10:47 +00:00

History

_tools.py

Python: Add samples syntax checking with pyright (#3710 )

2026-02-07 07:10:47 +00:00

.env.example

Python: Use Foundry evaluators to evaluate agent workflows (#2322 )

2025-11-21 09:51:44 +00:00

create_workflow.py

Python: Add samples syntax checking with pyright (#3710 )

2026-02-07 07:10:47 +00:00

README.md

Python: Use Foundry evaluators to evaluate agent workflows (#2322 )

2025-11-21 09:51:44 +00:00

run_evaluation.py

Python: Add samples syntax checking with pyright (#3710 )

2026-02-07 07:10:47 +00:00

README.md

Multi-Agent Travel Planning Workflow Evaluation

This sample demonstrates evaluating a multi-agent workflow using Azure AI's built-in evaluators. The workflow processes travel planning requests through seven specialized agents in a fan-out/fan-in pattern: travel request handler, hotel/flight/activity search agents, booking aggregator, booking confirmation, and payment processing.

Evaluation Metrics

The evaluation uses four Azure AI built-in evaluators:

Relevance - How well responses address the user query
Groundedness - Whether responses are grounded in available context
Tool Call Accuracy - Correct tool selection and parameter usage
Tool Output Utilization - Effective use of tool outputs in responses

Setup

Create a .env file with configuration as in the .env.example file in this folder.

Running the Evaluation

Execute the complete workflow and evaluation:

python run_evaluation.py

The script will:

Execute the multi-agent travel planning workflow
Display response summary for each agent
Create and run evaluation on hotel, flight, and activity search agents
Monitor progress and display the evaluation report URL