Files
agent-framework/python/samples/demos/workflow_evaluation
T
Eduard van Valkenburg a7d924a7d2 Python: [BREAKING] changed AIFunction to FunctionTool and @ai_function to @tool (#3413)
* changed AIFunction to FunctionTool and @ai_function to @tool

* test and mypy fixes

* mypy fix

* switch function tool to always_require

* fix noop

* fix github copilot imports

* test fixes

* fix ollama test

* fixes for tests

* fix tests

* reverted change to always_require and extended timeout

* fix test
a7d924a7d2 ยท 2026-01-28 14:53:53 +00:00
History
..

Multi-Agent Travel Planning Workflow Evaluation

This sample demonstrates evaluating a multi-agent workflow using Azure AI's built-in evaluators. The workflow processes travel planning requests through seven specialized agents in a fan-out/fan-in pattern: travel request handler, hotel/flight/activity search agents, booking aggregator, booking confirmation, and payment processing.

Evaluation Metrics

The evaluation uses four Azure AI built-in evaluators:

  • Relevance - How well responses address the user query
  • Groundedness - Whether responses are grounded in available context
  • Tool Call Accuracy - Correct tool selection and parameter usage
  • Tool Output Utilization - Effective use of tool outputs in responses

Setup

Create a .env file with configuration as in the .env.example file in this folder.

Running the Evaluation

Execute the complete workflow and evaluation:

python run_evaluation.py

The script will:

  1. Execute the multi-agent travel planning workflow
  2. Display response summary for each agent
  3. Create and run evaluation on hotel, flight, and activity search agents
  4. Monitor progress and display the evaluation report URL