mirror of https://github.com/microsoft/agent-framework.git synced 2026-06-16 21:04:09 +08:00

Files

T

Eduard van Valkenburg 3139347526 Python: [BREAKING] Observability updates (#2782 )

* fixes Python: Add env_file_path parameter to setup_observability() similar to AzureOpenAIChatClient
Fixes #2186

* WIP on updates using configure_azure_monitor

* improved setup and clarity

* fixed root .env.example

* revert changes

* updated files

* updated sample

* updated zero code

* test fixes and fixed links

* fix devui

* removed planning docs

* added enable method and updated readme and samples

* clarified docstring

* add return annotation

* updated naming

* update capatilized version

* updated readme and some fixes

* updated decorator name inline with the rest

* feedback from comments addressed

3139347526 · 2025-12-16 06:56:30 +00:00

History

agent_framework_lab_gaia

Python: [BREAKING] Observability updates (#2782 )

2025-12-16 06:56:30 +00:00

samples

Renamed async_credential to credential (#2648 )

2025-12-08 01:21:18 +00:00

tests

Python: consolidate lab packages into a single one; update contribution guidelines (#940 )

2025-09-27 03:28:05 +00:00

README.md

Python: Lab: Updates to GAIA module (#1763 )

2025-10-30 22:02:31 +00:00

README.md

Agent Framework Lab - GAIA

The GAIA benchmark can be used for evaluating agents and workflows built using the Agent Framework. It includes built-in benchmarks as well as utilities for running custom evaluations.

Note

: This module is part of the consolidated agent-framework-lab package. Install the package with the gaia extra to use this module.

Setup

Install the agent-framework-lab package with GAIA dependencies:

pip install "agent-framework-lab[gaia]"

Set up Hugging Face token:

export HF_TOKEN="hf\*..." # must have access to gaia-benchmark/GAIA

Create an evaluation script

Create a Python script (e.g., run_gaia.py) with the following content:

from agent_framework.lab.gaia import GAIA, Task, Prediction, GAIATelemetryConfig

async def run_task(task: Task) -> Prediction:
    return Prediction(prediction="answer here", messages=[])

async def main() -> None:
    # Optional: Enable telemetry for detailed tracing
    telemetry_config = GAIATelemetryConfig(
        enable_tracing=True,
        trace_to_file=True,
        file_path="gaia_traces.jsonl"
    )

    runner = GAIA(telemetry_config=telemetry_config)
    await runner.run(run_task, level=1, max_n=5, parallel=2)

See the gaia_sample.py for more detail.

View results

We provide a console viewer for reading GAIA results:

uv run gaia_viewer "gaia_results_<timestamp>.jsonl" --detailed