mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
cc0cfaaac8
* Python: fix OpenAI Azure routing and provider samples Prefer OpenAI when OPENAI_API_KEY is present unless Azure is explicitly requested. Clarify constructor docs, keep deprecated Azure wrappers compatible with stricter settings validation, and refresh the provider samples and tests to use the current client patterns. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix bandit * Python: align OpenAI embedding Azure routing Extend the shared OpenAI-vs-Azure routing and credential behavior to the embedding client, add Azure embedding regression coverage, and refresh the embedding samples to use the generic client path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: fix embedding client pyright check Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: thin OpenAI embedding wrapper Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: document embedding overload routing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: fix callable OpenAI key routing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: fix Azure credential routing tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: address OpenAI review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: narrow Azure routing markers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: refine OpenAI model fallback order Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: narrow Azure deployment docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: remove embedding routing wording Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: run embedding Azure integration tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * changed variable name * Python: expand OpenAI package README Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * clarified readme * Python: fix Azure OpenAI integration setup Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: correct Azure integration env mapping Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updated code to fix int tests * test updates * test fix * fix test setup * updates to tests and setup * remove openai assistants int tests * improvements in int tests * fix env var * fix env vars * fix azure responses test * trigger actions --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cc0cfaaac8
ยท
2026-03-27 13:33:39 +00:00
History
Agent Framework Lab - GAIA
The GAIA benchmark can be used for evaluating agents and workflows built using the Agent Framework. It includes built-in benchmarks as well as utilities for running custom evaluations.
Note
: This module is part of the consolidated
agent-framework-labpackage. Install the package with thegaiaextra to use this module.
Setup
Install the agent-framework-lab package with GAIA dependencies:
pip install "agent-framework-lab[gaia]"
Set up Hugging Face token:
export HF_TOKEN="hf\*..." # must have access to gaia-benchmark/GAIA
Create an evaluation script
Create a Python script (e.g., run_gaia.py) with the following content:
from agent_framework.lab.gaia import GAIA, Task, Prediction, GAIATelemetryConfig
async def run_task(task: Task) -> Prediction:
return Prediction(prediction="answer here", messages=[])
async def main() -> None:
# Optional: Enable telemetry for detailed tracing
telemetry_config = GAIATelemetryConfig(
enable_tracing=True,
trace_to_file=True,
file_path="gaia_traces.jsonl"
)
runner = GAIA(telemetry_config=telemetry_config)
await runner.run(run_task, level=1, max_n=5, parallel=2)
See the gaia_sample.py for more detail.
View results
We provide a console viewer for reading GAIA results:
uv run gaia_viewer "gaia_results_<timestamp>.jsonl" --detailed