mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
update link (#1051)
This commit is contained in:
committed by
GitHub
Unverified
parent
5f1417ab94
commit
08b602c290
@@ -44,10 +44,12 @@ This structure maintains a single PyPI package `agent-framework-lab` while suppo
|
||||
|
||||
## Installation
|
||||
|
||||
Install the base lab package:
|
||||
Install from source:
|
||||
|
||||
```bash
|
||||
pip install agent-framework-lab
|
||||
git clone https://github.com/microsoft/agent-framework.git
|
||||
cd agent-framework/python/packages/lab
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
For details on installing individual modules, see their respective README files listed above.
|
||||
|
||||
@@ -7,10 +7,12 @@ It includes built-in benchmarks as well as utilities for running custom evaluati
|
||||
|
||||
## Setup
|
||||
|
||||
Install the agent-framework-lab package with GAIA dependencies:
|
||||
Install from source with GAIA dependencies:
|
||||
|
||||
```bash
|
||||
pip install "agent-framework-lab[gaia]"
|
||||
git clone https://github.com/microsoft/agent-framework.git
|
||||
cd agent-framework/python/packages/lab
|
||||
pip install -e ".[gaia]"
|
||||
```
|
||||
|
||||
Set up Hugging Face token:
|
||||
|
||||
@@ -8,20 +8,22 @@ This package enables you to train and fine-tune agents using advanced RL algorit
|
||||
|
||||
## Installation
|
||||
|
||||
Install the agent-framework-lab package with Lightning dependencies:
|
||||
Install from source with Lightning dependencies:
|
||||
|
||||
```bash
|
||||
pip install "agent-framework-lab[lightning]"
|
||||
git clone https://github.com/microsoft/agent-framework.git
|
||||
cd agent-framework/python/packages/lab
|
||||
pip install -e ".[lightning]"
|
||||
```
|
||||
|
||||
### Optional Dependencies
|
||||
|
||||
```bash
|
||||
# For math-related training
|
||||
pip install agent-framework-lab[lightning,math]
|
||||
pip install -e ".[lightning,math]"
|
||||
|
||||
# For tau2 benchmarking
|
||||
pip install agent-framework-lab[lightning,tau2]
|
||||
pip install -e ".[lightning,tau2]"
|
||||
```
|
||||
|
||||
To prepare for RL training, you'll also need to install dependencies like PyTorch, Ray, and vLLM. See the [Agent-lightning setup instructions](https://github.com/microsoft/agent-lightning) for more details.
|
||||
|
||||
@@ -13,20 +13,22 @@ Each evaluation runs a multi-turn conversation where the user simulator presents
|
||||
|
||||
## Supported Domains
|
||||
|
||||
| Domain | Status | Description |
|
||||
|--------|--------|-------------|
|
||||
| **airline** | ✅ Supported | Customer service for airline booking, changes, and support |
|
||||
| **retail** | 🚧 In Development | E-commerce customer support scenarios |
|
||||
| **telecom** | 🚧 In Development | Telecommunications service support |
|
||||
| Domain | Status | Description |
|
||||
| ----------- | ----------------- | ---------------------------------------------------------- |
|
||||
| **airline** | ✅ Supported | Customer service for airline booking, changes, and support |
|
||||
| **retail** | 🚧 In Development | E-commerce customer support scenarios |
|
||||
| **telecom** | 🚧 In Development | Telecommunications service support |
|
||||
|
||||
*Note: Currently only the airline domain is fully supported.*
|
||||
_Note: Currently only the airline domain is fully supported._
|
||||
|
||||
## Installation
|
||||
|
||||
Install the agent-framework-lab package with TAU2 dependencies:
|
||||
Install from source with TAU2 dependencies:
|
||||
|
||||
```bash
|
||||
pip install "agent-framework-lab[tau2]"
|
||||
git clone https://github.com/microsoft/agent-framework.git
|
||||
cd agent-framework/python/packages/lab
|
||||
pip install -e ".[tau2]"
|
||||
```
|
||||
|
||||
Download data from [Tau2-Bench](https://github.com/sierra-research/tau2-bench):
|
||||
@@ -104,15 +106,15 @@ python samples/run_benchmark.py --max-steps 20
|
||||
|
||||
The following results are reproduced from our implementation of τ²-bench with `samples/run_benchmark.py`. It shows the average success rate over the dataset of 50 tasks.
|
||||
|
||||
| Agent Model | User Model | Success Rate |
|
||||
|-------------|------------|----------|
|
||||
| gpt-5 | gpt-4.1 | 62.0% |
|
||||
| gpt-5-mini | gpt-4.1 | 52.0% |
|
||||
| gpt-4.1 | gpt-4.1 | 60.0% |
|
||||
| gpt-4.1-mini | gpt-4.1 | 50.0% |
|
||||
| gpt-4.1 | gpt-4o-mini | 42.0% |
|
||||
| gpt-4o | gpt-4.1 | 42.0% |
|
||||
| gpt-4o-mini | gpt-4.1 | 26.0% |
|
||||
| Agent Model | User Model | Success Rate |
|
||||
| ------------ | ----------- | ------------ |
|
||||
| gpt-5 | gpt-4.1 | 62.0% |
|
||||
| gpt-5-mini | gpt-4.1 | 52.0% |
|
||||
| gpt-4.1 | gpt-4.1 | 60.0% |
|
||||
| gpt-4.1-mini | gpt-4.1 | 50.0% |
|
||||
| gpt-4.1 | gpt-4o-mini | 42.0% |
|
||||
| gpt-4o | gpt-4.1 | 42.0% |
|
||||
| gpt-4o-mini | gpt-4.1 | 26.0% |
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
|
||||
Reference in New Issue
Block a user