update link (#1051)

2026-06-16 21:04:09 +08:00 · 2025-10-01 04:18:11 -07:00
parent 5f1417ab94
commit 08b602c290
4 changed files with 33 additions and 25 deletions
@@ -44,10 +44,12 @@ This structure maintains a single PyPI package `agent-framework-lab` while suppo

 ## Installation

-Install the base lab package:
+Install from source:

 ```bash
-pip install agent-framework-lab
+git clone https://github.com/microsoft/agent-framework.git
+cd agent-framework/python/packages/lab
+pip install -e .
 ```

 For details on installing individual modules, see their respective README files listed above.
@@ -7,10 +7,12 @@ It includes built-in benchmarks as well as utilities for running custom evaluati

 ## Setup

-Install the agent-framework-lab package with GAIA dependencies:
+Install from source with GAIA dependencies:

 ```bash
-pip install "agent-framework-lab[gaia]"
+git clone https://github.com/microsoft/agent-framework.git
+cd agent-framework/python/packages/lab
+pip install -e ".[gaia]"
 ```

 Set up Hugging Face token:
@@ -8,20 +8,22 @@ This package enables you to train and fine-tune agents using advanced RL algorit

 ## Installation

-Install the agent-framework-lab package with Lightning dependencies:
+Install from source with Lightning dependencies:

 ```bash
-pip install "agent-framework-lab[lightning]"
+git clone https://github.com/microsoft/agent-framework.git
+cd agent-framework/python/packages/lab
+pip install -e ".[lightning]"
 ```

 ### Optional Dependencies

 ```bash
 # For math-related training
-pip install agent-framework-lab[lightning,math]
+pip install -e ".[lightning,math]"

 # For tau2 benchmarking
-pip install agent-framework-lab[lightning,tau2]
+pip install -e ".[lightning,tau2]"
 ```

 To prepare for RL training, you'll also need to install dependencies like PyTorch, Ray, and vLLM. See the [Agent-lightning setup instructions](https://github.com/microsoft/agent-lightning) for more details.
@@ -13,20 +13,22 @@ Each evaluation runs a multi-turn conversation where the user simulator presents

 ## Supported Domains

-| Domain | Status | Description |
-|--------|--------|-------------|
-| **airline** | ✅ Supported | Customer service for airline booking, changes, and support |
-| **retail** | 🚧 In Development | E-commerce customer support scenarios |
-| **telecom** | 🚧 In Development | Telecommunications service support |
+| Domain      | Status            | Description                                                |
+| ----------- | ----------------- | ---------------------------------------------------------- |
+| **airline** | ✅ Supported      | Customer service for airline booking, changes, and support |
+| **retail**  | 🚧 In Development | E-commerce customer support scenarios                      |
+| **telecom** | 🚧 In Development | Telecommunications service support                         |

-*Note: Currently only the airline domain is fully supported.*
+_Note: Currently only the airline domain is fully supported._

 ## Installation

-Install the agent-framework-lab package with TAU2 dependencies:
+Install from source with TAU2 dependencies:

 ```bash
-pip install "agent-framework-lab[tau2]"
+git clone https://github.com/microsoft/agent-framework.git
+cd agent-framework/python/packages/lab
+pip install -e ".[tau2]"
 ```

 Download data from [Tau2-Bench](https://github.com/sierra-research/tau2-bench):
@@ -104,15 +106,15 @@ python samples/run_benchmark.py --max-steps 20

 The following results are reproduced from our implementation of τ²-bench with `samples/run_benchmark.py`. It shows the average success rate over the dataset of 50 tasks.

-| Agent Model | User Model | Success Rate |
-|-------------|------------|----------|
-| gpt-5 | gpt-4.1 | 62.0% |
-| gpt-5-mini | gpt-4.1 | 52.0% |
-| gpt-4.1 | gpt-4.1 | 60.0% |
-| gpt-4.1-mini | gpt-4.1 | 50.0% |
-| gpt-4.1 | gpt-4o-mini | 42.0% |
-| gpt-4o | gpt-4.1 | 42.0% |
-| gpt-4o-mini | gpt-4.1 | 26.0% |
+| Agent Model  | User Model  | Success Rate |
+| ------------ | ----------- | ------------ |
+| gpt-5        | gpt-4.1     | 62.0%        |
+| gpt-5-mini   | gpt-4.1     | 52.0%        |
+| gpt-4.1      | gpt-4.1     | 60.0%        |
+| gpt-4.1-mini | gpt-4.1     | 50.0%        |
+| gpt-4.1      | gpt-4o-mini | 42.0%        |
+| gpt-4o       | gpt-4.1     | 42.0%        |
+| gpt-4o-mini  | gpt-4.1     | 26.0%        |

 ## Advanced Usage