* Python: Provider-leading client design & OpenAI package extraction Major refactoring of the Python Agent Framework client architecture: - Extract OpenAI clients into new `agent-framework-openai` package - Core package no longer depends on openai, azure-identity, azure-ai-projects - Rename clients for discoverability: OpenAIResponsesClient → OpenAIChatClient, OpenAIChatClient → OpenAIChatCompletionClient - Unify `model_id`/`deployment_name`/`model_deployment_name` → `model` param - New FoundryChatClient for Azure AI Foundry Responses API - New FoundryAgent/FoundryAgentClient for connecting to pre-configured Foundry agents - Remove OpenAIBase/OpenAIConfigMixin from non-deprecated client MRO - Deprecate AzureOpenAI* clients, AzureAIClient, OpenAIAssistantsClient - Reorganize samples: azure_openai+azure_ai+azure_ai_agent → azure/ - ADR-0020: Provider-Leading Client Design Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: missing Agent imports in samples, .model_id → .model in foundry_local sample Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: CI failures — mypy errors, coverage targets, sample imports - azure-ai mypy: add type ignores for TypedDict total=, model arg, forward ref - Coverage: replace core.azure/openai targets with openai package target - project_provider: add type annotation for opts dict Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: populate openai .pyi stub, fix broken README links, coverage targets Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fixes * updated observabilitty * reset azure init.pyi * fix errors * updated adr number * fix foundry local * fixed not renamed docstrings and comments, and added deprecated markers to old classes * fix tests and pyprojects * fix test vars * updated function tests * update durable * updated test setup for functions * Fix Foundry auth in workflow samples Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Stabilize Python integration workflows Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update hosting samples for Foundry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Trigger full CI rerun Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Trigger CI rerun again Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * trigger rerun * trigger rerun * fix for litellm * undo durabletask changes * Move Foundry APIs into foundry namespace Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix Foundry pyproject formatting Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Split provider samples by Foundry surface Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Restore hosting sample requirements Also fix the Foundry Local sample link after the provider sample move. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updated tests * udpated foundry integration tests * removed dist from azurefunctions tests * Use separate Foundry clients for concurrent agents Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix client setup in azfunc and durable * disabled two tests * updated setup for some function and durable tests * improved azure openai setup with new clients * ignore deprecated * fixes * skip 11 * remove openai assistants int tests --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Agent Framework Lab - Lightning
Agent Framework Lab Lightning is a specialized package that integrates Microsoft Agent Framework with Agent-lightning to provide reinforcement learning (RL) training capabilities for AI agents.
This package enables you to train and fine-tune agents using advanced RL algorithms from VERL (e.g., GRPO, PPO, Reinforce++) with support for distributed training, multi-GPU setups, and comprehensive monitoring. It also supports complex multi-turn agent interactions during training and optimization techniques like prompt optimization. See the Agent-lightning documentation for details.
Note
: This module is part of the consolidated
agent-framework-labpackage. Install the package with thelightningextra to use this module.
Installation
Install the agent-framework-lab package with Lightning dependencies:
pip install "agent-framework-lab[lightning]"
Optional Dependencies
# For math-related training
pip install -e ".[lightning,math]"
# For tau2 benchmarking
pip install -e ".[lightning,tau2]"
To prepare for RL training, you'll also need to install dependencies like PyTorch, Ray, and vLLM. See the Agent-lightning setup instructions for more details.
Usage Patterns
The basic usage pattern follows these steps:
- Prepare your dataset as a list of samples (typically dictionaries)
- Create an agent function that processes samples and returns evaluation scores
- Decorate with
@agentlightning.rolloutto enable training - Configure and run training with the
agentlightning.Trainerclass
Example Implementation
from agent_framework.lab.lightning import AgentFrameworkTracer
from agentlightning import rollout, Trainer, LLM, Dataset
from agentlightning.algorithm.verl import VERL
TaskType = Any
@rollout
async def math_agent(task: TaskType, llm: LLM) -> float:
"""A function that solves a math problem and returns the evaluation score."""
async with (
MCPStdioTool(name="calculator", command="uvx", args=["mcp-server-calculator"]) as mcp_server,
Agent(
client=OpenAIChatClient(
model_id=llm.model,
api_key="your-api-key",
base_url=llm.endpoint,
),
name="MathAgent",
instructions="Solve the math problem and output answer after ###",
temperature=llm.sampling_parameters.get("temperature", 0.0),
) as agent,
):
result = await agent.run(task["question"], tools=mcp_server)
# Your evaluation logic here...
return evaluation_score
# Training configuration
config = {
"data": {"train_batch_size": 8},
"trainer": {"total_epochs": 2, "n_gpus_per_node": 1},
# ... additional config
}
# Initialize agent-framework tracer to send telemetry data to agent-lightning's observability backend
tracer = AgentFrameworkTracer()
trainer = Trainer(algorithm=VERL(config), tracer=tracer, n_workers=2)
# Both train_dataset and val_dataset are lists of TaskType
trainer.fit(math_agent, train_dataset, val_data=val_dataset)
Example 1: Training a Math Agent
This example trains an agent that uses an MCP calculator tool to solve math problems. The dataset is a small subset from the Calc-X dataset. The Agent-lightning team has also experimented with a similar agent using a larger dataset. See this example for more details.
Running this example requires a minimum of 40GB GPU memory. If you don't have enough GPU memory, you can use a smaller model like Qwen2.5-0.5B-Instruct, though the results won't be as good. To run the example:
cd samples
# Run the ray cluster (see the troubleshooting section for more details)
ray start --head --dashboard-host=0.0.0.0
# Run the training script
python train_math_agent.py
To debug the agent used in the example, you can run the script with the --debug flag:
python train_math_agent.py --debug
The training curve below shows results with Qwen2.5-1.5B-Instruct and GRPO. Validation accuracy increases from 10% to 35% in the first 8 steps, then begins to overfit.
Example 2: Training a Tau2 Agent
This advanced example demonstrates training on complex multi-agent scenarios using the Tau2 benchmark. It features a multi-agent setup with an assistant agent and a user simulator agent, training the assistant while keeping the user simulator fixed. The example incorporates a multi-step workflow with tool usage and complex evaluation metrics. Currently, training uses the airline domain with a 50/50 split between training and validation data.
Before running this example, please read the agent-lightning-lab-tau2 documentation and follow the setup instructions.
To run the example:
# Set required environment variables
export TAU2_DATA_DIR="/path/to/tau2/data"
# Used for user simulator and LLM judge
export OPENAI_BASE_URL="your-endpoint"
export OPENAI_API_KEY="your-key"
# Used for tracking on Weights & Biases
export WANDB_API_KEY="your-key"
# Run the ray cluster
ray start --head --dashboard-host=0.0.0.0
# Train the tau2 agent
cd samples
python samples/train_tau2_agent.py
# Debug mode
python samples/train_tau2_agent.py --debug
This example uses more advanced Agent-lightning features compared to the math example. It's based on the LitAgent class rather than the @rollout decorator and involves concepts like resources and agent filtering. We recommend reading the Agent-lightning documentation to learn more.
Results with Qwen2.5-1.5B-Instruct and GRPO are shown below. Validation accuracy improves from 28% to 40% over 8 epochs.
Troubleshooting
Ray Connection Issues
Agent-lightning uses VERL for RL training, which depends on Ray. To avoid issues, it's recommended to start Ray manually beforehand. If you encounter Ray startup problems:
# Stop existing Ray processes
ray stop
# Start Ray with debugging enabled
env RAY_DEBUG=legacy HYDRA_FULL_ERROR=1 VLLM_USE_V1=1 ray start --head --dashboard-host=0.0.0.0
Important: Run Ray commands in the same directory as your training script. Set any required environment variables (WANDB_API_KEY, HF_TOKEN) before starting Ray.
GPU Memory Issues
- Reduce
gpu_memory_utilizationto <0.8 - Enable FSDP offloading:
"fsdp_config": { "param_offload": True, "optimizer_offload": True, } - Decrease batch sizes:
train_batch_sizeppo_mini_batch_sizelog_prob_micro_batch_size_per_gpu
Agent Debugging
Always test your agent before training:
# Use debug mode to validate agent behavior
python your_training_script.py --debug
# Check agent responses and evaluation logic
# Ensure proper tool integration and result extraction
Contributing
This package is part of the Microsoft Agent Framework Lab. Please see the main repository for contribution guidelines.
License
This project is licensed under the MIT License - see the LICENSE file for details.

