Python: Foundry hosted agent V2 (#5379)

* Python: Wrapper + Samples 1st (#5177)

* Experiment

* Update dependency and add non streaming

* Add more samples

* Rename samples

* Add invocations

* Comments 1

* Comments 2

* Comments 3

* Improve README

* Add local shell sample

* WIP: Add eval and memory samples

* Update user agent prefix

* Update user agent prefix doc

* Update dependency (#5215)

* Add tests and more content types (#5235)

* Add tests

* fix tests and sample

* Fix formatting

* Remove function approval contents

* Python: Refine samples and upgrade packages (#5261)

* Refine samples and upgrade pacakges

* Upgrade to a new package that fixes a bug

* Update model env var

* Move samples (#5281)

* Python: Upgrade agentserver packages (#5284)

* Upgrade agentserver packages

* Fix new types

* Python: Add special handling for workflows (#5298)

* Add special handling for workflows

* Address comments

* Improve samples (#5372)

* Python: Add more types (#5378)

* Add more type supports

* Upgrade packages

* Remove TODOs in README

* Fix README

* Comments and mypy

* User agent scoped

* Fix README

* Fix pre commit

* Fix pre commit 2

* Fix pre commit 3

* Fix pre commit 4

* Fix pre commit 5

* Fix pre commit 6

* Add azure-monitor-opentelemetry to dev deps

Fixes Samples & Markdown CI failure. The PR's new transitive dep on
azure-monitor-opentelemetry-exporter (via azure-ai-agentserver-core) makes
pyright resolve the azure.monitor.opentelemetry namespace, flipping the
check_md_code_blocks diagnostic for `configure_azure_monitor` from
reportMissingImports (filtered) to reportAttributeAccessIssue (not filtered).
Installing the umbrella azure-monitor-opentelemetry package in dev makes
pyright resolve the symbol correctly, matching the install guidance the
observability README already gives users.

---------

Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>
This commit is contained in:
Tao Chen
2026-04-20 22:21:27 -07:00
committed by GitHub
Unverified
parent 07f4c8a8d6
commit ce8b6305d8
87 changed files with 3597 additions and 1197 deletions
@@ -0,0 +1,56 @@
# Foundry Hosted Agents Samples
This directory contains samples that demonstrate how to use the Agent Framework to host agents on Foundry with different capabilities and configurations. Each sample includes a README with instructions on how to set up, run, and interact with the agent.
Read more about Foundry Hosted Agents [here](https://learn.microsoft.com/en-us/azure/foundry/agents/concepts/hosted-agents).
## Environment setup
1. Navigate to the sample directory you want to run. For example:
```bash
python -m venv .venv
# Windows
.venv\Scripts\Activate
# macOS/Linux
source .venv/bin/activate
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Create a `.env` file with your Foundry configuration following the `env.example` file in the sample.
4. Make sure you are logged in with the Azure CLI:
```bash
az login
```
## Deploying to a Docker container
Navigate to the sample directory and build the Docker image:
```bash
docker build -t hosted-agent-sample .
```
Run the container, passing in the required environment variables:
```bash
docker run -p 8088:8088 \
-e FOUNDRY_PROJECT_ENDPOINT=<your-endpoint> \
-e MODEL_DEPLOYMENT_NAME=<your-model> \
hosted-agent-sample
```
The server will be available at `http://localhost:8088`. You can send requests using the same `curl` command shown above.
## Deploying to Foundry
Follow this [guide](https://learn.microsoft.com/en-us/azure/foundry/agents/how-to/deploy-hosted-agent?tabs=bash#configure-your-agent) to deploy your agent to Foundry.
@@ -0,0 +1,6 @@
.venv
__pycache__
*.pyc
*.pyo
*.pyd
.Python
@@ -0,0 +1,2 @@
FOUNDRY_PROJECT_ENDPOINT="..."
MODEL_DEPLOYMENT_NAME="..."
@@ -0,0 +1,16 @@
FROM python:3.12-slim
WORKDIR /app
COPY . user_agent/
WORKDIR /app/user_agent
RUN if [ -f requirements.txt ]; then \
pip install -r requirements.txt; \
else \
echo "No requirements.txt found"; \
fi
EXPOSE 8088
CMD ["python", "main.py"]
@@ -0,0 +1,44 @@
# Basic example of hosting an agent with the `invocations` API
## Running the server locally
### Environment setup
Follow the instructions in the [Environment setup](../../README.md#environment-setup) section of the README in the parent directory to set up your environment and install dependencies.
Run the following command to start the server:
```bash
python main.py
```
### Interacting with the agent
Send a POST request to the server with a JSON body containing a "message" field to interact with the agent. For example:
```bash
curl -X POST http://localhost:8088/invocations -i -H "Content-Type: application/json" -d '{"message": "Hi"}'
```
The server will respond with a JSON object containing the response text. The `-i` flag in the `curl` command includes the HTTP response headers in the output, which includes the session ID that can be used for multi-turn conversations. Here is an example of the response:
```bash
HTTP/1.1 200
content-length: 34
content-type: application/json
x-agent-invocation-id: ec04d020-a0e7-441e-ae83-db75635a9f83
x-agent-session-id: 9370b9d4-cd13-4436-a57f-03b843ac0e17
x-platform-server: azure-ai-agentserver-core/2.0.0a20260410006 (python/3.12)
date: Fri, 17 Apr 2026 23:46:44 GMT
server: hypercorn-h11
{"response":"Hi! How can I help?"}
```
### Multi-turn conversation
To have a multi-turn conversation with the agent, take the session ID from the response headers of the previous request and include it in URL parameters for the next request. For example:
```bash
curl -X POST http://localhost:8088/invocations?agent_session_id=9370b9d4-cd13-4436-a57f-03b843ac0e17 -i -H "Content-Type: application/json" -d '{"message": "How are you?"}'
```
@@ -0,0 +1,23 @@
name: agent-framework-agent-basic-invocations
description: >
A basic Agent Framework agent hosted by Foundry.
metadata:
tags:
- Agent Framework
- AI Agent Hosting
- Azure AI AgentServer
- Invocations Protocol
- Streaming
template:
name: agent-framework-agent-basic-invocations
kind: hosted
protocols:
- protocol: invocations
version: 1.0.0
environment_variables:
- name: MODEL_DEPLOYMENT_NAME
value: "{{MODEL_DEPLOYMENT_NAME}}"
resources:
- kind: model
id: gpt-4.1-mini
name: MODEL_DEPLOYMENT_NAME
@@ -0,0 +1,9 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/microsoft/AgentSchema/refs/heads/main/schemas/v1.0/ContainerAgent.yaml
kind: hosted
name: agent-framework-agent-basic-invocations
protocols:
- protocol: invocations
version: 1.0.0
resources:
cpu: '0.25'
memory: '0.5Gi'
@@ -0,0 +1,36 @@
# Copyright (c) Microsoft. All rights reserved.
import os
from agent_framework import Agent
from agent_framework.foundry import FoundryChatClient
from agent_framework_foundry_hosting import InvocationsHostServer
from azure.identity import AzureCliCredential
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
def main():
client = FoundryChatClient(
project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
model=os.environ["MODEL_DEPLOYMENT_NAME"],
credential=AzureCliCredential(),
)
agent = Agent(
client=client,
instructions="You are a friendly assistant. Keep your answers brief.",
# History will be managed by the hosting infrastructure, thus there
# is no need to store history by the service. Learn more at:
# https://developers.openai.com/api/reference/resources/responses/methods/create
default_options={"store": False},
)
server = InvocationsHostServer(agent)
server.run()
if __name__ == "__main__":
main()
@@ -0,0 +1,2 @@
agent-framework
agent-framework-foundry-hosting
@@ -0,0 +1,6 @@
.venv
__pycache__
*.pyc
*.pyo
*.pyd
.Python
@@ -0,0 +1,2 @@
FOUNDRY_PROJECT_ENDPOINT="..."
MODEL_DEPLOYMENT_NAME="..."
@@ -0,0 +1,16 @@
FROM python:3.12-slim
WORKDIR /app
COPY . user_agent/
WORKDIR /app/user_agent
RUN if [ -f requirements.txt ]; then \
pip install -r requirements.txt; \
else \
echo "No requirements.txt found"; \
fi
EXPOSE 8088
CMD ["python", "main.py"]
@@ -0,0 +1,46 @@
# Basic example of hosting an agent with the `invocations` API
This is the same as the [01_basic](../01_basic/README.md) example, but demonstrates the "break glass" scenario where you can create your own `invoke_handler` to handle specific types of invocations. This is useful when you want to override the default behavior for certain requests or add custom processing logic.
## Running the server locally
### Environment setup
Follow the instructions in the [Environment setup](../../README.md#environment-setup) section of the README in the parent directory to set up your environment and install dependencies.
Run the following command to start the server:
```bash
python main.py
```
### Interacting with the agent
Send a POST request to the server with a JSON body containing a "message" field to interact with the agent. For example:
```bash
curl -X POST http://localhost:8088/invocations -i -H "Content-Type: application/json" -d '{"message": "Hi"}'
```
The server will respond with a JSON object containing the response text. The `-i` flag in the `curl` command includes the HTTP response headers in the output, which includes the session ID that can be used for multi-turn conversations. Here is an example of the response:
```bash
HTTP/1.1 200
content-length: 34
content-type: application/json
x-agent-invocation-id: ec04d020-a0e7-441e-ae83-db75635a9f83
x-agent-session-id: 9370b9d4-cd13-4436-a57f-03b843ac0e17
x-platform-server: azure-ai-agentserver-core/2.0.0a20260410006 (python/3.12)
date: Fri, 17 Apr 2026 23:46:44 GMT
server: hypercorn-h11
{"response":"Hi! How can I help?"}
```
### Multi-turn conversation
To have a multi-turn conversation with the agent, take the session ID from the response headers of the previous request and include it in URL parameters for the next request. For example:
```bash
curl -X POST http://localhost:8088/invocations?agent_session_id=9370b9d4-cd13-4436-a57f-03b843ac0e17 -i -H "Content-Type: application/json" -d '{"message": "How are you?"}'
```
@@ -0,0 +1,23 @@
name: agent-framework-agent-basic-invocations
description: >
A basic Agent Framework agent hosted by Foundry.
metadata:
tags:
- Agent Framework
- AI Agent Hosting
- Azure AI AgentServer
- Invocations Protocol
- Streaming
template:
name: agent-framework-agent-basic-invocations
kind: hosted
protocols:
- protocol: invocations
version: 1.0.0
environment_variables:
- name: MODEL_DEPLOYMENT_NAME
value: "{{MODEL_DEPLOYMENT_NAME}}"
resources:
- kind: model
id: gpt-4.1-mini
name: MODEL_DEPLOYMENT_NAME
@@ -0,0 +1,9 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/microsoft/AgentSchema/refs/heads/main/schemas/v1.0/ContainerAgent.yaml
kind: hosted
name: agent-framework-agent-basic-invocations
protocols:
- protocol: invocations
version: 1.0.0
resources:
cpu: '0.25'
memory: '0.5Gi'
@@ -0,0 +1,74 @@
# Copyright (c) Microsoft. All rights reserved.
import os
from collections.abc import AsyncGenerator
from agent_framework import Agent, AgentSession
from agent_framework.foundry import FoundryChatClient
from azure.ai.agentserver.invocations import InvocationAgentServerHost
from azure.identity import DefaultAzureCredential
from dotenv import load_dotenv
from starlette.requests import Request
from starlette.responses import JSONResponse, Response, StreamingResponse
# Load environment variables from .env file
load_dotenv()
# In-memory session store — keyed by session ID.
# WARNING: This is lost on restart. Use durable storage in production.
_sessions: dict[str, AgentSession] = {}
# Create the agent
client = FoundryChatClient(
project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
model=os.environ["MODEL_DEPLOYMENT_NAME"],
credential=DefaultAzureCredential(),
)
agent = Agent(
client=client,
instructions="You are a friendly assistant. Keep your answers brief.",
# History will be managed by the hosting infrastructure, thus there
# is no need to store history by the service. Learn more at:
# https://developers.openai.com/api/reference/resources/responses/methods/create
default_options={"store": False},
)
app = InvocationAgentServerHost()
@app.invoke_handler
async def handle_invoke(request: Request):
"""Handle streaming multi-turn chat with Azure OpenAI via SSE."""
data = await request.json()
session_id = request.state.session_id
stream = data.get("stream", False)
user_message = data.get("message", None)
if user_message is None:
error = "Missing 'message' in request"
if stream:
return StreamingResponse(content=error, status_code=400)
return Response(content=error, status_code=400)
session = _sessions.setdefault(session_id, AgentSession(session_id=session_id))
if stream:
async def stream_response() -> AsyncGenerator[str]:
async for update in agent.run(user_message, session=session, stream=True):
yield update.text
return StreamingResponse(
stream_response(),
media_type="text/event-stream",
headers={"Cache-Control": "no-cache", "Connection": "keep-alive"},
)
response = await agent.run([user_message], session=session, stream=stream)
return JSONResponse({"response": response.text})
if __name__ == "__main__":
app.run()
@@ -0,0 +1,2 @@
agent-framework
azure-ai-agentserver-invocations
@@ -0,0 +1,8 @@
# Hosting agents with Foundry Hosting and the `invocations` API
This folder contains a list of samples that show how to host agents using the `invocations` API and deploy them to Foundry Hosting.
| Sample | Description |
| --- | --- |
| [01_basic](./01_basic) | A basic example of hosting an agent with the `invocations` API and carrying on a multi-turn conversation. |
| [02_break_glass](./02_break_glass) | An example of hosting an agent with the `invocations` API and a "break glass" scenario where you can create your own `invoke_handler` to handle specific types of invocations. |
@@ -0,0 +1,6 @@
.venv
__pycache__
*.pyc
*.pyo
*.pyd
.Python
@@ -0,0 +1,2 @@
FOUNDRY_PROJECT_ENDPOINT="..."
MODEL_DEPLOYMENT_NAME="..."
@@ -0,0 +1,16 @@
FROM python:3.12-slim
WORKDIR /app
COPY . user_agent/
WORKDIR /app/user_agent
RUN if [ -f requirements.txt ]; then \
pip install -r requirements.txt; \
else \
echo "No requirements.txt found"; \
fi
EXPOSE 8088
CMD ["python", "main.py"]
@@ -0,0 +1,31 @@
# Basic example of hosting an agent with the `responses` API
This agent only contains an instruction (personal). It's the most basic agent with an LLM and no tools.
## Running the server locally
### Environment setup
Follow the instructions in the [Environment setup](../../README.md#environment-setup) section of the README in the parent directory to set up your environment and install dependencies.
Run the following command to start the server:
```bash
python main.py
```
## Interacting with the agent
Send a POST request to the server with a JSON body containing a "input" field to interact with the agent. For example:
```bash
curl -X POST http://localhost:8088/responses -H "Content-Type: application/json" -d '{"input": "Hi"}'
```
## Multi-turn conversation
To have a multi-turn conversation with the agent, include the previous response id in the request body. For example:
```bash
curl -X POST http://localhost:8088/responses -H "Content-Type: application/json" -d '{"input": "How are you?", "previous_response_id": "REPLACE_WITH_PREVIOUS_RESPONSE_ID"}'
```
@@ -0,0 +1,23 @@
name: agent-framework-agent-basic
description: >
A basic Agent Framework agent hosted by Foundry.
metadata:
tags:
- Agent Framework
- AI Agent Hosting
- Azure AI AgentServer
- Responses Protocol
- Streaming
template:
name: agent-framework-agent-basic
kind: hosted
protocols:
- protocol: responses
version: 1.0.0
environment_variables:
- name: MODEL_DEPLOYMENT_NAME
value: "{{MODEL_DEPLOYMENT_NAME}}"
resources:
- kind: model
id: gpt-4.1-mini
name: MODEL_DEPLOYMENT_NAME
@@ -0,0 +1,8 @@
kind: hosted
name: agent-framework-agent-basic
protocols:
- protocol: responses
version: 1.0.0
resources:
cpu: "0.25"
memory: 0.5Gi
@@ -0,0 +1,36 @@
# Copyright (c) Microsoft. All rights reserved.
import os
from agent_framework import Agent
from agent_framework.foundry import FoundryChatClient
from agent_framework_foundry_hosting import ResponsesHostServer
from azure.identity import AzureCliCredential
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
def main():
client = FoundryChatClient(
project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
model=os.environ["MODEL_DEPLOYMENT_NAME"],
credential=AzureCliCredential(),
)
agent = Agent(
client=client,
instructions="You are a friendly assistant. Keep your answers brief.",
# History will be managed by the hosting infrastructure, thus there
# is no need to store history by the service. Learn more at:
# https://developers.openai.com/api/reference/resources/responses/methods/create
default_options={"store": False},
)
server = ResponsesHostServer(agent)
server.run()
if __name__ == "__main__":
main()
@@ -0,0 +1,2 @@
agent-framework
agent-framework-foundry-hosting
@@ -0,0 +1,6 @@
.venv
__pycache__
*.pyc
*.pyo
*.pyd
.Python
@@ -0,0 +1,2 @@
FOUNDRY_PROJECT_ENDPOINT="..."
MODEL_DEPLOYMENT_NAME="..."
@@ -0,0 +1,16 @@
FROM python:3.12-slim
WORKDIR /app
COPY . user_agent/
WORKDIR /app/user_agent
RUN if [ -f requirements.txt ]; then \
pip install -r requirements.txt; \
else \
echo "No requirements.txt found"; \
fi
EXPOSE 8088
CMD ["python", "main.py"]
@@ -0,0 +1,27 @@
# Basic example of hosting an agent with the `responses` API and local tools
This agent is equipped with a function tool and a local shell tool.
> We recommend deploying this sample on a local container or to Foundry Hosting because the agent has access to a local shell tool, which can run arbitrary commands on the machine.
## Running the server locally
### Environment setup
Follow the instructions in the [Environment setup](../../README.md#environment-setup) section of the README in the parent directory to set up your environment and install dependencies.
Run the following command to start the server:
```bash
python main.py
```
## Interacting with the agent
Send a POST request to the server with a JSON body containing a "input" field to interact with the agent. For example:
```bash
curl -X POST http://localhost:8088/responses -H "Content-Type: application/json" -d '{"input": "What is the weather in Seattle?"}'
curl -X POST http://localhost:8088/responses -H "Content-Type: application/json" -d '{"input": "List the files in the current directory."}'
```
@@ -0,0 +1,23 @@
name: agent-framework-agent-with-local-tools
description: >
An Agent Framework agent with local tools hosted by Foundry.
metadata:
tags:
- Agent Framework
- AI Agent Hosting
- Azure AI AgentServer
- Responses Protocol
- Streaming
template:
name: agent-framework-agent-with-local-tools
kind: hosted
protocols:
- protocol: responses
version: 1.0.0
environment_variables:
- name: MODEL_DEPLOYMENT_NAME
value: "{{MODEL_DEPLOYMENT_NAME}}"
resources:
- kind: model
id: gpt-4.1-mini
name: MODEL_DEPLOYMENT_NAME
@@ -0,0 +1,8 @@
kind: hosted
name: agent-framework-agent-with-local-tools
protocols:
- protocol: responses
version: 1.0.0
resources:
cpu: "0.25"
memory: 0.5Gi
@@ -0,0 +1,74 @@
# Copyright (c) Microsoft. All rights reserved.
import os
import subprocess
from random import randint
from agent_framework import Agent, tool
from agent_framework.foundry import FoundryChatClient
from agent_framework_foundry_hosting import ResponsesHostServer
from azure.identity import AzureCliCredential
from dotenv import load_dotenv
from pydantic import Field
from typing import Annotated
# Load environment variables from .env file
load_dotenv()
@tool(approval_mode="never_require")
def get_weather(
location: Annotated[str, Field(description="The location to get the weather for.")],
) -> str:
"""Get the weather for a given location."""
conditions = ["sunny", "cloudy", "rainy", "stormy"]
return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C."
@tool(approval_mode="always_require")
def run_bash(command: str) -> str:
"""Execute a shell command locally and return stdout, stderr, and exit code."""
try:
result = subprocess.run(
command,
shell=True,
capture_output=True,
text=True,
timeout=30,
)
parts: list[str] = []
if result.stdout:
parts.append(result.stdout)
if result.stderr:
parts.append(f"stderr: {result.stderr}")
parts.append(f"exit_code: {result.returncode}")
return "\n".join(parts)
except subprocess.TimeoutExpired:
return "Command timed out after 30 seconds"
except Exception as e:
return f"Error executing command: {e}"
def main():
client = FoundryChatClient(
project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
model=os.environ["MODEL_DEPLOYMENT_NAME"],
credential=AzureCliCredential(),
)
agent = Agent(
client=client,
instructions="You are a friendly assistant. Keep your answers brief.",
tools=[get_weather, run_bash],
# History will be managed by the hosting infrastructure, thus there
# is no need to store history by the service. Learn more at:
# https://developers.openai.com/api/reference/resources/responses/methods/create
default_options={"store": False},
)
server = ResponsesHostServer(agent)
server.run()
if __name__ == "__main__":
main()
@@ -0,0 +1,2 @@
agent-framework
agent-framework-foundry-hosting
@@ -0,0 +1,6 @@
.venv
__pycache__
*.pyc
*.pyo
*.pyd
.Python
@@ -0,0 +1,4 @@
FOUNDRY_PROJECT_ENDPOINT="..."
MODEL_DEPLOYMENT_NAME="..."
TOOLBOX_NAME="..."
GITHUB_PAT="..."
@@ -0,0 +1,16 @@
FROM python:3.12-slim
WORKDIR /app
COPY . user_agent/
WORKDIR /app/user_agent
RUN if [ -f requirements.txt ]; then \
pip install -r requirements.txt; \
else \
echo "No requirements.txt found"; \
fi
EXPOSE 8088
CMD ["python", "main.py"]
@@ -0,0 +1,25 @@
# Basic example of hosting an agent with the `responses` API and a remote MCP
This agent is equipped with a GitHub MCP server and a Foundry Toolbox, which are both remote MCPs.
> Note that there are other ways to interact with Foundry toolboxes. Using it as a MCP is just one of the options.
## Running the server locally
### Environment setup
Follow the instructions in the [Environment setup](../../README.md#environment-setup) section of the README in the parent directory to set up your environment and install dependencies.
Run the following command to start the server:
```bash
python main.py
```
## Interacting with the agent
Send a POST request to the server with a JSON body containing a "input" field to interact with the agent. For example:
```bash
curl -X POST http://localhost:8088/responses -H "Content-Type: application/json" -d '{"input": "List all the repositories I own on GitHub."}'
```
@@ -0,0 +1,27 @@
name: agent-framework-agent-with-remote-mcp-tools
description: >
An Agent Framework agent with remote MCP tools hosted by Foundry.
metadata:
tags:
- Agent Framework
- AI Agent Hosting
- Azure AI AgentServer
- Responses Protocol
- Streaming
template:
name: agent-framework-agent-with-remote-mcp-tools
kind: hosted
protocols:
- protocol: responses
version: 1.0.0
environment_variables:
- name: MODEL_DEPLOYMENT_NAME
value: "{{MODEL_DEPLOYMENT_NAME}}"
- name: GITHUB_PAT
value: ${GITHUB_PAT}
- name: TOOLBOX_NAME
value: ${TOOLBOX_NAME}
resources:
- kind: model
id: gpt-4.1-mini
name: MODEL_DEPLOYMENT_NAME
@@ -0,0 +1,8 @@
kind: hosted
name: agent-framework-agent-with-remote-mcp-tools
protocols:
- protocol: responses
version: 1.0.0
resources:
cpu: "0.25"
memory: 0.5Gi
@@ -0,0 +1,76 @@
# Copyright (c) Microsoft. All rights reserved.
import os
import httpx
from agent_framework import Agent, MCPStreamableHTTPTool
from agent_framework.foundry import FoundryChatClient
from agent_framework_foundry_hosting import ResponsesHostServer
from azure.identity import AzureCliCredential
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
class ToolboxAuth(httpx.Auth):
"""httpx Auth that injects a fresh bearer token on every request."""
def auth_flow(self, request: httpx.Request):
credential = AzureCliCredential()
token = credential.get_token("https://ai.azure.com/.default").token
request.headers["Authorization"] = f"Bearer {token}"
yield request
def main():
client = FoundryChatClient(
project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
model=os.environ["MODEL_DEPLOYMENT_NAME"],
credential=AzureCliCredential(),
)
# Foundry Toolbox as a MCP tool
project_endpoint = os.environ["FOUNDRY_PROJECT_ENDPOINT"]
toolbox_name = os.environ["TOOLBOX_NAME"]
toolbox_endpoint = f"{project_endpoint.rstrip('/')}/toolboxes/{toolbox_name}/mcp?api-version=v1"
http_client = httpx.AsyncClient(auth=ToolboxAuth(), headers={"Foundry-Features": "Toolboxes=V1Preview"})
foundry_mcp_tool = MCPStreamableHTTPTool(
name="toolbox",
url=toolbox_endpoint,
http_client=http_client,
load_prompts=False,
)
# GitHub MCP server
github_pat = os.environ["GITHUB_PAT"]
if not github_pat:
raise ValueError(
"GITHUB_PAT environment variable must be set. Create a token at https://github.com/settings/tokens"
)
github_mcp_tool = client.get_mcp_tool(
name="GitHub",
url="https://api.githubcopilot.com/mcp/",
headers={
"Authorization": f"Bearer {github_pat}",
},
approval_mode="never_require",
)
agent = Agent(
client=client,
instructions="You are a friendly assistant. Keep your answers brief.",
tools=[foundry_mcp_tool, github_mcp_tool],
# History will be managed by the hosting infrastructure, thus there
# is no need to store history by the service. Learn more at:
# https://developers.openai.com/api/reference/resources/responses/methods/create
default_options={"store": False},
)
server = ResponsesHostServer(agent)
server.run()
if __name__ == "__main__":
main()
@@ -0,0 +1,2 @@
agent-framework
agent-framework-foundry-hosting
@@ -0,0 +1,6 @@
.venv
__pycache__
*.pyc
*.pyo
*.pyd
.Python
@@ -0,0 +1,2 @@
FOUNDRY_PROJECT_ENDPOINT="..."
MODEL_DEPLOYMENT_NAME="..."
@@ -0,0 +1,16 @@
FROM python:3.12-slim
WORKDIR /app
COPY . user_agent/
WORKDIR /app/user_agent
RUN if [ -f requirements.txt ]; then \
pip install -r requirements.txt; \
else \
echo "No requirements.txt found"; \
fi
EXPOSE 8088
CMD ["python", "main.py"]
@@ -0,0 +1,23 @@
# Basic example of hosting an agent with the `responses` API and a workflow
This sample demonstrates how to host a workflow using the `responses` API.
## Running the server locally
### Environment setup
Follow the instructions in the [Environment setup](../../README.md#environment-setup) section of the README in the parent directory to set up your environment and install dependencies.
Run the following command to start the server:
```bash
python main.py
```
## Interacting with the agent
Send a POST request to the server with a JSON body containing a "input" field to interact with the agent. For example:
```bash
curl -X POST http://localhost:8088/responses -H "Content-Type: application/json" -d '{"input": "Create a slogan for a new electric SUV that is affordable and fun to drive."}'
```
@@ -0,0 +1,23 @@
name: agent-framework-workflows
description: >
An Agent Framework workflow hosted by Foundry.
metadata:
tags:
- Agent Framework
- AI Agent Hosting
- Azure AI AgentServer
- Responses Protocol
- Streaming
template:
name: agent-framework-workflows
kind: hosted
protocols:
- protocol: responses
version: 1.0.0
environment_variables:
- name: MODEL_DEPLOYMENT_NAME
value: "{{MODEL_DEPLOYMENT_NAME}}"
resources:
- kind: model
id: gpt-4.1-mini
name: MODEL_DEPLOYMENT_NAME
@@ -0,0 +1,8 @@
kind: hosted
name: agent-framework-workflows
protocols:
- protocol: responses
version: 1.0.0
resources:
cpu: "0.25"
memory: 0.5Gi
@@ -0,0 +1,70 @@
# Copyright (c) Microsoft. All rights reserved.
import os
from agent_framework import Agent, AgentExecutor, WorkflowBuilder
from agent_framework.foundry import FoundryChatClient
from agent_framework_foundry_hosting import ResponsesHostServer
from azure.identity import AzureCliCredential
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
def main():
client = FoundryChatClient(
project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
model=os.environ["MODEL_DEPLOYMENT_NAME"],
credential=AzureCliCredential(),
)
writer_agent = Agent(
client=client,
instructions=("You are an excellent slogan writer. You create new slogans based on the given topic."),
name="writer",
)
legal_agent = Agent(
client=client,
instructions=(
"You are an excellent legal reviewer. "
"Make necessary corrections to the slogan so that it is legally compliant."
),
name="legal_reviewer",
)
format_agent = Agent(
client=client,
instructions=(
"You are an excellent content formatter. "
"You take the slogan and format it in a cool retro style when printing to a terminal."
),
name="formatter",
)
# Set the context mode to `last_agent` so that each agent only sees the output of the
# previous agent instead of the full conversation history
writer_executor = AgentExecutor(writer_agent, context_mode="last_agent")
legal_executor = AgentExecutor(legal_agent, context_mode="last_agent")
format_executor = AgentExecutor(format_agent, context_mode="last_agent")
workflow_agent = (
WorkflowBuilder(
start_executor=writer_executor,
# Limiting the output to only the final formatted result.
# If this is not set, all intermediate results will be included in the output.
output_executors=[format_executor],
)
.add_edge(writer_executor, legal_executor)
.add_edge(legal_executor, format_executor)
.build()
.as_agent()
)
server = ResponsesHostServer(workflow_agent)
server.run()
if __name__ == "__main__":
main()
@@ -0,0 +1,2 @@
agent-framework
agent-framework-foundry-hosting
@@ -0,0 +1,11 @@
# Hosting agents with Foundry Hosting and the `responses` API
This folder contains a list of samples that show how to host agents using the `responses` API and deploy them to Foundry Hosting.
| Sample | Description |
| --- | --- |
| [01_basic](./01_basic) | A basic example of hosting an agent with the `responses` API and carrying on a multi-turn conversation. |
| [02_local_tools](./02_local_tools) | An example of hosting an agent with the `responses` API and local tools including a function tool and a local shell tool. |
| [03_remote_mcp](./03_remote_mcp) | An example of hosting an agent with the `responses` API and remote MCPs, including a GitHub MCP server and a Foundry Toolbox. |
| [04_workflows](./04_workflows) | An example of hosting a workflow with the `responses` API. |
| [using_deployed_agent.py](./using_deployed_agent.py) | An example of how to use the deployed agent in Agent Framework. |
@@ -0,0 +1,50 @@
# Copyright (c) Microsoft. All rights reserved.
import asyncio
from agent_framework import Agent, AgentResponse, AgentResponseUpdate, ResponseStream
from agent_framework.openai import OpenAIChatClient
from typing_extensions import Any
"""
This script demonstrates how to talk to a deployed agent using the OpenAIChatClient.
Depending on where you have deployed your agent (local or Foundry Hosting), you may
need to change the base_url when initializing the OpenAIChatClient.
"""
async def print_streaming_response(streaming_response: ResponseStream[AgentResponseUpdate, AgentResponse[Any]]) -> None:
async for chunk in streaming_response:
if chunk.text:
print(chunk.text, end="", flush=True)
async def main() -> None:
agent = Agent(client=OpenAIChatClient(base_url="http://localhost:8088"))
session = agent.create_session()
# First turn
query = "Hi!"
print(f"User: {query}")
print("Agent: ", end="", flush=True)
streaming_response = agent.run(query, session=session, stream=True)
await print_streaming_response(streaming_response)
# Second turn
query = "Your name is Javis. What can you do?"
print(f"\nUser: {query}")
print("Agent: ", end="", flush=True)
streaming_response = agent.run(query, session=session, stream=True)
await print_streaming_response(streaming_response)
# Third turn
query = "What is your name?"
print(f"\nUser: {query}")
print("Agent: ", end="", flush=True)
streaming_response = agent.run(query, session=session, stream=True)
await print_streaming_response(streaming_response)
if __name__ == "__main__":
asyncio.run(main())