Files
Tao Chen 1b6f7d80fd Python: Record actual served model from Azure OpenAI (#5910)
* Record actual served model as response model for Azure OpenAI

* Formatting

* Fix tests

* Fix pipeline error

* Comments

* Address review: surface served model via ChatResponse.model

Apply blocking review feedback from PR #5910:

- Use ChatResponse.model / ChatResponseUpdate.model as the source of truth
  for the Azure x-ms-served-model header value, instead of stashing it in
  additional_properties and overriding it again in observability.
  Observability already reads response.model; the chat client now overwrites
  it post-parse when the served-model header is present. Empirically the
  Azure Responses API returns the deployment alias in body.model and the
  actual snapshot (e.g. gpt-5-nano-2025-08-07) in this header.

- Move the AZURE_OPENAI_SERVED_MODEL_HEADER constant out of observability.py
  and into RawOpenAIChatClient (as the SERVED_MODEL_HEADER ClassVar). The
  header is Azure-OpenAI-Responses-API-specific so observability does not
  need to know about it.

- Revert the streaming text_format path to client.responses.stream(...) and
  drop the _pydantic_model_to_text_format_param helper. That helper imported
  from openai.lib._parsing._responses (a private SDK path) and the swap to
  responses.create(stream=True) dropped client-side output_parsed for
  structured-output streaming. The streaming-with-text_format path is the
  only one that does not surface the served-model header - documented inline.

- Wrap the raw streaming responses in async with so the underlying socket
  closes deterministically (continuation_token retrieve + create paths).

- Fix the empty-string / whitespace-only header at the source by stripping
  in _extract_served_model and returning None when nothing remains.

- Revert unrelated formatting-only churn in _skills.py and test_mcp.py.

- Update unit tests to assert against chat_response.model / update.model
  and add an aggregated streaming assertion plus a pin that the
  streaming-with-text_format path does not get the header.

Verified end-to-end against Azure OpenAI Responses API: deployment alias
gpt-5-nano now reports gpt-5-nano-2025-08-07 as ChatResponse.model in both
the non-streaming and streaming paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: preserve streaming structured output finalization

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f62076ef-558d-49e8-8fe2-f38d527c9639

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* refactor: name streaming response finalizer

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f62076ef-558d-49e8-8fe2-f38d527c9639

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* fix: capture streaming response format after prepare

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f62076ef-558d-49e8-8fe2-f38d527c9639

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* refactor: clarify streaming response format capture

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f62076ef-558d-49e8-8fe2-f38d527c9639

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* test: use public API for streaming structured output

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f62076ef-558d-49e8-8fe2-f38d527c9639

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Inline the served-model header override at its two call sites

The `_apply_served_model_header` helper was a 1-line wrapper around
`_extract_served_model`. Inlining the `if served_model is not None: ...`
matches the pattern already used in the streaming paths and folds the
explanatory docstring onto `_extract_served_model` (which is now the
single place that knows about the header).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
1b6f7d80fd · 2026-05-19 06:38:53 +00:00
History
..

Agent Framework Foundry

This package contains the Microsoft Foundry integrations for Microsoft Agent Framework, including Foundry chat clients, preconfigured Foundry agents, Foundry embedding clients, and Foundry memory providers.

Toolboxes

A toolbox is a named, versioned bundle of hosted tool configurations — code interpreter, file search, image generation, MCP, web search, and so on — stored inside a Microsoft Foundry project. Toolboxes let you manage tool configuration once and reuse it across agents.

Authoring a toolbox

Toolboxes can be authored two ways:

  • Foundry portal — create and version toolboxes through the UI without touching code.
  • Programmatically — use the azure-ai-projects SDK to create, update, and version toolboxes from Python.

Toolbox authoring APIs (ToolboxVersionObject, ToolboxObject, project_client.beta.toolboxes.*) require azure-ai-projects>=2.1.0. Earlier versions can only consume toolboxes that already exist.

Using toolboxes with FoundryAgent

For hosted FoundryAgent, the toolbox must already be attached to the agent in the Microsoft Foundry project. Once attached, the agent invokes its toolbox tools transparently — no client-side wiring required — and you interact with the agent the same way you would with any other tool-equipped Foundry agent.

Using toolboxes with FoundryChatClient

Each toolbox is reachable as an MCP server. Connect to the toolbox's MCP endpoint with MCPStreamableHTTPTool — the agent then discovers and calls its tools over MCP at runtime:

from agent_framework import Agent, MCPStreamableHTTPTool
from agent_framework.foundry import FoundryChatClient

async with Agent(
    client=FoundryChatClient(...),
    instructions="You are a helpful assistant. Use the toolbox tools when useful.",
    tools=MCPStreamableHTTPTool(
        name="my_toolbox",
        description="Tools served by my Foundry toolbox",
        url="https://<your-toolbox-mcp-endpoint>",
    ),
) as agent:
    result = await agent.run("What tools are available?")
    print(result.text)