Files
agent-framework/python/packages/core
T
Eduard van Valkenburg 6305e3e092 Python: feat(python): Add embedding abstractions and OpenAI implementation (Phase 1) (#4153)
* feat(python): Add embedding abstractions and OpenAI implementation (Phase 1)

This PR contains two parts:

1. **Overall migration plan** for porting vector stores and embeddings from
   Semantic Kernel to Agent Framework (docs/features/vector-stores-and-embeddings/README.md)
   covering all 10 phases from core abstractions through connectors and TextSearch.

2. **Phase 1 implementation** — core embedding abstractions and OpenAI/Azure OpenAI
   embedding clients:

   Core types (_types.py):
   - EmbeddingGenerationOptions TypedDict (total=False)
   - Embedding[EmbeddingT] generic class with model_id, dimensions, created_at
   - GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT] list container with options, usage
   - EmbeddingInputT (default str) and EmbeddingT (default list[float]) TypeVars

   Protocol + base class (_clients.py):
   - SupportsGetEmbeddings protocol — Generic[EmbeddingInputT, EmbeddingT, OptionsContraT]
   - BaseEmbeddingClient ABC — Generic[EmbeddingInputT, EmbeddingT, OptionsCoT]

   Telemetry (observability.py):
   - EmbeddingTelemetryLayer with gen_ai.operation.name = "embeddings"

   OpenAI implementation (openai/_embedding_client.py):
   - RawOpenAIEmbeddingClient, OpenAIEmbeddingClient, OpenAIEmbeddingOptions
   - Uses _ensure_client() factory pattern

   Azure OpenAI implementation (azure/_embedding_client.py):
   - AzureOpenAIEmbeddingClient following AzureOpenAIChatClient pattern
   - Supports API key, Entra ID credentials, env var configuration

   Tests:
   - 47 unit tests for types, protocol, base class, OpenAI, and Azure clients
   - 6 integration tests (gated behind RUN_INTEGRATION_TESTS + credentials)

   Samples:
   - samples/02-agents/embeddings/openai_embeddings.py
   - samples/02-agents/embeddings/azure_openai_embeddings.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Add AzureOpenAIEmbeddingClient to azure __init__.pyi stub

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* ci: Add embedding env vars to Python integration tests

Map OPENAI_EMBEDDING_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
from GitHub vars to the integration test environment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Handle base64 encoding_format in OpenAI embedding client

When encoding_format='base64' is used, the OpenAI API returns base64-encoded
floats instead of a JSON array. Decode these automatically to list[float]
so the return type stays consistent regardless of encoding format.

Also adds a unit test for base64 decoding and fixes minor docstring/import issues.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Only record INPUT_TOKENS for embedding telemetry

Embeddings have no output/completion tokens. Remove OUTPUT_TOKENS recording
which was double-counting prompt_tokens via the total_tokens fallback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Resolve mypy variance error and lint warning

Use contravariant/covariant TypeVars for SupportsGetEmbeddings Protocol.
Combine nested if into single statement in telemetry layer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Make EmbeddingCoT invariant for mypy compatibility

GeneratedEmbeddings is invariant in its type param, so the Protocol
TypeVar cannot be covariant.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Address PR review - empty values guard, service_url for telemetry

- Add early return for empty values in get_embeddings to avoid unnecessary API calls
- Add service_url() method to RawOpenAIEmbeddingClient for proper telemetry endpoint reporting
- Add test for empty values behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix OpenAI chat client compatibility with third-party endpoints and OTel 0.4.14 (#4161)

* Fix system message content sent as list instead of string

Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages
when content is a list of content parts. This change flattens system and
developer message content to a plain string in the Chat Completions client.

Fixes https://github.com/microsoft/agent-framework/issues/1407

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix compatibility with opentelemetry-semantic-conventions-ai 0.4.14

Version 0.4.14 removed several LLM_* attributes from SpanAttributes
(LLM_SYSTEM, LLM_REQUEST_MODEL, LLM_RESPONSE_MODEL, LLM_REQUEST_MAX_TOKENS,
LLM_REQUEST_TEMPERATURE, LLM_REQUEST_TOP_P, LLM_TOKEN_TYPE).

Move these to the OtelAttr enum with their well-known gen_ai.* string values
and update all references in observability.py and tests.

Fixes https://github.com/microsoft/agent-framework/issues/4160

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Flatten text-only message content to string for all roles

Extend the system/developer fix to all message roles. Text-only content
lists are now post-processed into plain strings, while multimodal content
(text + images/audio) remains as a list. This fixes compatibility with
OpenAI-like endpoints that cannot deserialize list content (e.g. Foundry
Local's Neutron backend).

Partially fixes https://github.com/microsoft/agent-framework/issues/4084

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix streaming text lost when usage data in same chunk

Some providers (e.g. Gemini) include both usage data and text content
in the same streaming chunk. The early return on chunk.usage caused
text and tool call parsing to be skipped entirely. Remove the early
return and process usage alongside text/tool calls.

Fixes https://github.com/microsoft/agent-framework/issues/3434

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix mypy errors in _chat_client.py

Rename shadowed variable 'args' in system/developer branch to 'sys_args'
and rename loop variable 'content' to 'msg_content' to avoid type conflict.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* reorder imports

* fix: Use OtelAttr.REQUEST_MODEL instead of removed SpanAttributes.LLM_REQUEST_MODEL

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: Add score_threshold to vector store plan

Reference SK .NET PR #13501 for score threshold filtering semantics.
Include score_threshold in SearchOptions from Phase 3.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: Add reference to roji's SK .NET MEVD work for SQL connectors

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Clear env vars in construction tests to avoid CI leakage

Tests for missing API key / model ID now use monkeypatch.delenv to ensure
env vars from the integration test environment don't prevent the expected
ValueError from being raised.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6305e3e092 · 2026-02-24 07:40:20 +00:00
History
..
2025-09-30 07:18:36 +00:00

Get Started with Microsoft Agent Framework

Highlights

  • Flexible Agent Framework: build, orchestrate, and deploy AI agents and multi-agent systems
  • Multi-Agent Orchestration: Group chat, sequential, concurrent, and handoff patterns
  • Plugin Ecosystem: Extend with native functions, OpenAPI, Model Context Protocol (MCP), and more
  • LLM Support: OpenAI, Azure OpenAI, Azure AI, and more
  • Runtime Support: In-process and distributed agent execution
  • Multimodal: Text, vision, and function calling
  • Cross-Platform: .NET and Python implementations

Quick Install

pip install agent-framework-core --pre
# Optional: Add Azure AI integration
pip install agent-framework-azure-ai --pre

Supported Platforms:

  • Python: 3.10+
  • OS: Windows, macOS, Linux

1. Setup API Keys

Set as environment variables, or create a .env file at your project root:

OPENAI_API_KEY=sk-...
OPENAI_CHAT_MODEL_ID=...
OPENAI_RESPONSES_MODEL_ID=...
...
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=...
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=...
...
AZURE_AI_PROJECT_ENDPOINT=...
AZURE_AI_MODEL_DEPLOYMENT_NAME=...

You can also override environment variables by explicitly passing configuration parameters to the chat client constructor:

from agent_framework.azure import AzureOpenAIChatClient

client = AzureOpenAIChatClient(
    api_key="",
    endpoint="",
    deployment_name="",
    api_version="",
)

See the following setup guide for more information.

2. Create a Simple Agent

Create agents and invoke them directly:

import asyncio
from agent_framework import Agent
from agent_framework.openai import OpenAIChatClient

async def main():
    agent = Agent(
        client=OpenAIChatClient(),
        instructions="""
        1) A robot may not injure a human being...
        2) A robot must obey orders given it by human beings...
        3) A robot must protect its own existence...

        Give me the TLDR in exactly 5 words.
        """
    )

    result = await agent.run("Summarize the Three Laws of Robotics")
    print(result)

asyncio.run(main())
# Output: Protect humans, obey, self-preserve, prioritized.

3. Directly Use Chat Clients (No Agent Required)

You can use the chat client classes directly for advanced workflows:

import asyncio
from agent_framework.openai import OpenAIChatClient
from agent_framework import Message, Role

async def main():
    client = OpenAIChatClient()

    messages = [
        Message("system", ["You are a helpful assistant."]),
        Message("user", ["Write a haiku about Agent Framework."])
    ]

    response = await client.get_response(messages)
    print(response.messages[0].text)

    """
    Output:

    Agents work in sync,
    Framework threads through each task—
    Code sparks collaboration.
    """

asyncio.run(main())

4. Build an Agent with Tools and Functions

Enhance your agent with custom tools and function calling:

import asyncio
from typing import Annotated
from random import randint
from pydantic import Field
from agent_framework import Agent
from agent_framework.openai import OpenAIChatClient


def get_weather(
    location: Annotated[str, Field(description="The location to get the weather for.")],
) -> str:
    """Get the weather for a given location."""
    conditions = ["sunny", "cloudy", "rainy", "stormy"]
    return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C."


def get_menu_specials() -> str:
    """Get today's menu specials."""
    return """
    Special Soup: Clam Chowder
    Special Salad: Cobb Salad
    Special Drink: Chai Tea
    """


async def main():
    agent = Agent(
        client=OpenAIChatClient(),
        instructions="You are a helpful assistant that can provide weather and restaurant information.",
        tools=[get_weather, get_menu_specials]
    )

    response = await agent.run("What's the weather in Amsterdam and what are today's specials?")
    print(response)

    # Output:
    # The weather in Amsterdam is sunny with a high of 22°C. Today's specials include
    # Clam Chowder soup, Cobb Salad, and Chai Tea as the special drink.

asyncio.run(main())

You can explore additional agent samples here.

5. Multi-Agent Orchestration

Coordinate multiple agents to collaborate on complex tasks using orchestration patterns:

import asyncio
from agent_framework import Agent
from agent_framework.openai import OpenAIChatClient


async def main():
    # Create specialized agents
    writer = Agent(
        client=OpenAIChatClient(),
        name="Writer",
        instructions="You are a creative content writer. Generate and refine slogans based on feedback."
    )

    reviewer = Agent(
        client=OpenAIChatClient(),
        name="Reviewer",
        instructions="You are a critical reviewer. Provide detailed feedback on proposed slogans."
    )

    # Sequential workflow: Writer creates, Reviewer provides feedback
    task = "Create a slogan for a new electric SUV that is affordable and fun to drive."

    # Step 1: Writer creates initial slogan
    initial_result = await writer.run(task)
    print(f"Writer: {initial_result}")

    # Step 2: Reviewer provides feedback
    feedback_request = f"Please review this slogan: {initial_result}"
    feedback = await reviewer.run(feedback_request)
    print(f"Reviewer: {feedback}")

    # Step 3: Writer refines based on feedback
    refinement_request = f"Please refine this slogan based on the feedback: {initial_result}\nFeedback: {feedback}"
    final_result = await writer.run(refinement_request)
    print(f"Final Slogan: {final_result}")

    # Example Output:
    # Writer: "Charge Forward: Affordable Adventure Awaits!"
    # Reviewer: "Good energy, but 'Charge Forward' is overused in EV marketing..."
    # Final Slogan: "Power Up Your Adventure: Premium Feel, Smart Price!"

if __name__ == "__main__":
    asyncio.run(main())

Note: Sequential, Concurrent, Group Chat, Handoff, and Magentic orchestrations are available. See examples in orchestration samples.

More Examples & Samples

Agent Framework Documentation