mirror of https://github.com/microsoft/agent-framework.git synced 2026-06-16 21:04:09 +08:00

Files

T

Eduard van Valkenburg 0521f5bed8 Python: [BREAKING] Simplify API: ChatAgent -> Agent, ChatMessage -> Message (#3747 )

* [BREAKING] Rename ChatAgent -> Agent, ChatMessage -> Message, ChatClientProtocol -> SupportsChatGetResponse

Simplify the public API by removing redundant 'Chat' prefix from core types:
- ChatAgent -> Agent
- RawChatAgent -> RawAgent
- ChatMessage -> Message
- ChatClientProtocol -> SupportsChatGetResponse

Also renamed internal WorkflowMessage (was Message in _runner_context) to avoid collision.

No backward compatibility aliases - this is a clean breaking change.

* [BREAKING] Rename Agent chat_client parameter to client

* Fix rebase issues: WorkflowMessage references and broken markdown links

* Fix formatting and lint issues from code quality checks

* Fix import ordering in workflow sample files

* fixed rebase

* Fix test failures: use WorkflowMessage and A2AMessage after ChatMessage→Message rename

- Replace Message(data=..., source_id=...) with WorkflowMessage(...) in workflow tests
- Fix isinstance check in A2A agent to use A2AMessage instead of Message
- Fix import in test_workflow_observability.py (Message→WorkflowMessage)

* Fix lint, fmt, and sample errors after ChatMessage→Message rename

- Auto-fix 70+ ruff lint issues across samples (ChatMessage→Message refs)
- Fix HostedVectorStoreContent→Content.from_hosted_vector_store in file search sample
- Fix _normalize_messages→normalize_messages in custom agent sample
- Fix context.terminate→raise MiddlewareTermination in middleware samples
- Fix with_update_hook→with_transform_hook in override middleware sample
- Add TOptions_co import back to custom_chat_client sample
- Add noqa for FastAPI File() default in chatkit sample
- Fix B023 loop variable capture in weather agent sample

* fix: update Agent constructor calls from chat_client to client in declaration-only tool tests

* fix: add register_cleanup to devui lazy-loading proxy and type stub

* fixed tests and updated new pieces

* fix agui typevar

* fix merge errors

* fix merge conflicts

* fiux merge

* Remove unused links

---------

Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>

0521f5bed8 · 2026-02-10 23:04:32 +00:00

History

azure_chat_multimodal.py

Python: [BREAKING] Simplify API: ChatAgent -> Agent, ChatMessage -> Message (#3747 )

2026-02-10 23:04:32 +00:00

azure_responses_multimodal.py

Python: [BREAKING] Simplify API: ChatAgent -> Agent, ChatMessage -> Message (#3747 )

2026-02-10 23:04:32 +00:00

openai_chat_multimodal.py

Python: [BREAKING] Simplify API: ChatAgent -> Agent, ChatMessage -> Message (#3747 )

2026-02-10 23:04:32 +00:00

README.md

Python: [Breaking] Simplified Content types to a single class with classmethod constructors. (#3252 )

2026-01-20 22:09:39 +00:00

README.md

Multimodal Input Examples

This folder contains examples demonstrating how to send multimodal content (images, audio, PDF files) to AI agents using the Agent Framework.

Examples

OpenAI Chat Client

File: openai_chat_multimodal.py
Description: Shows how to send images, audio, and PDF files to OpenAI's Chat Completions API
Supported formats: PNG/JPEG images, WAV/MP3 audio, PDF documents

Azure OpenAI Chat Client

File: azure_chat_multimodal.py
Description: Shows how to send images to Azure OpenAI Chat Completions API
Supported formats: PNG/JPEG images (PDF files are NOT supported by Chat Completions API)

Azure OpenAI Responses Client

File: azure_responses_multimodal.py
Description: Shows how to send images and PDF files to Azure OpenAI Responses API
Supported formats: PNG/JPEG images, PDF documents (full multimodal support)

Environment Variables

Set the following environment variables before running the examples:

For OpenAI:

OPENAI_API_KEY: Your OpenAI API key

For Azure OpenAI:

AZURE_OPENAI_ENDPOINT: Your Azure OpenAI endpoint
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: The name of your Azure OpenAI chat model deployment
AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME: The name of your Azure OpenAI responses model deployment

Optionally for Azure OpenAI:

AZURE_OPENAI_API_VERSION: The API version to use (default is 2024-10-21)
AZURE_OPENAI_API_KEY: Your Azure OpenAI API key (if not using AzureCliCredential)

Note: You can also provide configuration directly in code instead of using environment variables:

# Example: Pass deployment_name directly
client = AzureOpenAIChatClient(
    credential=AzureCliCredential(),
    deployment_name="your-deployment-name",
    endpoint="https://your-resource.openai.azure.com"
)

Authentication

The Azure example uses AzureCliCredential for authentication. Run az login in your terminal before running the example, or replace AzureCliCredential with your preferred authentication method (e.g., provide api_key parameter).

Running the Examples

# Run OpenAI example
python openai_chat_multimodal.py

# Run Azure Chat example (requires az login or API key)
python azure_chat_multimodal.py

# Run Azure Responses example (requires az login or API key)
python azure_responses_multimodal.py

Using Your Own Files

The examples include small embedded test files for demonstration. To use your own files:

Method 1: Data URIs (recommended)

import base64

# Load and encode your file
with open("path/to/your/image.jpg", "rb") as f:
    image_data = f.read()
    image_base64 = base64.b64encode(image_data).decode('utf-8')
    image_uri = f"data:image/jpeg;base64,{image_base64}"

# Use in DataContent
Content.from_uri(
    uri=image_uri,
    media_type="image/jpeg"
)

Method 2: Raw bytes

# Load raw bytes
with open("path/to/your/image.jpg", "rb") as f:
    image_bytes = f.read()

# Use in DataContent
Content.from_data(
    data=image_bytes,
    media_type="image/jpeg"
)

Supported File Types

Type	Formats	Notes
Images	PNG, JPEG, GIF, WebP	Most common image formats
Audio	WAV, MP3	For transcription and analysis
Documents	PDF	Text extraction and analysis

API Differences

OpenAI Chat Completions API: Supports images, audio, and PDF files
Azure OpenAI Chat Completions API: Supports images only (no PDF/audio file types)
Azure OpenAI Responses API: Supports images and PDF files (full multimodal support)

Choose the appropriate client based on your multimodal needs and available APIs.