mirror of https://github.com/microsoft/agent-framework.git synced 2026-06-16 21:04:09 +08:00

Files

T

Eduard van Valkenburg a2856d3b92 Python: restructure: Python samples into progressive 01-05 layout (#3862 )

* restructure: Python samples into progressive 01-05 layout

- 01-get-started/: 6 numbered steps (hello agent → hosting)
- 02-agents/: all agent concept samples (tools, middleware, providers, etc.)
- 03-workflows/: ALL existing workflow samples preserved as-is
- 04-hosting/: azure-functions, durabletask, a2a
- 05-end-to-end/: demos, evaluation, hosted agents
- Old files moved to _to_delete/ for review
- Added AGENTS.md with structure documentation
- autogen-migration/ and semantic-kernel-migration/ preserved at root

* fix: switch to AzureOpenAI Foundry, fix CI failures

- Switch all 01-get-started samples to AzureOpenAIResponsesClient with
  Azure AI Foundry project endpoint (AZURE_AI_PROJECT_ENDPOINT +
  AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME + AzureCliCredential)
- Add _to_delete/ and 05-end-to-end/ to pyrightconfig.samples.json excludes
- Fix test paths in packages/ that referenced old getting_started/ dirs:
  durabletask conftest + streaming test, azurefunctions conftest,
  devui conftest + capture_messages + openai_sdk_integration
- Fix workflow_as_agent_human_in_the_loop.py import (sibling import)
- Update hosting READMEs and tool comment paths
- Replace root README.md with new structure overview
- Update AGENTS.md to document Azure OpenAI Foundry as default provider

* cleanup: remove _to_delete folder, copy resource files to active dirs

All files in _to_delete/ were either:
- Exact duplicates of files in the new structure (240 files)
- Same file with only comment path updates (100 files)
- One import-fix diff (workflow_as_agent_human_in_the_loop.py)
- One superseded minimal_sample.py

Resource files (sample.pdf, countries.json, employees.pdf, weather.json)
copied to 02-agents/sample_assets/ and 02-agents/resources/ since active
samples reference them.

* fix: address PR review comments, centralize resources, remove root duplicates

- Fix type annotation in 04_memory.py (string union -> proper types)
- Fix old sample paths in observability files
- Fix grammar/spelling in observability samples
- Move sample_assets/ and resources/ to shared/ folder
- Remove 8 duplicate observability files from 02-agents root
- Update resource path references in multimodal_input and provider samples

* fix: update broken links from old getting_started paths to new structure

- Update relative paths in READMEs: getting_started/ → 01-get-started/,
  02-agents/, 03-workflows/, 04-hosting/, 05-end-to-end/
- Fix absolute GitHub URLs in package READMEs
- Fix broken link in ollama package README

* fix: convert absolute GitHub URLs to relative paths for link checker

Absolute URLs to python/samples/ on main branch 404 until PR merges.
Converted to relative paths that linkspector can verify locally.

* fix: update link for handoff sample moved to orchestrations/

* fix: update chatkit-integration README path from demos/ to 05-end-to-end/

* fix: update broken links in orchestrations README to match flat directory structure

2026-02-12 17:36:36 +00:00

3.8 KiB

Raw Blame History

Multimodal Input Examples

This folder contains examples demonstrating how to send multimodal content (images, audio, PDF files) to AI agents using the Agent Framework.

Examples

OpenAI Chat Client

File: openai_chat_multimodal.py
Description: Shows how to send images, audio, and PDF files to OpenAI's Chat Completions API
Supported formats: PNG/JPEG images, WAV/MP3 audio, PDF documents

Azure OpenAI Chat Client

File: azure_chat_multimodal.py
Description: Shows how to send images to Azure OpenAI Chat Completions API
Supported formats: PNG/JPEG images (PDF files are NOT supported by Chat Completions API)

Azure OpenAI Responses Client

File: azure_responses_multimodal.py
Description: Shows how to send images and PDF files to Azure OpenAI Responses API
Supported formats: PNG/JPEG images, PDF documents (full multimodal support)

Environment Variables

Set the following environment variables before running the examples:

For OpenAI:

OPENAI_API_KEY: Your OpenAI API key

For Azure OpenAI:

AZURE_OPENAI_ENDPOINT: Your Azure OpenAI endpoint
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: The name of your Azure OpenAI chat model deployment
AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME: The name of your Azure OpenAI responses model deployment

Optionally for Azure OpenAI:

AZURE_OPENAI_API_VERSION: The API version to use (default is 2024-10-21)
AZURE_OPENAI_API_KEY: Your Azure OpenAI API key (if not using AzureCliCredential)

Note: You can also provide configuration directly in code instead of using environment variables:

# Example: Pass deployment_name directly
client = AzureOpenAIChatClient(
    credential=AzureCliCredential(),
    deployment_name="your-deployment-name",
    endpoint="https://your-resource.openai.azure.com"
)

Authentication

The Azure example uses AzureCliCredential for authentication. Run az login in your terminal before running the example, or replace AzureCliCredential with your preferred authentication method (e.g., provide api_key parameter).

Running the Examples

# Run OpenAI example
python openai_chat_multimodal.py

# Run Azure Chat example (requires az login or API key)
python azure_chat_multimodal.py

# Run Azure Responses example (requires az login or API key)
python azure_responses_multimodal.py

Using Your Own Files

The examples include small embedded test files for demonstration. To use your own files:

Method 1: Data URIs (recommended)

import base64

# Load and encode your file
with open("path/to/your/image.jpg", "rb") as f:
    image_data = f.read()
    image_base64 = base64.b64encode(image_data).decode('utf-8')
    image_uri = f"data:image/jpeg;base64,{image_base64}"

# Use in DataContent
Content.from_uri(
    uri=image_uri,
    media_type="image/jpeg"
)

Method 2: Raw bytes

# Load raw bytes
with open("path/to/your/image.jpg", "rb") as f:
    image_bytes = f.read()

# Use in DataContent
Content.from_data(
    data=image_bytes,
    media_type="image/jpeg"
)

Supported File Types

Type	Formats	Notes
Images	PNG, JPEG, GIF, WebP	Most common image formats
Audio	WAV, MP3	For transcription and analysis
Documents	PDF	Text extraction and analysis

API Differences

OpenAI Chat Completions API: Supports images, audio, and PDF files
Azure OpenAI Chat Completions API: Supports images only (no PDF/audio file types)
Azure OpenAI Responses API: Supports images and PDF files (full multimodal support)

Choose the appropriate client based on your multimodal needs and available APIs.

3.8 KiB Raw Blame History