mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
f5419b9f38
* Python: bump package versions for 1.2.2 release PATCH bump (1.2.1 -> 1.2.2) for the released cohort. Five PRs land in this window: - agent-framework-openai: fix file_search citations breaking the assistant- message history roundtrip (#5557) — drives the released-tier PATCH - agent-framework-orchestrations: [BREAKING] standardize orchestration terminal outputs as AgentResponse (#5301) - agent-framework-core, agent-framework-declarative: preserve Workflow.run() shared state across calls, accept list[Message] in declarative start executor, and coerce Enum values when serializing PowerFx symbols (#5531) - agent-framework-foundry-hosting: add hosted Durable Workflow support (#5531) - agent-framework-azure-contentunderstanding: new alpha package — Azure AI Content Understanding context provider (#4829) - dependencies: workspace package dependency refresh (#5555) Per lockstep convention, all 21 beta packages stamp 1.0.0b260429 and all 4 alpha packages (now including the new contentunderstanding) stamp 1.0.0a260429. Date stamp reflects 2026-04-29 Pacific. Every non-core package floor on agent-framework-core is raised to >=1.2.2; the new contentunderstanding package's stale >=1.0.0 floor is brought into line. Two follow-on fixes bundled to keep validate-dependency-bounds-test green at lowest-direct resolution: - Bump agent-framework-azure-contentunderstanding's azure-ai-content understanding lower bound from >=1.0.0 to >=1.0.1 (1.0.0 ships without proper typing — pyright reports 65 unknown-type errors) - Add pyright ignore comments to core/foundry/__init__.pyi for the new alpha package's type-stub imports, since alpha packages are not in core's [all] extra and therefore aren't installed at lowest-direct * Python: add #5552 to 1.2.2 CHANGELOG Add the streaming-span observability fix to the Fixed section. PR is on upstream/main but not yet pulled into origin/main; the code itself will land via the PR merge. * Python: address PR #5561 review feedback on dependency bounds Two packaging fixes flagged in review: 1. agent-framework-azure-contentunderstanding: add agent-framework-foundry as a runtime dependency. The package's README directs users to `pip install agent-framework-azure-contentunderstanding --pre` and the basic example imports `FoundryChatClient` from `agent_framework.foundry`, so the documented install path was failing with ImportError. Pulling agent-framework-foundry into deps makes the advertised entry path self-contained. 2. agent-framework-foundry: bump agent-framework-openai lower bound from >=1.1.0 to >=1.2.2,<2. Foundry imports private modules from agent_framework_openai (`_chat_client.py:22`, `_agent.py:34`), so resolvers were free to pair foundry==1.2.2 with older OpenAI versions that lack this release's coordinated Responses/history fix. Lockstep the floor with the released cohort to prevent mismatched installs. Both changes pass `validate-dependency-bounds-test` lower + upper at their respective packages.
67 lines
2.7 KiB
Python
67 lines
2.7 KiB
Python
# Copyright (c) Microsoft. All rights reserved.
|
|
"""DevUI Multi-Modal Agent — file upload + CU-powered analysis.
|
|
|
|
This agent uses Azure Content Understanding to analyze uploaded files
|
|
(PDFs, scanned documents, handwritten images, audio recordings, video)
|
|
and answer questions about them through the DevUI web interface.
|
|
|
|
Unlike the standard azure_responses_agent which sends files directly to the LLM,
|
|
this agent uses CU for structured extraction — superior for scanned PDFs,
|
|
handwritten content, audio transcription, and video analysis.
|
|
|
|
Required environment variables:
|
|
FOUNDRY_PROJECT_ENDPOINT — Azure AI Foundry project endpoint
|
|
FOUNDRY_MODEL — Model deployment name (e.g. gpt-4.1)
|
|
AZURE_CONTENTUNDERSTANDING_ENDPOINT — CU endpoint URL
|
|
|
|
Run with DevUI:
|
|
uv run poe devui --agent packages/azure-contentunderstanding/samples/devui_multimodal_agent
|
|
"""
|
|
|
|
import os
|
|
|
|
from agent_framework import Agent
|
|
from agent_framework.foundry import ContentUnderstandingContextProvider, FoundryChatClient
|
|
from azure.core.credentials import AzureKeyCredential
|
|
from azure.identity import AzureCliCredential
|
|
from dotenv import load_dotenv
|
|
|
|
load_dotenv()
|
|
|
|
# --- Auth ---
|
|
_credential = AzureCliCredential()
|
|
_cu_api_key = os.environ.get("AZURE_CONTENTUNDERSTANDING_API_KEY")
|
|
_cu_credential = AzureKeyCredential(_cu_api_key) if _cu_api_key else _credential
|
|
|
|
cu = ContentUnderstandingContextProvider(
|
|
endpoint=os.environ["AZURE_CONTENTUNDERSTANDING_ENDPOINT"],
|
|
credential=_cu_credential,
|
|
# max_wait controls how long before_run() waits for CU analysis before
|
|
# deferring to background. For interactive DevUI use, a short timeout
|
|
# (e.g. 5s) keeps the chat responsive — the agent tells the user the
|
|
# file is still being analyzed and resolves it on the next turn.
|
|
# Use max_wait=None to always wait for analysis to complete.
|
|
max_wait=5.0,
|
|
)
|
|
|
|
client = FoundryChatClient(
|
|
project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
|
|
model=os.environ["FOUNDRY_MODEL"],
|
|
credential=_credential,
|
|
)
|
|
|
|
agent = Agent(
|
|
client=client,
|
|
name="MultiModalDocAgent",
|
|
instructions=(
|
|
"You are a helpful document analysis assistant. "
|
|
"When a user uploads files, they are automatically analyzed using Azure Content Understanding. "
|
|
"Use list_documents() to check which documents are ready, pending, or failed "
|
|
"and to see which files are available for answering questions. "
|
|
"Tell the user if any documents are still being analyzed. "
|
|
"You can process PDFs, scanned documents, handwritten images, audio recordings, and video files. "
|
|
"When answering, cite specific content from the documents."
|
|
),
|
|
context_providers=[cu],
|
|
)
|