mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
f5419b9f38
* Python: bump package versions for 1.2.2 release PATCH bump (1.2.1 -> 1.2.2) for the released cohort. Five PRs land in this window: - agent-framework-openai: fix file_search citations breaking the assistant- message history roundtrip (#5557) — drives the released-tier PATCH - agent-framework-orchestrations: [BREAKING] standardize orchestration terminal outputs as AgentResponse (#5301) - agent-framework-core, agent-framework-declarative: preserve Workflow.run() shared state across calls, accept list[Message] in declarative start executor, and coerce Enum values when serializing PowerFx symbols (#5531) - agent-framework-foundry-hosting: add hosted Durable Workflow support (#5531) - agent-framework-azure-contentunderstanding: new alpha package — Azure AI Content Understanding context provider (#4829) - dependencies: workspace package dependency refresh (#5555) Per lockstep convention, all 21 beta packages stamp 1.0.0b260429 and all 4 alpha packages (now including the new contentunderstanding) stamp 1.0.0a260429. Date stamp reflects 2026-04-29 Pacific. Every non-core package floor on agent-framework-core is raised to >=1.2.2; the new contentunderstanding package's stale >=1.0.0 floor is brought into line. Two follow-on fixes bundled to keep validate-dependency-bounds-test green at lowest-direct resolution: - Bump agent-framework-azure-contentunderstanding's azure-ai-content understanding lower bound from >=1.0.0 to >=1.0.1 (1.0.0 ships without proper typing — pyright reports 65 unknown-type errors) - Add pyright ignore comments to core/foundry/__init__.pyi for the new alpha package's type-stub imports, since alpha packages are not in core's [all] extra and therefore aren't installed at lowest-direct * Python: add #5552 to 1.2.2 CHANGELOG Add the streaming-span observability fix to the Fixed section. PR is on upstream/main but not yet pulled into origin/main; the code itself will land via the PR merge. * Python: address PR #5561 review feedback on dependency bounds Two packaging fixes flagged in review: 1. agent-framework-azure-contentunderstanding: add agent-framework-foundry as a runtime dependency. The package's README directs users to `pip install agent-framework-azure-contentunderstanding --pre` and the basic example imports `FoundryChatClient` from `agent_framework.foundry`, so the documented install path was failing with ImportError. Pulling agent-framework-foundry into deps makes the advertised entry path self-contained. 2. agent-framework-foundry: bump agent-framework-openai lower bound from >=1.1.0 to >=1.2.2,<2. Foundry imports private modules from agent_framework_openai (`_chat_client.py:22`, `_agent.py:34`), so resolvers were free to pair foundry==1.2.2 with older OpenAI versions that lack this release's coordinated Responses/history fix. Lockstep the floor with the released cohort to prevent mismatched installs. Both changes pass `validate-dependency-bounds-test` lower + upper at their respective packages.
118 lines
4.1 KiB
Python
118 lines
4.1 KiB
Python
# Copyright (c) Microsoft. All rights reserved.
|
|
# /// script
|
|
# requires-python = ">=3.10"
|
|
# dependencies = [
|
|
# "agent-framework-azure-contentunderstanding",
|
|
# "agent-framework-foundry",
|
|
# "azure-identity",
|
|
# ]
|
|
# ///
|
|
# Run with: uv run packages/azure-contentunderstanding/samples/01-get-started/01_document_qa.py
|
|
|
|
|
|
import asyncio
|
|
import os
|
|
from pathlib import Path
|
|
|
|
from agent_framework import Agent, Content, Message
|
|
from agent_framework.foundry import ContentUnderstandingContextProvider, FoundryChatClient
|
|
from azure.identity import AzureCliCredential
|
|
from dotenv import load_dotenv
|
|
|
|
load_dotenv()
|
|
|
|
"""
|
|
Document Q&A — PDF upload with CU-powered extraction
|
|
|
|
This sample demonstrates the simplest CU integration: upload a PDF and
|
|
ask questions about it. Azure Content Understanding extracts structured
|
|
markdown with table preservation — superior to LLM-only vision for
|
|
scanned PDFs, handwritten content, and complex layouts.
|
|
|
|
Environment variables:
|
|
FOUNDRY_PROJECT_ENDPOINT — Azure AI Foundry project endpoint
|
|
FOUNDRY_MODEL — Model deployment name (e.g. gpt-4.1)
|
|
AZURE_CONTENTUNDERSTANDING_ENDPOINT — CU endpoint URL
|
|
"""
|
|
|
|
# Path to a sample PDF — uses the shared sample asset if available,
|
|
# otherwise falls back to a public URL
|
|
SAMPLE_PDF_PATH = Path(__file__).resolve().parents[1] / "shared" / "sample_assets" / "invoice.pdf"
|
|
|
|
|
|
async def main() -> None:
|
|
credential = AzureCliCredential()
|
|
|
|
# Set up Azure Content Understanding context provider
|
|
cu = ContentUnderstandingContextProvider(
|
|
endpoint=os.environ["AZURE_CONTENTUNDERSTANDING_ENDPOINT"],
|
|
credential=credential,
|
|
analyzer_id="prebuilt-documentSearch", # RAG-optimized document analyzer
|
|
max_wait=None, # wait until CU analysis finishes (no background deferral)
|
|
)
|
|
|
|
# Set up the LLM client
|
|
client = FoundryChatClient(
|
|
project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
|
|
model=os.environ["FOUNDRY_MODEL"],
|
|
credential=credential,
|
|
)
|
|
|
|
# Create agent with CU context provider.
|
|
# The provider extracts document content via CU and injects it into the
|
|
# LLM context so the agent can answer questions about the document.
|
|
async with cu:
|
|
agent = Agent(
|
|
client=client,
|
|
name="DocumentQA",
|
|
instructions=(
|
|
"You are a helpful document analyst. Use the analyzed document "
|
|
"content and extracted fields to answer questions precisely."
|
|
),
|
|
context_providers=[cu],
|
|
)
|
|
|
|
# --- Turn 1: Upload PDF and ask a question ---
|
|
# 4. Upload PDF and ask questions
|
|
# The CU provider extracts markdown + fields from the PDF and injects
|
|
# the full content into context so the agent can answer precisely.
|
|
print("--- Upload PDF and ask questions ---")
|
|
|
|
pdf_bytes = SAMPLE_PDF_PATH.read_bytes()
|
|
|
|
response = await agent.run(
|
|
Message(
|
|
role="user",
|
|
contents=[
|
|
Content.from_text(
|
|
"What is this document about? Who is the vendor, and what is the total amount due?"
|
|
),
|
|
Content.from_data(
|
|
pdf_bytes,
|
|
"application/pdf",
|
|
# Always provide filename — used as the document key
|
|
additional_properties={"filename": SAMPLE_PDF_PATH.name},
|
|
),
|
|
],
|
|
)
|
|
)
|
|
usage = response.usage_details or {}
|
|
print(f"Agent: {response}")
|
|
print(f" [Input tokens: {usage.get('input_token_count', 'N/A')}]\n")
|
|
|
|
|
|
if __name__ == "__main__":
|
|
asyncio.run(main())
|
|
|
|
"""
|
|
Sample output:
|
|
|
|
--- Upload PDF and ask questions ---
|
|
Agent: This document is an **invoice** for services and fees billed to
|
|
**MICROSOFT CORPORATION** (Invoice **INV-100**), including line items
|
|
(e.g., Consulting Services, Document Fee, Printing Fee) and a billing summary.
|
|
- **Vendor:** **CONTOSO LTD.**
|
|
- **Total amount due:** **$610.00**
|
|
[Input tokens: 988]
|
|
"""
|