Files
agent-framework/python/packages
T
Giles Odigwe bb4fe48c9a Python: Enhance Azure AI Search Citations with Document URLs in Foundry V2 (#4028)
* Python: Enhance Azure AI Search citations with document URLs in Foundry V2 (Responses API)

Override _parse_response_from_openai and _parse_chunk_from_openai in
RawAzureAIClient to extract get_urls from azure_ai_search_call_output
items and enrich url_citation annotations with document-specific URLs.

- Non-streaming: first pass collects get_urls, post-processes annotations
- Streaming: captures search output state, enriches url_citation events
  (also handles url_citation annotation type not handled by base class)
- Updated V2 sample to demonstrate citation URL extraction
- Added 14 unit tests covering extraction, enrichment, and edge cases

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: rework search citation enrichment to override _inner_get_response

- Remove all direct openai/pydantic imports from _client.py
- Override _inner_get_response instead of _parse_response_from_openai/_parse_chunk_from_openai
- Use closure-local state for streaming instead of instance-level _streaming_search_get_urls
- Add _build_url_citation_content helper for streaming url_citation handling
- Fix mypy errors by using str(value or '') for Annotation TypedDict fields
- Fix docstring to say 'citation' instead of 'url_citation'
- Update tests to match new approach

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: handle streaming search citations from output_item.done events

The azure_ai_search_call_output item only has populated output data
(including get_urls) in the response.output_item.done event, not in
the response.output_item.added event. Also removed the search_get_urls
guard on url_citation handling so annotations are always produced even
if get_urls haven't been captured yet.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* addressed comments

* refactor: address PR review - eliminate type: ignore[assignment] pattern

Call super()._inner_get_response() independently in each branch instead
of once at the top with union type reassignment. Non-streaming uses
two-arg super() in the closure; streaming uses cast() for type narrowing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: remove defensive patterns per PR review

- Replace all getattr() with direct attribute access
- Remove cast() for streaming branch, use type: ignore[assignment]
- Simplify _build_url_citation_content to use dict access directly
- Simplify _extract_azure_search_urls to use item.type/item.output
- Handle empty list output from streaming 'added' events
- Update tests to match actual runtime types (objects, not dicts)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* mypy fix

* small fixes

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bb4fe48c9a ยท 2026-02-24 01:21:33 +00:00
History
..