Files
Eduard van Valkenburg 5e056b672e Python: [BREAKING] Python: Provider-leading client design & OpenAI package extraction (#4818)
* Python: Provider-leading client design & OpenAI package extraction

Major refactoring of the Python Agent Framework client architecture:

- Extract OpenAI clients into new `agent-framework-openai` package
- Core package no longer depends on openai, azure-identity, azure-ai-projects
- Rename clients for discoverability: OpenAIResponsesClient → OpenAIChatClient,
  OpenAIChatClient → OpenAIChatCompletionClient
- Unify `model_id`/`deployment_name`/`model_deployment_name` → `model` param
- New FoundryChatClient for Azure AI Foundry Responses API
- New FoundryAgent/FoundryAgentClient for connecting to pre-configured Foundry agents
- Remove OpenAIBase/OpenAIConfigMixin from non-deprecated client MRO
- Deprecate AzureOpenAI* clients, AzureAIClient, OpenAIAssistantsClient
- Reorganize samples: azure_openai+azure_ai+azure_ai_agent → azure/
- ADR-0020: Provider-Leading Client Design

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: missing Agent imports in samples, .model_id → .model in foundry_local sample

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: CI failures — mypy errors, coverage targets, sample imports

- azure-ai mypy: add type ignores for TypedDict total=, model arg, forward ref
- Coverage: replace core.azure/openai targets with openai package target
- project_provider: add type annotation for opts dict

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: populate openai .pyi stub, fix broken README links, coverage targets

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fixes

* updated observabilitty

* reset azure init.pyi

* fix errors

* updated adr number

* fix foundry local

* fixed not renamed docstrings and comments, and added deprecated markers to old classes

* fix tests and pyprojects

* fix test vars

* updated function tests

* update durable

* updated test setup for functions

* Fix Foundry auth in workflow samples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stabilize Python integration workflows

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update hosting samples for Foundry

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trigger full CI rerun

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trigger CI rerun again

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* trigger rerun

* trigger rerun

* fix for litellm

* undo durabletask changes

* Move Foundry APIs into foundry namespace

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Foundry pyproject formatting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split provider samples by Foundry surface

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Restore hosting sample requirements

Also fix the Foundry Local sample link after the provider sample move.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updated tests

* udpated foundry integration tests

* removed dist from azurefunctions tests

* Use separate Foundry clients for concurrent agents

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix client setup in azfunc and durable

* disabled two tests

* updated setup for some function and durable tests

* improved azure openai setup with new clients

* ignore deprecated

* fixes

* skip 11

* remove openai assistants int tests

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5e056b672e · 2026-03-25 09:56:29 +00:00
History
..
2026-02-18 19:50:33 +00:00

Self-Reflection Evaluation Sample

This sample demonstrates the self-reflection pattern using Agent Framework and Azure AI Foundry's Groundedness Evaluator. For details, see Reflexion: Language Agents with Verbal Reinforcement Learning (NeurIPS 2023).

Overview

What it demonstrates:

  • Iterative self-reflection loop that automatically improves responses based on groundedness evaluation
  • Batch processing of prompts from JSONL files with progress tracking
  • Using AzureOpenAIResponsesClient with a Project Endpoint and Azure CLI authentication
  • Comprehensive summary statistics and detailed result tracking

Prerequisites

Azure Resources

  • Azure OpenAI Responses in Foundry: Deploy models (default: gpt-5.2 for both agent and judge)
  • Azure CLI: Run az login to authenticate

Python Environment

pip install agent-framework-core pandas --pre

Environment Variables

AZURE_AI_PROJECT_ENDPOINT=https://<your-ai-resource>.services.ai.azure.com/api/projects/<your-ai-project>/

Running the Sample

# Basic usage
python self_reflection.py

# With options
python self_reflection.py --input my_prompts.jsonl \
                          --output results.jsonl \
                          --max-reflections 5 \
                          -n 10

CLI Options:

  • --input, -i: Input JSONL file
  • --output, -o: Output JSONL file
  • --agent-model, -m: Agent model name (default: gpt-4.1)
  • --judge-model, -e: Evaluator model name (default: gpt-4.1)
  • --max-reflections: Max iterations (default: 3)
  • --limit, -n: Process only first N prompts

Understanding Results

The agent iteratively improves responses:

  1. Generate initial response
  2. Evaluate groundedness (1-5 scale)
  3. If score < 5, provide feedback and retry
  4. Stop at max iterations or perfect score (5/5)

Example output:

[1/31] Processing prompt 0...
  Self-reflection iteration 1/3...
  Groundedness score: 3/5
  Self-reflection iteration 2/3...
  Groundedness score: 5/5
  ✓ Perfect groundedness score achieved!
  ✓ Completed with score: 5/5 (best at iteration 2/3)

In the Foundry UI, under Build/Evaluations you can view detailed results for each prompt, including:

  • Context
  • Query
  • Response
  • Groundedness scores and reasoning for each interation of each prompt