mirror of https://github.com/microsoft/agent-framework.git synced 2026-06-16 21:04:09 +08:00

Files

T

Eduard van Valkenburg 0521f5bed8 Python: [BREAKING] Simplify API: ChatAgent -> Agent, ChatMessage -> Message (#3747 )

* [BREAKING] Rename ChatAgent -> Agent, ChatMessage -> Message, ChatClientProtocol -> SupportsChatGetResponse

Simplify the public API by removing redundant 'Chat' prefix from core types:
- ChatAgent -> Agent
- RawChatAgent -> RawAgent
- ChatMessage -> Message
- ChatClientProtocol -> SupportsChatGetResponse

Also renamed internal WorkflowMessage (was Message in _runner_context) to avoid collision.

No backward compatibility aliases - this is a clean breaking change.

* [BREAKING] Rename Agent chat_client parameter to client

* Fix rebase issues: WorkflowMessage references and broken markdown links

* Fix formatting and lint issues from code quality checks

* Fix import ordering in workflow sample files

* fixed rebase

* Fix test failures: use WorkflowMessage and A2AMessage after ChatMessage→Message rename

- Replace Message(data=..., source_id=...) with WorkflowMessage(...) in workflow tests
- Fix isinstance check in A2A agent to use A2AMessage instead of Message
- Fix import in test_workflow_observability.py (Message→WorkflowMessage)

* Fix lint, fmt, and sample errors after ChatMessage→Message rename

- Auto-fix 70+ ruff lint issues across samples (ChatMessage→Message refs)
- Fix HostedVectorStoreContent→Content.from_hosted_vector_store in file search sample
- Fix _normalize_messages→normalize_messages in custom agent sample
- Fix context.terminate→raise MiddlewareTermination in middleware samples
- Fix with_update_hook→with_transform_hook in override middleware sample
- Add TOptions_co import back to custom_chat_client sample
- Add noqa for FastAPI File() default in chatkit sample
- Fix B023 loop variable capture in weather agent sample

* fix: update Agent constructor calls from chat_client to client in declaration-only tool tests

* fix: add register_cleanup to devui lazy-loading proxy and type stub

* fixed tests and updated new pieces

* fix agui typevar

* fix merge errors

* fix merge conflicts

* fiux merge

* Remove unused links

---------

Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>

0521f5bed8 · 2026-02-10 23:04:32 +00:00

History

resources

Python: [BREAKING] Observability updates (#2782 )

2025-12-16 06:56:30 +00:00

.env.example

Python: Replace Eval SDK with AI Projects SDK in evaluation sample (#2540 )

2025-12-02 20:28:52 +00:00

README.md

Python: Replace Eval SDK with AI Projects SDK in evaluation sample (#2540 )

2025-12-02 20:28:52 +00:00

self_reflection.py

Python: [BREAKING] Simplify API: ChatAgent -> Agent, ChatMessage -> Message (#3747 )

2026-02-10 23:04:32 +00:00

README.md

Self-Reflection Evaluation Sample

This sample demonstrates the self-reflection pattern using Agent Framework and Azure AI Foundry's Groundedness Evaluator. For details, see Reflexion: Language Agents with Verbal Reinforcement Learning (NeurIPS 2023).

Overview

What it demonstrates:

Iterative self-reflection loop that automatically improves responses based on groundedness evaluation
Batch processing of prompts from JSONL files with progress tracking
Using AzureOpenAIChatClient with Azure CLI authentication
Comprehensive summary statistics and detailed result tracking

Prerequisites

Azure Resources

Azure OpenAI: Deploy models (default: gpt-4.1 for both agent and judge)
Azure CLI: Run az login to authenticate

Python Environment

pip install agent-framework-core azure-ai-projects pandas --pre

Environment Variables

# .env file
AZURE_AI_PROJECT_ENDPOINT=https://<your-ai-resource>.services.ai.azure.com/api/projects/<your-ai-project>/

Running the Sample

# Basic usage
python self_reflection.py

# With options
python self_reflection.py --input my_prompts.jsonl \
                          --output results.jsonl \
                          --max-reflections 5 \
                          -n 10

CLI Options:

--input, -i: Input JSONL file
--output, -o: Output JSONL file
--agent-model, -m: Agent model name (default: gpt-4.1)
--judge-model, -e: Evaluator model name (default: gpt-4.1)
--max-reflections: Max iterations (default: 3)
--limit, -n: Process only first N prompts

Understanding Results

The agent iteratively improves responses:

Generate initial response
Evaluate groundedness (1-5 scale)
If score < 5, provide feedback and retry
Stop at max iterations or perfect score (5/5)

Example output:

[1/31] Processing prompt 0...
  Self-reflection iteration 1/3...
  Groundedness score: 3/5
  Self-reflection iteration 2/3...
  Groundedness score: 5/5
  ✓ Perfect groundedness score achieved!
  ✓ Completed with score: 5/5 (best at iteration 2/3)

README.md

Self-Reflection Evaluation Sample

Overview

Prerequisites

Azure Resources

Python Environment

Environment Variables

Running the Sample

Understanding Results

Related Resources