Python: [BREAKING] PR2 — Wire context provider pipeline, remove old types, update all consumers (#3850)

* PR2: Wire context provider pipeline and update all internal consumers - Replace AgentThread with AgentSession across all packages - Replace ContextProvider with BaseContextProvider across all packages - Replace context_provider param with context_providers (Sequence) - Replace thread= with session= in run() signatures - Replace get_new_thread() with create_session() - Add get_session(service_session_id) to agent interface - DurableAgentThread -> DurableAgentSession - Remove _notify_thread_of_new_messages from WorkflowAgent - Wire before_run/after_run context provider pipeline in RawAgent - Auto-inject InMemoryHistoryProvider when no providers configured * fix: update all tests for context provider pipeline, fix lazy-loaders, remove old test files * refactor: update all sample files for context provider pipeline (AgentThread→AgentSession, ContextProvider→BaseContextProvider) * fix: update remaining ag-ui references (client docstring, getting_started sample) * fix: make get_session service_session_id keyword-only to avoid confusion with session_id * refactor: rename _RunContext.thread_messages to session_messages * refactor: remove _threads.py, _memory.py, and old provider files; migrate devui to use plain message lists * rename: remove _new_ prefix from test files * refactor: rewrite SlidingWindowChatMessageStore as SlidingWindowHistoryProvider(InMemoryHistoryProvider) * fix: read full history from session state directly instead of reaching into provider internals * fix: update stale .pyi stubs, sample imports, and README references for new provider types * fix: remove stale message_store, _notify_thread_of_new_messages, and session_id.key references in samples * refactor: merge context_providers and sessions sample folders into sessions, remove aggregate_context_provider * refactor: UserInfoMemory stores state in session.state instead of instance attributes * feat: add Pydantic BaseModel support to session state serialization Pydantic models stored in session.state are now automatically serialized via model_dump() and restored via model_validate() during to_dict()/from_dict() round-trips. Models are auto-registered on first serialization; use register_state_type() for cold-start deserialization. Also export register_state_type as a public API. * fix mem0 * Update sample README links and descriptions for session terminology - Replace 'thread' with 'session' in sample descriptions across all READMEs - Update file links for renamed samples (mem0_sessions, redis_sessions, etc.) - Fix Threads section → Sessions section in main samples/README.md - Update tools, middleware, workflows, durabletask, azure_functions READMEs - Update architecture diagrams in concepts/tools/README.md - Update migration guides (autogen, semantic-kernel) * Fix broken Redis README link to renamed sample * Fix Mem0 OSS client search: pass scoping params as direct kwargs AsyncMemory (OSS) expects user_id/agent_id/run_id as direct kwargs, while AsyncMemoryClient (Platform) expects them in a filters dict. Adds tests for both client types. Port of fix from #3844 to new Mem0ContextProvider. * Fix rebase issues: restore missing _conversation_state.py and checkpoint decode logic - Add back _conversation_state.py (encode/decode_chat_messages) lost in rebase - Fix on_checkpoint_restore to decode cache/conversation with decode_chat_messages - Fix on_checkpoint_restore to use decode_checkpoint_value for pending requests - Add tests/workflow/__init__.py for relative import support - Fix test_agent_executor checkpoint selection (checkpoints[1] not superstep) * Add STORES_BY_DEFAULT ClassVar to skip redundant InMemoryHistoryProvider injection Chat clients that store history server-side by default (OpenAI Responses API, Azure AI Agent) now declare STORES_BY_DEFAULT = True. The agent checks this during auto-injection and skips InMemoryHistoryProvider unless the user explicitly sets store=False. * Fix broken markdown links in azure_ai and redis READMEs * Fix getting-started samples to use session API instead of removed thread/ContextProvider API * updates to workflow as agent * fix group chat import * Rename Thread→Session throughout, fix service_session_id propagation, remove stale AGUIThread - Fix: Propagate conversation_id from ChatResponse back to session.service_session_id in both streaming and non-streaming paths in _agents.py - Rename AgentThreadException → AgentSessionException - Remove stale AGUIThread from ag_ui lazy-loader - Rename use_service_thread → use_service_session in ag-ui package - Rename test functions from *_thread_* to *_session_* - Rename sample files from *_thread* to *_session* - Update docstrings and comments: thread → session - Update _mcp.py kwargs filter: add 'session' alongside 'thread' - Fix ContinuationToken docstring example: thread=thread → session=session - Fix _clients.py docstring: 'Agent threads' → 'Agent sessions' * Fix broken markdown links after thread→session file renames * fix azure ai test
2026-06-16 21:04:09 +08:00 · 2026-02-12 22:00:32 +01:00
parent 0c67dbbce5
commit 1e350ea22f
312 changed files with 6669 additions and 11423 deletions
@@ -7,7 +7,7 @@ This sample demonstrates how to use the Durable Extension for Agent Framework to
 - Defining a simple agent with the Microsoft Agent Framework and wiring it into
  an Azure Functions app via the Durable Extension for Agent Framework.
 - Calling the agent through generated HTTP endpoints (`/api/agents/Joker/run`).
- Managing conversation state with thread identifiers, so multiple clients can
+- Managing conversation state with session identifiers, so multiple clients can
  interact with the agent concurrently without sharing context.

 ## Prerequisites
@@ -6,7 +6,7 @@ This sample demonstrates how to use the Durable Extension for Agent Framework to

 - Using the Microsoft Agent Framework to define multiple AI agents with unique names and instructions.
 - Registering multiple agents with the Function app and running them using HTTP.
- Conversation management (via thread IDs) for isolated interactions per agent.
+- Conversation management (via session IDs) for isolated interactions per agent.
 - Two different methods for registering agents: list-based initialization and incremental addition.

 ## Prerequisites
@@ -20,7 +20,7 @@ from azure.identity import AzureCliCredential
 logger = logging.getLogger(__name__)


-# NOTE: approval_mode="never_require" is for sample brevity. Use "always_require" in production; see samples/02-agents/tools/function_tool_with_approval.py and samples/02-agents/tools/function_tool_with_approval_and_threads.py.
+# NOTE: approval_mode="never_require" is for sample brevity. Use "always_require" in production; see samples/02-agents/tools/function_tool_with_approval.py and samples/02-agents/tools/function_tool_with_approval_and_sessions.py.
@tool(approval_mode="never_require")
 def get_weather(location: str) -> dict[str, Any]:
    """Get current weather for a location."""
@@ -4,8 +4,8 @@ This sample shows how to chain two invocations of the same agent inside a Durabl
 preserving the conversation state between runs.

 ## Key Concepts
- Deterministic orchestrations that make sequential agent calls on a shared thread
- Reusing an agent thread to carry conversation history across invocations
+- Deterministic orchestrations that make sequential agent calls on a shared session
+- Reusing an agent session to carry conversation history across invocations
 - HTTP endpoints for starting the orchestration and polling for status/output

 ## Prerequisites
@@ -5,7 +5,7 @@
 Components used in this sample:
 - AzureOpenAIChatClient to construct the writer agent hosted by Agent Framework.
 - AgentFunctionApp to surface HTTP and orchestration triggers via the Azure Functions extension.
- Durable Functions orchestration to run sequential agent invocations on the same conversation thread.
+- Durable Functions orchestration to run sequential agent invocations on the same conversation session.

 Prerequisites: configure `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_CHAT_DEPLOYMENT_NAME`, and either
 `AZURE_OPENAI_API_KEY` or authenticate with Azure CLI before starting the Functions host."""
@@ -45,17 +45,17 @@ def _create_writer_agent() -> Any:
 app = AgentFunctionApp(agents=[_create_writer_agent()], enable_health_check=True)


-# 4. Orchestration that runs the agent sequentially on a shared thread for chaining behaviour.
+# 4. Orchestration that runs the agent sequentially on a shared session for chaining behaviour.
@app.orchestration_trigger(context_name="context")
 def single_agent_orchestration(context: DurableOrchestrationContext) -> Generator[Any, Any, str]:
-    """Run the writer agent twice on the same thread to mirror chaining behaviour."""
+    """Run the writer agent twice on the same session to mirror chaining behaviour."""

    writer = app.get_agent(context, WRITER_AGENT_NAME)
-    writer_thread = writer.get_new_thread()
+    writer_session = writer.create_session()

    initial = yield writer.run(
        messages="Write a concise inspirational sentence about learning.",
-        thread=writer_thread,
+        session=writer_session,
    )

    improved_prompt = (
@@ -65,7 +65,7 @@ def single_agent_orchestration(context: DurableOrchestrationContext) -> Generato

    refined = yield writer.run(
        messages=improved_prompt,
-        thread=writer_thread,
+        session=writer_session,
    )

    return refined.text
@@ -64,12 +64,12 @@ def multi_agent_concurrent_orchestration(context: DurableOrchestrationContext) -
    physicist = app.get_agent(context, PHYSICIST_AGENT_NAME)
    chemist = app.get_agent(context, CHEMIST_AGENT_NAME)

-    physicist_thread = physicist.get_new_thread()
-    chemist_thread = chemist.get_new_thread()
+    physicist_session = physicist.create_session()
+    chemist_session = chemist.create_session()

    # Create tasks from agent.run() calls
-    physicist_task = physicist.run(messages=str(prompt), thread=physicist_thread)
-    chemist_task = chemist.run(messages=str(prompt), thread=chemist_thread)
+    physicist_task = physicist.run(messages=str(prompt), session=physicist_session)
+    chemist_task = chemist.run(messages=str(prompt), session=chemist_session)

    # Execute both tasks concurrently using task_all
    task_results = yield context.task_all([physicist_task, chemist_task])
@@ -89,7 +89,7 @@ def spam_detection_orchestration(context: DurableOrchestrationContext) -> Genera
    spam_agent = app.get_agent(context, SPAM_AGENT_NAME)
    email_agent = app.get_agent(context, EMAIL_AGENT_NAME)

-    spam_thread = spam_agent.get_new_thread()
+    spam_session = spam_agent.create_session()

    spam_prompt = (
        "Analyze this email for spam content and return a JSON response with 'is_spam' (boolean) "
@@ -100,7 +100,7 @@ def spam_detection_orchestration(context: DurableOrchestrationContext) -> Genera

    spam_result_raw = yield spam_agent.run(
        messages=spam_prompt,
-        thread=spam_thread,
+        session=spam_session,
        options={"response_format": SpamDetectionResult},
    )

@@ -113,7 +113,7 @@ def spam_detection_orchestration(context: DurableOrchestrationContext) -> Genera
        result = yield context.call_activity("handle_spam_email", spam_result.reason)  # type: ignore[misc]
        return result

-    email_thread = email_agent.get_new_thread()
+    email_session = email_agent.create_session()

    email_prompt = (
        "Draft a professional response to this email. Return a JSON response with a 'response' field "
@@ -124,7 +124,7 @@ def spam_detection_orchestration(context: DurableOrchestrationContext) -> Genera

    email_result_raw = yield email_agent.run(
        messages=email_prompt,
-        thread=email_thread,
+        session=email_session,
        options={"response_format": EmailResponse},
    )

@@ -93,13 +93,13 @@ def content_generation_hitl_orchestration(context: DurableOrchestrationContext)
        raise ValueError(f"Invalid content generation input: {exc}") from exc

    writer = app.get_agent(context, WRITER_AGENT_NAME)
-    writer_thread = writer.get_new_thread()
+    writer_session = writer.create_session()

    context.set_custom_status(f"Starting content generation for topic: {payload.topic}")

    initial_raw = yield writer.run(
        messages=f"Write a short article about '{payload.topic}'.",
-        thread=writer_thread,
+        session=writer_session,
        options={"response_format": GeneratedContent},
    )

@@ -150,7 +150,7 @@ def content_generation_hitl_orchestration(context: DurableOrchestrationContext)
            )
            rewritten_raw = yield writer.run(
                messages=rewrite_prompt,
-                thread=writer_thread,
+                session=writer_session,
                options={"response_format": GeneratedContent},
            )

@@ -6,7 +6,7 @@ This sample demonstrates how to create a worker-client setup that hosts a single

 - Using the Microsoft Agent Framework to define a simple AI agent with a name and instructions.
 - Registering durable agents with the worker and interacting with them via a client.
- Conversation management (via threads) for isolated interactions.
+- Conversation management (via sessions) for isolated interactions.
 - Worker-client architecture for distributed agent execution.

 ## Environment Setup
@@ -46,7 +46,7 @@ Using taskhub: default
 Using endpoint: http://localhost:8080

 Getting reference to Joker agent...
-Created conversation thread: a1b2c3d4-e5f6-7890-abcd-ef1234567890
+Created conversation session: a1b2c3d4-e5f6-7890-abcd-ef1234567890

 User: Tell me a short joke about cloud computing.

@@ -69,9 +69,9 @@ def run_client(agent_client: DurableAIAgentClient) -> None:
    logger.debug("Getting reference to Joker agent...")
    joker = agent_client.get_agent("Joker")

-    # Create a new thread for the conversation
-    thread = joker.get_new_thread()
-    logger.debug(f"Thread ID: {thread.session_id}")
+    # Create a new session for the conversation
+    session = joker.create_session()
+    logger.debug(f"Session ID: {session.session_id}")
    logger.info("Start chatting with the Joker agent! (Type 'exit' to quit)")

    # Interactive conversation loop
@@ -94,7 +94,7 @@ def run_client(agent_client: DurableAIAgentClient) -> None:

        # Send message to agent and get response
        try:
-            response = joker.run(user_message, thread=thread)
+            response = joker.run(user_message, session=session)
            logger.info(f"Joker: {response.text} \n")
        except Exception as e:
            logger.error(f"Error getting response: {e}")
@@ -6,7 +6,7 @@ This sample demonstrates how to host multiple AI agents with different tools in

 - Hosting multiple agents (WeatherAgent and MathAgent) in a single worker process.
 - Each agent with its own specialized tools and instructions.
- Interacting with different agents using separate conversation threads.
+- Interacting with different agents using separate conversation sessions.
 - Worker-client architecture for multi-agent systems.

 ## Environment Setup
@@ -49,7 +49,7 @@ Using endpoint: http://localhost:8080
 Testing WeatherAgent
 ================================================================================

-Created weather conversation thread: <guid>
+Created weather conversation session: <guid>
 User: What is the weather in Seattle?

 🔧 [TOOL CALLED] get_weather(location=Seattle)
@@ -61,7 +61,7 @@ WeatherAgent: The current weather in Seattle is sunny with a temperature of 72°
 Testing MathAgent
 ================================================================================

-Created math conversation thread: <guid>
+Created math conversation session: <guid>
 User: Calculate a 20% tip on a $50 bill

 🔧 [TOOL CALLED] calculate_tip(bill_amount=50.0, tip_percentage=20.0)
@@ -70,30 +70,30 @@ def run_client(agent_client: DurableAIAgentClient) -> None:

    # Get reference to WeatherAgent
    weather_agent = agent_client.get_agent("WeatherAgent")
-    weather_thread = weather_agent.get_new_thread()
+    weather_session = weather_agent.create_session()

-    logger.debug(f"Created weather conversation thread: {weather_thread.session_id}")
+    logger.debug(f"Created weather conversation session: {weather_session.session_id}")

    # Test WeatherAgent
    weather_message = "What is the weather in Seattle?"
    logger.info(f"User: {weather_message}")

-    weather_response = weather_agent.run(weather_message, thread=weather_thread)
+    weather_response = weather_agent.run(weather_message, session=weather_session)
    logger.info(f"WeatherAgent: {weather_response.text} \n")

    logger.debug("Testing MathAgent")

    # Get reference to MathAgent
    math_agent = agent_client.get_agent("MathAgent")
-    math_thread = math_agent.get_new_thread()
+    math_session = math_agent.create_session()

-    logger.debug(f"Created math conversation thread: {math_thread.session_id}")
+    logger.debug(f"Created math conversation session: {math_session.session_id}")

    # Test MathAgent
    math_message = "Calculate a 20% tip on a $50 bill"
    logger.info(f"User: {math_message}")

-    math_response = math_agent.run(math_message, thread=math_thread)
+    math_response = math_agent.run(math_message, session=math_session)
    logger.info(f"MathAgent: {math_response.text} \n")

    logger.debug("Both agents completed successfully!")
@@ -140,14 +140,14 @@ def run_client(agent_client: DurableAIAgentClient) -> None:
    logger.debug("Getting reference to TravelPlanner agent...")
    travel_planner = agent_client.get_agent("TravelPlanner")

-    # Create a new thread for the conversation
-    thread = travel_planner.get_new_thread()
-    if not thread.session_id:
-        logger.error("Failed to create a new thread with session ID!")
+    # Create a new session for the conversation
+    session = travel_planner.create_session()
+    if not session.session_id:
+        logger.error("Failed to create a new session with session ID!")
        return

-    key = thread.session_id.key
-    logger.info(f"Thread ID: {key}")
+    key = session.session_id
+    logger.info(f"Session ID: {key}")

    # Get user input
    print("\nEnter your travel planning request:")
@@ -164,7 +164,7 @@ def run_client(agent_client: DurableAIAgentClient) -> None:
    # Start the agent run with wait_for_response=False for non-blocking execution
    # This signals the agent to start processing without waiting for completion
    # The agent will execute in the background and write chunks to Redis
-    travel_planner.run(user_message, thread=thread, options={"wait_for_response": False})
+    travel_planner.run(user_message, session=session, options={"wait_for_response": False})

    # Stream the response from Redis
    # This demonstrates that the client can stream from Redis while
@@ -6,7 +6,7 @@ This sample demonstrates how to chain multiple invocations of the same agent usi

 - Using durable orchestrations to coordinate sequential agent invocations.
 - Chaining agent calls where the output of one run becomes input to the next.
- Maintaining conversation context across sequential runs using a shared thread.
+- Maintaining conversation context across sequential runs using a shared session.
 - Using `DurableAIAgentOrchestrationContext` to access agents within orchestrations.

 ## Environment Setup
@@ -42,7 +42,7 @@ The orchestration will execute the writer agent twice sequentially:

 ```
 [Orchestration] Starting single agent chaining...
-[Orchestration] Created thread: abc-123
+[Orchestration] Created session: abc-123
 [Orchestration] First agent run: Generating initial sentence...
 [Orchestration] Initial response: Every small step forward is progress toward mastery.
 [Orchestration] Second agent run: Refining the sentence...
@@ -62,7 +62,7 @@ You can view the state of the orchestration in the Durable Task Scheduler dashbo
 1. Open your browser and navigate to `http://localhost:8082`
 2. In the dashboard, you can view:
   - The sequential execution of both agent runs
-   - The conversation thread shared between runs
+   - The conversation session shared between runs
   - Input and output at each step
   - Overall orchestration state and history

@@ -87,17 +87,17 @@ def single_agent_chaining_orchestration(
    # Get the writer agent using the agent context
    writer = agent_context.get_agent(WRITER_AGENT_NAME)

-    # Create a new thread for the conversation - this will be shared across both runs
-    writer_thread = writer.get_new_thread()
+    # Create a new session for the conversation - this will be shared across both runs
+    writer_session = writer.create_session()

-    logger.debug(f"[Orchestration] Created thread: {writer_thread.session_id}")
+    logger.debug(f"[Orchestration] Created session: {writer_session.session_id}")

    prompt = "Write a concise inspirational sentence about learning."
    # First run: Generate an initial inspirational sentence
    logger.info("[Orchestration] First agent run: Generating initial sentence about: %s", prompt)
    initial_response = yield writer.run(
        messages=prompt,
-        thread=writer_thread,
+        session=writer_session,
    )
    logger.info(f"[Orchestration] Initial response: {initial_response.text}")

@@ -110,7 +110,7 @@ def single_agent_chaining_orchestration(
    logger.info("[Orchestration] Second agent run: Refining the sentence: %s", improved_prompt)
    refined_response = yield writer.run(
        messages=improved_prompt,
-        thread=writer_thread,
+        session=writer_session,
    )

    logger.info(f"[Orchestration] Refined response: {refined_response.text}")
@@ -7,7 +7,7 @@ This sample demonstrates how to host multiple agents and run them concurrently u
 - Running multiple specialized agents in parallel within an orchestration.
 - Using `OrchestrationAgentExecutor` to get `DurableAgentTask` objects for concurrent execution.
 - Aggregating results from multiple agents using `task.when_all()`.
- Creating separate conversation threads for independent agent contexts.
+- Creating separate conversation sessions for independent agent contexts.

 ## Environment Setup

@@ -64,7 +64,7 @@ You can view the state of the orchestration in the Durable Task Scheduler dashbo
 1. Open your browser and navigate to `http://localhost:8082`
 2. In the dashboard, you can view:
   - The concurrent execution of both agents (PhysicistAgent and ChemistAgent)
-   - Separate conversation threads for each agent
+   - Separate conversation sessions for each agent
   - Parallel task execution and completion timing
   - Aggregated results from both agents

@@ -80,15 +80,15 @@ def multi_agent_concurrent_orchestration(context: OrchestrationContext, prompt:
    physicist = agent_context.get_agent(PHYSICIST_AGENT_NAME)
    chemist = agent_context.get_agent(CHEMIST_AGENT_NAME)

-    # Create separate threads for each agent
-    physicist_thread = physicist.get_new_thread()
-    chemist_thread = chemist.get_new_thread()
+    # Create separate sessions for each agent
+    physicist_session = physicist.create_session()
+    chemist_session = chemist.create_session()

-    logger.debug(f"[Orchestration] Created threads - Physicist: {physicist_thread.session_id}, Chemist: {chemist_thread.session_id}")
+    logger.debug(f"[Orchestration] Created sessions - Physicist: {physicist_session.session_id}, Chemist: {chemist_session.session_id}")

    # Create tasks from agent.run() calls - these return DurableAgentTask instances
-    physicist_task = physicist.run(messages=str(prompt), thread=physicist_thread)
-    chemist_task = chemist.run(messages=str(prompt), thread=chemist_thread)
+    physicist_task = physicist.run(messages=str(prompt), session=physicist_session)
+    chemist_task = chemist.run(messages=str(prompt), session=chemist_session)

    logger.debug("[Orchestration] Created agent tasks, executing concurrently...")

@@ -82,6 +82,6 @@ You can view the state of the WriterAgent and orchestration in the Durable Task
 1. Open your browser and navigate to `http://localhost:8082`
 2. In the dashboard, you can view:
   - Orchestration instance status and pending events
-   - WriterAgent entity state and conversation threads
+   - WriterAgent entity state and conversation sessions
   - Activity execution logs
   - External event history
@@ -150,16 +150,16 @@ def content_generation_hitl_orchestration(

    # Get the writer agent
    writer = agent_context.get_agent(WRITER_AGENT_NAME)
-    writer_thread = writer.get_new_thread()
+    writer_session = writer.create_session()

-    logger.info(f"ThreadID: {writer_thread.session_id}")
+    logger.info(f"SessionID: {writer_session.session_id}")

    # Generate initial content
    logger.info("[Orchestration] Generating initial content...")

    initial_response: AgentResponse = yield writer.run(
        messages=f"Write a short article about '{payload.topic}'.",
-        thread=writer_thread,
+        session=writer_session,
            options={"response_format": GeneratedContent},
    )
    content = cast(GeneratedContent, initial_response.value)
@@ -251,11 +251,11 @@ def content_generation_hitl_orchestration(

            logger.debug("[Orchestration] Regenerating content with feedback...")

-            logger.warning(f"Regenerating with ThreadID: {writer_thread.session_id}")
+            logger.warning(f"Regenerating with SessionID: {writer_session.session_id}")

            rewrite_response: AgentResponse = yield writer.run(
                messages=rewrite_prompt,
-                thread=writer_thread,
+                session=writer_session,
                    options={"response_format": GeneratedContent},
            )
            rewritten_content = cast(GeneratedContent, rewrite_response.value)