Design doc draft (#5)

* wip * wip * wip * wip * wip * wip * Update docs/design/main.md Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com> * Update docs/design/main.md Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com> * wip * wip * wip * wip * wip * wip * wip * wip * wip * update * update * update * wip * wip * wip * wip * address comment * update * add custom agent example * address comment * update code teaser * Update docs/design/main.md Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com> * update * address comments * update guardrails * address some of mark's comments * add new separate sections for agents and workflows * update agent doc * Update agent.md Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com> * add foundry agent doc * wip * refine the component registration interface with agent runtime * update * workflows * update * update * Update * Update * update * Update design doc to remove runtime * Update * Update * Update * update * Add eval section notes (#9) * add notes on eval * remove duplicate title * update docs * update docs * save updates before merge * update evaluation script * Update agents.md * update workflows * Update Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com> * update workflow * Updated design doc * Update * Update * update * update * Update * update * update * Update * update * Update with agent abstraction alternatives * Update discussion * Update * update * Update * Update * Update * Update --------- Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com> Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com> Co-authored-by: Victor Dibia <chuvidi2003@gmail.com>
2026-06-16 21:04:09 +08:00 · 2025-05-29 12:36:54 -07:00
parent d0531bb93b
commit 6089446f04
14 changed files with 1528 additions and 0 deletions
@@ -0,0 +1,103 @@
+# Agent Framework Design Doc
+
+What values does the framework provide?
+
+- A set of configurable, extensible and high-quality components (e.g., model clients, tools, MCP servers and memory).
+- An easy path for deploying, securing and scaling applications, both locally and in the cloud.
+- Integration with tools for monitoring, debugging, evaluation and optimization, both locally and in the cloud.
+- A community of developers and users for support, ideas, and contributions, benefiting everyone in the ecosystem.
+
+What is this document?
+
+- An overview of the new framework.
+- Defining the major elements of the framework and their relationships.
+- Detailed design of each element and its implementation will be in a separate document.
+
+## Core Data Types
+
+To unify the interaction between components, we define a set of core
+data types that are used throughout the framework.
+
+See [Core Data Types](types.md) for more details.
+
+## Components
+
+A component is a class that provides a specific functionality and can be used
+independently by applications.
+There are two types of components in the framework: agent components and agents. Agent components are the building blocks of agents, while agents are
+the higher-level components, and can be composed from agent components
+and other agents (as in workflows).
+
+The framework defines the following components. Follow the links to
+find the design details of each component:
+
+- Agent Components:
+  - [Model Client](models.md)
+  - [Vector Store and Embedding Client](vector-stores.md)
+  - [Tool](tools.md)
+  - [MCP Server](mcp-servers.md)
+  - [Memory](memory.md)
+  - [Thread](threads.md)
+  - [Guardrail](guardrails.md)
+- Agent and Workflow:
+  - [Agent](agents.md)
+  - [Workflow](workflows.md)
+
+### Composition
+
+Components can be composed to create complex components. For example,
+an agent can be composed from model clients, tools and memory,
+and a tool can be composed from an agent or a workflow.
+It is the responsibility of the framework to validate components
+and their composition.
+
+### Configuration
+
+A component can be created from a set of serializable configuration parameters,
+with the help of dependency injection to resolve non-serializable dependencies.
+
+### Relationships
+
+The following diagram shows the component relationship of the framework:
+
+```mermaid
+graph TD
+    Component[Component] --> |extends| Agent[Agent]
+    Agent --> |extends| Workflow[Workflow]
+    
+    Component --> |extends| ModelClient[Model Client]
+    Component --> |extends| VectorStore[Vector Store]
+    Component --> |extends| EmbeddingClient[Embedding Client]
+    Component --> |extends| Tool[Tool]
+    Component --> |extends| MCPServer[MCP Server]
+    Component --> |extends| Memory[Memory]
+    Component --> |extends| Thread[Thread]
+    Component --> |extends| Guardrail[Guardrail]
+    
+    Agent --> |uses| uses1[Model Client]
+    Agent --> |uses| uses2[Thread]
+    Agent --> |uses| uses3[Tools and MCP Servers]
+    Agent --> |uses| uses4[Memory]
+    Agent --> |uses| uses5[Guardrail]
+    
+    Workflow --> |contains| contains[Child Agents]
+
+    Memory --> |uses| uses5[Vector Store]
+    VectorStore --> |uses| uses6[Embedding Client]
+```
+
+## Deployment and Scaling
+
+[Deployment](deployment.md).
+
+## Observability and Monitoring
+
+[Observability](observability.md).
+
+## Evaluation
+
+[Evaluation](evaluation.md).
+
+## Optimization
+
+[Optimization](optimization.md).
@@ -0,0 +1,476 @@
+# Agents
+
+An agent is a component that processes messages in a thread and returns a result.
+
+During its handling of messages, an agent may:
+
+- Use model client to process messages,
+- Use thread to keep track of the interaction with the model,
+- Invoke tools or MCP servers, and
+- Retrieve and store data through memory.
+
+It is up to the implementation of the agent class to decide how these components are used.
+
+__An important design goal of the framework is to ensure the developer experience
+of creating custom agent is as easy as possible.__ Existing frameworks
+have made "kitchen-sink" agents that are hard to understand and maintain.
+
+An agent might not use the components provided by the framework to implement
+the agent interface.
+Azure AI Agent is an example of such agent: its implementation is
+backed by the Azure AI Agent Service.
+
+The framework provides a set of pre-built agents:
+
+- `ChatCompletionAgent`: an agent that uses a chat-completion model to process messages
+and use thread, memory, tools and MCP servers in a configurable way. __If we can make
+custom agents easy to implement, we can remove this agent.__
+- `AzureAIAgent`: an agent that is backed by Azure AI Agent Service.
+- `ResponsesAgent`: an agent that is backed by OpenAI's Responses API.
+- `A2AAgent`: an agent that is backed by the [A2A Protocol](https://google.github.io/A2A/documentation/).
+
+## `Agent` protocol
+
+```python
+class Agent(Protocol):
+    """The protocol for all agents in the framework."""
+
+    async def run(
+        self, 
+        thread: Thread,
+        context: Context,
+    ) -> Result:
+        """The method to run the agent on a thread of messages, and return the result.
+
+        Args:
+            thread: The thread of messages to process: it may be a local thread
+                or a stub thread that is backed by a remote service.
+            context: The context for the current invocation of the agent, providing
+                access to the event channel, and human-in-the-loop (HITL) features.
+        
+        Returns:
+            The result of running the agent, which includes the final response.
+        """
+        ...
+
+
+@dataclass
+class Context:
+    """The context for the current invocation of the agent."""
+    event_handler: EventHandler
+    """The event consumer for handling events emitted by the agent."""
+    user_input_source: UserInputSource
+    """The user input source for requesting for user input during the agent run."""
+    ... # Other fields, could be extended to include more for application-specific needs.
+
+
+@dataclass
+class Result:
+    """The result of running an agent."""
+    final_response: Message
+    ... # Other fields, could be extended to include more for application-specific needs.
+```
+
+## `ToolCallingAgent` example
+
+Here is an example of a custom agent that calls a tool and returns the result.
+The `ToolCallingAgent` implements the `Agent` base class and
+it implements the `run` method to process incoming messages and call tools if needed.
+
+```python
+class ToolCallingAgent(Agent):
+    def __init__(
+        self, 
+        model_client: ModelClient,
+        tools: list[Tool],
+    ) -> None:
+        self.model_client = model_client
+        self.tools = tools
+
+    async def run(self, thread: Thread, context: Context) -> Result:
+        # Create a response using the model client, passing the thread and context.
+        create_result = await self.model_client.create(thread, context, tools=self.tools)
+        # Emit the event to notify the workflow consumer of a model response.
+        await context.emit(ModelResponseEvent(create_result))
+        if create_result.is_tool_call():
+            # Get user approval for the tool call through the context.
+            approval = await context.get_user_approval(create_result.tool_calls)
+            if not approval:
+                # ... return a canned response.
+            # Call the tools with the tool calls in the response.
+            tools = ... # Find the tool by name in the tools list.
+            tool_results = ... # Call the tool with the tool call arguments.
+            # Emit the event to notify the workflow consumer of a tool call.
+            await context.emit(ToolCallEvent(tool_result))
+            # Update the thread with the tool result.
+            await thread.append(tool_result.to_messages())
+            # Return the tool result as the response.
+            return Result(
+                final_response=tool_result,
+            )
+        else: 
+            # Return the response as the result.
+            return Result(
+                final_response=create_result,
+            )
+```
+
+Things to note in the implementation of the `run` method:
+- Orchestration of tools and model is completly customizable.
+- Components such as `thread` and `model_client` interacts smoothly with little boilerplate code.
+- The `context` parameter provides convenient access to the workflow run fixtures such as event channel.
+
+An agent doesn't need to use components provided by the framework to implement the agent interface.
+
+For example, in a multi-agent workflow, we may need a verification agent in a using deterministic
+logic to critic another agent's response.
+
+```python
+class CriticAgent(Agent):
+    def __init__(self) -> None:
+        self.verification_logic = ... # Some verification logic, e.g. a set of rules.
+
+    async def run(self, thread: Thread, context: Context) -> Result:
+        # Use the verification logic to verify the last message in the thread.
+        response = thread.get_last_message()
+        is_verified = self.verification_logic.verify(response)
+        if is_verified:
+            final_response = Message("The response is verified.")
+        else:
+            final_response = Message("The response is not verified.")
+        
+        return Result(
+            final_response=final_response,
+        )
+```
+
+## Run
+
+A _run_ is a single invocation of the agent or a workflow given a thread of messages.
+
+## Run agent
+
+Developer can instantiate a subclass of `Agent` directly using it's constructor, 
+and run it by calling the `run` method.
+
+```python
+@FunctionTool
+def my_tool(input: str) -> str:
+    return f"Tool result for {input}"
+
+model_client = OpenAIChatCompletionClient("gpt-4.1")
+agent = ToolCallingAgent(
+    model_client=model_client, 
+    tools=[my_tool],
+)
+
+# Create a thread for the current task.
+thread = [
+    Message("Hello"),
+    Message("Can you find the file 'foo.txt' for me?"),
+]
+
+# Create a context that uses a handler that prints emitted events to the console, 
+# and a user input source that reads from the console.
+context = Context(event_handler=ConsoleEventHandler(), user_input_source=ConsoleUserInputSource())
+
+# Run the agent with the thread and context.
+result = await agent.run(thread, context)
+```
+
+## User session
+
+A user session is a logical concept which involves a sequence of messages exchanged between the user and the agent.
+Consider the following examples:
+
+- A chat session in ChatGPT.
+- A delegation of task to a workflow agent from a user, with data exchanged between the user
+    and the workflow such as occassional feedbacks from the user and status updates from the workflow.
+
+A user session may involve multiple runs.
+
+
+## User session state
+
+Rather than classifying agents as stateless or stateful, we focus on how state is managed during a user session.
+
+There are several states that an application may maintain during a user session:
+- **Conversation or workflow state**. This is the conversation history or execution 
+    history in a workflow. This state is typically owned and managed by the thread object.
+- **Long-term memory**. This can be information relevant to the user, 
+    such as user preferences, past interactions, or other relevant data.
+    This can also be information relevant to the task, such as past trajectories,
+    past results, or other task-related data. These states are typically
+    owned and managed by a memory object.
+
+The thread is always passed through the agent's `run` method.
+Whereas the memory is can be set through the constructor of the agent.
+
+See the [Memory](memory.md) design document for more details on how memory
+is used in the framework.
+
+It is up to the application to decide whether to reuse state across different
+user sessions. The framework should provide the necessary methods and storage layer integration
+for persisting and retrieving state, but the application should decide how to use them.
+
+## Run agent concurrently
+
+If the agent just call models and tools that are stateless, 
+we can run the same instance of the agent concurrently.
+
+```python
+# Create threads for concurrent tasks.
+thread1 = [
+    Message("Hello"),
+    Message("Can you find the file 'foo.txt' for me?"),
+]
+thread2 = [
+    Message("Hello"),
+    Message("Can you find the file 'bar.txt' for me?"),
+]
+
+# Run the agent concurrently on multiple threads.
+results = await asyncio.gather(
+    agent.run(thread1, context),
+    agent.run(thread2, context),
+)
+# The `context`'s event handlers will emit events from both runs.
+```
+
+This is not always the right way to run concurrent agents, as some tools
+or memory associated with the agent may not be concurrent-safe.
+
+It is up the application to decide if an agent can run concurrently,
+or multiple instances should be created for each thread.
+
+
+## Using Foundry Agent Service
+
+The framework offers a built-in agent class for users of the Foundry Agent Service.
+The agent class essentially acts as a proxy to the agent hosted by the Foundry Agent Service.
+
+```python
+agent = FoundryAgent(
+    name="my_foundry_agent",
+    project_client="ProjectClient",
+    agent_id="my_agent_id", # If not provided, a new agent will be created.
+    deployment_name="my_deployment",
+    instruction="my_instruction",
+    ... # Other parameters for the agent creation.
+)
+
+# Create a thread that is backed by the Foundry Agent Service.
+thread = FoundryThread(thread_id="my_thread_id")
+
+# Run the agent on the thread and an new context that emits events to the console.
+result = await agent.run(thread, RunContext(event_channel="console"))
+```
+
+## Alternative agent abstractions
+
+There are two alternatives:
+
+1. **Agent with private conversation state**: The agent manages its own conversation state,
+    either by using a thread or other custom logics. The conversation state is 
+    not shared with other agents or workflows. It is up to the agent to decide how
+    to manage the conversation state.
+2. **Agent without conversation state**: The conversation state is externalized
+    and managed by a thread abstraction. The agent is invoked with a thread on
+    every run. While it can still use the thread to append messages etc., it loses
+    control over the conversation state the moment the run method returns.
+
+### Protocol comparison
+
+For agent with private conversation state, agent is invoked with new messages
+and the agent is responsible for managing the conversation state while exposing
+public methods for the orchestration code to manipulate its conversation state
+indirectly.
+
+```python
+class Agent(Protocol):
+
+    async def run(
+        self, 
+        messages: list[Message],
+        context: Context,
+    ) -> Result:
+        """The method to run the agent and return the result.
+
+        Args:
+            messages: The list of new messages to process.
+            context: The context for the current invocation of the agent, providing
+                access to the event channel, and human-in-the-loop (HITL) features.
+
+        Returns:
+            The result of running the agent, which includes the final response.
+        """
+        ...
+    
+    async def reset() -> None:
+        """Reset the conversation state of the agent."""
+        ...
+    
+    # And other methods for managing the conversation state.
+```
+
+For agent without conversation state, the agent is invoked with a thread
+and the agent is responsible for processing the messages in the thread.
+
+```python
+class Agent(Protocol):
+
+    async def run(
+        self, 
+        thread: Thread,
+        context: Context,
+    ) -> Result:
+        """The method to run the agent on a thread of messages, and return the result.
+
+        Args:
+            thread: The current conversation state.
+            context: The context for the current invocation of the agent, providing
+                access to the event channel, and human-in-the-loop (HITL) features.
+        Returns:
+            The result of running the agent, which includes the final response.
+        """
+        ...
+```
+
+### Constructor comparison
+
+For agent with private conversation state, the agent is initialized with
+the a state in addition to components like model client and tools, which could be a thread passed to the constructor,
+or a custom state object that the agent uses to manage its conversation state.
+
+```python
+class CustomAgent(Agent):
+    def __init__(self, 
+        model_client: ModelClient,
+        tools: list[Tool],
+        state: CustomState, # Could be a thread or a custom state object, or nothing at all.
+    ) -> None:
+        self.model_client = model_client
+        self.tools = tools
+        self.state = state # Could be created by the agent within the constructor.
+
+```
+
+For agent without conversation state, the agent is initialized with
+the components it needs to process messages, such as a model client and tools.
+
+```python
+class CustomAgent(Agent):
+    def __init__(
+        self, 
+        model_client: ModelClient,
+        tools: list[Tool],
+    ) -> None:
+        self.model_client = model_client
+        self.tools = tools
+```
+
+### Thread-Agent compatibility considerations
+
+For agent with private conversation state, compatibility with thread is not a concern,
+as this is completely managed by the agent itself.
+
+For agent without conversation state, the thread must be compatible with the agent's
+`run` method. For example, a `FoundryAgent` must work with a `FoundryThread`
+because the thread is backed by the Foundry Agent Service, and the implementation
+requires the thread to be compatible with the service's API.
+
+Compatibility constraints:
+- `FoundryAgent` must work with `FoundryThread`.
+- `OpenAIAssistantAgent` must work with `OpenAIAssistantThread`.
+- `ResponsesAgent` must work with `ResponsesThread`, when using the stateful mode of the Responses API.
+
+### Workflow-Agent compatibility considerations
+
+For agent with private conversation state, the orchestration code cannot directly
+modifies the conversation state of every agent in the workflow.
+This means that for resetting the conversation state, branching a conversation,
+or other orchestration logic, the agent must provides public
+methods for the orchestration code to manipulate its conversation state.
+
+Potential methods (just initial ideas):
+- `reset()` to reset the conversation state.
+- `branch()` to create a new branch of the conversation state from an existing state.
+
+Example: AutoGen's MagenticOne orchestration requires the agents to be able to
+reset their conversation states during re-planning. It is reasonable to expect
+other types of orchestration logic will require behavior like branching
+or backtracking.
+
+For agent without conversation state, the orchestration code can directly
+manipulate the thread that is passed to the agent's `run` method. So the orchestration code
+can clone, fork, or reset the thread as needed.
+This also means that the agent's converstion state must be abstracted as a thread.
+
+### Extensibility considerations
+
+For agent with private conversation state, the management of the conversation state
+is completely up to the agent implementation. This means that custom agents can
+be created with different conversation state management strategies, such as:
+- Using a custom thread implementation that provides additional features.
+- Using a custom state object that provides additional features.
+When using a custom state object, the developer must also implement
+methods for exporting and importing the state.
+
+For agent without conversation state, the thread abstraction is required to
+encapsulate the conversation state and ensure that the agent's `run` method
+can use it without any issues. This puts a constraint on the agent implementation,
+and also what can be represented as state in the thread.
+Though, if the thread abstraction is designed well, it relieves the developer
+from implementing the conversation state management logic themselves.
+The developer only needs to come up with custom thread when the built-in thread
+abstraction does not work with their custom agent.
+
+### Discussion
+
+- Either agent or thread must manage the conversation state.
+- The class that manages the conversation state must provide a way to manipulate
+    it for orchestration purposes.
+- Isolate thread as a separate required abstraction may introduce compatibility
+    issues.
+- A thread abstraction with methods for manipulating the conversation state
+    should always be provided by the framework, whether it is exposed again
+    through the agent or not.
+
+In a scenario with built-in agents and built-in threads, the developer experience
+is nearly identical except for agent without conversation state the developer
+must ensure the thread is compatible with the agent's `run` method.
+
+In a scenario with custom agents and built-in threads, the developer experience
+is simpler for agent without conversation state, as the thread abstraction
+is already provided by the framework and the agent can use it directly. Plus,
+the developer doesn't need to implement the conversation state management logic
+through the agent's other methods, which will mostly likely be boilerplate code.
+
+In a scenario with built-in agents and custom threads, the developer experience
+is nearly identical, as in either case the developer must ensure
+the agent's `run` method is compatible with the thread or general state object.
+
+In a scenario with custom agents and custom threads, the developer experience
+is nearly identical, as in either case the developer must ensure
+the agent's `run` method is compatible with the thread or general state object,
+and that the state management logic is implemented in the agent or the thread.
+
+| Scenario | Agent with Conversation State | Agent without Conversation State |
+|----------|------------------------------------------|---------------------------------------------|
+| Built-in Agents, Built-in Threads | Simpler -- it should just work as there is no compatibility issue at runtime | Developer must ensure thread compatibility with agent's `run` method at runtime |
+| Custom Agents, Built-in Threads | Developer must implement state management methods on the agent. | Simpler, as thread abstraction is provided by the framework and agent can use it directly |
+| Built-in Agents, Custom Threads | Developer must ensure compatibility of the custom thread or state with agent's `run` method | Developer must ensure compatibility of the custom thread with agent's `run` method |
+| Custom Agents, Custom Threads | Developer is fully responsible for implementing state management. | Developer is fully responsible for implementing state management. |
+
+Overall, the agent without conversation state abstraction
+provides a simpler and more consistent developer experience, as it relies on
+the thread abstraction provided by the framework. The downside is that 
+developer must ensure the thread used is compatible with the agent's `run` method
+-- this can be mitigated by enforcing strong types and validation, as well as
+built-in factory methods for creating new threads given the agent type.
+
+Another factor to consider is that Semantic Kernel already has agent abstraction
+that passes a thread per invocation, so it is easier for us to migrate to the
+new interface. 
+
+> **We should continue to question this decision as we implement more agents and workflows, and revisit the design.**
@@ -0,0 +1,129 @@
+## Evaluation
+
+The goal of Evaluation is to enable developers measure both the quality of agent responses and the efficiency of their decision-making processes.
+
+### Core Evaluation Concepts
+
+To enable effective evaluation (mindful of the fact that agents may be implemented with different approaches or even frameworks), it is useful to focus on the following core concepts:
+
+- **Standardized Trajectory Format**: A unified representation of agent interactions (messages, tool calls, events) enabling consistent evaluation across different agent implementations.
+- **Trajectory and Outcome Evaluation**: Analyze both the path an agent takes and the final response it generates. This includes evaluating the sequence of tool calls, the order of operations, and the final output.
+
+### Evaluation Components
+
+The framework provides these key evaluation components:
+
+- **Trajectory Converter**: Transforms agent runs from various frameworks into a standardized format for evaluation.
+- **Metrics Library**:
+  - Computation-based metrics: Direct algorithms that calculate objective measures without requiring a model
+  - Model-based metrics: Evaluation criteria that require an AI model to assess subjective qualities
+- **Judge**: For model-based metrics, a judge is the LLM responsible for applying evaluation criteria. Different judge models can be selected based on evaluation needs.
+- **Evaluator**: Coordinates the evaluation process by running computation-based metrics directly and applying judges to model-based metrics.
+- **Integration**: Connect with cloud evaluation services including Azure AI Evaluation.
+
+### (Example) Metrics
+
+Metrics may be pointwise (evaluating a single response on some criteria) or pairwise (evaluating two responses against each other e.g., where some ground truth is available).
+
+#### Computation-based Metrics
+
+- **Tool Match**: Measures tool call sequence matching in various ways:
+  - Exact Match: Perfect match with reference sequence
+  - In-Order Match: Required tools called in correct order (extra steps allowed)
+  - Any-Order Match: All required tools called regardless of order
+- **Precision**: Proportion of agent's tool calls that match reference tool calls.
+- **Recall**: Proportion of reference tool calls included in the agent's tool calls.
+- **Single Tool Usage**: Checks if a specific tool was used during the trajectory.
+- **Tool Call Errors**: Measures rate of tool call failures or errors.
+- **Latency**: Time required for agent to complete its task.
+
+#### Model-based Metrics
+
+- **Task Adherence**: Evaluates how well the agent's response addresses the assigned task.
+- **Coherence**: Assesses logical flow and internal consistency of the response.
+- **Safety**: Detects potential harmful content in responses.
+- **Follows Trajectory**: Evaluates if the response logically follows from the tools used.
+- **Efficiency**: Measures if the agent took an optimal path to reach the solution.
+
+This can build on the suite of metrics provided by [Azure AI evaluation](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/agent-evaluate-sdk).
+
+### Sample Developer Experience
+
+**Sample Developer Experience:**
+
+1. **Run Agent**: Execute your agent on tasks to generate trajectories.
+2. **Create Trajectory**: Structure task, run data, and optional reference.
+3. **Configure Metrics**: Select pre-built or custom metrics for evaluation.
+4. **Evaluate**: Run evaluator to get scores and detailed results.
+5. **Analyze**: Review metrics to identify improvements.
+
+```python
+from azure.ai.evaluation import AzureOpenAIModelConfiguration
+from agent_framework.evaluation import (
+    TrajectoryMatchMetric,
+    TaskAdherenceMetric,
+    Evaluator,
+    Trajectory
+)
+
+# Model configuration for judge
+model_config = AzureOpenAIModelConfiguration(
+    azure_deployment="o3-mini",
+    api_version="2024-02-01",
+    temperature=0
+)
+
+# Run your agent
+task = "What's the weather in Seattle?"
+run = your_agent.run(task)
+
+# Create trajectory object
+trajectory = Trajectory(
+    task=task,
+    run=run,
+    reference=[  # Optional reference trajectory
+        {"type": "tool_call", "tool": "weather_api", "args": {"location": "Seattle"}},
+        {"type": "response", "content": "Weather information for Seattle"}
+    ]
+)
+
+# Define metrics
+trajectory_match = TrajectoryMatchMetric(match_type="exact")
+task_adherence = TaskAdherenceMetric(
+    criteria={
+        "Task adherence": (
+            "Does the response address the user's request and incorporate "
+            "information from tool calls appropriately?"
+        )
+    },
+    rating_rubric={
+        "5": "Excellent - Fully addresses task with complete detail",
+        "4": "Good - Addresses most aspects effectively",
+        "3": "Adequate - Addresses core task, minor gaps",
+        "2": "Poor - Partial addressing with significant gaps",
+        "1": "Inadequate - Fails to address task properly"
+    }
+)
+
+# Create evaluator
+evaluator = Evaluator(
+    metrics=[trajectory_match, task_adherence],
+    model_config=model_config,
+    trajectory=trajectory
+)
+
+# Run evaluation
+result = evaluator.run()
+
+# Results follow Azure format
+print("Evaluation Results:")
+for metric_name, score in result.items():
+    if isinstance(score, dict):
+        print(f"{metric_name}: {score.get('score', 'N/A')}")
+        print(f"  Result: {score.get('result', 'N/A')}")
+        print(f"  Reason: {score.get('reason', 'N/A')}")
+    else:
+        print(f"{metric_name}: {score}")
+
+
+```
@@ -0,0 +1,50 @@
+# Guardrails
+
+The design goal is to provide a flexible and extensible way to implement guardrails
+and a built-in set of guardrails that can be used for common use cases.
+
+> NOTE: this is work in progress.
+
+Guardrails can be template-based to adapt to different input data types, which
+include:
+- `Message` for agent messages.
+- `ToolCall` for tool call requests.
+- `ToolResult` for tool call results.
+
+Guardrails are added to other components such as `ModelClient` and `MCPServer`
+as hooks that are called before and after the main logic of the component.
+
+For example, the `ModelClient` has methods to add input and output guardrails.
+
+```python
+model_client = ModelClient(...)
+model_client.add_input_guardrails([
+    PIIGuardrail[Message](...),
+    SensitiveDataGuardrail[Message](...),
+])
+model_client.add_output_guardrails([
+    HarmfulContentGuardrail[Message](...),
+])
+```
+
+Another example to show how to use a guardrail with an MCP server:
+
+```python
+guardrail = PIIGuardrail(
+    config={
+        "rules": [
+            {
+                "type": "email",
+                "action": "block"
+            },
+            {
+                "type": "phone",
+                "action": "block"
+            }
+        ]
+    }
+)
+
+mcp_server = MCPServer(...)
+mcp_server.add_output_guardrail(guardrail)
+```
@@ -0,0 +1,71 @@
+# MCP Servers
+
+An MCP server is a component that wraps a session to an
+[Model Context Protocol](https://modelcontextprotocol.io/) (MCP) server.
+
+The tools provided by MCP server should match the tool interface to ensure
+minimal boilerplate code when dealing with both tools and MCP servers.
+
+Other features like sampling and resources, should be accessible through
+the MCP server interface as well.
+
+
+## MCP Server base class (draft)
+
+```python
+
+class MCPServer(ABC):
+    """The base class for all MCP servers in the framework."""
+
+    @abstractmethod
+    async def list_tools(self, context: Context) -> list[ToolSchema]:
+        """List all available tools in the MCP server.
+
+        Returns:
+            A list of tool schemas available in the MCP server.
+        """
+        ...
+
+    @abstractmethod
+    async def call_tool(
+        self,
+        call: ToolCall,
+        context: Context,
+    ) -> ToolResult:
+        """Call a tool with the given name and arguments.
+
+        Args:
+            tool_name: The name of the tool to call.
+            args: The arguments to pass to the tool.
+            context: The context for the current invocation of the MCP server.
+
+        Returns:
+            The result of calling the tool.
+        """
+        ...
+    
+    def add_input_guardrails(
+        self, 
+        guardrails: list[InputGuardrail[ToolCall]]
+    ) -> None:
+        """Add input guardrails to the MCP server.
+
+        Args:
+            guardrails: The list of input guardrails to add.
+        """
+        ...
+    
+    def add_output_guardrails(
+        self, 
+        guardrails: list[OutputGuardrail[ToolResult]]
+    ) -> None:
+        """Add output guardrails to the MCP server.
+
+        Args:
+            guardrails: The list of output guardrails to add.
+        """
+        ...
+    
+```
+
+MCP specs have other APIs. We should consider adding them as well.
@@ -0,0 +1,85 @@
+# Model Clients
+
+A model client is a component that implements a unified interface for
+interacting with different language models. It exposes a standardized metadata
+about the model it provides (e.g., model name, tool call and vision capabilities, etc.)
+to support validation and composition with other components.
+
+The framework provides a set of pre-built model clients:
+
+- `OpenAIChatCompletionClient`
+- `AzureOpenAIChatCompletionClient`
+- `AzureOpenAIResponseClient`
+- `AzureAIClient`
+- `AnthropicClient`
+- `GeminiClient`
+- `HuggingFaceClient`
+- `OllamaClient`
+- `VLLMClient`
+- `ONNXRuntimeClient`
+- `BedrockClient`
+- `NIMClient`
+
+Prompt template is a component that is used by model clients to generate prompts with parameters set based on some injected context.
+prompts with parameters set based on some injected context.
+This gets into the actual interface and implementation detail of model clients,
+so we just mention it here.
+
+The design goal is to provide integration with a wide range of model providers,
+including both open-source and commercial models, while maintaining a consistent
+interface for developers to use.
+
+## `ModelClient` base class (draft)
+
+```python
+class ModelClient(ABC):
+    """The base class for all model clients in the framework."""
+
+    @abstractmethod
+    async def create(
+        self,
+        thread: Thread,
+        context: Context,
+        stream: bool = False,
+        tools: Optional[list[Tool]] = None,
+        output_format: Optional[OutputFormat] = None,
+    ) -> Message:
+        """Generate a response from the model based on the provided messages.
+
+        Args:
+            thread: The conversation context to generate a response.
+            context: The context for the current invocation of the model client.
+                This is for accessing event channels for streaming tokens.
+            stream: Whether to stream the response tokens.
+            tools: Optional list of tools to use for tool calling.
+            output_format: Optional structured output format for the response.
+                If provided, the model will generate a response in this format
+                and returns a structured response message.
+
+        Returns:
+            The generated response message.
+        """
+        ...
+    
+    def add_input_guardrails(
+        self, 
+        guardrails: list[InputGuardrail[Message]]
+    ) -> None:
+        """Add input guardrails to the model client.
+
+        Args:
+            guardrails: The list of input guardrails to add.
+        """
+        ...
+    
+    def add_output_guardrails(
+        self, 
+        guardrails: list[OutputGuardrail[Message]]
+    ) -> None:
+        """Add output guardrails to the model client.
+
+        Args:
+            guardrails: The list of output guardrails to add.
+        """
+        ...
+```
@@ -0,0 +1,3 @@
+# Observability and Monitoring
+
+Traces should follow the [OTEL GenAI Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/).
@@ -0,0 +1,10 @@
+# Optimization and Tuning
+
+> For future consideration.
+
+The framework should support optimization of agents and workflows
+with task feedback, by tuning the various components such as 
+system prompts, model parameters, and tool configurations.
+
+We should also consider fine-tuning of the models and embeddings
+as part of the optimization process.
@@ -0,0 +1,29 @@
+# Threads
+
+Threads are stateful objects to manage the conversation context of an agent or a workflow.
+They are meant to be shown to the user as part of a user interface.
+They can be persisted to a database or a file system, and used to
+resume a previous user session.
+
+Thread should use message and content types as defined in [Core Data Types](types.md).
+
+A thread can contain sub-threads as a dictionary of threads. 
+This is to ensure agents in a workflow can run concurrently on different threads.
+The default thread has the key `main` and the sub-threads having keys that are usually
+corresponding to the agents in a workflow.
+
+For workflows, thread should also support the concept of execution state, which includes:
+- The history of steps taken.
+- The current step in the workflow.
+- The next steps to be taken.
+
+This is to ensure the workflow can be resumed from where it left off, without losing
+the state of execution.
+
+The framework should provides default implementations of a thread class that:
+- Can be backed by a database (i.e., Redis) or a file system (i.e., JSON file).
+- Can be backed by the Foundry Agent Service.
+- Can be copied and forked.
+- Can be serialized and deserialized to/from JSON.
+- Can support checkpointing, rollback, and time travel, for both agent and workflow.
+- Can automantically export truncated views to be used by model clients to keep the context size within limits.
@@ -0,0 +1,215 @@
+# Tools
+
+> The design goal is to make it easy to create new tools and integrate existing APIs and make them available to agents.
+
+A tool is a component that can be used to invoke procedure code
+and returns a well-defined result type to the caller.
+
+The result type should indicate the success or failure of the invocation,
+as well as the output of the invocation in terms of the core data types.
+There may be other fields in the result type for things like
+side effects, etc.. We should address this when designing the
+tool interface.
+
+A tool may have arguments for invocation.
+The arguments must be defined using JSON schema that language model supports.
+
+A tool may have dependencies such as tokens, credentials,
+or output message channels that will be provided by through
+a context variable passed to the tool when it is invoked.
+
+A tool may also have guardrails that are used to ensure the
+tool is invoked with proper arguments, or that the agent has the
+right context such as human approval to invoke the tool.
+
+The framework provides a set of pre-built tools:
+
+- `FunctionTool`: a tool that wraps a function.
+- `AzureAISearchTool`: a tool that is backed by Azure AI Search Service.
+- `OpenAPITool`: a tool that is backed by a service that defines an OpenAPI spec.
+- Other tools backed by Foundry.
+
+## `Tool` base class
+
+```python
+@dataclass
+class ToolResult:
+    """The result of running a tool."""
+    is_error: bool
+    output: List[ImageContent | TextContent] # The content types are defined as part of the core data types.
+    ... # Other fields, could be extended to include more for application-specific needs.
+
+class Tool(ABC):
+    """The base class for all tools in the framework."""
+
+    @property
+    def name(self) -> str:
+        """The name of the tool, used to identify it in the system."""
+        ...
+    
+    @property
+    def description(self) -> str:
+        """The description of the tool, used to provide information about its
+        functionality.
+        """
+        ...
+
+    @property
+    def schema(self) -> ToolSchema:
+        """The schema of the tool, which defines the JSON schema of the input
+        arguments."""
+        ...
+
+    @property
+    def strict(self) -> bool:
+        """Whether the JSON schema is in strict mode. If true, no optional
+        arguments are allowed.
+        """
+        ...
+    
+    async def __call__(
+        self,
+        call: ToolCall,
+        context: Context,
+    ) -> ToolResult:
+        """The method to call to run the tool with arguments and return the result.
+        
+        Args:
+            call: The tool call containing the name and arguments to pass to the tool.
+            context: The context for the current invocation of the tool, providing
+                access to the event channel, and human-in-the-loop (HITL) features.
+
+        Returns:
+            The result of running the tool.
+        """
+        try:
+            # Call the on_invoke method to allow for input guardrails to be applied
+            # to the arguments before the tool is run.
+            await self.on_invoke(args, context)
+            # Call the run method to actually run the tool.
+            result = await self.run(args, context)
+            # Call the on_output method to handle the output of the tool.
+            result = await self.on_output(result, context)
+        except Exception as e:
+            # If an error occurs, call the on_error method to handle it.
+            result = await self.on_error(e, context)
+        return result
+
+    @abstractmethod
+    async def run(
+        self,
+        calls: ToolCall,
+        context: Context,
+    ) -> ToolResult:
+        """The method called by the tool itself to run the tool with arguments and return the result."""
+        ...
+    
+    async def on_invoke(
+        self,
+        calls: ToolCall,
+        context: Context,
+    ) -> None:
+        """The method called by the tool when is invoked but before it is run.
+
+        This is useful for input guardrails to be applied to the arguments
+        before the tool is run.
+        """ 
+        ...
+    
+    async def on_error(
+        self,
+        error: Exception,
+        context: Context,
+    ) -> ToolResult:
+        """The method called by the tool when an error is raised."""
+        ...
+    
+    async def on_output(
+        self,
+        output: ToolResult,
+        context: Context,
+    ) -> ToolResult:
+        """The method called by the tool when the output is ready.
+
+        This is where output guardrails can be applied to the result
+        before it is returned to the caller.
+        """
+        ...
+    
+    def add_input_guardrails(
+        self, 
+        guardrails: list[InputGuardrail[ToolCall]]
+    ) -> None:
+        """Add input guardrails to the tool.
+
+        Args:
+            guardrails: The list of input guardrails to add.
+        """
+        ...
+    
+    def add_output_guardrails(
+        self, 
+        guardrails: list[OutputGuardrail[ToolResult]]
+    ) -> None:
+        """Add output guardrails to the tool.
+
+        Args:
+            guardrails: The list of output guardrails to add.
+        """
+        ...
+    
+    def add_on_error_func(
+        self, 
+        on_error_func: Callable[[Exception, Context], Awaitable[ToolResult]]
+    ) -> None:
+        """Add a function to call when an error is raised during the call to `run`.
+
+        Args:
+            on_error_func: The function to call when an error is raised.
+        """
+        ...
+```
+
+## `FunctionTool`
+
+The `FunctionTool` is a decorator that can be used to create a tool from a function.
+
+```python
+@FunctionTool
+def web_search(
+    query: str,
+    num_results: int = 10,
+) -> str:
+    """A tool that performs a web search and returns the results."""
+    ...
+```
+
+`FunctionTool` supports customization of the following:
+- `name`
+- `description`
+- `on_error_func`: the function to call when an error is raised during the call to `run`.
+- `strict`: whether the JSON schema is in strict mode. If true, no optional
+  arguments are allowed.
+- `input_guardrails`: a list of input guardrails to apply to the arguments
+  before the tool is run.
+- `output_guardrails`: a list of output guardrails to apply to the result
+  before it is returned.
+
+## `AgentTool`
+
+The `AgentTool` is a wrapper around an agent that can be used as a tool.
+
+```python
+agent = SomeAgent(...)
+tool = AgentTool(
+    agent=agent,
+    name="SomeAgent",
+    description="Some description of this agent tool.",
+    output_extractor=..., # Optional, a function to extract a ToolResult from the agent's run Result.
+    on_error_func=..., # Optional, a function to call when an error is raised during the call to `run`.
+)
+```
+
+The argument to the `AgentTool` is a single string.
+
+> NOTE: Do we also need to support passing a thread to the agent tool?
@@ -0,0 +1,118 @@
+# Core Data Types
+
+A design goal of the new framework to simplify the interaction between agent components
+through a common set of data types, minimizing boilerplate code
+in the application for transforming data between components.
+
+For example, text, images, function calls, tool schema are
+all examples of such data types.
+These data types are used to interact with agent components (model clients, tools, MCP, threads, and memory),
+forming the connective tissue between those components.
+
+In AutoGen, these are the data types mostly defined in `autogen_core.models` module,
+and others like `autogen_core.Image` and `autogen_core.FunctionCall`. This is just
+an example as AutoGen has no formal definition of model context.
+
+To start, we should follow [MEAI](https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai?view=net-9.0-pp).
+
+This document describes the data types from Python perspective,
+while for .NET, we should directly use the MEAI data types.
+
+## Content types
+
+```python
+class AIContent(ABC):
+    """Base class for all AI content types."""
+    additional_properties: Dict[str, Any] = field(default_factory=dict)
+    """Additional properties for extensibility, allowing custom fields."""
+
+class DataContent(AIContent):
+    """Data content type."""
+    data: bytes # Raw binary data.
+    media_type: str # MIME type of the data, e.g., "image/png", "application/json"
+    uri: str # URI constructed from the data.
+    base64: str # Base64 encoded data for easy transport.
+
+class ErrorContent(AIContent):
+    """Error content type."""
+    details: str # Detailed error message.
+    error_code: str # Error code for programmatic handling.
+    message: str # Human-readable error message.
+
+class FunctionCallContent(AIContent):
+    """Function call content type."""
+    name: str # Name of the function to call.
+    arguments: Dict[str, Any] # Arguments for the function call, serialized as JSON.
+    call_id: str # Unique identifier for the function call.
+    exception: Optional[Exception] = None # Optional exception for error occurred while mapping the original function call data to this content type.
+
+class FunctionResultContent(AIContent):
+    """Function result content type."""
+    call_id: str # Unique identifier for the function call.
+    result: Any # Result of the function call, or a generic error message.
+    exception: Optional[Exception] = None # Optional exception for error occurred while executing the function call.
+
+class TextContent(AIContent):
+    """Text content type."""
+    text: str
+
+class TextReasoningContent(AIContent):
+    """Text reasoning content type."""
+    text: str
+
+
+class UriContent(AIContent):
+    """URI content type."""
+    uri: str # URI of the content, e.g., a link to an image or document.
+    media_type: str # MIME type of the content, e.g., "image/png", "application/pdf".
+
+
+class UsageDetails:
+    input_token_count: Optional[int] = None
+    output_token_count: Optional[int] = None
+    additional_counts: Optional[Dict[str, int]] = None
+    total_token_count: Optional[int] = None
+
+
+class UsageContent(AIContent):
+    """Usage content type."""
+    details: UsageDetails
+
+```
+
+## `ChatMessage`
+
+A message in a thread that is sent to or received from a model client.
+
+> Should we use `Message` instead of `ChatMessage`?
+
+> We may need to extend this class to support more framework-level functionalities 
+> such as handoff, stopping, and so on?
+
+```python
+class ChatRole(Enum):
+    """The role of the author in a chat message."""
+    USER = "user"
+    ASSISTANT = "assistant"
+    SYSTEM = "system"
+    TOOL = "tool"
+
+class ChatMessage:
+    message_id: str # Unique identifier for the message.
+    author: str # Unique identifier for the author of the message.
+    role: ChatRole # Role of the author in the chat, e.g., user, assistant, system, tool.
+    contents: List[AIContent] # List of content types in the message, e.g., text, images, function calls.
+```
+
+## Tool types
+
+Align with the [MEAI tool types](https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.aifunction?view=net-9.0-pp) 
+in terms of the core attributes and methods.
+
+See [Tools](./tools.md) for more details.
+
+## Model client types
+
+Align with the MEAI model client types in terms of the core attributes and methods.
+
+See [Models](./models.md) for more details.
@@ -0,0 +1,38 @@
+# Vector Stores and Embedding Clients
+
+A vector store is component that provides a unified interface for
+interacting with different vector databases, similar to model clients.
+It exposes indexing and querying methods, including vector, text-based
+and hybrid queries.
+
+The details can be filled in based on the existing vector abstraction
+in Semantic Kernel.
+
+The framework provides pre-built vector stores (already exist in
+Semantic Kernel):
+
+- Azure AI Search
+- Cosmos DB
+- Chroma
+- Couchbase
+- Elasticsearch
+- Faiss
+- In-memory
+- JDBC
+- MongoDB
+- Pinecone
+- Postgres
+- Qdrant
+- Redis
+- SQL Server
+- SQLite
+- Volatile
+- Weaviate
+
+Many vector store implementations will require embedding clients
+to function. An embedding client is a component that implements a unified interface
+to interact with different embedding models.
+
+The framework provides a set of pre-built embedding clients:
+
+- TBD.
@@ -0,0 +1,201 @@
+# Workflow
+
+The design goal is to create workflows that can be specified in a declarative
+way to allow for easy creation and modification without needing to change the
+underlying code. 
+
+## `Workflow` is Agent
+
+A `Workflow` is an agent composed of other agents. It follows the same interface
+as an agent. This allows for nested workflows, where a workflow can contain other
+workflows.
+
+## Agents in a `Workflow`
+
+Each agent (or a `Workflow`) in a `Workflow` has a thread on which it will
+always run. The thread may be privated, or shared among some or all of the agents.
+
+When do agents share a `thread`?
+
+- When an agent is called through handoff or as a tool by another agent, the caller
+    agent's thread may be shared with the callee agent.
+
+When do agents not share a `thread`?
+
+- When a set of worker agents are called through a "fan-out" and "fan-in" pattern, where the worker
+    agents are called in parallel and the results are combined by an aggregator agent.
+
+Thread sharing can be configured through the `Workflow`'s constructor.
+By default, each agent has its own private thread and no sharing.
+
+See [Threads](threads.md) for more details on how threads work.
+
+## `Workflow` from control flow graph
+
+A `Workflow` can be created from a control flow graph of agents.
+The graph is a directed graph where each node is an agent and each edge
+is a transition between agents. The graph can contain loops
+and conditional transitions.
+
+The control flow graph specifies the order in which agents are called
+and the conditions under which they are called.
+
+```python
+# Create agent instances.
+agent1 = MCPAgent(
+    model_client="OpenAIChatCompletionClient",
+    mcp_server=["MCPServer1", "MCPServer2"],
+)
+agent2 = MCPAgent(
+    model_client="OpenAIChatCompletionClient",
+    mcp_server=["MCPServer3", "MCPServer4"],
+)
+agent3 = MCPAgent(
+    model_client="OpenAIChatCompletionClient",
+    mcp_server="MCPServer5",
+)
+
+# Create a directed graph of agents with conditional loops and transitions.
+# The graph builder validates the graph.
+graph = GraphBuilder() \
+    .add_agent(agent1) \
+    .add_agent(agent2) \
+    .add_agent(agent3) \
+    .add_loop(agent1, agent2, conditions=Any(...)) \
+    .add_transition(agent2, agent3, conditions=Any(..., All(...))]) \
+    .build()
+
+# Create a workflow from the graph.
+workflow = Workflow(graph=graph)
+```
+
+## `Workflow` from message router
+
+By default, each message is delivered to an _inbox_ of every agent in a `Workflow`.
+When an agent is called, the inbox is cleared and the messages are added
+to the thread that is used by the agent.
+
+If multiple agents share a thread, each message is added exactly once to the thread.
+
+To customize the message flow, we can configure how each inbox behaves.
+Each agent's inbox can be configured to only accept messages from a specific sender(s). 
+We can also configure the inbox batch size, time-to-live for messages in the inbox
+and various other parameters that controls how the inbox is processed.
+
+The configuration of agents' inboxes is done using a `Router` object,
+which can be built using a `RouterBuilder` object.
+
+```python
+graph = ...
+
+router = RouterBuilder() \
+    .add_route(source=agent1, target=agent2) \ # Agent2 will receive messages from agent1.
+    .add_route(source=[agent1, agent2], target=agent3, batch_size=10, ttl="1h") \ # Agent3 will receive messages from agent1 and agent2, with a batch size of 10 and a time-to-live of 1 hour.
+    .add_route(source=Router.ANY, target=agent4) \ # Agent4 will receive all messages.
+).build()
+
+# Create a workflow from the graph and router.
+workflow = Workflow(graph=graph, router=router)
+```
+
+You can also skip the graph all together and just create a workflow from the router.
+In this case, all agents will run concurrently to process the messages delivered
+to their inboxes, according to the inbox rules.
+
+```python
+# Create a workflow from the router.
+workflow = Workflow(router=router)
+```
+
+The validation of the router is done as part of the workflow creation, to ensure
+that no gap exists in the routing, and warning for cascading routes.
+
+## Run `Workflow`
+
+It is the same as running an agent.
+
+```python
+# Create a message batch to send to the workflow.
+# The run context is used to pass in the event channel and other context
+# shared by the agents.
+thread = [
+    Message("Hello"),
+    Message("Can you find the file 'foo.txt' for me?"),
+]
+context = RunContext(event_channel="console")
+result = await workflow.run(thread, context=context)
+```
+
+## `Workflow` has a final response
+
+A `Workflow` is expected to have a final response, which is the final response in the 
+result of the last agent in the workflow. The final response is returned as part of the
+`Result` object returned by the `run` method.
+
+This is to ensure the workflow can be used in the same way as an agent. 
+
+## Stopping `Workflow`
+
+A `workflow` may run indefinitely, so it is important to have a way to stop it.
+
+```python
+# Use a stopping condition to stop the workflow when the condition is met.
+# Detail design TBD.
+condition = StopCondition(
+    condition=Any(...),
+    timeout="1h",
+)
+workflow = Workflow(graph=graph, stop_condition=condition)
+```
+
+TBD.
+
+## `Workflow` can be stateless
+
+The workflow state is kept in the thread object as input to the `run` method.
+If not provided, the workflow will create new sub-threads for each agent
+in the workflow for their private threads, otherwise, the workflow will
+use the provided sub-thread.
+
+```python
+# Create a workflow with a graph and router.
+workflow = Workflow(graph=graph, router=router, stop_condition=condition)
+
+# Create a new thread.
+thread = [
+    Message("Hello"),
+    Message("Can you find the file 'foo.txt' for me?"),
+]
+
+# Run the workflow.
+result = await workflow.run(thread, context=context)
+
+# Update the thread with new messages from the user.
+thread = result.thread + [
+    Message("Can you find the file 'bar.txt' for me?"),
+]
+
+# Resume the workflow from where it left off.
+result = await workflow.run(thread, context=context)
+```
+
+Read more about [Threads](threads.md) for more details on threads.
+
+## Pre-defined workflows
+
+The framework ships with a few pre-defined workflows for common orchestration
+patterns. These workflows can be used as-is or as a starting point for
+new developers, however, when using them, you should be aware of the underlying
+implementation and move on to custom workflows when a limit is reached.
+
+The pre-defined workflows are:
+- `Sequential`: A sequential workflow that calls each agent in order,
+  its message flow can be configured separately.
+- `MapReduce`: A map-reduce workflow that splits a task into smaller
+  tasks, runs them in parallel and then combines the results.
+- `RoundRobinGroupChat`: agents are called in a round-robin fashion in a loop.
+- `SelectorGroupChat`: agents are selected on each iteration by the workflow's built-in
+  LLM based selector.
+- `Swarm`: use handoffs.
+
+The predefined workflows are implemented as subclasses of the `Workflow` class.