mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
Memory, RAG, and other Context Providers (#54)
* initial draft * fixes * fixes * incorporate reviewer feedback * Remove attachment of context providers to threads. A separate issue now tracks this design question.
This commit is contained in:
committed by
GitHub
Unverified
parent
41facf3a2d
commit
a990866901
@@ -36,7 +36,7 @@ find the design details of each component:
|
||||
- [Vector Store and Embedding Client](vector-stores.md)
|
||||
- [Tool](tools.md)
|
||||
- [MCP Server](mcp-servers.md)
|
||||
- [Memory](memory.md)
|
||||
- [Context Provider (memory, RAG, etc.)](context_providers.md)
|
||||
- [Thread](threads.md)
|
||||
- [Guardrail](guardrails.md)
|
||||
- Agent and Workflow:
|
||||
@@ -70,19 +70,19 @@ graph TD
|
||||
Component --> |extends| EmbeddingClient[Embedding Client]
|
||||
Component --> |extends| Tool[Tool]
|
||||
Component --> |extends| MCPServer[MCP Server]
|
||||
Component --> |extends| Memory[Memory]
|
||||
Component --> |extends| ContextProvider[Context Provider]
|
||||
Component --> |extends| Thread[Thread]
|
||||
Component --> |extends| Guardrail[Guardrail]
|
||||
|
||||
Agent --> |uses| uses1[Model Client]
|
||||
Agent --> |uses| uses2[Thread]
|
||||
Agent --> |uses| uses3[Tools and MCP Servers]
|
||||
Agent --> |uses| uses4[Memory]
|
||||
Agent --> |uses| uses4[Context Provider]
|
||||
Agent --> |uses| uses5[Guardrail]
|
||||
|
||||
Workflow --> |contains| contains[Child Agents]
|
||||
|
||||
Memory --> |uses| uses5[Vector Store]
|
||||
ContextProvider --> |uses| uses5[Vector Store]
|
||||
VectorStore --> |uses| uses6[Embedding Client]
|
||||
```
|
||||
|
||||
|
||||
@@ -241,10 +241,10 @@ There are several states that an application may maintain during a user session:
|
||||
owned and managed by a memory object.
|
||||
|
||||
The thread is always passed through the agent's `run` method.
|
||||
Whereas the memory is can be set through the constructor of the agent.
|
||||
Memory may be attached to the thread or passed to the agent's constructor.
|
||||
|
||||
See the [Memory](memory.md) design document for more details on how memory
|
||||
is used in the framework.
|
||||
See the [Context Providers](context_providers.md) design document for more details on how memory
|
||||
and other context providers like RAG are used in the framework.
|
||||
|
||||
It is up to the application to decide whether to reuse state across different
|
||||
user sessions. The framework should provide the necessary methods and storage layer integration
|
||||
|
||||
@@ -0,0 +1,75 @@
|
||||
# Memory, RAG, and other Context Providers
|
||||
|
||||
Prior to calling a model client, it's often necessary to add information to the client's context window gathered from various sources.
|
||||
Two prime examples are long-term memory and retrieval-augmented generation (RAG) systems.
|
||||
The ContextProvider class supports such scenarios through a unified interface for storing and retrieving context data.
|
||||
|
||||
## `ContextProvider` base class
|
||||
|
||||
```python
|
||||
class ContextProvider(ABC):
|
||||
"""
|
||||
The base class for context providers like Memory and RAG.
|
||||
Subclasses will typically have extra methods and constructor parameters for specific functionality,
|
||||
such as clearing memory contents, or adding files to a RAG provider.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
async def get_relevant_context(self, messages: list["Message"]) -> ProvidedContext | None:
|
||||
"""Searches for and returns any information relevant to the messages."""
|
||||
...
|
||||
|
||||
@abstractmethod
|
||||
async def on_new_messages(self, messages: list["Message"]) -> None:
|
||||
"""Stores any information derived from the messages that may be useful to retrieve later."""
|
||||
...
|
||||
|
||||
# To close, delete and release any runtime resources, each subclass should override the built-in Python `del` method.
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
As an example, consider the following scenario involving long-term memory as a context provider.
|
||||
|
||||
Suppose that an app defines a subclass of `ContextProvider` called `Mem0Wrapper` which implements all required methods.
|
||||
At runtime the app instantiates the memory provider, passing any necessary parameters to its constructor.
|
||||
```python
|
||||
mem = Mem0Wrapper(<params>)
|
||||
```
|
||||
|
||||
In this example, the app then clears memory to ensure that it starts empty.
|
||||
```python
|
||||
mem.clear()
|
||||
```
|
||||
|
||||
Then the app creates an agent and passes the memory provider to it through the constructor or some other method.
|
||||
```python
|
||||
agent.add_context_provider(mem)
|
||||
```
|
||||
|
||||
After creating the agent, the app calls `agent.run(message, thread, run_config)` as usual,
|
||||
where the user message assigns a task that requires knowledge the agent doesn't have.
|
||||
`agent.run()` calls `get_relevant_context(message)` on each of the agent's context providers,
|
||||
but `None` is returned since memory is empty.
|
||||
Then `agent.run()` calls the model client as usual, but the LLM can't solve the task.
|
||||
It may realize the information is missing, and ask the user for it.
|
||||
|
||||
The original user message (which assigned the task) is then added to the thread's message history as usual,
|
||||
which automatically calls the `Agent.on_new_messages(messages)`,
|
||||
which calls `on_new_messages` on each context provider.
|
||||
In this case the memory provider fails to find any useful information in the user's message to store.
|
||||
|
||||
Suppose then that the user responds by supplying the missing information.
|
||||
This time the agent will succeed, since the LLM's context window now contains the relevant information.
|
||||
More importantly for our example, when the second user message is added to the thread's message history,
|
||||
`mem.on_new_messages()` will extract and store the relevant information for later retrieval.
|
||||
|
||||
For this example, suppose the user then initiates a new chat (clearing the message history),
|
||||
and assigns the original task again, but without providing the missing information.
|
||||
This time when `mem.get_relevant_context(message)` is called, the memory provider finds the relevant information stored from the previous chat.
|
||||
Then `agent.run()` attaches the retrieved information to the context window before calling the model client,
|
||||
which allows the agent to succeed at the task without the user needing to repeat the missing information.
|
||||
|
||||
For more advanced memory implementations that have the ability to learn from their own experience
|
||||
(instead of only from the user), `mem.get_relevant_context(message)` may return useful context
|
||||
that was not previously extracted by `mem.on_new_messages(messages)`.
|
||||
Reference in New Issue
Block a user