Files
Eduard van Valkenburg f970a699d8 Python: Fix compaction message-id collisions and tool-loop summary persistence (#6299)
* Fix compaction message-id collisions and tool-loop summary persistence

Fixes two bugs in the compaction strategies:

- #5237: incremental group annotation assigned message ids by position
  within the re-annotated slice, so moving the re-annotation start back to
  a previous group start restarted ids at 0 and produced collisions
  (e.g. a user message reusing an assistant message's id), merging groups
  and causing tool-result compaction to wrongly exclude messages.
  group_messages/_ensure_message_ids now take an id_offset and guard
  against existing-id collisions; annotate_message_groups threads the
  slice start index through as the offset.

- #4991: the function-invocation loop copied the message list each
  iteration, so summaries inserted by compaction landed in a throwaway
  copy and were lost across tool-loop iterations (only the persistent
  excluded flags survived). _prepare_messages_for_model_call now compacts
  the list in place when messages is a list, so inserted summaries persist.

Adds regression tests (incremental id uniqueness, existing-id collision
avoidance, idempotency, and tool-loop summary persistence including
streaming and conversation-id modes).

Also adds a summarization.py sample demonstrating SummarizationStrategy
directly with a real client, and reworks advanced.py with tool-call
groups and a real summarizer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Guard incremental message-id assignment against prefix-id collisions

Addresses PR review on #5237: _ensure_message_ids only guarded against
collisions within the re-annotated slice. A preexisting (e.g. user-supplied)
id in the preserved prefix could still be reassigned in the suffix when the
id was numerically out of position, merging groups across the re-annotation
boundary again.

group_messages/_ensure_message_ids now accept reserved_ids, and
annotate_message_groups passes the preserved prefix's ids so auto-assigned
suffix ids never collide across the full list. Adds a regression test
reproducing the out-of-position prefix-id collision.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-04 08:37:59 +00:00

160 lines
6.6 KiB
Python

# Copyright (c) Microsoft. All rights reserved.
import asyncio
from typing import Any, cast
from agent_framework import (
GROUP_ANNOTATION_KEY,
SUMMARIZED_BY_SUMMARY_ID_KEY,
SUMMARY_OF_MESSAGE_IDS_KEY,
Message,
SummarizationStrategy,
apply_compaction,
)
from agent_framework.openai import OpenAIChatClient
from dotenv import load_dotenv
load_dotenv()
"""This sample demonstrates the SummarizationStrategy directly.
Unlike SlidingWindow/Truncation strategies that simply drop older groups,
``SummarizationStrategy`` calls a real chat client to *summarize* the oldest
message groups, replaces them with a single linked summary message, and keeps
the most recent turns verbatim. This preserves long-range context (decisions,
goals, unresolved items) while bounding the prompt size.
Key components:
- SummarizationStrategy with a real OpenAIChatClient summarizer
- ``apply_compaction`` to run the strategy over a message list
- Bidirectional summary trace metadata (summary -> originals, original -> summary)
Run with:
uv run samples/02-agents/compaction/summarization.py # requires OPENAI_API_KEY
"""
def _annotation(message: Message) -> dict[str, Any] | None:
annotation = message.additional_properties.get(GROUP_ANNOTATION_KEY)
return cast("dict[str, Any]", annotation) if isinstance(annotation, dict) else None
def _build_history() -> list[Message]:
"""Build a multi-turn conversation long enough to trigger summarization."""
return [
Message(role="system", contents=["You are a project planning assistant."]),
Message(role="user", contents=["We are migrating a monolith to microservices. Where do we start?"]),
Message(
role="assistant",
contents=["Start by mapping bounded contexts and identifying the highest-churn modules to extract first."],
),
Message(role="user", contents=["The billing module changes most often. What are the risks of extracting it?"]),
Message(
role="assistant",
contents=["Main risks: distributed transactions, invoices-table ownership, and latency on hot paths."],
),
Message(role="user", contents=["How should we handle the shared invoices table?"]),
Message(
role="assistant",
contents=["Use the strangler-fig pattern: dual-write during transition, then make billing the owner."],
),
Message(role="user", contents=["What is the most recent decision we made?"]),
Message(role="assistant", contents=["We decided to extract billing first using the strangler-fig pattern."]),
]
def _print_messages(label: str, messages: list[Message]) -> None:
print(f"\n--- {label} ---")
print(f"Message count: {len(messages)}")
for index, message in enumerate(messages, start=1):
text = message.text or ", ".join(content.type for content in message.contents)
print(f"{index:02d}. [{message.role}] {text[:90]}")
async def main() -> None:
# 1. Create a real summarizing client. SummarizationStrategy only requires a
# SupportsChatGetResponse-compatible client, so any chat client works.
summarizer = OpenAIChatClient(model="gpt-4o-mini")
# 2. Build a conversation and show it before compaction.
messages = _build_history()
_print_messages("Before compaction", messages)
# 3. Configure the strategy. It triggers once the included non-system message
# count exceeds ``target_count + threshold`` (here 4 + 2 = 6), summarizing
# the oldest groups down toward ``target_count`` while keeping recent turns.
strategy = SummarizationStrategy(
client=summarizer,
target_count=4,
threshold=2,
)
# 4. Apply the strategy. The oldest groups are summarized into a single
# assistant message; the projected list is what the model would receive.
projected = await apply_compaction(messages, strategy=strategy)
_print_messages("After compaction (SummarizationStrategy)", projected)
# 5. Inspect the generated summary and its bidirectional trace metadata.
print("\n--- Summary trace ---")
for message in messages:
annotation = _annotation(message)
if annotation is None:
continue
summarizes = annotation.get(SUMMARY_OF_MESSAGE_IDS_KEY)
if summarizes:
print(f"Generated summary ({message.message_id}):")
print(f" {message.text}")
print(f" summarizes original ids: {summarizes}")
summarized_by: dict[str | None, Any] = {}
for message in messages:
annotation = _annotation(message)
if annotation is None:
continue
summary_id = annotation.get(SUMMARIZED_BY_SUMMARY_ID_KEY)
if summary_id:
summarized_by[message.message_id] = summary_id
if summarized_by:
print("Originals replaced by the summary:")
for original_id, summary_id in summarized_by.items():
print(f" {original_id} -> {summary_id}")
if __name__ == "__main__":
asyncio.run(main())
"""
Sample output (summary text varies because it is generated by the model):
--- Before compaction ---
Message count: 9
01. [system] You are a project planning assistant.
02. [user] We are migrating a monolith to microservices. Where do we start?
03. [assistant] Start by mapping bounded contexts and identifying the highest-churn modules to ex
04. [user] The billing module changes most often. What are the risks of extracting it?
05. [assistant] Main risks: distributed transactions, data ownership of the invoices table, and lat
06. [user] How should we handle the shared invoices table?
07. [assistant] Use the strangler-fig pattern: dual-write during transition, then make billing the
08. [user] What is the most recent decision we made?
09. [assistant] We decided to extract billing first using the strangler-fig pattern.
--- After compaction (SummarizationStrategy) ---
Message count: 6
01. [system] You are a project planning assistant.
02. [assistant] The user is migrating a monolith to microservices and decided to extract the billin
03. [user] How should we handle the shared invoices table?
04. [assistant] Use the strangler-fig pattern: dual-write during transition, then make billing the
05. [user] What is the most recent decision we made?
06. [assistant] We decided to extract billing first using the strangler-fig pattern.
--- Summary trace ---
Generated summary (summary_9):
The user is migrating a monolith to microservices and decided to extract the billing module first...
summarizes original ids: ['msg_1', 'msg_2', 'msg_3', 'msg_4', 'msg_5']
Originals replaced by the summary:
msg_1 -> summary_9
msg_2 -> summary_9
msg_3 -> summary_9
msg_4 -> summary_9
msg_5 -> summary_9
"""