mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
dbac4eab56
Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/9241f85a-8657-4979-8d64-0736dc6e7ebb Co-authored-by: lokitoth <6936551+lokitoth@users.noreply.github.com>
385 lines
13 KiB
Markdown
385 lines
13 KiB
Markdown
# Agent Hosting Executor Invariants and Test Coverage
|
|
|
|
This document defines the **invariants** that Agent hosting executors must maintain, identifies edge cases in `MessageMerger`, and recommends additional test coverage.
|
|
|
|
## Hosting Executor Invariants
|
|
|
|
Agent hosting executors (e.g., `HostedAgentResponseExecutor`, `AIAgentResponseExecutor`, `WorkflowOrchestrator`) **must** maintain the following invariants:
|
|
|
|
### Invariant 1: Single ResponseId per Turn
|
|
|
|
**For a given "turn", only one ResponseId is emitted.**
|
|
|
|
- If the underlying Agent does not provide a `ResponseId`, the hosting executor **must** assign one.
|
|
- All `AgentResponseUpdate` items emitted during a single turn must share the same `ResponseId`.
|
|
- This ensures clients can reliably group streaming updates into a single logical response.
|
|
|
|
**Relevant Code:**
|
|
- `IdGenerator` creates a `ResponseId` in `InMemoryResponsesService.InitializeResponse()`
|
|
- `AgentInvocationContext.ResponseId` provides the single ResponseId for the turn
|
|
- `ToStreamingResponseAsync()` uses `context.ResponseId` for the `Response` object
|
|
|
|
### Invariant 2: Output Order Preservation for Untimestamped Messages
|
|
|
|
**For a given turn, all messages without `CreatedAt` must be merged in output (arrival) order.**
|
|
|
|
- When `AgentResponseUpdate` items lack a `CreatedAt` timestamp, their relative order must be preserved based on arrival sequence.
|
|
- The current `MessageMerger.CompareByDateTimeOffset()` falls back to index comparison when timestamps are missing or equal, which satisfies this invariant.
|
|
- Edge case: Mixed timestamped/untimestamped messages require careful handling to avoid non-transitive sort issues.
|
|
|
|
**Relevant Code:**
|
|
- `MessageMerger.CompareByDateTimeOffset()` uses index as tiebreaker
|
|
- `ResponseMergeState.AddUpdate()` preserves insertion order in lists
|
|
|
|
### Invariant 3: Agent Message Grouping (No Interleaving)
|
|
|
|
**For multi-agent systems speaking concurrently, a given agent's response messages must all be grouped together (not interleaved with other agents' messages).**
|
|
|
|
- When multiple agents emit responses concurrently (e.g., in handoff scenarios), messages from each agent must remain contiguous.
|
|
- The merged output should group messages by agent, not interleave them arbitrarily.
|
|
- This is critical for maintaining conversation coherence in multi-agent orchestration.
|
|
|
|
**Relevant Code:**
|
|
- `MessageMerger` groups by `ResponseId`, then by `MessageId`
|
|
- Multi-agent scenarios may emit different `ResponseId` values per agent
|
|
- `ComputeMerged()` processes responses in dictionary iteration order (first-seen order)
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
`MessageMerger` (`dotnet/src/Microsoft.Agents.AI.Workflows/MessageMerger.cs`) handles merging streaming `AgentResponseUpdate` messages into a final `AgentResponse`. It groups updates by `ResponseId`, then by `MessageId`, sorts messages by `CreatedAt` timestamp, and produces a consolidated response.
|
|
|
|
---
|
|
|
|
## Edge Cases Identified
|
|
|
|
| # | Edge Case | Risk | Current Test Coverage |
|
|
|---|-----------|------|----------------------|
|
|
| 1 | **Non-transitive timestamp comparison** — mixed timestamped/untimestamped updates with 3+ messages can produce inconsistent sort order | High | ❌ Not covered |
|
|
| 2 | **Cross-ResponseId ordering** — messages from different ResponseIds are emitted in first-seen order, not chronological | Medium | ❌ Not covered |
|
|
| 3 | **ResponseId=null updates always last** — dangling updates appended after all response-scoped messages regardless of arrival time | Medium | ❌ Not covered |
|
|
| 4 | **MessageId=null updates within a response** — keyed messages always precede message-id-less updates | Low | ❌ Not covered |
|
|
| 5 | **CreatedAt overwritten** — all messages in a merged response get same `CreatedAt`, erasing original timestamps | Low | Partial (test asserts message timestamps) |
|
|
| 6 | **Dangling metadata lost** — `FinishReason`, `Usage`, `AgentId` from ResponseId=null updates not merged | Medium | ❌ Not covered |
|
|
| 7 | **Unused `createdTimes` collection** — final response uses `UtcNow`, collected times are unused | Low (code smell) | N/A |
|
|
|
|
---
|
|
|
|
## Edge Case Details
|
|
|
|
### 1. Non-transitive Timestamp Comparison (High Risk)
|
|
|
|
**Problem:** The `OrderBy` lambda uses a comparison that isn't transitive when some messages have `CreatedAt` and others don't:
|
|
|
|
```csharp
|
|
.OrderBy(kvp => kvp.Value.CreatedAt ?? m.CreatedAt)
|
|
```
|
|
|
|
With three messages:
|
|
- A: `CreatedAt = 10`, idx=0
|
|
- B: `CreatedAt = null`, idx=1
|
|
- C: `CreatedAt = 5`, idx=2
|
|
|
|
The comparison produces: A < B (10 < 10=false, so equal), B < C (10 < 5=false), but A > C (10 > 5). This violates transitivity and can cause non-deterministic sort results.
|
|
|
|
**Recommendation:** Use insertion order as fallback for null timestamps, or store original index.
|
|
|
|
---
|
|
|
|
### 2. Cross-ResponseId Ordering (Medium Risk)
|
|
|
|
**Problem:** When multiple `ResponseId`s arrive interleaved, messages are emitted in response-first-seen order:
|
|
|
|
```
|
|
Updates: R1-msg1 → R2-msg1 → R1-msg2 → R2-msg2
|
|
Output: [R1-msg1, R1-msg2], [R2-msg1, R2-msg2]
|
|
```
|
|
|
|
This may not match chronological arrival order.
|
|
|
|
---
|
|
|
|
### 3. ResponseId=null Updates Always Last (Medium Risk)
|
|
|
|
**Problem:** Updates with `ResponseId = null` are grouped as "dangling" and always appended last, regardless of when they arrived:
|
|
|
|
```csharp
|
|
if (update.ResponseId is null)
|
|
{
|
|
danglingUpdates.Add(update);
|
|
continue;
|
|
}
|
|
```
|
|
|
|
This means metadata-only updates sent early in the stream appear at the end.
|
|
|
|
---
|
|
|
|
### 4. MessageId=null Updates Within a Response (Low Risk)
|
|
|
|
**Problem:** Within a `ResponseId` group, updates with `MessageId = null` are stored in a separate list and appended after keyed messages:
|
|
|
|
```csharp
|
|
if (update.MessageId is null)
|
|
{
|
|
grouping.Value.noIds.Add(update);
|
|
}
|
|
```
|
|
|
|
This is likely intentional but not documented or tested.
|
|
|
|
---
|
|
|
|
### 5. CreatedAt Overwritten (Low Risk)
|
|
|
|
**Problem:** All messages in the final response receive the same `CreatedAt` timestamp:
|
|
|
|
```csharp
|
|
message.CreatedAt = m.CreatedAt ?? DateTimeOffset.UtcNow;
|
|
```
|
|
|
|
Individual message timestamps are overwritten with the response-level timestamp or current time.
|
|
|
|
---
|
|
|
|
### 6. Dangling Metadata Lost (Medium Risk)
|
|
|
|
**Problem:** When `ResponseId = null` updates contain `FinishReason`, `Usage`, or `AgentId`, these values are not merged into the final response:
|
|
|
|
```csharp
|
|
// Dangling updates become orphan messages, not merged with response metadata
|
|
foreach (var orphan in danglingUpdates)
|
|
{
|
|
orphanMessages.Add(CreateAgentMessage(orphan));
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
### 7. Unused `createdTimes` Collection (Low Risk - Code Smell)
|
|
|
|
**Problem:** The code collects `CreatedAt` values into `createdTimes` list but never uses them:
|
|
|
|
```csharp
|
|
List<DateTimeOffset> createdTimes = [];
|
|
// ... later ...
|
|
if (update.CreatedAt.HasValue) createdTimes.Add(update.CreatedAt.Value);
|
|
// createdTimes is never used
|
|
```
|
|
|
|
---
|
|
|
|
## Recommended Additional Tests
|
|
|
|
### High Priority
|
|
|
|
1. **Non-transitive sorting with mixed timestamps**
|
|
```csharp
|
|
[Fact]
|
|
public void MergeMessages_MixedTimestamps_ProducesStableOrder()
|
|
{
|
|
// Arrange: 3 messages - A (CreatedAt=10), B (null), C (CreatedAt=5)
|
|
// Act: Merge
|
|
// Assert: Order is deterministic (either chronological or insertion order)
|
|
}
|
|
```
|
|
|
|
2. **Function call/result sequencing with 3+ messages**
|
|
```csharp
|
|
[Fact]
|
|
public void MergeMessages_FunctionCallAndResultsWithMixedTimestamps_PreservesLogicalOrder()
|
|
{
|
|
// Arrange: FunctionCall (null), FunctionResult (T1), Assistant (null)
|
|
// Act: Merge
|
|
// Assert: FunctionCall precedes FunctionResult precedes Assistant
|
|
}
|
|
```
|
|
|
|
### Medium Priority
|
|
|
|
3. **Multiple ResponseIds interleaved**
|
|
```csharp
|
|
[Fact]
|
|
public void MergeMessages_InterleavedResponseIds_GroupsByResponseId()
|
|
{
|
|
// Arrange: R1-msg1, R2-msg1, R1-msg2, R2-msg2
|
|
// Act: Merge
|
|
// Assert: Messages grouped by ResponseId, verify order
|
|
}
|
|
```
|
|
|
|
4. **Dangling updates with FinishReason/Usage**
|
|
```csharp
|
|
[Fact]
|
|
public void MergeMessages_DanglingUpdatesWithMetadata_MetadataPropagates()
|
|
{
|
|
// Arrange: ResponseId=null update with FinishReason=Stop, Usage=(10,20,30)
|
|
// Act: Merge
|
|
// Assert: Final response contains FinishReason and Usage
|
|
}
|
|
```
|
|
|
|
5. **ResponseId=null timing**
|
|
```csharp
|
|
[Fact]
|
|
public void MergeMessages_DanglingUpdatesFirst_AppearsAfterKeyedMessages()
|
|
{
|
|
// Arrange: null-response update, then keyed update
|
|
// Act: Merge
|
|
// Assert: Keyed messages appear before dangling
|
|
}
|
|
```
|
|
|
|
### Low Priority
|
|
|
|
6. **MessageId=null updates ordering**
|
|
```csharp
|
|
[Fact]
|
|
public void MergeMessages_MixedMessageIds_KeyedBeforeUnkeyed()
|
|
{
|
|
// Arrange: Mix of keyed and unkeyed updates within same ResponseId
|
|
// Act: Merge
|
|
// Assert: Keyed messages precede unkeyed in arrival order
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
| Priority | Test Count | Risk Addressed |
|
|
|----------|------------|----------------|
|
|
| High | 2 | Non-deterministic sorting, Function sequencing |
|
|
| Medium | 3 | ResponseId grouping, Metadata propagation, Timing |
|
|
| Low | 1 | MessageId ordering |
|
|
|
|
**Recommendation:** Add at minimum the 2 high-priority tests and 2 medium-priority tests (interleaved ResponseIds, dangling metadata) to ensure `MessageMerger` behaves correctly in production streaming scenarios.
|
|
|
|
---
|
|
|
|
## Required Tests for Hosting Executor Invariants
|
|
|
|
### Invariant 1: Single ResponseId per Turn
|
|
|
|
```csharp
|
|
[Fact]
|
|
public async Task HostingExecutor_AssignsSingleResponseId_WhenAgentProvidesNone()
|
|
{
|
|
// Arrange: Agent that emits updates without ResponseId
|
|
// Act: Execute through hosting executor
|
|
// Assert: All emitted updates have the same ResponseId assigned by executor
|
|
}
|
|
|
|
[Fact]
|
|
public async Task HostingExecutor_PreservesAgentResponseId_WhenProvided()
|
|
{
|
|
// Arrange: Agent that emits updates with a consistent ResponseId
|
|
// Act: Execute through hosting executor
|
|
// Assert: ResponseId from agent is preserved (or overridden if executor policy requires)
|
|
}
|
|
|
|
[Fact]
|
|
public async Task HostingExecutor_RejectsMultipleResponseIds_InSingleTurn()
|
|
{
|
|
// Arrange: Agent that incorrectly emits updates with different ResponseIds
|
|
// Act: Execute through hosting executor
|
|
// Assert: Executor normalizes to single ResponseId or throws validation error
|
|
}
|
|
```
|
|
|
|
### Invariant 2: Output Order Preservation
|
|
|
|
```csharp
|
|
[Fact]
|
|
public void MessageMerger_PreservesInsertionOrder_WhenNoTimestamps()
|
|
{
|
|
// Arrange: Multiple updates without CreatedAt, in specific order A, B, C
|
|
// Act: Merge
|
|
// Assert: Output order is A, B, C
|
|
}
|
|
|
|
[Fact]
|
|
public void MessageMerger_PreservesInsertionOrder_WhenMixedTimestamps()
|
|
{
|
|
// Arrange: Updates where some have CreatedAt and some don't
|
|
// Act: Merge
|
|
// Assert: Untimestamped updates maintain relative order among themselves
|
|
}
|
|
|
|
[Fact]
|
|
public void MessageMerger_StableSort_WithThreeOrMoreMixedTimestampMessages()
|
|
{
|
|
// Arrange: 3+ messages with mixed null/non-null CreatedAt values
|
|
// Act: Merge multiple times
|
|
// Assert: Result is deterministic and consistent across runs
|
|
}
|
|
```
|
|
|
|
### Invariant 3: Agent Message Grouping
|
|
|
|
```csharp
|
|
[Fact]
|
|
public void MessageMerger_GroupsMessagesByAgent_InMultiAgentScenario()
|
|
{
|
|
// Arrange: Interleaved updates from Agent1 and Agent2
|
|
// A1-msg1, A2-msg1, A1-msg2, A2-msg2
|
|
// Act: Merge
|
|
// Assert: Output groups Agent1 messages together, Agent2 messages together
|
|
// Either [A1-msg1, A1-msg2, A2-msg1, A2-msg2] or [A2-msg1, A2-msg2, A1-msg1, A1-msg2]
|
|
}
|
|
|
|
[Fact]
|
|
public void MessageMerger_MaintainsAgentGrouping_WithDifferentResponseIds()
|
|
{
|
|
// Arrange: Agent1 uses ResponseId=R1, Agent2 uses ResponseId=R2
|
|
// Act: Merge with primaryResponseId
|
|
// Assert: Messages from each agent are contiguous, not interleaved
|
|
}
|
|
|
|
[Fact]
|
|
public async Task HostingExecutor_HandoffScenario_MaintainsMessageCoherence()
|
|
{
|
|
// Arrange: Agent1 hands off to Agent2 mid-conversation
|
|
// Act: Execute through hosting executor
|
|
// Assert: Agent1's messages appear before Agent2's messages (no interleaving)
|
|
}
|
|
|
|
[Fact]
|
|
public void MessageMerger_PreservesAgentMessageOrder_WithConcurrentAgents()
|
|
{
|
|
// Arrange: Two agents emitting messages concurrently with timestamps
|
|
// A1 at T1, A2 at T2, A1 at T3, A2 at T4 (where T1 < T2 < T3 < T4)
|
|
// Act: Merge
|
|
// Assert: Agent grouping is maintained, not sorted purely by timestamp
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Recommendations
|
|
|
|
### For Invariant 1 (Single ResponseId)
|
|
|
|
1. **Hosting Executor Layer**: The `AgentInvocationContext` already generates a `ResponseId`. Ensure this is consistently applied to all outgoing `AgentResponseUpdate` items.
|
|
|
|
2. **Validation**: Consider adding runtime validation that throws if an agent emits conflicting `ResponseId` values during a single turn.
|
|
|
|
3. **Code Location**: `ToStreamingResponseAsync()` in `AgentResponseUpdateExtensions.cs` should ensure the `context.ResponseId` is used consistently.
|
|
|
|
### For Invariant 2 (Output Order)
|
|
|
|
1. **Index Tracking**: Store original insertion index alongside each update to ensure stable sorting.
|
|
|
|
2. **Code Location**: `MessageMerger.ResponseMergeState.AddUpdate()` should track insertion order explicitly.
|
|
|
|
3. **Sort Stability**: Replace the current `OrderBy` with a stable sort that uses insertion index as the final tiebreaker.
|
|
|
|
### For Invariant 3 (Agent Grouping)
|
|
|
|
1. **Agent-First Grouping**: Modify `ComputeMerged()` to group by `AgentId` before processing.
|
|
|
|
2. **Deterministic Order**: Define explicit ordering rules (e.g., first-seen agent order, or alphabetical by AgentId).
|
|
|
|
3. **Code Location**: `MessageMerger.ComputeMerged()` needs logic to ensure agent messages are contiguous.
|