mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
Python: Fix incorrect workflow timings in DevUI by adding created_at to executor events (#5615)
* fix(devui): add created_at to custom output item events for correct workflow timings (#5545) CustomResponseOutputItemAddedEvent and CustomResponseOutputItemDoneEvent lacked a created_at field, causing the frontend to synthesize timestamps using integer-second precision with a forced +1s minimum gap between events. This made instant workflows appear to take 3+ seconds in the DevUI timeline. Fix: - Add optional created_at: float | None field to both custom event models - Populate created_at=float(time.time()) in the mapper for executor_invoked, executor_completed, and executor_failed events Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(devui): use event created_at for accurate workflow timeline timings workflow-view.tsx synthesized _uiTimestamp using Math.max(baseTimestamp, lastTimestamp + 1) with integer-second precision, forcing a minimum 1-second gap between every sequential event. This made instant workflows appear to take several seconds in the DevUI timeline. The fix prefers event.created_at (a float Unix timestamp populated by the backend mapper for all executor events) and only falls back to the synthetic timestamp when created_at is absent. This matches the pattern already used in devuiStore.ts:addDebugEvent. Added a regression test in test_mapper.py verifying that the mapper attaches created_at to all executor lifecycle events (invoked, completed, failed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(devui): address review feedback for issue #5545 - Read data.timestamp (ISO string) and response.created_at in addition to top-level created_at when deriving _uiTimestamp, so response.workflow_event.completed events get a real server timestamp instead of a synthesized one - Change uniqueTimestamp tiebreaker: when a real server timestamp is available use Math.max(eventTimestamp, lastTimestamp) rather than lastTimestamp + 1, eliminating artificial 1-second gaps while still preserving monotonic ordering - Apply the same fix in the HIL streaming path (second setOpenAIEvents call in workflow-view.tsx) - Add assert event.created_at > 0 to regression test to guard against zero or negative timestamps - Add test_custom_output_item_event_models_have_created_at_field model- level test so removing the field produces a clear named failure rather than a downstream ValidationError Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(#5545): guard NaN timestamps, fix fallback ID uniqueness, add regression tests - workflow-view.tsx (×2): Wrap data.timestamp ISO→number conversion in a Number.isFinite() guard. Python's datetime.now().isoformat() emits microseconds without a trailing 'Z' (e.g. '2024-01-15T12:34:56.123456'), which some JS engines cannot parse, returning NaN. NaN !== undefined is true so the eventTimestamp !== undefined guard did not catch it, poisoning _uiTimestamp and resetting the monotonic ordering seed (NaN || 0 → 0). - execution-timeline.tsx: Replace uiTimestamp in the fallback syntheticItemId with the per-executor runNumber counter. Two runs of the same executor within the same second previously received identical _uiTimestamp values and therefore identical syntheticItemIds, causing their output buckets, state, and run entries to collide (execution-timeline.tsx:360–408). - Add missing test_workflow_timings_bug.py source file (only a stale .pyc existed). Three regression tests: · test_custom_event_models_lack_created_at_field – model field guard · test_workflow_executor_events_lack_created_at – mapper populates created_at · test_rapid_workflow_events_have_no_top_level_timestamps – confirms data.timestamp format that requires the frontend NaN guard Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback for #5545: Python: [Bug]: Workflow timings in DevUI are incorrect * devui: move timing regression tests into test_mapper.py, remove dedicated bug file - Delete test_workflow_timings_bug.py; tests belong in existing module files - The two tests already present in test_mapper.py (test_executor_events_carry_created_at_timestamp and test_custom_output_item_event_models_have_created_at_field) cover the same ground as the first two tests in the deleted file - Add test_executor_completed_maps_to_output_item_done_event to test_mapper.py, replacing the third test from the deleted file with a generic, issue-agnostic name and docstring Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback for #5545: review comment fixes --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
Unverified
parent
7476049d7e
commit
27f926609f
@@ -1056,6 +1056,7 @@ class MessageMapper:
|
||||
output_index=context["output_index"],
|
||||
sequence_number=self._next_sequence(context),
|
||||
item=executor_item,
|
||||
created_at=float(time.time()),
|
||||
)
|
||||
]
|
||||
|
||||
@@ -1088,6 +1089,7 @@ class MessageMapper:
|
||||
output_index=context.get("output_index", 0),
|
||||
sequence_number=self._next_sequence(context),
|
||||
item=executor_item,
|
||||
created_at=float(time.time()),
|
||||
)
|
||||
]
|
||||
|
||||
@@ -1121,6 +1123,7 @@ class MessageMapper:
|
||||
output_index=context.get("output_index", 0),
|
||||
sequence_number=self._next_sequence(context),
|
||||
item=executor_item,
|
||||
created_at=float(time.time()),
|
||||
)
|
||||
]
|
||||
|
||||
|
||||
@@ -64,6 +64,7 @@ class CustomResponseOutputItemAddedEvent(BaseModel):
|
||||
output_index: int
|
||||
sequence_number: int
|
||||
item: dict[str, Any] | ExecutorActionItem | Any # Flexible item type
|
||||
created_at: float | None = None # Unix timestamp; used by frontend for accurate workflow timings
|
||||
|
||||
|
||||
class CustomResponseOutputItemDoneEvent(BaseModel):
|
||||
@@ -77,6 +78,7 @@ class CustomResponseOutputItemDoneEvent(BaseModel):
|
||||
output_index: int
|
||||
sequence_number: int
|
||||
item: dict[str, Any] | ExecutorActionItem | Any # Flexible item type
|
||||
created_at: float | None = None # Unix timestamp; used by frontend for accurate workflow timings
|
||||
|
||||
|
||||
class ResponseWorkflowEventComplete(BaseModel):
|
||||
|
||||
+4
-2
@@ -356,8 +356,10 @@ export function ExecutionTimeline({
|
||||
const runNumber = (runCount.get(executorId) || 0) + 1;
|
||||
runCount.set(executorId, runNumber);
|
||||
|
||||
// Create synthetic item ID for fallback format (no real item.id from backend)
|
||||
const syntheticItemId = `fallback_${executorId}_${uiTimestamp}`;
|
||||
// Create synthetic item ID using the run counter for guaranteed uniqueness.
|
||||
// Using uiTimestamp here caused collisions when the same executor ran
|
||||
// twice within the same second (both fallback entries would share an ID).
|
||||
const syntheticItemId = `fallback_${executorId}_run${runNumber}`;
|
||||
|
||||
runs.push({
|
||||
executorId,
|
||||
|
||||
@@ -576,17 +576,37 @@ export function WorkflowView({
|
||||
openAIEvent.type === "response.workflow_event.complete" // Fallback variant
|
||||
) {
|
||||
setOpenAIEvents((prev) => {
|
||||
// Generate unique timestamp for each event
|
||||
// Derive a server-side timestamp from the event, in priority order:
|
||||
// 1. top-level created_at (custom output-item events)
|
||||
// 2. response.created_at (response.created / lifecycle events)
|
||||
// 3. data.timestamp (response.workflow_event.completed ISO string)
|
||||
// Fall back to a synthesized timestamp only when none is present.
|
||||
const anyEvent = openAIEvent as Record<string, unknown>;
|
||||
const eventTimestamp: number | undefined =
|
||||
typeof anyEvent["created_at"] === "number" && anyEvent["created_at"]
|
||||
? (anyEvent["created_at"] as number)
|
||||
: typeof (anyEvent["response"] as Record<string, unknown> | undefined)?.["created_at"] === "number"
|
||||
? ((anyEvent["response"] as Record<string, number>)["created_at"] as number)
|
||||
: (() => {
|
||||
const ts = (anyEvent["data"] as Record<string, unknown> | undefined)?.["timestamp"];
|
||||
if (typeof ts !== "string") return undefined;
|
||||
const ms = new Date(ts).getTime();
|
||||
// Guard against NaN: Python isoformat() emits microseconds without Z,
|
||||
// which some JS engines cannot parse. Number.isFinite rejects NaN.
|
||||
return Number.isFinite(ms) ? ms / 1000 : undefined;
|
||||
})();
|
||||
const baseTimestamp = Math.floor(Date.now() / 1000);
|
||||
const lastTimestamp =
|
||||
prev.length > 0
|
||||
? (prev[prev.length - 1] as { _uiTimestamp?: number })
|
||||
._uiTimestamp || 0
|
||||
: 0;
|
||||
const uniqueTimestamp = Math.max(
|
||||
baseTimestamp,
|
||||
lastTimestamp + 1
|
||||
);
|
||||
// When we have a real server timestamp clamp to lastTimestamp (no +1s gap).
|
||||
// When synthesizing, keep the +1 s gap so ordering is always monotonic.
|
||||
const uniqueTimestamp =
|
||||
eventTimestamp !== undefined
|
||||
? Math.max(eventTimestamp, lastTimestamp)
|
||||
: Math.max(baseTimestamp, lastTimestamp + 1);
|
||||
|
||||
return [
|
||||
...prev,
|
||||
@@ -992,14 +1012,37 @@ export function WorkflowView({
|
||||
openAIEvent.type === "response.workflow_event.completed"
|
||||
) {
|
||||
setOpenAIEvents((prev) => {
|
||||
// Generate unique timestamp for each event
|
||||
// Derive a server-side timestamp from the event, in priority order:
|
||||
// 1. top-level created_at (custom output-item events)
|
||||
// 2. response.created_at (response.created / lifecycle events)
|
||||
// 3. data.timestamp (response.workflow_event.completed ISO string)
|
||||
// Fall back to a synthesized timestamp only when none is present.
|
||||
const anyEvent = openAIEvent as Record<string, unknown>;
|
||||
const eventTimestamp: number | undefined =
|
||||
typeof anyEvent["created_at"] === "number" && anyEvent["created_at"]
|
||||
? (anyEvent["created_at"] as number)
|
||||
: typeof (anyEvent["response"] as Record<string, unknown> | undefined)?.["created_at"] === "number"
|
||||
? ((anyEvent["response"] as Record<string, number>)["created_at"] as number)
|
||||
: (() => {
|
||||
const ts = (anyEvent["data"] as Record<string, unknown> | undefined)?.["timestamp"];
|
||||
if (typeof ts !== "string") return undefined;
|
||||
const ms = new Date(ts).getTime();
|
||||
// Guard against NaN: Python isoformat() emits microseconds without Z,
|
||||
// which some JS engines cannot parse. Number.isFinite rejects NaN.
|
||||
return Number.isFinite(ms) ? ms / 1000 : undefined;
|
||||
})();
|
||||
const baseTimestamp = Math.floor(Date.now() / 1000);
|
||||
const lastTimestamp =
|
||||
prev.length > 0
|
||||
? (prev[prev.length - 1] as { _uiTimestamp?: number })
|
||||
._uiTimestamp || 0
|
||||
: 0;
|
||||
const uniqueTimestamp = Math.max(baseTimestamp, lastTimestamp + 1);
|
||||
// When we have a real server timestamp clamp to lastTimestamp (no +1s gap).
|
||||
// When synthesizing, keep the +1 s gap so ordering is always monotonic.
|
||||
const uniqueTimestamp =
|
||||
eventTimestamp !== undefined
|
||||
? Math.max(eventTimestamp, lastTimestamp)
|
||||
: Math.max(baseTimestamp, lastTimestamp + 1);
|
||||
|
||||
return [
|
||||
...prev,
|
||||
|
||||
@@ -391,6 +391,94 @@ async def test_executor_failed_event(mapper: MessageMapper, test_request: AgentF
|
||||
assert "Executor failed" in str(item["error"])
|
||||
|
||||
|
||||
async def test_executor_events_carry_created_at_timestamp(
|
||||
mapper: MessageMapper, test_request: AgentFrameworkRequest
|
||||
) -> None:
|
||||
"""REGRESSION TEST: Executor mapped events must include a created_at timestamp.
|
||||
|
||||
Without created_at, the frontend synthesizes timestamps using
|
||||
Math.max(baseTimestamp, lastTimestamp + 1) with second precision, forcing
|
||||
a minimum 1-second gap between sequential events regardless of their actual
|
||||
elapsed time. This makes instant workflows appear to take multiple seconds
|
||||
in the DevUI timeline.
|
||||
"""
|
||||
invoke_event = create_executor_invoked_event(executor_id="exec_ts")
|
||||
complete_event = create_executor_completed_event(executor_id="exec_ts")
|
||||
fail_event = create_executor_failed_event(executor_id="exec_ts_fail")
|
||||
|
||||
invoked_results = await mapper.convert_event(invoke_event, test_request)
|
||||
completed_results = await mapper.convert_event(complete_event, test_request)
|
||||
|
||||
# Set up a separate context for the failed path
|
||||
mapper2 = MessageMapper()
|
||||
await mapper2.convert_event(create_executor_invoked_event(executor_id="exec_ts_fail"), test_request)
|
||||
failed_results = await mapper2.convert_event(fail_event, test_request)
|
||||
|
||||
for label, results in [
|
||||
("executor_invoked", invoked_results),
|
||||
("executor_completed", completed_results),
|
||||
("executor_failed", failed_results),
|
||||
]:
|
||||
assert results, f"mapper.convert_event should return events for {label}"
|
||||
for event in results:
|
||||
assert getattr(event, "created_at", None) is not None, (
|
||||
f"{label} mapped event {type(event).__name__} is missing 'created_at'. "
|
||||
"The frontend relies on this field for accurate workflow timeline timings."
|
||||
)
|
||||
assert event.created_at > 0, (
|
||||
f"{label} mapped event {type(event).__name__} has a non-positive "
|
||||
f"created_at value ({event.created_at!r}); expected a valid Unix timestamp."
|
||||
)
|
||||
|
||||
|
||||
def test_custom_output_item_event_models_have_created_at_field() -> None:
|
||||
"""MODEL TEST: CustomResponseOutputItemAddedEvent and Done must declare created_at.
|
||||
|
||||
This guards against accidentally removing the field from the model definition.
|
||||
A missing field causes a downstream ValidationError instead of a clear test failure.
|
||||
"""
|
||||
from agent_framework_devui.models._openai_custom import (
|
||||
CustomResponseOutputItemAddedEvent,
|
||||
CustomResponseOutputItemDoneEvent,
|
||||
)
|
||||
|
||||
assert "created_at" in CustomResponseOutputItemAddedEvent.model_fields, (
|
||||
"CustomResponseOutputItemAddedEvent is missing 'created_at' in model_fields. "
|
||||
"The frontend uses this field for accurate workflow timeline timings."
|
||||
)
|
||||
assert "created_at" in CustomResponseOutputItemDoneEvent.model_fields, (
|
||||
"CustomResponseOutputItemDoneEvent is missing 'created_at' in model_fields. "
|
||||
"The frontend uses this field for accurate workflow timeline timings."
|
||||
)
|
||||
|
||||
|
||||
async def test_executor_completed_maps_to_output_item_done_event(
|
||||
mapper: MessageMapper, test_request: AgentFrameworkRequest
|
||||
) -> None:
|
||||
"""Test executor_completed events are mapped to CustomResponseOutputItemDoneEvent.
|
||||
|
||||
Ensures executor_completed does not fall through to the legacy
|
||||
ResponseWorkflowEventComplete path, which lacks a top-level created_at field.
|
||||
"""
|
||||
from agent_framework_devui.models._openai_custom import ResponseWorkflowEventComplete
|
||||
|
||||
invoke_event = create_executor_invoked_event(executor_id="exec_output_item")
|
||||
await mapper.convert_event(invoke_event, test_request)
|
||||
|
||||
complete_event = create_executor_completed_event(executor_id="exec_output_item")
|
||||
results = await mapper.convert_event(complete_event, test_request)
|
||||
|
||||
assert results, "mapper.convert_event should return events for executor_completed"
|
||||
|
||||
workflow_events = [r for r in results if isinstance(r, ResponseWorkflowEventComplete)]
|
||||
assert not workflow_events, (
|
||||
"executor_completed should map to CustomResponseOutputItemDoneEvent, not ResponseWorkflowEventComplete."
|
||||
)
|
||||
|
||||
output_item_done = [r for r in results if r.type == "response.output_item.done"]
|
||||
assert output_item_done, f"Expected at least one response.output_item.done event; got: {[r.type for r in results]}"
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Workflow Lifecycle Event Tests
|
||||
# =============================================================================
|
||||
|
||||
Reference in New Issue
Block a user