Python: fix reasoning model workflow handoff and history serialization (#4083)

mirror of https://github.com/microsoft/agent-framework.git synced 2026-06-16 21:04:09 +08:00

* fix: strip function_call and text_reasoning from cross-agent workflow handoff

When a reasoning model (e.g. gpt-5-mini) runs as Agent 1 in a workflow, its
response includes text_reasoning items (with server-scoped IDs like rs_XXXX)
and function_call items. Forwarding these to Agent 2 in a fresh conversation
caused API errors because the reasoning/call IDs are scoped to the original
stored response context.

Changes:
- Strip 'function_call', 'text_reasoning', 'function_approval_request', and
  'function_approval_response' from handoff messages in _agent_executor.py
- Keep 'function_result' so the actual tool output content is preserved for
  the next agent's context
- Update unit tests to reflect that function_result messages survive handoff
  (messages grow from 2→3: user, tool(result), assistant(summary))
- Fix incorrect test assertions in test_function_invocation_stop_clears_*
  that assumed the client layer updates session.service_session_id
- Also fixed _extract_function_calls to search all messages with call_id
  deduplication, and the error-limit stop path to submit function_call_output
  items before halting (via tool_choice=none cleanup call)

Relates to: https://github.com/microsoft/agent-framework/issues/4047

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: reasoning model workflow handoff and history serialization

Fixes multiple related issues when using reasoning models (gpt-5-mini,
gpt-5.2) in multi-agent workflows that chain agents via from_response
or replay full conversation history via AgentExecutorRequest.

## Reasoning items always emitted on output_item.added

When a reasoning model produces encrypted or hidden reasoning (no
visible text), the Responses API still fires a reasoning output item
without any reasoning_text.delta events. Previously no text_reasoning
Content was emitted in that case, making it invisible to downstream
logic. Both the non-streaming (_parse_response_from_openai) and
streaming (output_item.added) paths now always emit at least one
text_reasoning Content — with empty text if no content is available —
so co-occurrence detection and serialization guards work reliably.

## Reasoning items only serialized when paired with a function_call

The Responses API only accepts reasoning items in input when they
directly preceded a function_call in the original response. Sending a
reasoning item that preceded a text response (no tool call) causes:
  "reasoning was provided without its required following item"
_prepare_message_for_openai now checks has_function_call per message
and skips text_reasoning serialization when there is no accompanying
function_call.

## summary field is an array, not an object

The reasoning item summary field sent to the Responses API must be an
array of objects ([{"type": "summary_text", "text": ...}]), not a
single object. Fixed _prepare_content_for_openai accordingly.

## service_session_id cleared when explicit history is provided

When a workflow coordinator replays a full conversation (including
function calls from a previous agent run) back to an executor via
AgentExecutorRequest or from_response, the executor's session still
held a service_session_id (previous_response_id) from the prior run.
The API then received the same function-call items twice — once from
previous_response_id (server-stored) and once from the explicit input —
causing: "Duplicate item found with id fc_...".

AgentExecutor.run (when should_respond=True) and from_response now
reset self._session.service_session_id = None before running so that
explicit input is the sole source of conversation context.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* small improvements in text reasoning

* refactor: add reset_service_session to AgentExecutorRequest for explicit history replay

Replace the implicit 'always clear service_session_id when should_respond=True'
with an explicit opt-in field on AgentExecutorRequest.

The old approach used should_respond=True as a proxy for 'full history replay',
but that conflates two distinct intents:
- Orchestrations group chat sends should_respond=True with an empty/single-message
  list (not a full replay) — unnecessarily clearing service_session_id.
- HITL / feedback coordinators send the full prior conversation and truly need
  a fresh service session ID to avoid duplicate-item API errors.

Changes:
- Add AgentExecutorRequest.reset_service_session: bool = False
- AgentExecutor.run only clears service_session_id when this flag is True
- AgentExecutor.from_response unchanged (always clears; always full conversation)
- Set reset_service_session=True in all full-history-replay call sites:
  agents_with_HITL.py, azure_chat_agents_tool_calls_with_feedback.py,
  autogen-migration round-robin coordinator, tau2 runner
- Update _FullHistoryReplayCoordinator test helper to pass the flag

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* comment update

* fixes from feedback

* fix test

* reverted changes to agent executor

* fix: remove reset_service_session from tau2 runner

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* two other reverts

* fix sample

---------

Co-authored-by: Giles Odigwe <79032838+giles17@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

This commit is contained in:

Eduard van Valkenburg

2026-02-19 22:02:20 +01:00

committed by

GitHub

Unverified

parent 2cb4137501

commit 67ce1baecf

11 changed files with 445 additions and 66 deletions

									
										python/samples/03-workflows/agents/azure_chat_agents_tool_calls_with_feedback.py
									
		+3
		-4
	
												View File
												
				@@ -153,7 +153,7 @@ class Coordinator(Executor):

				            # Human approved the draft as-is; forward it unchanged.

				            await ctx.send_message(

				                AgentExecutorRequest(

				                    messages=original_request.conversation + [Message("user", text="The draft is approved as-is.")],

				                    messages=[*original_request.conversation, *[Message("user", text="The draft is approved as-is.")]],

				                    should_respond=True,

				                ),

				                target_id=self.final_editor_id,

				@@ -161,16 +161,15 @@ class Coordinator(Executor):

				            return

				        # Human provided feedback; prompt the writer to revise.

				        conversation: list[Message] = list(original_request.conversation)

				        instruction = (

				            "A human reviewer shared the following guidance:\n"

				            f"{note or 'No specific guidance provided.'}\n\n"

				            "Rewrite the draft from the previous assistant message into a polished final version. "

				            "Keep the response under 120 words and reflect any requested tone adjustments."

				        )

				        conversation.append(Message("user", text=instruction))

				        await ctx.send_message(

				            AgentExecutorRequest(messages=conversation, should_respond=True), target_id=self.writer_id

				            AgentExecutorRequest(messages=[Message("user", text=instruction)], should_respond=True),

				            target_id=self.writer_id,

				        )