Python: Unify tool results as Content items with rich content support (#4331)

* feat(python): allow @tool functions to return rich content (images, audio)

Add support for tool functions to return Content objects that the model can perceive natively. Closes #4272

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Anthropic logging + mypy fix

* Address PR review: fix MCP ordering, fold helper into from_function_result, fix Chat client

- Preserve original content order in MCP tool results instead of text-first
- Move _build_function_result logic into Content.from_function_result()
- Chat Completions: inject user message for rich items (API only supports string tool content)
- Update tests for ordering and new from_function_result behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use native Responses API multi-part output, warn+omit for Chat client

- Responses client: put rich items directly in function_call_output's
  output field as list (native API support) instead of user message injection
- Chat client: warn and omit rich items (API doesn't support multi-part
  tool results), matching Ollama/Bedrock pattern
- Unify test image: use sample_image.jpg across all integration tests
- Add Azure OpenAI Responses integration test
- Assert model describes house image to verify perception

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix lint: remove print statement, wrap long line

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback: bug fixes, single-pass MCP, unit tests

- Add isinstance guard in from_function_result for non-Content lists
- Fix Anthropic empty tool_content fallback to string result
- Fix Content(type='text', text=None) edge case in parse_result
- Rewrite MCP _parse_tool_result_from_mcp as single-pass (no index counters)
- Add Anthropic unit tests: data image, uri image, unsupported media, all-unsupported
- Add OpenAI Chat unit test: rich items warning and omission
- Add OpenAI Responses unit tests: function_result with/without items
- Add test_types tests: only-rich-items list, non-Content list fallback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix pyright errors: add type ignore comments for Any list iteration

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix mypy/pyright: ensure ToolExecutionException receives str

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix lint: remove duplicate test_prepare_options_excludes_conversation_id

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: unify all tool results into Content items

* addressed copilot comments

* pyright fix

* small fix

* comments

* fix: address Copilot review - warnings, blob safety, dedup

- Add warning logs when rich content is dropped in Claude agent and
  MCP server handlers (matching Chat/Bedrock/Ollama pattern)
- Defensive blob URI construction: wrap plain base64 in data: prefix
- Simplify Chat client _prepare_content_for_openai to use content.result
- Simplify Responses client text-only path, remove redundant nesting
- Add test for plain base64 blob without data: prefix

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix token double-counting in compaction and address review comments

- Exclude items from _serialize_content() to prevent double-counting
  tokens when items mirrors result in function_result content
- Add rich content warning in GitHub Copilot agent tool handler
- Replace raw Content debug log with concise item count/type summary
- Update stale test comments about FunctionTool.invoke return type

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Giles Odigwe
2026-03-12 15:30:09 -07:00
committed by GitHub
Unverified
parent b6a1315386
commit 5e33deff45
27 changed files with 1337 additions and 332 deletions
@@ -523,10 +523,22 @@ class BedrockChatClient(
}
}
case "function_result":
if content.items:
text_parts = [item.text or "" for item in content.items if item.type == "text"]
rich_items = [item for item in content.items if item.type in ("data", "uri")]
if rich_items:
logger.warning(
"Bedrock does not support rich content (images, audio) in tool results. "
"Rich content items will be omitted."
)
tool_result_text = "\n".join(text_parts) if text_parts else ""
tool_result_blocks = self._convert_tool_result_to_blocks(tool_result_text)
else:
tool_result_blocks = self._convert_tool_result_to_blocks(content.result)
tool_result_block = {
"toolResult": {
"toolUseId": content.call_id,
"content": self._convert_tool_result_to_blocks(content.result),
"content": tool_result_blocks,
"status": "error" if content.exception else "success",
}
}
@@ -547,7 +559,12 @@ class BedrockChatClient(
return None
def _convert_tool_result_to_blocks(self, result: Any) -> list[dict[str, Any]]:
prepared_result = result if isinstance(result, str) else FunctionTool.parse_result(result)
if isinstance(result, str):
prepared_result = result
else:
parsed = FunctionTool.parse_result(result)
text_parts = [c.text or "" for c in parsed if c.type == "text"]
prepared_result = "\n".join(text_parts) if text_parts else str(result)
try:
parsed_result: object = json.loads(prepared_result)
except json.JSONDecodeError:
@@ -132,4 +132,5 @@ def test_process_response_parses_tool_result() -> None:
contents = chat_response.messages[0].contents
assert contents[0].type == "function_result"
assert contents[0].result == {"answer": 42}
assert "answer" in str(contents[0].result)
assert contents[0].items is not None