Python: Add timeout parameter to FoundryAgent to fix ConnectTimeout on multi-turn conversations (#6263)

* Python: fix ConnectTimeout on multi-turn FoundryAgent conversations (#6241)

Expose a `timeout` parameter on `RawFoundryAgentChatClient`,
`_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`, and
`RawOpenAIChatClient` so callers can override the HTTP timeout used by
the underlying AsyncOpenAI client.

Root cause: `RawFoundryAgentChatClient.__init__` called
`project_client.get_openai_client()` without configuring any timeout,
inheriting the OpenAI SDK default of `httpx.Timeout(connect=5.0)`.
When connections are recycled between turns under load, the 5 s connect
timeout fires and surfaces as `openai.APITimeoutError`.

Fix:
- `load_openai_service_settings` (`_shared.py`): accept `timeout` and
  include it in `client_args` for all three `AsyncOpenAI`/
  `AsyncAzureOpenAI` construction paths.
- `RawOpenAIChatClient.__init__` (`_chat_client.py`): accept `timeout`
  and forward to `load_openai_service_settings`.
- `RawFoundryAgentChatClient.__init__` (`_agent.py`): accept `timeout`
  and set `openai_client.timeout = timeout` on the client returned by
  `get_openai_client()` before passing it to the base class.
- `_FoundryAgentChatClient`, `RawFoundryAgent`, `FoundryAgent`: accept
  and propagate `timeout` through the construction chain.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add timeout parameter to FoundryAgent and RawOpenAIChatClient

Expose a timeout parameter on RawFoundryAgentChatClient,
_FoundryAgentChatClient, RawFoundryAgent, FoundryAgent, and
RawOpenAIChatClient. When provided, the value is applied to the
underlying AsyncOpenAI client so that connect timeouts under load
or after connection recycling can be tuned by callers.

Previously, get_openai_client() was called without any timeout
override, so the SDK default of httpx.Timeout(connect=5.0) was
inherited and could fire on multi-turn conversations where the
underlying connection is recycled between turns.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Add `timeout` parameter to `FoundryAgent` to fix `ConnectTimeout` on multi-turn conversations

Fixes #6241

* fix(foundry): use with_options to avoid mutating shared OpenAI client timeout (#6241)

Replace direct assignment  with
 in
RawFoundryAgentChatClient.__init__.

The Azure AI Projects SDK caches and returns a shared AsyncOpenAI client
per AIProjectClient. Mutating its .timeout attribute leaked the override
to all other code paths sharing that client (other agents, user code).
with_options() returns a new client instance with the override applied,
leaving the original shared client untouched.

Update tests to assert with_options is called with the correct timeout
and that the original shared client's timeout attribute is not mutated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(foundry): assert with_options return value flows to instance.client (#6241)

The four timeout propagation tests verified that with_options was called
but did not confirm that the returned (timeout-configured) client was
actually stored on the instance. A silent discard of the return value
would have left the tests green while the timeout had no effect.

Each test now captures the constructed instance and asserts:
  assert <instance>.client is openai_client_mock.with_options.return_value

Affected tests:
- test_raw_foundry_agent_chat_client_init_applies_timeout_to_openai_client
- test_raw_foundry_agent_chat_client_init_applies_timeout_with_preview_enabled
- test_foundry_agent_chat_client_init_propagates_timeout
- test_foundry_agent_init_propagates_timeout_to_openai_client

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Evan Mattson
2026-06-05 03:25:18 +09:00
committed by GitHub
Unverified
parent bc0e65d716
commit 6b94315161
5 changed files with 173 additions and 3 deletions
@@ -191,6 +191,7 @@ class RawFoundryAgentChatClient( # type: ignore[misc]
compaction_strategy: CompactionStrategy | None = None,
tokenizer: TokenizerProtocol | None = None,
additional_properties: dict[str, Any] | None = None,
timeout: float | None = None,
) -> None:
"""Initialize a raw Foundry Agent client.
@@ -211,6 +212,8 @@ class RawFoundryAgentChatClient( # type: ignore[misc]
compaction_strategy: Optional per-client compaction override.
tokenizer: Optional tokenizer for compaction strategies.
additional_properties: Additional properties stored on the client instance.
timeout: HTTP timeout in seconds for requests. When not provided, the
OpenAI SDK default is used (connect: 5s, total: 600s).
"""
settings = load_settings(
FoundryAgentSettings,
@@ -260,8 +263,11 @@ class RawFoundryAgentChatClient( # type: ignore[misc]
openai_client_kwargs["default_headers"] = dict(default_headers)
if allow_preview:
openai_client_kwargs["agent_name"] = self.agent_name
openai_client = self.project_client.get_openai_client(**openai_client_kwargs)
if timeout is not None:
openai_client = openai_client.with_options(timeout=timeout)
super().__init__(
async_client=self.project_client.get_openai_client(**openai_client_kwargs),
async_client=openai_client,
default_headers=default_headers,
instruction_role=instruction_role,
compaction_strategy=compaction_strategy,
@@ -537,6 +543,7 @@ class _FoundryAgentChatClient( # type: ignore[misc]
additional_properties: dict[str, Any] | None = None,
middleware: (Sequence[ChatAndFunctionMiddlewareTypes] | None) = None,
function_invocation_configuration: FunctionInvocationConfiguration | None = None,
timeout: float | None = None,
) -> None:
"""Initialize a Foundry Agent client with full middleware support.
@@ -556,6 +563,8 @@ class _FoundryAgentChatClient( # type: ignore[misc]
additional_properties: Additional properties stored on the client instance.
middleware: Optional sequence of middleware.
function_invocation_configuration: Optional function invocation configuration.
timeout: HTTP timeout in seconds for requests. When not provided, the
OpenAI SDK default is used (connect: 5s, total: 600s).
"""
super().__init__(
project_endpoint=project_endpoint,
@@ -573,6 +582,7 @@ class _FoundryAgentChatClient( # type: ignore[misc]
additional_properties=additional_properties,
middleware=middleware,
function_invocation_configuration=function_invocation_configuration,
timeout=timeout,
)
@@ -625,6 +635,7 @@ class RawFoundryAgent( # type: ignore[misc]
compaction_strategy: CompactionStrategy | None = None,
tokenizer: TokenizerProtocol | None = None,
additional_properties: Mapping[str, Any] | None = None,
timeout: float | None = None,
) -> None:
"""Initialize a Foundry Agent.
@@ -657,6 +668,8 @@ class RawFoundryAgent( # type: ignore[misc]
compaction_strategy: Optional agent-level in-run compaction override.
tokenizer: Optional agent-level tokenizer override.
additional_properties: Additional properties stored on the local agent wrapper.
timeout: HTTP timeout in seconds for requests. When not provided, the
OpenAI SDK default is used (connect: 5s, total: 600s).
"""
# Create the client
actual_client_type = client_type or _FoundryAgentChatClient
@@ -675,6 +688,7 @@ class RawFoundryAgent( # type: ignore[misc]
"default_headers": default_headers,
"env_file_path": env_file_path,
"env_file_encoding": env_file_encoding,
"timeout": timeout,
}
if function_invocation_configuration is not None:
if not issubclass(actual_client_type, FunctionInvocationLayer):
@@ -912,6 +926,7 @@ class FoundryAgent( # type: ignore[misc]
compaction_strategy: CompactionStrategy | None = None,
tokenizer: TokenizerProtocol | None = None,
additional_properties: Mapping[str, Any] | None = None,
timeout: float | None = None,
) -> None:
"""Initialize a Foundry Agent with full middleware and telemetry.
@@ -958,6 +973,8 @@ class FoundryAgent( # type: ignore[misc]
compaction_strategy: Optional agent-level in-run compaction override.
tokenizer: Optional agent-level tokenizer override.
additional_properties: Additional properties stored on the local agent wrapper.
timeout: HTTP timeout in seconds for requests. When not provided, the
OpenAI SDK default is used (connect: 5s, total: 600s).
"""
super().__init__(
project_endpoint=project_endpoint,
@@ -983,4 +1000,5 @@ class FoundryAgent( # type: ignore[misc]
compaction_strategy=compaction_strategy,
tokenizer=tokenizer,
additional_properties=additional_properties,
timeout=timeout,
)