- treat tools as a global allowlist across built-in, extension, and SDK tools
- remove process-cwd singleton tool usage from SDK and CLI paths
- add regression coverage for extension tool filtering
closes#3452closes#2835
When auto-retry fires after a retryable error (e.g. overloaded_error) and the
retry response includes tool_use, session.prompt() returned prematurely because
_resolveRetry() was called on the first successful message_end — while the
agent loop was still executing tools via the fire-and-forget agent.continue().
This caused callers to observe isStreaming=true after prompt() returned, and
follow-up session.prompt() calls threw 'Agent is already processing'. The
tool execution results were silently lost.
Fix: move _resolveRetry() from the message_end handler to the agent_end
handler. The _retryAttempt counter reset stays on message_end (preventing
accumulation across LLM calls within a turn), but the promise that unblocks
waitForRetry() now only resolves when the full agent loop completes.
The _isRetryableError() regex used literal spaces ("server error",
"internal error") but Codex SSE error events use underscores
("server_error"). Change to .? so both space and underscore (and
direct concatenation) are matched, enabling automatic retry on
transient Codex SSE errors.
fixes#2091