fix(agent): timeout _execute_initial_actions so empty-history hangs cannot happen

The main execution loop already wraps _execute_step with asyncio.wait_for
using settings.step_timeout (default 180s). But _execute_initial_actions,
which runs before the main loop, is unwrapped — if it hangs (e.g. the
first navigate stalls on a silent CDP WebSocket before the per-action
timeout can catch it), the agent blocks indefinitely without ever
entering the main loop. No step gets recorded, history stays empty, and
any outer watchdog eventually kills the run with zero diagnostic data.

Wrap _execute_initial_actions with the same step_timeout. On timeout,
record the failure in state.last_result / consecutive_failures and fall
through to the main execution loop so the agent can still attempt to
recover. InterruptedError (from an interrupting callback) is still
swallowed silently — same contract as before.

Paired with the per-action asyncio.wait_for added in tools/service.py,
this closes the last unprotected path in the pre-main-loop flow.
This commit is contained in:
Saurav Panda
2026-04-20 16:39:22 -07:00
parent d2985dcab9
commit a97ba48345

View File

@@ -2552,11 +2552,25 @@ class Agent(Generic[Context, AgentStructuredOutput]):
# Register skills as actions if SkillService is configured
await self._register_skills_as_actions()
# Normally there was no try catch here but the callback can raise an InterruptedError
# Normally there was no try catch here but the callback can raise an InterruptedError.
# Wrap with step_timeout so initial actions (usually a single URL navigate) can't
# hang indefinitely on a silent CDP WebSocket — without this the agent would take
# zero steps and return with an empty history while any outer watchdog waits.
try:
await self._execute_initial_actions()
await asyncio.wait_for(
self._execute_initial_actions(),
timeout=self.settings.step_timeout,
)
except InterruptedError:
pass
except TimeoutError:
initial_timeout_msg = (
f'Initial actions timed out after {self.settings.step_timeout}s '
f'(browser may be unresponsive). Proceeding to main execution loop.'
)
self.logger.error(f'{initial_timeout_msg}')
self.state.last_result = [ActionResult(error=initial_timeout_msg)]
self.state.consecutive_failures += 1
except Exception as e:
raise e