Files
browser-use/eval
Magnus Müller 58c3b25cfd Complete last_message flow to comprehensive judge
- Update reformat_agent_history to accept and save last_message parameter
- Ensure last_message flows from run_agent_with_browser to comprehensive judge
- Add last_message to result.json structure for judge evaluation
- Judge now has complete context including agent's final reasoning
2025-06-25 10:34:50 +02:00
..
2025-06-25 10:14:14 +02:00