simmediumvision-robotmetric · varies

Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

Description

Embodied LLMs endow robots with high-level task reasoning, but they cannot reflect on what went wrong or why, turning deployment into a sequence of independent trials where mistakes repeat rather than accumulate into experience. Drawing upon human reflective practitioners, we introduce Reflective Test-Time Planning, which integrates two modes of reflection: \textit{reflection-in-action}, where the agent uses test-time scaling to generate and score multiple candidate actions using internal reflec

Source

http://arxiv.org/abs/2602.21198v1