← Back to Benchmarks
simmediumnavigationmetric · varies
Think, Remember, Navigate: Zero-Shot Object-Goal Navigation with VLM-Powered Reasoning
Description
While Vision-Language Models (VLMs) are set to transform robotic navigation, existing methods often underutilize their reasoning capabilities. To unlock the full potential of VLMs in robotics, we shift their role from passive observers to active strategists in the navigation process. Our framework outsources high-level planning to a VLM, which leverages its contextual understanding to guide a frontier-based exploration agent. This intelligent guidance is achieved through a trio of techniques: st