← Back to Benchmarks
simmediumquadrupedmetric · varies
MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding
Description
Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs), enhanced with two human-inspired mechanisms: perspective-based active grounding, which dynamically adjusts the robot's viewpoint for improved visual inspection, and historical memory backtracking, which enables the system to retain and re-evaluate uncertain observ