← Back to Benchmarks
simmediumquadrupedmetric · varies

MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding

Description

Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs), enhanced with two human-inspired mechanisms: perspective-based active grounding, which dynamically adjusts the robot's viewpoint for improved visual inspection, and historical memory backtracking, which enables the system to retain and re-evaluate uncertain observ

Source

http://arxiv.org/abs/2508.05021v1