← Back to Benchmarks
simmediumnavigationmetric · varies

VLN-Pilot: Large Vision-Language Model as an Autonomous Indoor Drone Operator

Description

This paper introduces VLN-Pilot, a novel framework in which a large Vision-and-Language Model (VLLM) assumes the role of a human pilot for indoor drone navigation. By leveraging the multimodal reasoning abilities of VLLMs, VLN-Pilot interprets free-form natural language instructions and grounds them in visual observations to plan and execute drone trajectories in GPS-denied indoor environments. Unlike traditional rule-based or geometric path-planning approaches, our framework integrates language

Source

http://arxiv.org/abs/2602.05552v1