simmediumroboticsmetric · varies

FloorPlan-VLN: A New Paradigm for Floor Plan Guided Vision-Language Navigation

Description

Existing Vision-Language Navigation (VLN) task requires agents to follow verbose instructions, ignoring some potentially useful global spatial priors, limiting their capability to reason about spatial structures. Although human-readable spatial schematics (e.g., floor plans) are ubiquitous in real-world buildings, current agents lack the cognitive ability to comprehend and utilize them. To bridge this gap, we introduce \textbf{FloorPlan-VLN}, a new paradigm that leverages structured semantic flo

Source

http://arxiv.org/abs/2603.17437v1