← Back to Benchmarks
simmediumnavigationmetric · varies

NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving

Description

Vision-language models (VLMs) have emerged as a promising direction for end-to-end autonomous driving (AD) by jointly modeling visual observations, driving context, and language-based reasoning. However, existing VLM-based systems face a trade-off between high-level reasoning and motion planning: large models offer strong semantic understanding but are costly to adapt for precise control, whereas small VLM models can be fine-tuned efficiently but often exhibit weaker reasoning. We propose NaviDr

Source

http://arxiv.org/abs/2603.07901v1