← Back to Benchmarks
simmediumnavigationmetric · varies
NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving
Description
Vision-language models (VLMs) have emerged as a promising direction for end-to-end autonomous driving (AD) by jointly modeling visual observations, driving context, and language-based reasoning. However, existing VLM-based systems face a trade-off between high-level reasoning and motion planning: large models offer strong semantic understanding but are costly to adapt for precise control, whereas small VLM models can be fine-tuned efficiently but often exhibit weaker reasoning. We propose NaviDr