simmediumroboticsmetric · varies

Learning from Mistakes: Post-Training for Driving VLA with Takeover Data

Description

Current Vision-Language-Action (VLA) paradigms in end-to-end autonomous driving rely on offline training from static datasets, leaving them vulnerable to distribution shift. Recent post-training methods use takeover data to mitigate this by augmenting the dataset with high-quality expert takeover samples, yet they suffer from two key limitations: supervision restricted to the period after the takeover moments leads to policies with limited safety margins, and passive preference optimization lack

Source

http://arxiv.org/abs/2603.14972v1