← Back to Benchmarks
simmediumnavigationmetric · varies
LongNav-R1: Horizon-Adaptive Multi-Turn RL for Long-Horizon VLA Navigation
Description
This paper develops LongNav-R1, an end-to-end multi-turn reinforcement learning (RL) framework designed to optimize Visual-Language-Action (VLA) models for long-horizon navigation. Unlike existing single-turn paradigm, LongNav-R1 reformulates the navigation decision process as a continuous multi-turn conversation between the VLA policy and the embodied environment. This multi-turn RL framework offers two distinct advantages: i) it enables the agent to reason about the causal effects of historica