simmediumimitationmetric · varies

DAgger Diffusion Navigation: DAgger Boosted Diffusion Policy for Vision-Language Navigation

Description

Vision-Language Navigation in Continuous Environments (VLN-CE) requires agents to follow natural language instructions through free-form 3D spaces. Existing VLN-CE approaches typically use a two-stage waypoint planning framework, where a high-level waypoint predictor generates the navigable waypoints, and then a navigation planner suggests the intermediate goals in the high-level action space. However, this two-stage decomposition framework suffers from: (1) global sub-optimization due to the pr

Source

http://arxiv.org/abs/2508.09444v1