← Back to Benchmarks
simmediummanipulationmetric · varies

LiLo-VLA: Compositional Long-Horizon Manipulation via Linked Object-Centric Policies

Description

General-purpose robots must master long-horizon manipulation, defined as tasks involving multiple kinematic structure changes (e.g., attaching or detaching objects) in unstructured environments. While Vision-Language-Action (VLA) models offer the potential to master diverse atomic skills, they struggle with the combinatorial complexity of sequencing them and are prone to cascading failures due to environmental sensitivity. To address these challenges, we propose LiLo-VLA (Linked Local VLA), a mo

Source

http://arxiv.org/abs/2602.21531v1