← Back to Benchmarks
simmediummanipulation-datametric · varies

Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation

Description

Visual loco-manipulation of arbitrary objects in the wild with humanoid robots requires accurate end-effector (EE) control and a generalizable understanding of the scene via visual inputs (e.g., RGB-D images). Existing approaches are based on real-world imitation learning and exhibit limited generalization due to the difficulty in collecting large-scale training datasets. This paper presents a new paradigm, HERO, for object loco-manipulation with humanoid robots that combines the strong generali

Source

http://arxiv.org/abs/2602.16705v2