← Back to Benchmarks
simmediumhumanoidmetric · varies

HumanoidVLM: Vision-Language-Guided Impedance Control for Contact-Rich Humanoid Manipulation

Description

Humanoid robots must adapt their contact behavior to diverse objects and tasks, yet most controllers rely on fixed, hand-tuned impedance gains and gripper settings. This paper introduces HumanoidVLM, a vision-language driven retrieval framework that enables the Unitree G1 humanoid to select task-appropriate Cartesian impedance parameters and gripper configurations directly from an egocentric RGB image. The system couples a vision-language model for semantic task inference with a FAISS-based Retr

Source

http://arxiv.org/abs/2601.14874v1