← Back to Benchmarks
simmediumimitationmetric · varies

CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision

Description

Teaching robots desired skills in real-world environments remains challenging, especially for non-experts. A key bottleneck is that collecting robotic data often requires expertise or specialized hardware, limiting accessibility and scalability. We posit that natural language offers an intuitive and accessible interface for robot learning. To this end, we study two aspects: (1) enabling non-experts to collect robotic data through natural language supervision (e.g., "move the arm to the right") a

Source

http://arxiv.org/abs/2411.00508v4