simmediumpolicy-learningmetric · varies

Human-Robot Copilot for Data-Efficient Imitation Learning

Description

Collecting human demonstrations via teleoperation is a common approach for teaching robots task-specific skills. However, when only a limited number of demonstrations are available, policies are prone to entering out-of-distribution (OOD) states due to compounding errors or environmental stochasticity. Existing interactive imitation learning or human-in-the-loop methods try to address this issue by following the Human-Gated DAgger (HG-DAgger) paradigm, an approach that augments demonstrations th

Source

http://arxiv.org/abs/2604.03613v1