← Back to Benchmarks
simmediumpolicy-learningmetric · varies

CRAFT: Video Diffusion for Bimanual Robot Data Generation

Description

Bimanual robot learning from demonstrations is fundamentally limited by the cost and narrow visual diversity of real-world data, which constrains policy robustness across viewpoints, object configurations, and embodiments. We present Canny-guided Robot Data Generation using Video Diffusion Transformers (CRAFT), a video diffusion-based framework for scalable bimanual demonstration generation that synthesizes temporally coherent manipulation videos while producing action labels. By conditioning vi

Source

http://arxiv.org/abs/2604.03552v1