← Back to Benchmarks
simmediumsim-to-realmetric · varies

RoboPaint: From Human Demonstration to Any Robot and Any View

Description

Acquiring large-scale, high-fidelity robot demonstration data remains a critical bottleneck for scaling Vision-Language-Action (VLA) models in dexterous manipulation. We propose a Real-Sim-Real data collection and data editing pipeline that transforms human demonstrations into robot-executable, environment-specific training data without direct robot teleoperation. Standardized data collection rooms are built to capture multimodal human demonstrations (synchronized 3 RGB-D videos, 11 RGB videos,

Source

http://arxiv.org/abs/2602.05325v2