simmediumsim-to-realmetric · varies

PolaRiS: Scalable Real-to-Sim Evaluations for Generalist Robot Policies

Description

A significant challenge for robot learning research is our ability to accurately measure and compare the performance of robot policies. Benchmarking in robotics is historically challenging due to the stochasticity, reproducibility, and time-consuming nature of real-world rollouts. This challenge is exacerbated for recent generalist policies, which has to be evaluated across a wide variety of scenes and tasks. Evaluation in simulation offers a scalable complement to real world evaluations, but th

Source

http://arxiv.org/abs/2512.16881v2