← Back to Benchmarks
simmediumhumanoidmetric · varies
RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics
Description
Spatial tracing, as a fundamental embodied interaction ability for robots, is inherently challenging as it requires multi-step metric-grounded reasoning compounded with complex spatial referring and real-world metric measurement. However, existing methods struggle with this compositional task. To this end, we propose RoboTracer, a 3D-aware VLM that first achieves both 3D spatial referring and measuring via a universal spatial encoder and a regression-supervised decoder to enhance scale awareness