simmediummanipulationmetric · varies

DeFM: Learning Foundation Representations from Depth for Robotics

Description

Depth sensors are widely deployed across robotic platforms, and advances in fast, high-fidelity depth simulation have enabled robotic policies trained on depth observations to achieve robust sim-to-real transfer for a wide range of tasks. Despite this, representation learning for depth modality remains underexplored compared to RGB, where large-scale foundation models now define the state of the art. To address this gap, we present DeFM, a self-supervised foundation model trained entirely on dep

Source

http://arxiv.org/abs/2601.18923v1