simmediummanipulationmetric · varies

PocketDP3: Efficient Pocket-Scale 3D Visuomotor Policy

Description

Recently, 3D vision-based diffusion policies have shown strong capability in learning complex robotic manipulation skills. However, a common architectural mismatch exists in these models: a tiny yet efficient point-cloud encoder is often paired with a massive decoder. Given a compact scene representation, we argue that this may lead to substantial parameter waste in the decoder. Motivated by this observation, we propose PocketDP3, a pocket-scale 3D diffusion policy that replaces the heavy condit

Source

http://arxiv.org/abs/2601.22018v2