simmediumvision-robotmetric · varies

Draft-and-Target Sampling for Video Generation Policy

Description

Video generation models have been used as a robot policy to predict the future states of executing a task conditioned on task description and observation. Previous works ignore their high computational cost and long inference time. To address this challenge, we propose Draft-and-Target Sampling, a novel diffusion inference paradigm for video generation policy that is training-free and can improve inference efficiency. We introduce a self-play denoising approach by utilizing two complementary den

Source

http://arxiv.org/abs/2603.13438v1