← Back to Benchmarks
simmediumimitationmetric · varies

Exploring Conditions for Diffusion models in Robotic Control

Description

While pre-trained visual representations have significantly advanced imitation learning, they are often task-agnostic as they remain frozen during policy learning. In this work, we explore leveraging pre-trained text-to-image diffusion models to obtain task-adaptive visual representations for robotic control, without fine-tuning the model itself. However, we find that naively applying textual conditions - a successful strategy in other vision domains - yields minimal or even negative gains in co

Source

http://arxiv.org/abs/2510.15510v2