← Back to Benchmarks
simmediummobile-manipulationmetric · varies

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Description

Training Vision-Language-Action (VLA) models for generalist robots typically requires large-scale real-world robot data, which is expensive and time-consuming to collect. The inefficiency of physical data collection severely limits the scalability, and generalization capacity of current VLA systems. To address this challenge, we introduce GigaBrain-0, a novel VLA foundation model empowered by world model-generated data (e.g., video generation, real2real transfer, human transfer, view transfer, s

Source

http://arxiv.org/abs/2510.19430v3