policy

OpenVLA

Stanford / UC Berkeley / Google DeepMind / TRI · PyTorch

or hover any field below to flag it

Overview

Name
OpenVLA
Author
Stanford / UC Berkeley / Google DeepMind / TRI
Framework
PyTorch
License
mit
Skill type
manipulation
Evidence level
verified
Task description
7B-parameter vision-language-action model trained on 970K real-world demos from Open X-Embodiment. Fused SigLIP+DINOv2 visual encoder with Llama 2 backbone. Outputs tokenized robot actions from RGB image + language instruction. Fine-tunable via LoRA on consumer GPUs.

Spaces

Action space
end-effector-pose · 7-dim · 5Hz
Observation space
  • type: multimodal
  • · primary_rgb (224x224)
  • · language_instruction

Links

Compatible robots

0+2 mentioned but not in catalog yet

No robots list OpenVLA as compatible yet. Know of one? Flag it above.

Compatible environments

2

Datasets that reference this policy

0

No datasets reference OpenVLA yet.