policy

VoxPoser

Stanford · PyTorch

or hover any field below to flag it

Overview

Name
VoxPoser
Author
Stanford
Framework
PyTorch
License
mit
Skill type
manipulation
Evidence level
verified
Task description
Composes 3D voxel value maps for robot manipulation using LLM-generated code. An LLM writes Python code that interacts with open-vocabulary vision models to create 3D affordance/constraint maps. MPC planner synthesizes closed-loop 6-DoF trajectories from the maps. Zero-shot generalization to novel tasks and objects.

Spaces

Action space
end-effector-pose · 6-dim · 5Hz
Observation space
  • type: multimodal
  • · rgbd_images
  • · point_cloud
  • · language_instruction
  • · open_vocab_detections

Links

HuggingFace repo
null

Compatible robots

0+1 mentioned but not in catalog yet

No robots list VoxPoser as compatible yet. Know of one? Flag it above.

Compatible environments

2

Datasets that reference this policy

0

No datasets reference VoxPoser yet.