policy
VoxPoser
Stanford · PyTorch
or hover any field below to flag it
Overview
Name
VoxPoser
Author
Stanford
Framework
PyTorch
License
mit
Skill type
manipulation
Evidence level
verified
Task description
Composes 3D voxel value maps for robot manipulation using LLM-generated code. An LLM writes Python code that interacts with open-vocabulary vision models to create 3D affordance/constraint maps. MPC planner synthesizes closed-loop 6-DoF trajectories from the maps. Zero-shot generalization to novel tasks and objects.
Spaces
Action space
end-effector-pose · 6-dim · 5Hz
Observation space
- type: multimodal
- · rgbd_images
- · point_cloud
- · language_instruction
- · open_vocab_detections
Links
HuggingFace repo
null
Paper (arXiv)
Compatible robots
0+1 mentioned but not in catalog yetNo robots list VoxPoser as compatible yet. Know of one? Flag it above.
Compatible environments
2Datasets that reference this policy
0No datasets reference VoxPoser yet.