policy
Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning
Sunkeerth · PyTorch
or hover any field below to flag it
Overview
Name
Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning
Author
Sunkeerth
Framework
PyTorch
License
unknown
Skill type
locomotion
Evidence level
untested
Task description
Vision–Language–Action (VLA) robotic navigation model combines ResNet vision encoder and transformer language encoder for multimodal fusion. Enables real-time autonomous control in physics simulation, trained on GPU and running locally with CUDA. Foundation for XR/humanoid robotics; research code fo
Spaces
Action space
other · 0-dim · 0Hz
Observation space
- type: other
Links
HuggingFace repo
null
Paper (arXiv)
null
Compatible robots
0No robots list Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning as compatible yet. Know of one? Flag it above.
Compatible environments
0No environments list Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning yet.
Datasets that reference this policy
0No datasets reference Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning yet.