policy

Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning

Sunkeerth · PyTorch

or hover any field below to flag it

Overview

Name
Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning
Author
Sunkeerth
Framework
PyTorch
License
unknown
Skill type
locomotion
Evidence level
untested
Task description
Vision–Language–Action (VLA) robotic navigation model combines ResNet vision encoder and transformer language encoder for multimodal fusion. Enables real-time autonomous control in physics simulation, trained on GPU and running locally with CUDA. Foundation for XR/humanoid robotics; research code fo

Spaces

Action space
other · 0-dim · 0Hz
Observation space
  • type: other

Links

HuggingFace repo
null
Paper (arXiv)
null

Compatible robots

0

No robots list Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning as compatible yet. Know of one? Flag it above.

Compatible environments

0

No environments list Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning yet.

Datasets that reference this policy

0

No datasets reference Vision-Language-Action-Based-Robotic-Navigation-System-with-Multimodal-Transformer-Reasoning yet.