← Back to Benchmarks
simmediummanipulation-datametric · varies
TaF-VLA: Tactile-Force Alignment in Vision-Language-Action Models for Force-aware Manipulation
Description
Vision-Language-Action (VLA) models have recently emerged as powerful generalists for robotic manipulation. However, due to their predominant reliance on visual modalities, they fundamentally lack the physical intuition required for contact-rich tasks that require precise force regulation and physical reasoning. Existing attempts to incorporate vision-based tactile sensing into VLA models typically treat tactile inputs as auxiliary visual textures, thereby overlooking the underlying correlation