← Back to Benchmarks
simmediummanipulation-datametric · varies

TaF-VLA: Tactile-Force Alignment in Vision-Language-Action Models for Force-aware Manipulation

Description

Vision-Language-Action (VLA) models have recently emerged as powerful generalists for robotic manipulation. However, due to their predominant reliance on visual modalities, they fundamentally lack the physical intuition required for contact-rich tasks that require precise force regulation and physical reasoning. Existing attempts to incorporate vision-based tactile sensing into VLA models typically treat tactile inputs as auxiliary visual textures, thereby overlooking the underlying correlation

Source

http://arxiv.org/abs/2601.20321v2