← Back to Benchmarks
simmediumgraspingmetric · varies
Using Visual Language Models to Control Bionic Hands: Assessment of Object Perception and Grasp Inference
Description
This study examines the potential of utilizing Vision Language Models (VLMs) to improve the perceptual capabilities of semi-autonomous prosthetic hands. We introduce a unified benchmark for end-to-end perception and grasp inference, evaluating a single VLM to perform tasks that traditionally require complex pipelines with separate modules for object detection, pose estimation, and grasp planning. To establish the feasibility and current limitations of this approach, we benchmark eight contempora