← Back to Benchmarks
simmediumgraspingmetric · varies

Using Visual Language Models to Control Bionic Hands: Assessment of Object Perception and Grasp Inference

Description

This study examines the potential of utilizing Vision Language Models (VLMs) to improve the perceptual capabilities of semi-autonomous prosthetic hands. We introduce a unified benchmark for end-to-end perception and grasp inference, evaluating a single VLM to perform tasks that traditionally require complex pipelines with separate modules for object detection, pose estimation, and grasp planning. To establish the feasibility and current limitations of this approach, we benchmark eight contempora

Source

http://arxiv.org/abs/2509.13572v1