← Back to Benchmarks
simmediumgraspingmetric · varies

PAVLM: Advancing Point Cloud based Affordance Understanding Via Vision-Language Model

Description

Affordance understanding, the task of identifying actionable regions on 3D objects, plays a vital role in allowing robotic systems to engage with and operate within the physical world. Although Visual Language Models (VLMs) have excelled in high-level reasoning and long-horizon planning for robotic manipulation, they still fall short in grasping the nuanced physical properties required for effective human-robot interaction. In this paper, we introduce PAVLM (Point cloud Affordance Vision-Languag

Source

http://arxiv.org/abs/2410.11564v2