simmediumnavigationmetric · varies

Enhancing Lightweight Vision Language Models through Group Competitive Learning for Socially Compliant Navigation

Description

Social robot navigation requires a sophisticated integration of scene semantics and human social norms. Scaling up Vision Language Models (VLMs) generally improves reasoning and decision-making capabilities for socially compliant navigation. However, increased model size incurs substantial computational overhead, limiting suitability for real-time robotic deployment. Conversely, lightweight VLMs enable efficient inference but often exhibit weaker reasoning and decision-making performance in soci

Source

http://arxiv.org/abs/2603.11447v1