← Back to Benchmarks
simmediummanipulationmetric · varies

NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models

Description

Vision-Language-Action (VLA) models are formulated to ground instructions in visual context and generate action sequences for robotic manipulation. Despite recent progress, VLA models still face challenges in learning related and reusable primitives, reducing reliance on large-scale data and complex architectures, and enabling exploration beyond demonstrations. To address these challenges, we propose a novel Neuro-Symbolic Vision-Language-Action (NS-VLA) framework via online reinforcement learni

Source

http://arxiv.org/abs/2603.09542v1