simmediumrlmetric · varies

Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models

Description

The recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption of RL for post-training Multimodal Large Language Models (MLLMs) to enhance their visual reasoning capabilities. Although many studies have reported improved performance, it remains unclear whether RL training truly enables models to learn from visual information. In this work, we propose the Hallucination-as-Cue Framework, an analytical framework designed to investigate the effects of R

Source

http://arxiv.org/abs/2604.03179v1