simmediumoffline-rlmetric · varies

Online reinforcement learning via sparse Gaussian mixture model Q-functions

Description

This paper introduces a structured and interpretable online policy-iteration framework for reinforcement learning (RL), built around the novel class of sparse Gaussian mixture model Q-functions (S-GMM-QFs). Extending earlier work that trained GMM-QFs offline, the proposed framework develops an online scheme that leverages streaming data to encourage exploration. Model complexity is regulated through sparsification by Hadamard overparametrization, which mitigates overfitting while preserving expr

Source

http://arxiv.org/abs/2509.14585v3