simmediumpolicy-learningmetric · varies

ViSA: Visited-State Augmentation for Generalized Goal-Space Contrastive Reinforcement Learning

Description

Goal-Conditioned Reinforcement Learning (GCRL) is a framework for learning a policy that can reach arbitrarily given goals. In particular, Contrastive Reinforcement Learning (CRL) provides a framework for policy updates using an approximation of the value function estimated via contrastive learning, achieving higher sample efficiency compared to conventional methods. However, since CRL treats the visited state as a pseudo-goal during learning, it can accurately estimate the value function only f

Source

http://arxiv.org/abs/2603.14887v2