← Back to Benchmarks
simmediumoffline-rlmetric · varies
Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning
Description
Offline goal-conditioned reinforcement learning (GCRL) offers a practical learning paradigm in which goal-reaching policies are trained from abundant state-action trajectory datasets without additional environment interaction. However, offline GCRL still struggles with long-horizon tasks, even with recent advances that employ hierarchical policy structures, such as HIQL. Identifying the root cause of this challenge, we observe the following insight. Firstly, performance bottlenecks mainly stem f