← Back to Benchmarks
simmediumoffline-rlmetric · varies

Offline Goal-Conditioned Reinforcement Learning with Projective Quasimetric Planning

Description

Offline Goal-Conditioned Reinforcement Learning seeks to train agents to reach specified goals from previously collected trajectories. Scaling that promises to long-horizon tasks remains challenging, notably due to compounding value-estimation errors. Principled geometric offers a potential solution to address these issues. Following this insight, we introduce Projective Quasimetric Planning (ProQ), a compositional framework that learns an asymmetric distance and then repurposes it, firstly as a

Source

http://arxiv.org/abs/2506.18847v3