← Back to Benchmarks
simmediumrlmetric · varies
FlowRL: A Taxonomy and Modular Framework for Reinforcement Learning with Diffusion Policies
Description
Thanks to their remarkable flexibility, diffusion models and flow models have emerged as promising candidates for policy representation. However, efficient reinforcement learning (RL) upon these policies remains a challenge due to the lack of explicit log-probabilities for vanilla policy gradient estimators. While numerous attempts have been proposed to address this, the field lacks a unified perspective to reconcile these seemingly disparate methods, thus hampering ongoing development. In this