simmediumatarimetric · varies

Flow Models for Unbounded and Geometry-Aware Distributional Reinforcement Learning

Description

We introduce a new architecture for Distributional Reinforcement Learning (DistRL) that models return distributions using normalizing flows. This approach enables flexible, unbounded support for return distributions, in contrast to categorical approaches like C51 that rely on fixed or bounded representations. It also offers richer modeling capacity to capture multi-modality, skewness, and tail behavior than quantile based approaches. Our method is significantly more parameter-efficient than cate

Source

http://arxiv.org/abs/2505.04310v1