← Back to Benchmarks
simmediumoffline-rlmetric · varies
Epigraph-Guided Flow Matching for Safe and Performant Offline Reinforcement Learning
Description
Offline reinforcement learning (RL) provides a compelling paradigm for training autonomous systems without the risks of online exploration, particularly in safety-critical domains. However, jointly achieving strong safety and performance from fixed datasets remains challenging. Existing safe offline RL methods often rely on soft constraints that allow violations, introduce excessive conservatism, or struggle to balance safety, reward optimization, and adherence to the data distribution. To addre