← Back to Benchmarks
simmediumoffline-rlmetric · varies
Safe Deployment of Offline Reinforcement Learning via Input Convex Action Correction
Description
Offline reinforcement learning (offline RL) offers a promising framework for developing control strategies in chemical process systems using historical data, without the risks or costs of online experimentation. This work investigates the application of offline RL to the safe and efficient control of an exothermic polymerisation continuous stirred-tank reactor. We introduce a Gymnasium-compatible simulation environment that captures the reactor's nonlinear dynamics, including reaction kinetics,