simmediumoffline-rlmetric · varies

Safe Deployment of Offline Reinforcement Learning via Input Convex Action Correction

Description

Offline reinforcement learning (offline RL) offers a promising framework for developing control strategies in chemical process systems using historical data, without the risks or costs of online experimentation. This work investigates the application of offline RL to the safe and efficient control of an exothermic polymerisation continuous stirred-tank reactor. We introduce a Gymnasium-compatible simulation environment that captures the reactor's nonlinear dynamics, including reaction kinetics,

Source

http://arxiv.org/abs/2507.22640v1