simmediumoffline-rlmetric · varies

CS-GBA: A Critical Sample-based Gradient-guided Backdoor Attack for Offline Reinforcement Learning

Description

Offline Reinforcement Learning (RL) enables policy optimization from static datasets but is inherently vulnerable to backdoor attacks. Existing attack strategies typically struggle against safety-constrained algorithms (e.g., CQL) due to inefficient random poisoning and the use of easily detectable Out-of-Distribution (OOD) triggers. In this paper, we propose CS-GBA (Critical Sample-based Gradient-guided Backdoor Attack), a novel framework designed to achieve high stealthiness and destructivenes

Source

http://arxiv.org/abs/2601.10407v1