simmediumoffline-rlmetric · varies

Prompting Decision Transformers for Zero-Shot Reach-Avoid Policies

Description

Offline goal-conditioned reinforcement learning methods have shown promise for reach-avoid tasks, where an agent must reach a target state while avoiding undesirable regions of the state space. Existing approaches typically encode avoid-region information into an augmented state space and cost function, which prevents flexible, dynamic specification of novel avoid-region information at evaluation time. They also rely heavily on well-designed reward and cost functions, limiting scalability to com

Source

http://arxiv.org/abs/2505.19337v2