← Back to Benchmarks
simmediumoffline-rlmetric · varies
Prompting Decision Transformers for Zero-Shot Reach-Avoid Policies
Description
Offline goal-conditioned reinforcement learning methods have shown promise for reach-avoid tasks, where an agent must reach a target state while avoiding undesirable regions of the state space. Existing approaches typically encode avoid-region information into an augmented state space and cost function, which prevents flexible, dynamic specification of novel avoid-region information at evaluation time. They also rely heavily on well-designed reward and cost functions, limiting scalability to com