simmediumatarimetric · varies

Exploration with Foundation Models: Capabilities, Limitations, and Hybrid Approaches

Description

Exploration in reinforcement learning (RL) remains challenging, particularly in sparse-reward settings. While foundation models possess strong semantic priors, their capabilities as zero-shot exploration agents in classic RL benchmarks are not well understood. We benchmark LLMs and VLMs on multi-armed bandits, Gridworlds, and sparse-reward Atari to test zero-shot exploration. Our investigation reveals a key limitation: while VLMs can infer high-level objectives from visual input, they consistent

Source

http://arxiv.org/abs/2509.19924v1