dataset

Reagent-RL-709K

bunny127

or hover any field below to flag it

Overview

Name
Reagent-RL-709K
Source
bunny127
Episodes
0
Robot count
0
Format
other
Description
Official Repo of Reagent Agent RL training dataset (Reagent-RL-709K). Paper: https://arxiv.org/abs/2601.22154 Abstract: Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use. However, most methods still relies on sparse outcome-based reward for training. Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results. In this paper, we introduce… See the full description on the dataset page: https://huggingface.co/datasets/bunny127/Reagent-RL-709K.
Robots used
null

Links

HuggingFace dataset