dataset
Reagent-RL-709K
bunny127
or hover any field below to flag it
Overview
Name
Reagent-RL-709K
Source
bunny127
Episodes
0
Robot count
0
Format
other
Description
Official Repo of Reagent Agent RL training dataset (Reagent-RL-709K).
Paper: https://arxiv.org/abs/2601.22154
Abstract:
Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use.
However, most methods still relies on sparse outcome-based reward for training.
Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results.
In this paper, we introduce… See the full description on the dataset page: https://huggingface.co/datasets/bunny127/Reagent-RL-709K.
Robots used
null
Links
HuggingFace dataset