dataset

Reagent-RL-709K

bunny127

or hover any field below to flag it

Overview

Name

Reagent-RL-709K

Source

bunny127

Episodes

Robot count

Format

other

Description

Official Repo of Reagent Agent RL training dataset (Reagent-RL-709K). Paper: https://arxiv.org/abs/2601.22154 Abstract: Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use. However, most methods still relies on sparse outcome-based reward for training. Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results. In this paper, we introduce… See the full description on the dataset page: https://huggingface.co/datasets/bunny127/Reagent-RL-709K.

Robots used

null

Links

HuggingFace dataset

bunny127/Reagent-RL-709K