dataset

Innovator-VL-RL-172K

InnovatorLab

or hover any field below to flag it

Overview

Name
Innovator-VL-RL-172K
Source
InnovatorLab
Episodes
0
Robot count
0
Format
parquet
Description
Innovator-VL-RL-172K Paper | Code Introduction Innovator-VL-RL-172K is a curated multimodal reinforcement learning (RL) training dataset containing approximately 172K instances.It is designed to support vision-language reasoning and complex decision-making during RL/RLHF-style optimization, where the goal is to improve a model’s ability to consistently select high-quality responses rather than merely expanding knowledge coverage. The dataset emphasizes samples that are… See the full description on the dataset page: https://huggingface.co/datasets/InnovatorLab/Innovator-VL-RL-172K.
Robots used
null

Links