dataset
Nemotron-RL-Super-Training-Blends
nvidia
or hover any field below to flag it
Overview
Name
Nemotron-RL-Super-Training-Blends
Source
nvidia
Episodes
0
Robot count
0
Format
other
Description
Dataset Description:
Nemotron-3-Super-RL-Training-Blends contains the dataset blends used to train the Nemotron-3-Super-120B-A12B model. RL training for the Nemotron-3-Super-120B-A12B model is done in 6 stages: RLVR 1, RLVR 2, RLVR 3, SWE 1, SWE 2, and RLHF. The blends for each stage consist of data from various datasets, which we detail below. The percentages in parentheses indicate the mixing ratios of the dataset components. Note that the model was also trained on additional data… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/Nemotron-RL-Super-Training-Blends.
Robots used
null
Links
HuggingFace dataset