dataset

Nemotron-RL-Super-Training-Blends

nvidia

or hover any field below to flag it

Overview

Name

Source

nvidia

Episodes

Robot count

Format

other

Description

Dataset Description: Nemotron-3-Super-RL-Training-Blends contains the dataset blends used to train the Nemotron-3-Super-120B-A12B model. RL training for the Nemotron-3-Super-120B-A12B model is done in 6 stages: RLVR 1, RLVR 2, RLVR 3, SWE 1, SWE 2, and RLHF. The blends for each stage consist of data from various datasets, which we detail below. The percentages in parentheses indicate the mixing ratios of the dataset components. Note that the model was also trained on additional data… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/Nemotron-RL-Super-Training-Blends.

Robots used

null

Links

HuggingFace dataset

nvidia/Nemotron-RL-Super-Training-Blends