dataset

VPPO_MMK12_validation

chamber111

or hover any field below to flag it

Overview

Name

Source

chamber111

Episodes

Robot count

Format

parquet

Description

Dataset Card for VPPO_MMK12_validation Dataset Details Dataset Description This dataset is the official validation split used to fine-tune the VPPO-7B and VPPO-32B models presented in our paper, "Spotlight on Token Perception for Multimodal Reinforcement Learning". This is a direct copy of the test split of FanqingM/MMK12 dataset. We have isolated it here to ensure the exact version used in our experiments is publicly available, guaranteeing reproducibility for… See the full description on the dataset page: https://huggingface.co/datasets/chamber111/VPPO_MMK12_validation.

Robots used

null

Links

HuggingFace dataset

chamber111/VPPO_MMK12_validation