dataset
VPPO_MMK12_validation
chamber111
or hover any field below to flag it
Overview
Name
VPPO_MMK12_validation
Source
chamber111
Episodes
0
Robot count
0
Format
parquet
Description
Dataset Card for VPPO_MMK12_validation
Dataset Details
Dataset Description
This dataset is the official validation split used to fine-tune the VPPO-7B and VPPO-32B models presented in our paper, "Spotlight on Token Perception for Multimodal Reinforcement Learning".
This is a direct copy of the test split of FanqingM/MMK12 dataset. We have isolated it here to ensure the exact version used in our experiments is publicly available, guaranteeing reproducibility for… See the full description on the dataset page: https://huggingface.co/datasets/chamber111/VPPO_MMK12_validation.
Robots used
null
Links
HuggingFace dataset