dataset

VinciCoder-1.6M-SFT

DocTron-Hub

or hover any field below to flag it

Overview

Name
VinciCoder-1.6M-SFT
Source
DocTron-Hub
Episodes
0
Robot count
0
Format
parquet
Description
VinciCoder: Unified Multimodal Code Generation Dataset This repository contains the datasets used for VinciCoder: Unifying Multimodal Code Generation via Coarse-to-fine Visual Reinforcement Learning, a project that introduces a unified multimodal code generation model. The framework uses a two-stage training approach, comprising a large-scale Supervised Finetuning (SFT) corpus and a Visual Reinforcement Learning (ViRL) dataset. These datasets are designed for tasks involving direct… See the full description on the dataset page: https://huggingface.co/datasets/DocTron-Hub/VinciCoder-1.6M-SFT.
Robots used
null

Links