dataset

VinciCoder-1.6M-SFT

DocTron-Hub

or hover any field below to flag it

Overview

Name

VinciCoder-1.6M-SFT

Source

DocTron-Hub

Episodes

Robot count

Format

parquet

Description

VinciCoder: Unified Multimodal Code Generation Dataset This repository contains the datasets used for VinciCoder: Unifying Multimodal Code Generation via Coarse-to-fine Visual Reinforcement Learning, a project that introduces a unified multimodal code generation model. The framework uses a two-stage training approach, comprising a large-scale Supervised Finetuning (SFT) corpus and a Visual Reinforcement Learning (ViRL) dataset. These datasets are designed for tasks involving direct… See the full description on the dataset page: https://huggingface.co/datasets/DocTron-Hub/VinciCoder-1.6M-SFT.

Robots used

null

Links

HuggingFace dataset

DocTron-Hub/VinciCoder-1.6M-SFT