dataset

MM-RLHF-RewardBench

yifanzhang114

or hover any field below to flag it

Overview

Name

MM-RLHF-RewardBench

Source

yifanzhang114

Episodes

Robot count

Format

parquet

Description

[📖 arXiv Paper] [📊 MM-RLHF Data] [📝 Homepage] [🏆 Reward Model] [🔮 MM-RewardBench] [🔮 MM-SafetyBench] [📈 Evaluation Suite] The Next Step Forward in Multimodal LLM Alignment [2025/02/10] 🔥 We are proud to open-source MM-RLHF, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes: A high-quality MLLM alignment dataset. A strong Critique-Based MLLM reward model and its training algorithm. A novel… See the full description on the dataset page: https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench.

Robots used

null

Links

HuggingFace dataset

yifanzhang114/MM-RLHF-RewardBench