dataset
MM-RLHF-RewardBench
yifanzhang114
or hover any field below to flag it
Overview
Name
MM-RLHF-RewardBench
Source
yifanzhang114
Episodes
0
Robot count
0
Format
parquet
Description
[๐ arXiv Paper]
[๐ MM-RLHF Data]
[๐ Homepage]
[๐ Reward Model]
[๐ฎ MM-RewardBench]
[๐ฎ MM-SafetyBench]
[๐ Evaluation Suite]
The Next Step Forward in Multimodal LLM Alignment
[2025/02/10] ๐ฅ We are proud to open-source MM-RLHF, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes:
A high-quality MLLM alignment dataset.
A strong Critique-Based MLLM reward model and its training algorithm.
A novelโฆ See the full description on the dataset page: https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench.
Robots used
null
Links
HuggingFace dataset