dataset

MM-RLHF-RewardBench

yifanzhang114

or hover any field below to flag it

Overview

Name
MM-RLHF-RewardBench
Source
yifanzhang114
Episodes
0
Robot count
0
Format
parquet
Description
[๐Ÿ“– arXiv Paper] [๐Ÿ“Š MM-RLHF Data] [๐Ÿ“ Homepage] [๐Ÿ† Reward Model] [๐Ÿ”ฎ MM-RewardBench] [๐Ÿ”ฎ MM-SafetyBench] [๐Ÿ“ˆ Evaluation Suite] The Next Step Forward in Multimodal LLM Alignment [2025/02/10] ๐Ÿ”ฅ We are proud to open-source MM-RLHF, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes: A high-quality MLLM alignment dataset. A strong Critique-Based MLLM reward model and its training algorithm. A novelโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench.
Robots used
null

Links