dataset

aflow-qwen-rl-training

beita6969

or hover any field below to flag it

Overview

Name
aflow-qwen-rl-training
Source
beita6969
Episodes
0
Robot count
0
Format
other
Description
AFlow-Qwen2.5-7B-Instruct RL Training: Workflow optimization using PPO and GiGPO on HumanEval dataset
Robots used
null

Links

HuggingFace dataset
null