dataset
aflow-qwen-rl-training
beita6969
or hover any field below to flag it
Overview
Name
aflow-qwen-rl-training
Source
beita6969
Episodes
0
Robot count
0
Format
other
Description
AFlow-Qwen2.5-7B-Instruct RL Training: Workflow optimization using PPO and GiGPO on HumanEval dataset
Robots used
null
Links
HuggingFace dataset
null