dataset

swe-bench-trajectory-quality-subsets

davongluck

or hover any field below to flag it

Overview

Name
swe-bench-trajectory-quality-subsets
Source
davongluck
Episodes
0
Robot count
0
Format
parquet
Description
SWE-bench Trajectory Quality Subsets Curated subsets of nebius/SWE-rebench-openhands-trajectories constructed using the v3 quality scoring framework for fine-tuning evaluation. Subsets Overview Subset Size Selection Mean Score Resolved Rate Purpose Ablation-NoB2-500 500 Top 500 with Efficiency = B3 alone (drop B2 error_retry) 0.6410 100% Ablation study Ablation-NoB3-500 500 Top 500 with Efficiency = B2 alone (drop B3 step_count_ratio) 0.7253 100% Ablation… See the full description on the dataset page: https://huggingface.co/datasets/davongluck/swe-bench-trajectory-quality-subsets.
Robots used
null

Links