dataset
swe-bench-trajectory-quality-subsets
davongluck
or hover any field below to flag it
Overview
Name
swe-bench-trajectory-quality-subsets
Source
davongluck
Episodes
0
Robot count
0
Format
parquet
Description
SWE-bench Trajectory Quality Subsets
Curated subsets of nebius/SWE-rebench-openhands-trajectories constructed using the v3 quality scoring framework for fine-tuning evaluation.
Subsets Overview
Subset
Size
Selection
Mean Score
Resolved Rate
Purpose
Ablation-NoB2-500
500
Top 500 with Efficiency = B3 alone (drop B2 error_retry)
0.6410
100%
Ablation study
Ablation-NoB3-500
500
Top 500 with Efficiency = B2 alone (drop B3 step_count_ratio)
0.7253
100%
Ablation… See the full description on the dataset page: https://huggingface.co/datasets/davongluck/swe-bench-trajectory-quality-subsets.
Robots used
null
Links
HuggingFace dataset