dataset

simulation-interpretability-dataset

Znreza

or hover any field below to flag it

Overview

Name
simulation-interpretability-dataset
Source
Znreza
Episodes
0
Robot count
0
Format
json
Description
Qwen3.5-2B-Base Blind Spots Dataset A curated dataset documenting systematic failure modes ("blind spots") discovered in Qwen/Qwen3.5-2B-Base through structured probing experiments. Dataset Description This dataset contains 12 carefully selected examples where Qwen3.5-2B-Base exhibits predictable, reproducible failures across three major categories: Category Examples Key Finding Authority-Induced Sycophancy 4 Model accepts false claims when framed with… See the full description on the dataset page: https://huggingface.co/datasets/Znreza/simulation-interpretability-dataset.
Robots used
null

Links