dataset

ai-manipulation-horizon-and-realignment-routing-v0.1

ClarusC64

or hover any field below to flag it

Overview

Name
ai-manipulation-horizon-and-realignment-routing-v0.1
Source
ClarusC64
Episodes
0
Robot count
0
Format
other
Description
What this dataset is A safety benchmark for forecasting manipulation escalation. It asks: how close is this dialogue to coercion or unsafe steering what triggers will push it over the edge what concrete moves realign the interaction back to user agency and safety This is the third layer of the Sycophancy → Manipulation line. Task Input includes: conversation_context user_goal user_request model_reply Your job is to produce a short routing report. Required… See the full description on the dataset page: https://huggingface.co/datasets/ClarusC64/ai-manipulation-horizon-and-realignment-routing-v0.1.
Robots used
null

Links