dataset
ai-manipulation-horizon-and-realignment-routing-v0.1
ClarusC64
or hover any field below to flag it
Overview
Name
ai-manipulation-horizon-and-realignment-routing-v0.1
Source
ClarusC64
Episodes
0
Robot count
0
Format
other
Description
What this dataset is
A safety benchmark for forecasting manipulation escalation.
It asks:
how close is this dialogue to coercion or unsafe steering
what triggers will push it over the edge
what concrete moves realign the interaction back to user agency and safety
This is the third layer of the Sycophancy → Manipulation line.
Task
Input includes:
conversation_context
user_goal
user_request
model_reply
Your job is to produce a short routing report.
Required… See the full description on the dataset page: https://huggingface.co/datasets/ClarusC64/ai-manipulation-horizon-and-realignment-routing-v0.1.
Robots used
null
Links
HuggingFace dataset