dataset

streaming-phi-deidentification-benchmark

vkatg

or hover any field below to flag it

Overview

Name
streaming-phi-deidentification-benchmark
Source
vkatg
Episodes
0
Robot count
0
Format
json
Description
Streaming PHI De-Identification Benchmark Most PHI de-identification benchmarks evaluate a single document in isolation. That is not how clinical data actually moves. A patient's name appears in a clinical note, then in an ASR transcript ten minutes later, then in imaging metadata an hour after that. Each event looks low-risk on its own. The cumulative exposure across modalities is what creates re-identification risk. This dataset captures that. Every record is fully synthetic. It… See the full description on the dataset page: https://huggingface.co/datasets/vkatg/streaming-phi-deidentification-benchmark.
Robots used
null

Links