Festivus
Home
Data
Contribute
Changelog
Search
← Back to Benchmarks
sim
medium
eval_dataset
metric · varies
B3 Agent Security Benchmark Weak
Description
HuggingFace evaluation dataset: Lakera/b3-agent-security-benchmark-weak
Source
https://huggingface.co/datasets/Lakera/b3-agent-security-benchmark-weak