AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents Paper • 2604.02947 • Published 10 days ago • 19
BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models Paper • 2408.12798 • Published Aug 23, 2024