2025 "red teaming" Papers
2 papers found
IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves
Ruofan Wang, Juncheng Li, Yixu Wang et al.
ICCV 2025posterarXiv:2411.00827
8
citations
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
Xiaojun Jia, Tianyu Pang, Chao Du et al.
ICLR 2025posterarXiv:2405.21018
74
citations