Xiangyu Qi
4
Papers
418
Total Citations
Papers (4)
Safety Alignment Should be Made More Than Just a Few Tokens Deep
ICLR 2025
277
citations
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
ICLR 2025arXiv
141
citations
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
ICML 2024
0
citations
Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks
CVPR 2022arXiv
0
citations