Xiangyu Qi

4

Papers

418

Total Citations

Papers (4)

Safety Alignment Should be Made More Than Just a Few Tokens Deep

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks