"backdoor attacks" Papers
21 papers found
Backdoor Mitigation by Distance-Driven Detoxification
Shaokui Wei, Jiayin Liu, Hongyuan Zha
Certifying Language Model Robustness with Fuzzed Randomized Smoothing: An Efficient Defense Against Backdoor Attacks
Bowei He, Lihao Yin, Huiling Zhen et al.
DeDe: Detecting Backdoor Samples for SSL Encoders via Decoders
Sizai Hou, Songze Li, Duanyi Yao
FedRACE: A Hierarchical and Statistical Framework for Robust Federated Learning
Gang Yan, Sikai Yang, Wan Du
Infighting in the Dark: Multi-Label Backdoor Attack in Federated Learning
Ye Li, Yanchao Zhao, chengcheng zhu et al.
Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
Zi Liang, Qingqing Ye, Xuan Liu et al.
Backdoor Attacks via Machine Unlearning
Zihao Liu, Tianhao Wang, Mengdi Huai et al.
BadRL: Sparse Targeted Backdoor Attack against Reinforcement Learning
Jing Cui, Yufei Han, Yuzhe Ma et al.
Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks
Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman
Causality Based Front-door Defense Against Backdoor Attack on Language Models
Yiran Liu, Xiaoang Xu, Zhiyi Hou et al.
Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normalization
Xingyi Zhao, Depeng Xu, Shuhan Yuan
Does Few-Shot Learning Suffer from Backdoor Attacks?
Xinwei Liu, Xiaojun Jia, Jindong Gu et al.
Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang et al.
Flatness-aware Sequential Learning Generates Resilient Backdoors
Hoang Pham, The-Anh Ta, Anh Tran et al.
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency
Linshan Hou, Ruili Feng, Zhongyun Hua et al.
Progressive Poisoned Data Isolation for Training-Time Backdoor Defense
Yiming Chen, Haiwei Wu, Jiantao Zhou
Resisting Backdoor Attacks in Federated Learning via Bidirectional Elections and Individual Perspective
Zhen Qin, Feiyi Chen, Chen Zhi et al.
SHINE: Shielding Backdoors in Deep Reinforcement Learning
Zhuowen Yuan, Wenbo Guo, Jinyuan Jia et al.
TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors
Yichuan Mo, Hui Huang, Mingjie Li et al.
TrojVLM: Backdoor Attack Against Vision Language Models
Weimin Lyu, Lu Pang, Tengfei Ma et al.
WBP: Training-time Backdoor Attacks through Hardware-based Weight Bit Poisoning
Kunbei Cai, Zhenkai Zhang, Qian Lou et al.