2025 "adversarial attacks" Papers
22 papers found
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
Xiaojun Jia, Sensen Gao, Simeng Qin et al.
Adversary Aware Optimization for Robust Defense
Daniel Wesego, Pedram Rooshenas
Bits Leaked per Query: Information-Theoretic Bounds for Adversarial Attacks on LLMs
Masahiro Kaneko, Timothy Baldwin
Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness
Longwei Wang, Ifrat Ikhtear Uddin, Prof. KC Santosh (PhD) et al.
Confidence Elicitation: A New Attack Vector for Large Language Models
Brian Formento, Chuan Sheng Foo, See-Kiong Ng
DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches
Yun Xing, Yue Cao, Nhat Chung et al.
Detecting Adversarial Data Using Perturbation Forgery
Qian Wang, Chen Li, Yuchen Luo et al.
Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
Steffen Schotthöfer, Lexie Yang, Stefan Schnake
Fit the Distribution: Cross-Image/Prompt Adversarial Attacks on Multimodal Large Language Models
Hai Yan, Haijian Ma, Xiaowen Cai et al.
GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack
Md Farhamdur Reza, Richeng Jin, Tianfu Wu et al.
IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector
Zheng CHEN, Yushi Feng, Jisheng Dang et al.
Jailbreaking as a Reward Misspecification Problem
Zhihui Xie, Jiahui Gao, Lei Li et al.
Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Shiji Zhao, Ranjie Duan, Fengxiang Wang et al.
MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents
Lukas Aichberger, Alasdair Paren, Guohao Li et al.
MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework
Ping Guo, Cheng Gong, Fei Liu et al.
Non-Adaptive Adversarial Face Generation
Sunpill Kim, Seunghun Paik, Chanwoo Hwang et al.
NoPain: No-box Point Cloud Attack via Optimal Transport Singular Boundary
Zezeng Li, Xiaoyu Du, Na Lei et al.
Robust LLM safeguarding via refusal feature adversarial training
Lei Yu, Virginie Do, Karen Hambardzumyan et al.
Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization
Parvin Nazari, Bojian Hou, Davoud Ataee Tarzanagh et al.
Towards Certification of Uncertainty Calibration under Adversarial Attacks
Cornelius Emde, Francesco Pinto, Thomas Lukasiewicz et al.
Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
Yiming Liu, Kezhao Liu, Yao Xiao et al.
Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
Zi Liang, Qingqing Ye, Xuan Liu et al.