"adversarial attacks" Papers

53 papers found • Page 1 of 2

Adversary Aware Optimization for Robust Defense

Daniel Wesego, Pedram Rooshenas

NeurIPS 2025poster

Bits Leaked per Query: Information-Theoretic Bounds for Adversarial Attacks on LLMs

Masahiro Kaneko, Timothy Baldwin

NeurIPS 2025spotlightarXiv:2510.17000

Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness

Longwei Wang, Ifrat Ikhtear Uddin, Prof. KC Santosh (PhD) et al.

NeurIPS 2025spotlightarXiv:2510.16171
2
citations

DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches

Yun Xing, Yue Cao, Nhat Chung et al.

NeurIPS 2025posterarXiv:2506.16690

Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks

Steffen Schotthöfer, Lexie Yang, Stefan Schnake

NeurIPS 2025oralarXiv:2505.08022
6
citations

Fit the Distribution: Cross-Image/Prompt Adversarial Attacks on Multimodal Large Language Models

Hai Yan, Haijian Ma, Xiaowen Cai et al.

NeurIPS 2025poster

IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector

Zheng CHEN, Yushi Feng, Jisheng Dang et al.

NeurIPS 2025posterarXiv:2502.15902

Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency

Shiji Zhao, Ranjie Duan, Fengxiang Wang et al.

ICCV 2025posterarXiv:2501.04931
28
citations

MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents

Lukas Aichberger, Alasdair Paren, Guohao Li et al.

NeurIPS 2025posterarXiv:2503.10809
10
citations

MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework

Ping Guo, Cheng Gong, Fei Liu et al.

CVPR 2025posterarXiv:2501.07251

Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization

Parvin Nazari, Bojian Hou, Davoud Ataee Tarzanagh et al.

NeurIPS 2025posterarXiv:2511.01126

Towards Certification of Uncertainty Calibration under Adversarial Attacks

Cornelius Emde, Francesco Pinto, Thomas Lukasiewicz et al.

ICLR 2025posterarXiv:2405.13922
2
citations

Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

Zi Liang, Qingqing Ye, Xuan Liu et al.

NeurIPS 2025spotlight

$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts

Guanjie Chen, Xinyu Zhao, Tianlong Chen et al.

ICML 2024poster

Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model

Decheng Liu, Xijun Wang, Chunlei Peng et al.

AAAI 2024paperarXiv:2312.11285
34
citations

Adversarial Attacks on the Interpretation of Neuron Activation Maximization

Géraldin Nanfack, Alexander Fulleringer, Jonathan Marty et al.

AAAI 2024paperarXiv:2306.07397
12
citations

Adversarially Robust Deep Multi-View Clustering: A Novel Attack and Defense Framework

Haonan Huang, Guoxu Zhou, Yanghang Zheng et al.

ICML 2024poster

Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents

Chung-En Sun, Sicun Gao, Lily Weng

ICML 2024poster

Comparing the Robustness of Modern No-Reference Image- and Video-Quality Metrics to Adversarial Attacks

Anastasia Antsiferova, Khaled Abud, Aleksandr Gushchin et al.

AAAI 2024paperarXiv:2310.06958
16
citations

CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

Shashank Agnihotri, Steffen Jung, Margret Keuper

ICML 2024poster

DataFreeShield: Defending Adversarial Attacks without Training Data

Hyeyoon Lee, Kanghyun Choi, Dain Kwon et al.

ICML 2024poster

Enhancing Adversarial Robustness in SNNs with Sparse Gradients

Yujia Liu, Tong Bu, Ding Jianhao et al.

ICML 2024poster

Exploring Vulnerabilities in Spiking Neural Networks: Direct Adversarial Attacks on Raw Event Data

Yanmeng Yao, Xiaohan Zhao, Bin Gu

ECCV 2024poster
9
citations

Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions

Jon Vadillo, Roberto Santana, Jose A Lozano

ICML 2024poster

Fast Adversarial Attacks on Language Models In One GPU Minute

Vinu Sankar Sadasivan, Shoumik Saha, Gaurang Sriramanan et al.

ICML 2024poster

Graph Neural Network Explanations are Fragile

Jiate Li, Meng Pang, Yun Dong et al.

ICML 2024poster

Improved Dimensionality Dependence for Zeroth-Order Optimisation over Cross-Polytopes

Weijia Shao

ICML 2024poster

IOI: Invisible One-Iteration Adversarial Attack on No-Reference Image- and Video-Quality Metrics

Ekaterina Shumitskaya, Anastasia Antsiferova, Dmitriy Vatolin

ICML 2024oral

Lyapunov-Stable Deep Equilibrium Models

Haoyu Chu, Shikui Wei, Ting Liu et al.

AAAI 2024paperarXiv:2304.12707
7
citations

Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution

Eslam Zaher, Maciej Trzaskowski, Quan Nguyen et al.

ICML 2024poster

MathAttack: Attacking Large Language Models towards Math Solving Ability

Zihao Zhou, Qiufeng Wang, Mingyu Jin et al.

AAAI 2024paperarXiv:2309.01686
37
citations

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models

Xin Liu, Yichen Zhu, Jindong Gu et al.

ECCV 2024posterarXiv:2311.17600
183
citations

On the Duality Between Sharpness-Aware Minimization and Adversarial Training

Yihao Zhang, Hangzhou He, Jingyu Zhu et al.

ICML 2024poster

Rethinking Adversarial Robustness in the Context of the Right to be Forgotten

Chenxu Zhao, Wei Qian, Yangyi Li et al.

ICML 2024poster

Rethinking Independent Cross-Entropy Loss For Graph-Structured Data

Rui Miao, Kaixiong Zhou, Yili Wang et al.

ICML 2024poster

Revisiting Character-level Adversarial Attacks for Language Models

Elias Abad Rocamora, Yongtao Wu, Fanghui Liu et al.

ICML 2024poster

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

Zhuowen Yuan, Zidi Xiong, Yi Zeng et al.

ICML 2024poster

Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models

Christian Schlarmann, Naman Singh, Francesco Croce et al.

ICML 2024poster

Robust Communicative Multi-Agent Reinforcement Learning with Active Defense

Lebin Yu, Yunbo Qiu, Quanming Yao et al.

AAAI 2024paperarXiv:2312.11545
8
citations

Robustness Tokens: Towards Adversarial Robustness of Transformers

Brian Pulfer, Yury Belousov, Slava Voloshynovskiy

ECCV 2024posterarXiv:2503.10191

Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models

Yongshuo Zong, Ondrej Bohdal, Tingyang Yu et al.

ICML 2024poster

SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding

Chanho Park, Namyoon Lee

ICML 2024poster

Spear and Shield: Adversarial Attacks and Defense Methods for Model-Based Link Prediction on Continuous-Time Dynamic Graphs

Dongjin Lee, Juho Lee, Kijung Shin

AAAI 2024paperarXiv:2308.10779

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization

Xixu Hu, Runkai Zheng, Jindong Wang et al.

ECCV 2024posterarXiv:2402.03317
5
citations

Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits

Zhiwei Wang, Hongning Wang, Huazheng Wang

AAAI 2024paperarXiv:2402.13487
1
citations

The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks

Ziquan Liu, Yufei Cui, Yan Yan et al.

ICML 2024poster

Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

Zhiying Jiang, Xingyuan Li, Jinyuan Liu et al.

AAAI 2024paperarXiv:2402.15959
14
citations

Towards the Theory of Unsupervised Federated Learning: Non-asymptotic Analysis of Federated EM Algorithms

Ye Tian, Haolei Weng, Yang Feng

ICML 2024poster

Trustworthy Actionable Perturbations

Jesse Friedbaum, Sudarshan Adiga, Ravi Tandon

ICML 2024poster

Two Heads are Actually Better than One: Towards Better Adversarial Robustness via Transduction and Rejection

Nils Palumbo, Yang Guo, Xi Wu et al.

ICML 2024poster
← PreviousNext →