Prateek Mittal
19
Papers
463
Total Citations
Papers (19)
Safety Alignment Should be Made More Than Just a Few Tokens Deep
ICLR 2025
277
citations
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
ICLR 2025arXiv
141
citations
Data Shapley in One Training Run
ICLR 2025arXiv
44
citations
ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search
NeurIPS 2025
1
citations
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
ICML 2024
0
citations
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
ICML 2024
0
citations
Adapting to Evolving Adversaries with Regularized Continual Robust Training
ICML 2025
0
citations
PatchDEMUX: A Certifiably Robust Framework for Multi-label Classifiers Against Adversarial Patches
CVPR 2025
0
citations
Differentially Private Image Classification by Learning Priors from Random Processes
NeurIPS 2023
0
citations
Characterizing the Optimal $0-1$ Loss for Multi-class Classification with a Test-time Attacker
NeurIPS 2023
0
citations
A Privacy-Friendly Approach to Data Valuation
NeurIPS 2023
0
citations
Analyzing Federated Learning through an Adversarial Lens
ICML 2019
0
citations
PAC-learning in the presence of adversaries
NeurIPS 2018
0
citations
Lower Bounds on Adversarial Robustness from Optimal Transport
NeurIPS 2019
0
citations
HYDRA: Pruning Adversarially Robust Neural Networks
NeurIPS 2020
0
citations
Formulating Robustness Against Unforeseen Attacks
NeurIPS 2022
0
citations
Understanding Robust Learning through the Lens of Representation Similarities
NeurIPS 2022
0
citations
Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning
NeurIPS 2022
0
citations
A Randomized Approach to Tight Privacy Accounting
NeurIPS 2023
0
citations