Prateek Mittal

19
Papers
463
Total Citations

Papers (19)

Safety Alignment Should be Made More Than Just a Few Tokens Deep

ICLR 2025
277
citations

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

ICLR 2025arXiv
141
citations

Data Shapley in One Training Run

ICLR 2025arXiv
44
citations

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

NeurIPS 2025
1
citations

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

ICML 2024
0
citations

A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization

ICML 2024
0
citations

Adapting to Evolving Adversaries with Regularized Continual Robust Training

ICML 2025
0
citations

PatchDEMUX: A Certifiably Robust Framework for Multi-label Classifiers Against Adversarial Patches

CVPR 2025
0
citations

Differentially Private Image Classification by Learning Priors from Random Processes

NeurIPS 2023
0
citations

Characterizing the Optimal $0-1$ Loss for Multi-class Classification with a Test-time Attacker

NeurIPS 2023
0
citations

A Privacy-Friendly Approach to Data Valuation

NeurIPS 2023
0
citations

Analyzing Federated Learning through an Adversarial Lens

ICML 2019
0
citations

PAC-learning in the presence of adversaries

NeurIPS 2018
0
citations

Lower Bounds on Adversarial Robustness from Optimal Transport

NeurIPS 2019
0
citations

HYDRA: Pruning Adversarially Robust Neural Networks

NeurIPS 2020
0
citations

Formulating Robustness Against Unforeseen Attacks

NeurIPS 2022
0
citations

Understanding Robust Learning through the Lens of Representation Similarities

NeurIPS 2022
0
citations

Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning

NeurIPS 2022
0
citations

A Randomized Approach to Tight Privacy Accounting

NeurIPS 2023
0
citations