🧬Ethics & Safety

Privacy in ML

Privacy-preserving machine learning

100 papers2,121 total citations

Compare with other topics

Feb '24 — Jan '26757 papers

Top Conferences

ICLR: 39 AAAI: 20 CVPR: 14 ICML: 14 NeurIPS: 7 ECCV: 6

Top Papers

#1

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Chongyu Fan, Jiancheng Liu, Yihua Zhang et al.

Fast Machine Unlearning without Retraining through Selective Synaptic Dampening

Jack Foster, Stefan Schoepf, Alexandra Brintrup

AAAI 2024arXiv:2308.07707

machine unlearningselective synaptic dampeningfisher information matrixpost hoc unlearning+3

170

citations

#3

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou et al.

MUSE: Machine Unlearning Six-Way Evaluation for Language Models

Weijia Shi, Jaechan Lee, Yangsibo Huang et al.

ICLR 2025arXiv:2407.06460

machine unlearninglanguage modelsprivacy leakageverbatim memorization+4

157

citations

#5

EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE

Zeyi Liao, Lingbo Mo, Chejian Xu et al.

ICLR 2025arXiv:2409.11295

web agent securityprivacy leakage attacksenvironmental injection attackadversarial threat modeling+4

106

citations

#6

AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Teun van der Weij, Felix Hofstätter, Oliver Jaffe et al.

Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning

Chongyu Fan, Jiancheng Liu, Alfred Hero et al.

Data Shapley in One Training Run

Jiachen (Tianhao) Wang, Prateek Mittal, Dawn Song et al.

ICLR 2025arXiv:2406.11011

data attributiondata shapleyfoundation model pretraininggenerative ai copyright+3

44

citations

#9

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

Jie Xu, Yazhou Ren, Xiaolong Wang et al.

No Prejudice! Fair Federated Graph Neural Networks for Personalized Recommendation

Nimesh Agrawal, Anuj Sirohi, Sandeep Kumar et al.

AAAI 2024arXiv:2312.10080

federated learninggraph neural networksrecommendation systemsfairness constraints+3

39

citations

#11

SimAC: A Simple Anti-Customization Method for Protecting Face Privacy against Text-to-Image Synthesis of Diffusion Models

Feifei Wang, Zhentao Tan, Tianyi Wei et al.

DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection

Yuhao Sun, Lingyun Yu, Hongtao Xie et al.

Persistent Pre-training Poisoning of LLMs

Yiming Zhang, Javier Rando, Ivan Evtimov et al.

AA-CLIP: Enhancing Zero-Shot Anomaly Detection via Anomaly-Aware CLIP

wenxin ma, Xu Zhang, Qingsong Yao et al.

On the Relation between Trainability and Dequantization of Variational Quantum Learning Models

Elies Gil-Fuster, Casper Gyurik, Adrian Perez-Salinas et al.

ICLR 2025arXiv:2406.07072

variational quantum machine learningparametrized quantum circuitsquantum kernel methodstrainability+3

33

citations

#16

Localization Is All You Evaluate: Data Leakage in Online Mapping Datasets and How to Fix It

Adam Lilja, Junsheng Fu, Erik Stenborg et al.

Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key

Zhihe Yang, Xufang Luo, Dongqi Han et al.

Machine Unlearning Fails to Remove Data Poisoning Attacks

Martin Pawelczyk, Jimmy Di, Yiwei Lu et al.

A Closer Look at Machine Unlearning for Large Language Models

Xiaojian Yuan, Tianyu Pang, Chao Du et al.

Generalization Analysis of Machine Learning Algorithms via the Worst-Case Data-Generating Probability Measure

Xinying Zou, Samir Perlaza, Inaki Esnaola et al.

AAAI 2024arXiv:2312.12236

worst-case probability measuregeneralization gap analysisgibbs probability measureexpected loss sensitivity+4

26

citations

#21

Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models

Zhaowei Zhu, Jialu Wang, Hao Cheng et al.

T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation

Lijun Li, Zhelun Shi, Xuhao Hu et al.

Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish et al.

ICLR 2025arXiv:2406.16257

machine unlearningexact unlearningparameter-efficient fine-tuningparameter isolation+4

22

citations

#24

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

Chulin Xie, De-An Huang, Wenda Chu et al.

CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG

Boyi Deng, Wenjie Wang, Fengbin Zhu et al.

Encryption-Friendly LLM Architecture

Donghwan Rho, Taeseong Kim, Minje Park et al.

Progressive Poisoned Data Isolation for Training-Time Backdoor Defense

Yiming Chen, Haiwei Wu, Jiantao Zhou

AAAI 2024arXiv:2312.12724

backdoor attacksdata poisoningtraining-time defensepoisoned data isolation+2

16

citations

#28

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Tudor Cebere, Aurélien Bellet, Nicolas Papernot

The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense

Yangyang Guo, Fangkai Jiao, Liqiang Nie et al.

Position: Editing Large Language Models Poses Serious Safety Risks

Paul Youssef, Zhixue Zhao, Daniel Braun et al.

IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation

Yiren Song, Pei Yang, Hai Ci et al.

MERGE: Fast Private Text Generation

Zi Liang, Pinghui Wang, Ruofei Zhang et al.

AAAI 2024arXiv:2305.15769

private inferencetransformer-based modelsnatural language generationcloud model deployment+4

14

citations

#33

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

Lukas Helff, Felix Friedrich, Manuel Brack et al.

Regroup Median Loss for Combating Label Noise

Authors: Fengpeng Li, Kemou Li, Jinyu Tian et al.

AAAI 2024arXiv:2312.06273

label noisesmall-loss criterionrobust loss estimationsemi-supervised learning+3

14

citations

#35

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

Yingzi Ma, Jiongxiao Wang, Fei Wang et al.

Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace

Jinluan Yang, Anke Tang, Didi Zhu et al.

Backdoor Cleaning without External Guidance in MLLM Fine-tuning

Xuankun Rong, Wenke Huang, Jian Liang et al.

A Generalized Shuffle Framework for Privacy Amplification: Strengthening Privacy Guarantees and Enhancing Utility

Chen E, Yang Cao, Ge Yifei

AAAI 2024arXiv:2312.14388

local differential privacyprivacy amplificationshuffle modelpersonalized ldp+3

12

citations

#39

Minimum-Norm Interpolation Under Covariate Shift

Neil Mallinar, Austin Zane, Spencer Frei et al.

ICML 2024

transfer learningcovariate shiftbenign overfittinglinear interpolation+3

12

citations

#40

SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

Pei-Kai Huang, Jun-Xiong Chong, Cheng-Hsuan Chiang et al.

DP-SGD Without Clipping: The Lipschitz Neural Network Way

Louis Béthune, Thomas Massena, Thibaut Boissin et al.

Prompt Risk Control: A Rigorous Framework for Responsible Deployment of Large Language Models

Thomas Zollo, Todd Morrill, Zhun Deng et al.

Privacy-Preserving Optics for Enhancing Protection in Face De-Identification

Jhon Lopez, Carlos Hinojosa, Henry Arguello et al.

Causal Fairness under Unobserved Confounding: A Neural Sensitivity Framework

Maresa Schröder, Dennis Frauen, Stefan Feuerriegel

Emerging Property of Masked Token for Effective Pre-training

Hyesong Choi, Hunsang Lee, Seyoung Joung et al.

ECCV 2024arXiv:2404.08330

masked image modelingself-supervised learningmasked token optimizationpre-training efficiency+3

10

citations

#46

Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses

David Glukhov, Ziwen Han, I Shumailov et al.

Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions

Siqiao Mu, Diego Klabjan

Poincaré Differential Privacy for Hierarchy-Aware Graph Embedding

Yuecen Wei, Haonan Yuan, Xingcheng Fu et al.

AAAI 2024arXiv:2312.12183

graph neural networksdifferential privacyhyperbolic geometrygraph embedding+3

10

citations

#49

Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy

Yangsibo Huang, Daogao Liu, Lynn Chua et al.

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training

Jin Zhou, Kaiwen Wang, Jonathan Chang et al.

NeurIPS 2025arXiv:2502.20548

distributional reinforcement learningkl-regularized rlllm post-trainingvalue-based algorithms+4

10

citations

#51

RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards

jingnan zheng, Xiangtian Ji, Yijun Lu et al.

Synthesizing Privacy-Preserving Text Data via Finetuning without Finetuning Billion-Scale LLMs

Bowen Tan, Zheng Xu, Eric Xing et al.

Scaling Laws for Differentially Private Language Models

Ryan McKenna, Yangsibo Huang, Amer Sinha et al.

Multi-Dimensional Fair Federated Learning

Cong Su, Guoxian Yu, Jun Wang et al.

AAAI 2024arXiv:2312.05551

federated learninggroup fairnessclient fairnessdifferential multipliers+3

9

citations

#55

Data Taggants: Dataset Ownership Verification Via Harmless Targeted Data Poisoning

Wassim Bouaziz, Nicolas Usunier, El-Mahdi El-Mhamdi

On Harmonizing Implicit Subpopulations

Feng Hong, Jiangchao Yao, YUEMING LYU et al.

SLIM: Spuriousness Mitigation with Minimal Human Annotations

Xiwei Xuan, Ziquan Deng, Hsuan-Tien Lin et al.

Differentially Private Steering for Large Language Model Alignment

Anmol Goel, Yaxi Hu, Iryna Gurevych et al.

Steganographic Passport: An Owner and User Verifiable Credential for Deep Model IP Protection Without Retraining

Qi Cui, Ruohan Meng, Chaohui Xu et al.

Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors

Tianchun Wang, Yuanzhou Chen, Zichuan Liu et al.

Privacy Attacks on Image AutoRegressive Models

Antoni Kowalczuk, Jan Dubiński, Franziska Boenisch et al.

PPIDSG: A Privacy-Preserving Image Distribution Sharing Scheme with GAN in Federated Learning

Yuting Ma, Yuanzhi Yao, Xiaohua Xu

AAAI 2024arXiv:2312.10380

federated learningprivacy-preserving sharingreconstruction attacksinference attacks+4

7

citations

#63

Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment

Alvi Md Ishmam, Chris Thomas

Privacy amplification by random allocation

Moshe Shenfeld, Vitaly Feldman

From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring

Yang Li, Qiang Sheng, Yehan Yang et al.

Improving Robustness to Model Inversion Attacks via Sparse Coding Architectures

Sayanton Vhaduri Dibbo, Adam Breuer, Juston Moore et al.

SEMU: Singular Value Decomposition for Efficient Machine Unlearning

Marcin Sendera, Łukasz Struski, Kamil Książek et al.

Contrastive Private Data Synthesis via Weighted Multi-PLM Fusion

Tianyuan Zou, Yang Liu, Peng Li et al.

Robustness Auditing for Linear Regression: To Singularity and Beyond

Ittai Rubinstein, Samuel Hopkins

ICLR 2025arXiv:2410.07916

robustness auditinglinear regressionordinary least squaressample removal+3

7

citations

#70

OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition

Yuchen Pan, Junjun Jiang, Kui Jiang et al.

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

Jing Yang

Mask in the Mirror: Implicit Sparsification

Tom Jacobs, Rebekka Burkholz

ICLR 2025arXiv:2408.09966

continuous sparsificationimplicit regularizationmirror flow frameworkunderdetermined linear regression+2

6

citations

#73

Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models

Chen Chen, Daochang Liu, Mubarak Shah et al.

Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks

Tianqu Zhuang, Hongyao Yu, Yixiang Qiu et al.

The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing

Blaise Delattre, Alexandre Araujo, Quentin Barthélemy et al.

Data-Free Hard-Label Robustness Stealing Attack

Xiaojian Yuan, Kejiang Chen, Wen Huang et al.

AAAI 2024arXiv:2312.05924

model stealing attackshard-label queriesrobustness stealingdata-free attacks+4

6

citations

#77

Differentially Private Federated Learning with Time-Adaptive Privacy Spending

Shahrzad Kianidehkordi, Nupur Kulkarni, Adam Dziedzic et al.

Strategic Classification With Externalities

Safwan Hossain, Evi Micha, Yiling Chen et al.

Protect Your Score: Contact-Tracing with Differential Privacy Guarantees

Rob Romijnders, Christos Louizos, Yuki Asano et al.

AAAI 2024arXiv:2312.11581

contact tracing algorithmsdifferential privacy guaranteesrisk score communicationprivacy-preserving mechanisms+4

5

citations

#80

Understanding Generalization in Quantum Machine Learning with Margins

TAK HUR, Daniel Kyungdeock Park

Hessian-Free Online Certified Unlearning

Xinbao Qiao, Meng Zhang, Ming Tang et al.

ICLR 2025arXiv:2404.01712

machine unlearningcertified unlearningonline unlearninghessian-free optimization+4

5

citations

#82

Near-Optimal Resilient Aggregation Rules for Distributed Learning Using 1-Center and 1-Mean Clustering with Outliers

Yuhao Yi, Ronghui You, Hong Liu et al.

AAAI 2024arXiv:2312.12835

byzantine machine learningresilient aggregation mechanismsdistributed learning systemsoutlier-robust clustering+4

5

citations

#83

Learning Safe Action Models with Partial Observability

Hai Le, Brendan Juba, Roni Stern

Unraveling the Enigma of Double Descent: An In-depth Analysis through the Lens of Learned Feature Space

Yufei Gu, Xiaoqing Zheng, Tomaso Aste

Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning

Fengyu Gao, Ruida Zhou, Tianhao Wang et al.

Towards Trustworthy Federated Learning with Untrusted Participants

Youssef Allouah, Rachid Guerraoui, John Stephan

Automatically Identify and Rectify: Robust Deep Contrastive Multi-view Clustering in Noisy Scenarios

xihong yang, Siwei Wang, Fangdi Wang et al.

A Generic Framework for Conformal Fairness

Aditya Vadlamani, Anutam Srinivasan, Pranav Maneriker et al.

Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models

Linh Tran, Wei Sun, Stacy Patterson et al.

SAP: Corrective Machine Unlearning with Scaled Activation Projection for Label Noise Robustness

Sangamesh Kodge, Deepak Ravikumar, Gobinda Saha et al.

DF-MIA: A Distribution-Free Membership Inference Attack on Fine-Tuned Large Language Models

Zhiheng Huang, Yannan Liu, Daojing He et al.

X-Hacking: The Threat of Misguided AutoML

Rahul Sharma, Sumantrak Mukherjee, Andrea Šipka et al.

How Far Are We from True Unlearnability?

Kai Ye, Liangcai Su, Chenxiong Qian

ICLR 2025arXiv:2509.08058

unlearnable examplesdata poisoningloss landscape analysismulti-task learning+4

4

citations

#94

Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment

Yufan Liu, Wanqian Zhang, Dayan Wu et al.

ECCV 2024arXiv:2407.08127

model inversion attackblack-box attackprediction alignmentlatent code search+4

4

citations

#95

Personalized Privacy Protection Mask Against Unauthorized Facial Recognition

Ka Ho Chow, Sihao Hu, Tiansheng Huang et al.

ECCV 2024arXiv:2407.13975

facial recognition privacyprivacy protection maskcross-image optimizationperceptibility optimization+3

4

citations

#96

DCT-CryptoNets: Scaling Private Inference in the Frequency Domain

Arjun Roy, Kaushik Roy

Towards Establishing Guaranteed Error for Learned Database Operations

Sepanta Zeighami, Cyrus Shahabi

Differential Privacy Under Class Imbalance: Methods and Empirical Insights

Lucas Rosenblatt, Yuliia Lut, Ethan Turok et al.

ExcluIR: Exclusionary Neural Information Retrieval

Wenhao Zhang, Mengqi Zhang, Shiguang Wu et al.

Bayesian Low-Rank Learning (Bella): A Practical Approach to Bayesian Neural Networks

Bao Gia Doan, Afshar Shamsi, Xiao-Yu Guo et al.

AAAI 2025

4

citations

Privacy in ML

Top Conferences

Related Topics (Ethics & Safety)

Top Papers

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Fast Machine Unlearning without Retraining through Selective Synaptic Dampening

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

MUSE: Machine Unlearning Six-Way Evaluation for Language Models

EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE

AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning

Data Shapley in One Training Run

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

No Prejudice! Fair Federated Graph Neural Networks for Personalized Recommendation

SimAC: A Simple Anti-Customization Method for Protecting Face Privacy against Text-to-Image Synthesis of Diffusion Models

DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection

Persistent Pre-training Poisoning of LLMs

AA-CLIP: Enhancing Zero-Shot Anomaly Detection via Anomaly-Aware CLIP

On the Relation between Trainability and Dequantization of Variational Quantum Learning Models

Localization Is All You Evaluate: Data Leakage in Online Mapping Datasets and How to Fix It

Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key

Machine Unlearning Fails to Remove Data Poisoning Attacks

A Closer Look at Machine Unlearning for Large Language Models

Generalization Analysis of Machine Learning Algorithms via the Worst-Case Data-Generating Probability Measure

Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models

T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation

Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG

Encryption-Friendly LLM Architecture

Progressive Poisoned Data Isolation for Training-Time Backdoor Defense

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense

Position: Editing Large Language Models Poses Serious Safety Risks

IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation

MERGE: Fast Private Text Generation

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

Regroup Median Loss for Combating Label Noise

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace

Backdoor Cleaning without External Guidance in MLLM Fine-tuning

A Generalized Shuffle Framework for Privacy Amplification: Strengthening Privacy Guarantees and Enhancing Utility

Minimum-Norm Interpolation Under Covariate Shift

SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

DP-SGD Without Clipping: The Lipschitz Neural Network Way

Prompt Risk Control: A Rigorous Framework for Responsible Deployment of Large Language Models

Privacy-Preserving Optics for Enhancing Protection in Face De-Identification

Causal Fairness under Unobserved Confounding: A Neural Sensitivity Framework

Emerging Property of Masked Token for Effective Pre-training

Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses

Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions

Poincaré Differential Privacy for Hierarchy-Aware Graph Embedding

Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training

RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards

Synthesizing Privacy-Preserving Text Data via Finetuning *without* Finetuning Billion-Scale LLMs

Scaling Laws for Differentially Private Language Models

Multi-Dimensional Fair Federated Learning

Data Taggants: Dataset Ownership Verification Via Harmless Targeted Data Poisoning

On Harmonizing Implicit Subpopulations

SLIM: Spuriousness Mitigation with Minimal Human Annotations

Differentially Private Steering for Large Language Model Alignment

Steganographic Passport: An Owner and User Verifiable Credential for Deep Model IP Protection Without Retraining

Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors

Privacy Attacks on Image AutoRegressive Models

PPIDSG: A Privacy-Preserving Image Distribution Sharing Scheme with GAN in Federated Learning

Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment

Privacy amplification by random allocation

From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring

Improving Robustness to Model Inversion Attacks via Sparse Coding Architectures

SEMU: Singular Value Decomposition for Efficient Machine Unlearning

Contrastive Private Data Synthesis via Weighted Multi-PLM Fusion

Robustness Auditing for Linear Regression: To Singularity and Beyond

OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

Mask in the Mirror: Implicit Sparsification

Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models

Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks

The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing

Data-Free Hard-Label Robustness Stealing Attack

Synthesizing Privacy-Preserving Text Data via Finetuning without Finetuning Billion-Scale LLMs