Knowledge Distillation
Transferring knowledge to smaller models
Top Papers
CLIP-KD: An Empirical Study of CLIP Model Distillation
Chuanguang Yang, Zhulin An, Libo Huang et al.
Towards Continual Knowledge Graph Embedding via Incremental Distillation
Jiajun Liu, Ke Wenjun, Peng Wang et al.
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Fangxun Shu, Yue Liao, Lei Zhang et al.
Distribution-aware Knowledge Prototyping for Non-exemplar Lifelong Person Re-identification
Kunlun Xu, Xu Zou, Yuxin Peng et al.
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Wenda Xu, Rujun Han, Zifeng Wang et al.
Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting
Yuqi Li, Chuanguang Yang, Hansheng Zeng et al.
VkD: Improving Knowledge Distillation using Orthogonal Projections
Roy Miles, Ismail Elezi, Jiankang Deng
Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models
Xiao Cui, Mo Zhu, Yulei Qin et al.
Unlocking Dataset Distillation with Diffusion Models
Brian Moser, Federico Raue, Sebastian Palacio et al.
An Upload-Efficient Scheme for Transferring Knowledge From a Server-Side Pre-trained Generator to Clients in Heterogeneous Federated Learning
Jianqing Zhang, Yang Liu, Yang Hua et al.
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta, Alberto Baldrati, Marco Bertini et al.
Embarrassingly Simple Dataset Distillation
Yunzhen Feng, Shanmukha Ramakrishna Vedantam, Julia Kempe
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation
Amin Parchami, Moritz BΓΆhle, Sukrut Rao et al.
De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts
Yuzheng Wang, Dingkang Yang, Zhaoyu Chen et al.
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
Yu Wang, Xin Li, Shengzhao Wen et al.
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
Ming Zhong, Chenxin An, Weizhu Chen et al.
MiniPLM: Knowledge Distillation for Pre-training Language Models
Yuxian Gu, Hao Zhou, Fandong Meng et al.
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen et al.
Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs
Sungmin Cha, Sungjun Cho, Dasol Hwang et al.
Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching
Benjamin Minixhofer, Ivan VuliΔ, Edoardo Maria Ponti
A Good Learner can Teach Better: Teacher-Student Collaborative Knowledge Distillation
Ayan Sengupta, Shantanu Dixit, Md Shad Akhtar et al.
Large Language Model Meets Graph Neural Network in Knowledge Distillation
Shengxiang Hu, Guobing Zou, Song Yang et al.
Fine-Grained Knowledge Selection and Restoration for Non-exemplar Class Incremental Learning
Authors: Jiang-Tian Zhai, Xialei Liu, Lu Yu et al.
Distilling Reliable Knowledge for Instance-Dependent Partial Label Learning
Dong-Dong Wu, Deng-Bao Wang, Min-Ling Zhang
Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching
Ruonan Yu, Songhua Liu, Jingwen Ye et al.
Asymmetric Masked Distillation for Pre-Training Small Foundation Models
Zhiyu Zhao, Bingkun Huang, Sen Xing et al.
Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition
Chuanguang Yang, XinQiang Yu, Han Yang et al.
Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models
Zheng Hu, Zhe Li, Ziyun Jiao et al.
Knowledge-Aware Parameter Coaching for Personalized Federated Learning
Mingjian Zhi, Yuanguo Bi, Wenchao Xu et al.
LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection
Sanmin Kim, Youngseok Kim, Sihwan Hwang et al.
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Minki Kang, Jongwon Jeong, Seanie Lee et al.
How to Train the Teacher Model for Effective Knowledge Distillation
Shayan Mohajer Hamidi, Xizhen Deng, Renhao Tan et al.
Active Object Detection with Knowledge Aggregation and Distillation from Large Models
Dejie Yang, Yang Liu
MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities
Kunxi Li, Tianyu Zhan, Kairui Fu et al.
Improving Knowledge Distillation via Regularizing Feature Direction and Norm
Yuzhu Wang, Lechao Cheng, Manni Duan et al.
Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation
YUE XU, Yong-Lu Li, Kaitong Cui et al.
Teacher as a Lenient Expert: Teacher-Agnostic Data-Free Knowledge Distillation
Hyunjune Shin, Dong-Wan Choi
Improving Language Model Distillation through Hidden State Matching
Sayantan Dasgupta, Trevor Cohn
Generative Model-Based Feature Knowledge Distillation for Action Recognition
Guiqin Wang, Peng Zhao, Yanjiang Shi et al.
Boosting Residual Networks with Group Knowledge
Shengji Tang, Peng Ye, Baopu Li et al.
Graph-Based Cross-Domain Knowledge Distillation for Cross-Dataset Text-to-Image Person Retrieval
Bingjun Luo, Jinpeng Wang, Zewen Wang et al.
CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models
Dongfang Li, Zetian Sun, Xinshuo Hu et al.
Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion
Cunhang Fan, Yujie Chen, Jun Xue et al.
Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
Jincheng Zhong, XiangCheng Zhang, Jianmin Wang et al.
A General Theoretical Framework for Learning Smallest Interpretable Models
Sebastian Ordyniak, Giacomo Paesani, Mateusz Banany et al.
Building Optimal Neural Architectures using Interpretable Knowledge
Keith Mills, Fred Han, Mohammad Salameh et al.
DCSF-KD: Dynamic Channel-wise Spatial Feature Knowledge Distillation for Object Detection
Tao Dai, Yang Lin, Hang Guo et al.
Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration
Yunshuai Zhou, Junbo Qiao, Jincheng Liao et al.
Knowledge Distillation with Refined Logits
Wujie Sun, Defang Chen, Siwei Lyu et al.
Gatekeeper: Improving Model Cascades Through Confidence Tuning
Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha et al.
AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation
Zihao Tang, Zheqi Lv, Shengyu Zhang et al.
Less or More From Teacher: Exploiting Trilateral Geometry For Knowledge Distillation
Chengming Hu, Haolun Wu, Xuan Li et al.
What Makes a Good Dataset for Knowledge Distillation?
Logan Frank, Jim Davis
Data-to-Model Distillation: Data-Efficient Learning Framework
Ahmad Sajedi, Samir Khaki, Lucy Z. Liu et al.
EA-KD: Entropy-based Adaptive Knowledge Distillation
Chi-Ping Su, Ching-Hsun Tseng, Bin Pu et al.
AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution
Yun Zhang, Wei Li, Simiao Li et al.
Enhancing Generalized Few-Shot Semantic Segmentation via Effective Knowledge Transfer
Xinyue Chen, Miaojing Shi, Zijian Zhou et al.
From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question-Answering
Nathaniel Weir, Bhavana Dalvi Mishra, Orion Weller et al.
Query-based Knowledge Transfer for Heterogeneous Learning Environments
Norah Alballa, Wenxuan Zhang, Ziquan Liu et al.
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang, Yunjian Zhang, Yao Zhu et al.
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Xinran Gu, Kaifeng Lyu, Jiazheng Li et al.
When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
Quan Shi, Carlos Jimenez, Shunyu Yao et al.
Harmonizing knowledge Transfer in Neural Network with Unified Distillation
yaomin huang, faming Fang, Zaoming Yan et al.
Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning
Yichen Li, Xiuying Wang, Wenchao Xu et al.
Cooperative Knowledge Distillation: A Learner Agnostic Approach
Michael Livanos, Ian Davidson, Stephen Wong
Neural Collapse Inspired Knowledge Distillation
Shuoxi Zhang, Zijian Song, Kun He
Reducing Spatial Fitting Error in Distillation of Denoising Diffusion Models
Shengzhe Zhou, Zejian Li, Shengyuan Zhang et al.
Hybrid Data-Free Knowledge Distillation
Jialiang Tang, Shuo Chen, Chen Gong
Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation
Sheng-Feng Yu, Jia-Jiun Yao, Wei-Chen Chiu
DistillHGNN: A Knowledge Distillation Approach for High-Speed Hypergraph Neural Networks
Saman Forouzandeh, Parham Moradi Dowlatabadi, Mahdi Jalili
KDAT: Inherent Adversarial Robustness via Knowledge Distillation with Adversarial Tuning for Object Detection Models
Yarin Yerushalmi Levi, Edita Grolman, Idan Yankelev et al.
SelKD: Selective Knowledge Distillation via Optimal Transport Perspective
Liangliang Shi, Zhengyan Shi, Junchi Yan
Logit Standardization in Knowledge Distillation
Shangquan Sun, Wenqi Ren, Jingzhi Li et al.
CrossKD: Cross-Head Knowledge Distillation for Object Detection
JiaBao Wang, yuming chen, Zhaohui Zheng et al.
Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation
Keonhee Han, Dominik Muhle, Felix Wimbauer et al.
Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer
Jinyu Yang, Ruijia Wang, Cheng Yang et al.
RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features
Geonho Bang, Kwangjin Choi, Jisong Kim et al.
Co-Progression Knowledge Distillation with Knowledge Prototype for Industrial Anomaly Detection
Bokang Yang, Zhe Zhang, Jie Ma
Data Shunt: Collaboration of Small and Large Models for Lower Costs and Better Performance
Dong Chen, Yueting Zhuang, Shuo Zhang et al.
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
Yushi Hu, Otilia Stretcu, Chun-Ta Lu et al.
Maintaining Fairness in Logit-based Knowledge Distillation for Class-Incremental Learning
Zijian Gao, Shanhao Han, Xingxing Zhang et al.
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
Zeyuan Allen-Zhu, Yuanzhi Li
BLADE: Enhancing Black-Box Large Language Models with Small Domain-Specific Models
Haitao Li, Qingyao Ai, Jia Chen et al.
Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks
Huanxuan Liao, Shizhu He, Yao Xu et al.
Small Scale Data-Free Knowledge Distillation
He Liu, Yikai Wang, Huaping Liu et al.
Progressively Knowledge Distillation via Re-parameterizing Diffusion Reverse Process
Xufeng Yao, Fanbin Lu, Yuechen Zhang et al.
Understanding the Role of the Projector in Knowledge Distillation
Heuristic-free Knowledge Distillation for Streaming ASR via Multi-modal Training
Ji Won Yoon
Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation
Yuzheng Wang, Zhaoyu Chen, Dingkang Yang et al.
Low-Rank Knowledge Decomposition for Medical Foundation Models
Yuhang Zhou, Haolin li, Siyuan Du et al.
Small-to-Large Generalization: Training Data Influences Models Consistently Across Scale
Alaa Khaddaj, Logan Engstrom, Aleksander Madry
A Knowledge Distillation-Based Approach to Enhance Transparency of Classifier Models
Yuchen Jiang, Xinyuan Zhao, Yihang Wu et al.
Complementary Knowledge Distillation for Robust and Privacy-Preserving Model Serving in Vertical Federated Learning
Dashan Gao, Sheng Wan, Lixin Fan et al.
Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models
Chenhui Hu, Pengfei Cao, Yubo Chen et al.
Spatial-Temporal Knowledge Distillation for Takeaway Recommendation
Shuyuan Zhao, Wei Chen, Boyan Shi et al.
Self-Training Based Few-Shot Node Classification by Knowledge Distillation
Zongqian Wu, Yujie Mo, Peng Zhou et al.
D^4: Dataset Distillation via Disentangled Diffusion Model
Duo Su, Junjie Hou, Weizhi Gao et al.
Overcoming Generic Knowledge Loss with Selective Parameter Update
Wenxuan Zhang, Paul Janson, Rahaf Aljundi et al.
Adaptive Dual Guidance Knowledge Distillation
Tong Li, Long Liu, Kang Liu et al.
Real-Time Neural Denoising with Render-Aware Knowledge Distillation
Mengxun Kong, Jie Guo, Chen Wang et al.