🧬Efficiency

Model Compression

Making models smaller and faster

100 papers2,062 total citations
Compare with other topics
Feb '24 Jan '26327 papers
Also includes: model compression, compression, model efficiency, compact models

Top Papers

#1

Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

Suyu Ge, Yunan Zhang, Liyuan Liu et al.

ICLR 2024
372
citations
#2

RECOMP: Improving Retrieval-Augmented LMs with Context Compression and Selective Augmentation

Fangyuan Xu, Weijia Shi, Eunsol Choi

ICLR 2024
222
citations
#3

HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression

Yihang Chen, Qianyi Wu, Weiyao Lin et al.

ECCV 2024
179
citations
#4

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

gaojie lin, Jianwen Jiang, Jiaqi Yang et al.

ICCV 2025
86
citations
#5

Consistency Models Made Easy

Zhengyang Geng, Ashwini Pokle, Weijian Luo et al.

ICLR 2025
81
citations
#6

Model Stock: All we need is just a few fine-tuned models

Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han

ECCV 2024
71
citations
#7

UMA: A Family of Universal Models for Atoms

Brandon Wood, Misko Dzamba, Xiang Fu et al.

NeurIPS 2025arXiv:2506.23971
atomic simulationsmaterials sciencemixture of linear expertsempirical scaling laws+4
62
citations
#8

Accelerating Diffusion Transformers with Token-wise Feature Caching

Chang Zou, Xuyang Liu, Ting Liu et al.

ICLR 2025
62
citations
#9

Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching

Shitong Shao, Zeyuan Yin, Muxin Zhou et al.

CVPR 2024
54
citations
#10

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

Lanqing Guo, Yingqing He, Haoxin Chen et al.

ECCV 2024
50
citations
#11

EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation

Rang Meng, Xingyu Zhang, Yuming Li et al.

CVPR 2025
49
citations
#12

DiffAvatar: Simulation-Ready Garment Optimization with Differentiable Simulation

Yifei Li, Hsiaoyu Chen, Egor Larionov et al.

CVPR 2024
41
citations
#13

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

Han Shu, Wenshuo Li, Yehui Tang et al.

AAAI 2025
37
citations
#14

Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On

Xu Yang, Changxing Ding, Zhibin Hong et al.

CVPR 2024
37
citations
#15

From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning

Le Zhuo, Liangbing Zhao, Sayak Paul et al.

ICCV 2025
28
citations
#16

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

Kai Wang, Mingjia Shi, YuKun Zhou et al.

CVPR 2025
24
citations
#17

Training-Free Pretrained Model Merging

Zhengqi Xu, Ke Yuan, Huiqiong Wang et al.

CVPR 2024
24
citations
#18

Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians

Ishan Amin, Sanjeev Raja, Aditi Krishnapriyan

ICLR 2025
21
citations
#19

Effective Interplay between Sparsity and Quantization: From Theory to Practice

Simla Harma, Ayan Chakraborty, Elizaveta Kostenok et al.

ICLR 2025
19
citations
#20

Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging

Anke Tang, Enneng Yang, Li Shen et al.

NeurIPS 2025
18
citations
#21

Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene

Jiahao Wu, Rui Peng, Zhiyan Wang et al.

ICLR 2025
16
citations
#22

OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition

Stephen Zhang, Vardan Papyan

ICLR 2025
16
citations
#23

Locality-aware Gaussian Compression for Fast and High-quality Rendering

Seungjoo Shin, Jaesik Park, Sunghyun Cho

ICLR 2025
15
citations
#24

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation

Xiyi Chen, Marko Mihajlovic, Shaofei Wang et al.

CVPR 2024
15
citations
#25

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.

CVPR 2025
14
citations
#26

MambaIC: State Space Models for High-Performance Learned Image Compression

Fanhu Zeng, Hao Tang, Yihua Shao et al.

CVPR 2025
14
citations
#27

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

Julie Kallini, Shikhar Murty, Christopher Manning et al.

ICLR 2025
14
citations
#28

Palu: KV-Cache Compression with Low-Rank Projection

Chi-Chih Chang, Wei-Cheng Lin, Chien-Yu Lin et al.

ICLR 2025
14
citations
#29

ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference

Xiang Liu, Zhenheng Tang, Peijie Dong et al.

NeurIPS 2025
14
citations
#30

Compression of 3D Gaussian Splatting with Optimized Feature Planes and Standard Video Codecs

Soonbin Lee, Fangwen Shu, Yago Sanchez de la Fuente et al.

ICCV 2025arXiv:2501.03399
3d gaussian splatting3d scene representationfeature planesprogressive tri-plane structure+4
13
citations
#31

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models

Chen Ju, Haicheng Wang, Haozhe Cheng et al.

ECCV 2024
12
citations
#32

Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models

Zeman Li, Xinwei Zhang, Peilin Zhong et al.

ICLR 2025
11
citations
#33

MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection

Bokai Lin, Zihao Zeng, Zipeng Xiao et al.

ICLR 2025
10
citations
#34

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)

Tianyi Zhang, Mohsen Hariri, Shaochen (Henry) Zhong et al.

NeurIPS 2025
10
citations
#35

PLeaS - Merging Models with Permutations and Least Squares

Anshul Nasery, Jonathan Hayase, Pang Wei Koh et al.

CVPR 2025arXiv:2407.02447
model mergingpermutation symmetriesleast squares approximationfine-tuned models+3
10
citations
#36

PowerMLP: An Efficient Version of KAN

Ruichen Qiu, Yibo Miao, Shiwen Wang et al.

AAAI 2025
9
citations
#37

You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning

Ayan Sengupta, Siddhant Chaudhary, Tanmoy Chakraborty

ICLR 2025
9
citations
#38

MagCache: Fast Video Generation with Magnitude-Aware Cache

Zehong Ma, Longhui Wei, Feng Wang et al.

NeurIPS 2025
9
citations
#39

Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks

Shikai Qiu, Lechao Xiao, Andrew Wilson et al.

ICML 2025
9
citations
#40

HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder

Qi Yang, Le Yang, Geert Van der Auwera et al.

ICML 2025
9
citations
#41

Towards More Accurate Diffusion Model Acceleration with A Timestep Tuner

Mengfei Xia, Yujun Shen, Changsong Lei et al.

CVPR 2024
9
citations
#42

Efficient 3D Implicit Head Avatar with Mesh-anchored Hash Table Blendshapes

Ziqian Bai, Feitong Tan, Sean Fanello et al.

CVPR 2024
9
citations
#43

ModSkill: Physical Character Skill Modularization

Yiming Huang, Zhiyang Dou, Lingjie Liu

ICCV 2025
8
citations
#44

AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval

Pavel Suma, Giorgos Kordopatis-Zilos, Ahmet Iscen et al.

ECCV 2024
8
citations
#45

Toward Tiny and High-quality Facial Makeup with Data Amplify Learning

Qiaoqiao Jin, Xuanhong Chen, Meiguang Jin et al.

ECCV 2024
8
citations
#46

Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data

David Heurtel-Depeiges, Anian Ruoss, Joel Veness et al.

ICML 2025
7
citations
#47

SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models

Hung Nguyen, Quang Qui-Vinh Nguyen, Khoi Nguyen et al.

AAAI 2025
7
citations
#48

Kinetics: Rethinking Test-Time Scaling Law

Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng et al.

NeurIPS 2025arXiv:2506.05333
test-time scalingmemory access bottleneckssparse attentioninference-time strategies+3
7
citations
#49

AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

Xinghui Li, Qichao Sun, Pengze Zhang et al.

CVPR 2025
7
citations
#50

Vision-centric Token Compression in Large Language Model

Ling Xing, Alex Jinpeng Wang, Rui Yan et al.

NeurIPS 2025arXiv:2502.00791
token compressionvision-language modelscontext window expansionin-context learning+4
7
citations
#51

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning

Shibo Jie, Yehui Tang, Jianyuan Guo et al.

ECCV 2024
7
citations
#52

Learned Image Compression with Hierarchical Progressive Context Modeling

Yuqi Li, Haotian Zhang, Li Li et al.

ICCV 2025
6
citations
#53

Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better

Enshu Liu, Junyi Zhu, Zinan Lin et al.

ICLR 2025arXiv:2404.02241
diffusion modelsconsistency modelscheckpoint averaginggenerative models+3
6
citations
#54

ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS

Weijie Wang, Donny Y. Chen, Zeyu Zhang et al.

NeurIPS 2025
6
citations
#55

Differentiable Product Quantization for Memory Efficient Camera Relocalization

Zakaria Laskar, Iaroslav Melekhov, Assia Benbihi et al.

ECCV 2024
6
citations
#56

Visual Persona: Foundation Model for Full-Body Human Customization

Jisu Nam, Soowon Son, Zhan Xu et al.

CVPR 2025
6
citations
#57

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Xuran Ma, Yexin Liu, Yaofu LIU et al.

ICCV 2025
6
citations
#58

StableCodec: Taming One-Step Diffusion for Extreme Image Compression

Tianyu Zhang, Xin Luo, Li Li et al.

ICCV 2025
6
citations
#59

A Unified Model for Compressed Sensing MRI Across Undersampling Patterns

Armeet Singh Jatyani, Jiayun Wang, Aditi Chandrashekar et al.

CVPR 2025
6
citations
#60

Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements

Niccolò Biondi, Federico Pernici, Simone Ricci et al.

CVPR 2024
5
citations
#61

MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs

Tommaso Mencattini, Adrian Robert Minut, Donato Crisostomi et al.

ICML 2025
5
citations
#62

Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization

Vladimir Boza, Vladimir Macko

ICLR 2025
5
citations
#63

PixelMan: Consistent Object Editing with Diffusion Models via Pixel Manipulation and Generation

Liyao Jiang, Negar Hassanpour, Mohammad Salameh et al.

AAAI 2025
5
citations
#64

CoMBO: Conflict Mitigation via Branched Optimization for Class Incremental Segmentation

Kai Fang, Anqi Zhang, Guangyu Gao et al.

CVPR 2025
5
citations
#65

Make Lossy Compression Meaningful for Low-Light Images

Shilv Cai, Liqun Chen, Sheng Zhong et al.

AAAI 2024arXiv:2305.15030
lossy image compressionlow-light image enhancementend-to-end architecturesignal-to-noise ratio aware+2
5
citations
#66

Lightweight Predictive 3D Gaussian Splats

Junli Cao, Vidit Goel, Chaoyang Wang et al.

ICLR 2025arXiv:2406.19434
3d gaussian splatsscene representationlightweight renderingpoint cloud compression+3
5
citations
#67

Unsegment Anything by Simulating Deformation

Jiahao Lu, Xingyi Yang, Xinchao Wang

CVPR 2024
5
citations
#68

We Should Chart an Atlas of All the World's Models

Eliahu Horwitz, Nitzan Kurer, Jonathan Kahana et al.

NeurIPS 2025
5
citations
#69

ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness

Boqian Li, Zeyu Cai, Michael Black et al.

ICCV 2025arXiv:2503.10624
body fittingclothed humansequivariant tightnesspoint cloud processing+4
5
citations
#70

Leveraging Hierarchical Feature Sharing for Efficient Dataset Condensation

Haizhong Zheng, Jiachen Sun, Shutong Wu et al.

ECCV 2024
5
citations
#71

PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling

Hao Zhang, Haolan Xu, Chun Feng et al.

ICCV 2025
5
citations
#72

Make Your Training Flexible: Towards Deployment-Efficient Video Models

Chenting Wang, Kunchang Li, Tianxiang Jiang et al.

ICCV 2025
5
citations
#73

Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization

Jiaxin Deng, Junbiao Pang, Baochang Zhang et al.

AAAI 2025
4
citations
#74

WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression

Yu Mao, Jun Wang, Nan Guan et al.

CVPR 2025
4
citations
#75

Metamizer: A Versatile Neural Optimizer for Fast and Accurate Physics Simulations

Nils Wandel, Stefan Schulz, Reinhard Klein

ICLR 2025
4
citations
#76

Efficient and Accurate Explanation Estimation with Distribution Compression

Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl et al.

ICLR 2025
4
citations
#77

SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs

Shibo Jie, Yehui Tang, Kai Han et al.

ICML 2025
4
citations
#78

Fast and Low-Cost Genomic Foundation Models via Outlier Removal

Haozheng Luo, Chenghao Qiu, Maojiang Su et al.

ICML 2025
4
citations
#79

LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing

Ruisi Cai, Saurav Muralidharan, Hongxu Yin et al.

ICLR 2025
structured pruningknowledge distillationweight sharingzero-shot pruning+4
4
citations
#80

Anatomically Constrained Implicit Face Models

Prashanth Chandran, Gaspard Zoss

CVPR 2024
4
citations
#81

Accelerated Methods with Compressed Communications for Distributed Optimization Problems Under Data Similarity

Dmitry Bylinkin, Aleksandr Beznosikov

AAAI 2025
3
citations
#82

Scale Efficient Training for Large Datasets

Qing Zhou, Junyu Gao, Qi Wang

CVPR 2025
3
citations
#83

CuMPerLay: Learning Cubical Multiparameter Persistence Vectorizations

Caner Korkmaz, Brighton Nuwagira, Baris Coskunuzer et al.

ICCV 2025
3
citations
#84

Scaling Down Text Encoders of Text-to-Image Diffusion Models

Lifu Wang, Daqing Liu, Xinchen Liu et al.

CVPR 2025
3
citations
#85

Compressing Streamable Free-Viewpoint Videos to 0.1 MB per Frame

Luyang Tang, Jiayu Yang, Rui Peng et al.

AAAI 2025
3
citations
#86

Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models

Alina Shutova, Vladimir Malinovskii, Vage Egiazarian et al.

ICML 2025
3
citations
#87

A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective

Lianghe Shi, Meng Wu, Huijie Zhang et al.

NeurIPS 2025
3
citations
#88

ShoeModel: Learning to Wear on the User-specified Shoes via Diffusion Model

Wenyu Li, Binghui Chen, Yifeng Geng et al.

ECCV 2024
3
citations
#89

Adaptive Pruning of Pretrained Transformer via Differential Inclusions

yizhuo Ding, Ke Fan, Yikai Wang et al.

ICLR 2025
3
citations
#90

RivuletMLP: An MLP-based Architecture for Efficient Compressed Video Quality Enhancement

Gang He, Weiran Wang, Guancheng Quan et al.

CVPR 2025
3
citations
#91

FREE-Merging: Fourier Transform for Efficient Model Merging

Shenghe Zheng, Hongzhi Wang

ICCV 2025
3
citations
#92

DMesh++: An Efficient Differentiable Mesh for Complex Shapes

Sanghyun Son, Matheus Gadelha, Yang Zhou et al.

ICCV 2025arXiv:2412.16776
differentiable mesh processing3d triangular meshesmesh connectivityshape reconstruction+3
3
citations
#93

Faster and Better 3D Splatting via Group Training

Chengbo Wang, Guozheng Ma, Yizhen Lao et al.

ICCV 2025arXiv:2412.07608
3d gaussian splattingnovel view synthesisscene reconstructiontraining efficiency+2
3
citations
#94

Improving Visual and Downstream Performance of Low-Light Enhancer with Vision Foundation Models Collaboration

yuxuan Gu, Huaian Chen, Yi Jin et al.

CVPR 2025
2
citations
#95

Adaptive Routing of Text-to-Image Generation Requests Between Large Cloud Model and Light-Weight Edge Model

Zewei Xin, Qinya Li, Chaoyue Niu et al.

ICCV 2025
2
citations
#96

Trade-offs in Image Generation: How Do Different Dimensions Interact?

Sicheng Zhang, Binzhu Xie, Zhonghao Yan et al.

ICCV 2025
2
citations
#97

Singular Value Scaling: Efficient Generative Model Compression via Pruned Weights Refinement

Hyeonjin Kim, Jaejun Yoo

AAAI 2025
2
citations
#98

DeepCompress-ViT: Rethinking Model Compression to Enhance Efficiency of Vision Transformers at the Edge

Sabbir Ahmed, Abdullah Al Arafat, Deniz Najafi et al.

CVPR 2025
2
citations
#99

Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability

Unki Park, Seongmoon Jeong, Jang Youngchan et al.

CVPR 2025
2
citations
#100

PocketSR: The Super-Resolution Expert in Your Pocket Mobiles

Haoze Sun, Linfeng Jiang, Fan Li et al.

NeurIPS 2025
2
citations