2025 Poster "model compression" Papers

29 papers found

Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization

kaiyuan Li, Xiaoyue Chen, Chen Gao et al.

NeurIPS 2025posterarXiv:2505.22038
4
citations

Composable Interventions for Language Models

Arinbjörn Kolbeinsson, Kyle O'Brien, Tianjin Huang et al.

ICLR 2025posterarXiv:2407.06483
4
citations

Computation and Memory-Efficient Model Compression with Gradient Reweighting

Zhiwei Li, Yuesen Liao, Binrui Wu et al.

NeurIPS 2025poster

DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models

Yongqi Huang, Peng Ye, Chenyu Huang et al.

CVPR 2025posterarXiv:2503.01359
6
citations

EdgeTAM: On-Device Track Anything Model

Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.

CVPR 2025posterarXiv:2501.07256
8
citations

EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction

Hsi-Che Lin, Yu-Chu Yu, Kai-Po Chang et al.

NeurIPS 2025posterarXiv:2506.12015

Fast Feedforward 3D Gaussian Splatting Compression

Yihang Chen, Qianyi Wu, Mengyao Li et al.

ICLR 2025posterarXiv:2410.08017
26
citations

FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization

Seung-Wook Kim, Seongyeol Kim, Jiah Kim et al.

ICCV 2025posterarXiv:2506.23516

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee, Haebin Seong, Dong Bok Lee et al.

ICLR 2025posterarXiv:2410.01524
13
citations

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

Yuxian Gu, Qinghao Hu, Haocheng Xi et al.

NeurIPS 2025posterarXiv:2508.15884
15
citations

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

Simiao Li, Yun Zhang, Wei Li et al.

ICLR 2025posterarXiv:2404.02573
4
citations

Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

Fei Wang, Li Shen, Liang Ding et al.

NeurIPS 2025posterarXiv:2510.15304

LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing

Ruisi Cai, Saurav Muralidharan, Hongxu Yin et al.

ICLR 2025poster
4
citations

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Yuxuan Cai, Jiangning Zhang, Haoyang He et al.

ICCV 2025posterarXiv:2410.16236
23
citations

Mixture Compressor for Mixture-of-Experts LLMs Gains More

Wei Huang, Yue Liao, Jianhui Liu et al.

ICLR 2025posterarXiv:2410.06270
22
citations

MODEL SHAPLEY: Find Your Ideal Parameter Player via One Gradient Backpropagation

Chu Xu, Xinke Jiang, Rihong Qiu et al.

NeurIPS 2025poster

MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics

Bowei Guo, Shengkun Tang, Cong Zeng et al.

ICCV 2025posterarXiv:2510.11962
1
citations

One-Shot Knowledge Transfer for Scalable Person Re-Identification

Longhua Li, Lei Qi, Xin Geng

ICCV 2025posterarXiv:2511.06016

Optimal Brain Apoptosis

Mingyuan Sun, Zheng Fang, Jiaxu Wang et al.

ICLR 2025posterarXiv:2502.17941
3
citations

PLD: A Choice-Theoretic List-Wise Knowledge Distillation

Ejafa Bassam, Dawei Zhu, Kaigui Bian

NeurIPS 2025posterarXiv:2506.12542

Quantization without Tears

Minghao Fu, Hao Yu, Jie Shao et al.

CVPR 2025posterarXiv:2411.13918
14
citations

Quantized Spike-driven Transformer

Xuerui Qiu, Malu Zhang, Jieyuan Zhang et al.

ICLR 2025posterarXiv:2501.13492
14
citations

RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models

Zukang Xu, Xing Hu, Qiang Wu et al.

NeurIPS 2025posterarXiv:2510.01240

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

Rasoul Shafipour, David Harrison, Maxwell Horton et al.

ICLR 2025posterarXiv:2410.10714
2
citations

SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking

Xingrun Xing, Boyan Gao, Zheng Liu et al.

ICLR 2025posterarXiv:2407.04752
21
citations

Systematic Outliers in Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

ICLR 2025posterarXiv:2502.06415
15
citations

The Unreasonable Ineffectiveness of the Deeper Layers

Andrey Gromov, Kushal Tirumala, Hassan Shapourian et al.

ICLR 2025posterarXiv:2403.17887
160
citations

TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks

Xiang Meng, Mehdi Makni, Rahul Mazumder

NeurIPS 2025poster

Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models

Yoojin Jung, Byung Cheol Song

CVPR 2025posterarXiv:2504.04747
1
citations