SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity

4
citations
#1154
in ICML 2025
of 3340 papers
9
Top Authors
4
Data Points

Abstract

Fine-tuning LLMs is both computationally andmemory-intensive. While parameter-efficient fine-tuning methods, such as QLoRA and DoRA,reduce the number of trainable parameters andlower memory usage, they do not decrease computational cost. In some cases, they may evenslow down fine-tuning. In this paper, we introduceSparseLoRA, a method that accelerates LLM fine-tuning through contextual sparsity. We proposea lightweight, training-free SVD sparsity estimator that dynamically selects a sparse subset ofweights for loss and gradient computation. Also,we systematically analyze and address sensitivityacross layers, tokens, and training steps. Our experimental results show that SparseLoRA reducescomputational cost by up to $2.0\times$ and a measuredspeedup of up to $1.5\times$ while maintaining accuracy across various downstream tasks, includingcommonsense and arithmetic reasoning, code generation, and instruction following.

Citation History

Jan 28, 2026
0
Feb 13, 2026
4+4
Feb 13, 2026
4
Feb 13, 2026
4