NeurIPS "computational efficiency" Papers
20 papers found
Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings
Qiong Wu, Wenhao Lin, Yiyi Zhou et al.
Accurate and Efficient Low-Rank Model Merging in Core Space
Aniello Panariello, Daniel Marczak, Simone Magistri et al.
Adaptive Inference-Time Scaling via Cyclic Diffusion Search
Gyubin Lee, Bao Truong, Jaesik Yoon et al.
Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization
kaiyuan Li, Xiaoyue Chen, Chen Gao et al.
Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability
Divya Jyoti Bajpai, Manjesh Kumar Hanawal
Bio-Inspired Image Restoration
Yuning Cui, Wenqi Ren, Alois Knoll
Efficient RAW Image Deblurring with Adaptive Frequency Modulation
Wenlong Jiao, Binglong Li, Wei Shang et al.
Faithful Group Shapley Value
Kiljae Lee, Ziqi Liu, Weijing Tang et al.
Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms
Yinuo Ren, Haoxuan Chen, Yuchen Zhu et al.
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Leqi Shen, Guoqiang Gong, Tao He et al.
Gatekeeper: Improving Model Cascades Through Confidence Tuning
Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha et al.
HoliTom: Holistic Token Merging for Fast Video Large Language Models
Kele Shao, Keda TAO, Can Qin et al.
Multi-Agent Collaboration via Evolving Orchestration
Yufan Dang, Chen Qian, Xueheng Luo et al.
One Head to Rule Them All: Amplifying LVLM Safety through a Single Critical Attention Head
Junhao Xia, Haotian Zhu, Shuchao Pang et al.
Robust Regression of General ReLUs with Queries
Ilias Diakonikolas, Daniel Kane, Mingchen Ma
SCOPE: Saliency-Coverage Oriented Token Pruning for Efficient Multimodel LLMs
Jinhong Deng, Wen Li, Joey Tianyi Zhou et al.
SHF: Symmetrical Hierarchical Forest with Pretrained Vision Transformer Encoder for High-Resolution Medical Segmentation
Enzhi Zhang, Peng Chen, Rui Zhong et al.
VCM: Vision Concept Modeling with Adaptive Vision Token Compression via Instruction Fine-Tuning
Run Luo, Renke Shan, Longze Chen et al.
Vision-centric Token Compression in Large Language Model
Ling Xing, Alex Jinpeng Wang, Rui Yan et al.
VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.