"mixture of experts" Papers
20 papers found
HMVLM:Human Motion-Vision-Language Model via MoE LoRA
Lei Hu, Yongjing Ye, Shihong Xia
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Fangxun Shu, Yue Liao, Lei Zhang et al.
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
Multi-Task Vehicle Routing Solver via Mixture of Specialized Experts under State-Decomposable MDP
Yuxin Pan, Zhiguang Cao, Chengyang GU et al.
NetMoE: Accelerating MoE Training through Dynamic Sample Placement
Xinyi Liu, Yujie Wang, Fangcheng Fu et al.
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
Yingying Zhang, Lixiang Ru, Kang Wu et al.
Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
James Oldfield, Shawn Im, Sharon Li et al.
Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts
Onur Celik, Aleksandar Taranovic, Gerhard Neumann
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
Mohammed Nowaz Rabbani Chowdhury, Meng Wang, Kaoutar El Maghraoui et al.
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
Daeun Lee, Jaehong Yoon, Sung Ju Hwang
Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters
Yuhang Zhou, Zhao Zihua, Siyuan Du et al.
Frequency-Adaptive Pan-Sharpening with Mixture of Experts
Xuanhua He, Keyu Yan, Rui Li et al.
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
Huy Nguyen, Pedram Akbarian, Nhat Ho
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Anke Tang, Li Shen, Yong Luo et al.
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Johan Obando Ceron, Ghada Sokar, Timon Willi et al.
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.
Norface: Improving Facial Expression Analysis by Identity Normalization
Hanwei Liu, Rudong An, Zhimeng Zhang et al.
On Least Square Estimation in Softmax Gating Mixture of Experts
Huy Nguyen, Nhat Ho, Alessandro Rinaldo
Receptive Fields As Experts in Convolutional Neural Architectures
Dongze Lian, Weihao Yu, Xinchao Wang
Scaling Laws for Fine-Grained Mixture of Experts
Jan Ludziejewski, Jakub Krajewski, Kamil Adamczewski et al.