Mixture of Experts
Sparse MoE architectures
Top Papers
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
Yuedong Chen, Haofei Xu, Chuanxia Zheng et al.
Scaling and evaluating sparse autoencoders
Leo Gao, Tom Dupre la Tour, Henk Tillman et al.
Mixture-of-Agents Enhances Large Language Model Capabilities
Junlin Wang, Jue Wang, Ben Athiwaratkun et al.
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Samuel Marks, Can Rager, Eric Michaud et al.
RoMa: Robust Dense Feature Matching
Johan Edstedt, Qiyu Sun, Georg Bökman et al.
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Linrui Tian, Qi Wang, Bang Zhang et al.
SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference
Yuan Zhang, Chun-Kai Fan, Junpeng Ma et al.
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri, Melissa Z Pan, Shuyi Yang et al.
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts
Xiaoming Shi, Shiyu Wang, Yuqi Nie et al.
Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed
Yifan Wang, Xingyi He, Sida Peng et al.
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
Ted Zadouri, Ahmet Üstün, Arash Ahmadian et al.
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li, Zedong Wang, Zicheng Liu et al.
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi, Fuxiao Liu, Shihao Wang et al.
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
Changhun Lee, Jungyu Jin, Taesu Kim et al.
AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders
Zhengxuan Wu, Aryaman Arora, Atticus Geiger et al.
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
Consistency Models Made Easy
Zhengyang Geng, Ashwini Pokle, Weijian Luo et al.
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
Haiyang Liu, Zihao Zhu, Giorgio Becherini et al.
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.
Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders
Yaohua Zha, Huizhen Ji, Jinmin Li et al.
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Zhaorun Chen, Zichen Wen, Yichao Du et al.
What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
Guangkai Xu, yongtao ge, Mingyu Liu et al.
RE-Bench: Evaluating Frontier AI R&D Capabilities of Language Model Agents against Human Experts
Hjalmar Wijk, Tao Lin, Joel Becker et al.
Inductive Moment Matching
Linqi (Alex) Zhou, Stefano Ermon, Jiaming Song
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai, Jianqiao Lu, Yao Luo et al.
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
Adam Karvonen, Can Rager, Johnny Lin et al.
CoMo: Controllable Motion Generation through Language Guided Pose Code Editing
Yiming Huang, WEILIN WAN, Yue Yang et al.
OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers
Han Liang, Jiacheng Bao, Ruichi Zhang et al.
TabM: Advancing tabular deep learning with parameter-efficient ensembling
Yury Gorishniy, Akim Kotelnikov, Artem Babenko
Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification
Zhen-Xiang Ma, Zhen-Duo Chen, Li-Jun Zhao et al.
SEPT: Towards Efficient Scene Representation Learning for Motion Prediction
Zhiqian Lan, Yuxuan Jiang, Yao Mu et al.
S2MAE: A Spatial-Spectral Pretraining Foundation Model for Spectral Remote Sensing Data
Xuyang Li, Danfeng Hong, Jocelyn Chanussot
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Keisuke Kamahori, Tian Tang, Yile Gu et al.
Self-Evolving Multi-Agent Collaboration Networks for Software Development
Yue Hu, Yuzhu Cai, Yaxin Du et al.
Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
Heli Ben-Hamu, Itai Gat, Daniel Severo et al.
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li, Sen Lin, Lingjie Duan et al.
HSEvo: Elevating Automatic Heuristic Design with Diversity-Driven Harmony Search and Genetic Algorithm Using LLMs
Pham Vu Tuan Dat, Long Doan, Huynh Thi Thanh Binh
Multi-Architecture Multi-Expert Diffusion Models
Yunsung Lee, Jin-Young Kim, Hyojun Go et al.
STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction
Yu-Hsuan Wu, Jerry Hu, Weijian Li et al.
TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts
Hyunwook Lee, Sungahn Ko
SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction
Yang Zhou, Hao Shao, Letian Wang et al.
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin, Bo Zhu, Li Yuan et al.
STI-Bench: Are MLLMs Ready for Precise Spatial-Temporal World Understanding?
Yun Li, Yiming Zhang, Tao Lin et al.
Towards Energy Efficient Spiking Neural Networks: An Unstructured Pruning Framework
Xinyu Shi, Jianhao Ding, Zecheng Hao et al.
NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views
Han Huang, Yulun Wu, Junsheng Zhou et al.
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Fangxun Shu, Yue Liao, Lei Zhang et al.
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang, Zeyuan Wang, Qiushi Lyu et al.
Spurious Feature Diversification Improves Out-of-distribution Generalization
LIN Yong, Lu Tan, Yifan HAO et al.
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
Zhihao Li, Yufei Wang, Heliang Zheng et al.
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Yucheng Li, Huiqiang Jiang, Qianhui Wu et al.
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang, Pei Zhang, Baosong Yang et al.
Frequency-Adaptive Pan-Sharpening with Mixture of Experts
Xuanhua He, Keyu Yan, Rui Li et al.
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang, Xu Yan, Dongfeng Bai et al.
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Peng Jin, Bo Zhu, Yuan Li et al.
MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views
Wangze Xu, Huachen Gao, Shihe Shen et al.
Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Shiji Zhao, Ranjie Duan, Fengxiang Wang et al.
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
Thomas Fel, Ekdeep Singh Lubana, Jacob Prince et al.
Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
Ziyue Li, Tianyi Zhou
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Yuhao Wang, Yang Liu, Aihua Zheng et al.
Sparse autoencoders reveal selective remapping of visual concepts during adaptation
Hyesu Lim, Jinho Choi, Jaegul Choo et al.
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
Baichuan Zhou, Haote Yang, Dairong Chen et al.
UMBRAE: Unified Multimodal Brain Decoding
Weihao Xia, Raoul de Charette, Cengiz Oztireli et al.
MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos with Depth Priors
Qingming LIU, Yuan Liu, Jiepeng Wang et al.
KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems
Jusheng Zhang, Zimeng Huang, Yijia Fan et al.
Efficient Inference of Vision Instruction-Following Models with Elastic Cache
ZUYAN LIU, Benlin Liu, Jiahui Wang et al.
Efficient Deweahter Mixture-of-Experts with Uncertainty-Aware Feature-Wise Linear Modulation
Rongyu Zhang, Yulin Luo, Jiaming Liu et al.
MoDE: CLIP Data Experts via Clustering
Jiawei Ma, Po-Yao Huang, Saining Xie et al.
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan, Julian Forsyth, Thomas Fel et al.
MUSE-VL: Modeling Unified VLM through Semantic Discrete Encoding
Rongchang Xie, Chen Du, Ping Song et al.
$\text{D}_{2}\text{O}$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Zhongwei Wan, Xinjian Wu, Yu Zhang et al.
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Jiangjie Chen, Qianyu He, Siyu Yuan et al.
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini, Pierre Ablin, David Grangier
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang, Yue Liao, Jianhui Liu et al.
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance
Yuxuan Luo, Zhengkun Rong, Lizhen Wang et al.
Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Yogesh Kumar, Pekka Marttinen
Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-Free Multi-Exposure Image Fusion
Guanyao Wu, Hongming Fu, Jinyuan Liu et al.
Pathologies of Predictive Diversity in Deep Ensembles
Geoff Pleiss, Taiga Abe, E. Kelly Buchanan et al.
MOFDiff: Coarse-grained Diffusion for Metal-Organic Framework Design
Xiang Fu, Tian Xie, Andrew Rosen et al.
MC^2: Multi-concept Guidance for Customized Multi-concept Generation
Jiaxiu Jiang, Yabo Zhang, Kailai Feng et al.
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner, Benedikt Alkin, Andreas Fürst et al.
GOAL: A Generalist Combinatorial Optimization Agent Learner
Darko Drakulić, Sofia Michel, Jean-Marc Andreoli
Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs
Barrett Tang, Zile Huang, Chengzhi Liu et al.
Hyper-Connections
Defa Zhu, Hongzhi Huang, Zihao Huang et al.
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Junyi Chen, Longteng Guo, Jia Sun et al.
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis
Kensen Shi, Joey Hong, Yinlin Deng et al.
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models
Yi-Lin Sung, Jaehong Yoon, Mohit Bansal
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Harma, Ayan Chakraborty, Elizaveta Kostenok et al.
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Junmo Kang, Leonid Karlinsky, Hongyin Luo et al.
Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching
Federico Errica, Henrik Christiansen, Viktor Zaverkin et al.
Exploring Diverse Representations for Open Set Recognition
Yu Wang, Junxian Mu, Pengfei Zhu et al.
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
Cheng Yang, Yang Sui, Jinqi Xiao et al.
Delta Decompression for MoE-based LLMs Compression
Hao Gu, Wei Li, Lujun Li et al.
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.
Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
Anke Tang, Enneng Yang, Li Shen et al.
Unsupervised Layer-Wise Score Aggregation for Textual OOD Detection
Maxime Darrin, Guillaume Staerman, Eduardo Dadalto Camara Gomes et al.
SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds
Yanbo Wang, Wentao Zhao, Cao Chuan et al.
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
Jinlan Fu, Shenzhen Huangfu, Hao Fei et al.
InfMAE: A Foundation Model in The Infrared Modality
Fangcen liu, Chenqiang Gao, Yaming Zhang et al.
UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving
Jian Zou, Tianyu Huang, Guanglei Yang et al.