Multi-Task Learning
Learning multiple tasks jointly
Top Papers
Sequential Modeling Enables Scalable Learning for Large Vision Models
Yutong Bai, Xinyang Geng, Karttikeya Mangalam et al.
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri, Melissa Z Pan, Shuyi Yang et al.
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Guangxuan Xiao, Jiaming Tang, Jingwei Zuo et al.
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
Cong Wei, Yang Chen, Haonan Chen et al.
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
MohammadReza Davari, Eugene Belilovsky
MLVU: Benchmarking Multi-task Long Video Understanding
Junjie Zhou, Yan Shu, Bo Zhao et al.
Reliable Conflictive Multi-View Learning
Cai Xu, Jiajun Si, Ziyu Guan et al.
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
Jingfeng Wu, Difan Zou, Zixiang Chen et al.
VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding
Yi Xin, Junlong Du, Qiang Wang et al.
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.
SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
Zhixuan Liang, Yao Mu, Hengbo Ma et al.
Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification
Yunlong Zhang, Honglin Li, YUXUAN SUN et al.
HGPrompt: Bridging Homogeneous and Heterogeneous Graphs for Few-Shot Prompt Learning
Xingtong Yu, Yuan Fang, Zemin Liu et al.
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
Jianhong Bai, Menghan Xia, Xintao WANG et al.
Task Singular Vectors: Reducing Task Interference in Model Merging
Antonio Andrea Gargiulo, Donato Crisostomi, Maria Sofia Bucarelli et al.
TabM: Advancing tabular deep learning with parameter-efficient ensembling
Yury Gorishniy, Akim Kotelnikov, Artem Babenko
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks
Matthew Chang, Gunjan Chhablani, Alexander Clegg et al.
XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning
Pritam Sarkar, Ali Etemad
STEM: Unleashing the Power of Embeddings for Multi-Task Recommendation
Liangcai Su, Junwei Pan, Ximei Wang et al.
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
Jiatong Shi, Hirofumi Inaguma, Xutai Ma et al.
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang, Zeyuan Wang, Qiushi Lyu et al.
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello, Lili Yu, Yixin Nie et al.
Graph-Aware Contrasting for Multivariate Time-Series Classification
Yucheng Wang, Yuecong Xu, Jianfei Yang et al.
What to align in multimodal contrastive learning?
Benoit Dufumier, Javiera Castillo Navarro, Devis Tuia et al.
UMIE: Unified Multimodal Information Extraction with Instruction Tuning
Lin Sun, Kai Zhang, Qingyuan Li et al.
LAMM: Label Alignment for Multi-Modal Prompt Learning
Jingsheng Gao, Jiacheng Ruan, Suncheng Xiang et al.
Progressive Pretext Task Learning for Human Trajectory Prediction
Xiaotong Lin, Tianming Liang, Jian-Huang Lai et al.
No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces
Daniel Marczak, Simone Magistri, Sebastian Cygert et al.
KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems
Jusheng Zhang, Zimeng Huang, Yijia Fan et al.
eTag: Class-Incremental Learning via Embedding Distillation and Task-Oriented Generation
Libo Huang, Yan Zeng, Chuanguang Yang et al.
MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders
Baijiong Lin, Weisen Jiang, Pengguang Chen et al.
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.
Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment
Ziyu Shan, Yujie Zhang, Qi Yang et al.
Training-Free Pretrained Model Merging
Zhengqi Xu, Ke Yuan, Huiqiong Wang et al.
Contrastive Learning for DeepFake Classification and Localization via Multi-Label Ranking
Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu
Decoupled Spatio-Temporal Consistency Learning for Self-Supervised Tracking
Yaozong Zheng, Bineng Zhong, Qihua Liang et al.
MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions
Jian Wu, Linyi Yang, Dongyuan Li et al.
LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging
Ke Wang, Nikos Dimitriadis, Alessandro Favero et al.
Robust Tracking via Mamba-based Context-aware Token Learning
Jinxia Xie, Bineng Zhong, Qihua Liang et al.
Category-Level Multi-Part Multi-Joint 3D Shape Assembly
Yichen Li, Kaichun Mo, Yueqi Duan et al.
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner, Benedikt Alkin, Andreas Fürst et al.
TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data
Siyi Du, Shaoming Zheng, Yinsong Wang et al.
AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
Ruoxuan Feng, Jiangyu Hu, Wenke Xia et al.
GOAL: A Generalist Combinatorial Optimization Agent Learner
Darko Drakulić, Sofia Michel, Jean-Marc Andreoli
Improving Plasticity in Online Continual Learning via Collaborative Learning
Maorong Wang, Nicolas Michel, Ling Xiao et al.
HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Zhongyu Xia, ZhiWei Lin, Xinhao Wang et al.
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
ziang yan, Zhilin Li, Yinan He et al.
Task-driven Image Fusion with Learnable Fusion Loss
Haowen Bai, Jiangshe Zhang, Zixiang Zhao et al.
Class Incremental Learning via Likelihood Ratio Based Task Prediction
Haowei Lin, Yijia Shao, Weinan Qian et al.
TIME-FS: Joint Learning of Tensorial Incomplete Multi-View Unsupervised Feature Selection and Missing-View Imputation
Yanyong Huang, Minghui Lu, Wei Huang et al.
Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Ziheng Zhou, Jinxing Zhou, Wei Qian et al.
Cloud-Device Collaborative Learning for Multimodal Large Language Models
Guanqun Wang, Jiaming Liu, Chenxuan Li et al.
Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
Anke Tang, Enneng Yang, Li Shen et al.
T4P: Test-Time Training of Trajectory Prediction via Masked Autoencoder and Actor-specific Token Memory
Daehee Park, Jaeseok Jeong, Sung-Hoon Yoon et al.
BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning
Xiangyang Miao, Guobao Xiao, Shiping Wang et al.
MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation
Donggon Jang, Yucheol Cho, Suin Lee et al.
AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment
Yan Li, Yifei Xing, Xiangyuan Lan et al.
Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning
Kai Jiang, Zhengyan Shi, Dell Zhang et al.
Three Heads Are Better than One: Complementary Experts for Long-Tailed Semi-supervised Learning
Chengcheng Ma, Ismail Elezi, Jiankang Deng et al.
Every Node Is Different: Dynamically Fusing Self-Supervised Tasks for Attributed Graph Clustering
Pengfei Zhu, Qian Wang, Yu Wang et al.
DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding
Jincen Jiang, Qianyu Zhou, Yuhang Li et al.
Quad Bayer Joint Demosaicing and Denoising Based on Dual Encoder Network with Joint Residual Learning
Bolun Zheng, Li Haoran, Quan Chen et al.
Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment
Alireza Ganjdanesh, Shangqian Gao, Heng Huang
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
Yingsen Zeng, Yujie Zhong, Chengjian Feng et al.
Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching
Rui Gong, Weide Liu, ZAIWANG GU et al.
Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes
Zhiyuan Yu, Zheng Qin, lintao zheng et al.
A Second-Order Perspective on Model Compositionality and Incremental Learning
Angelo Porrello, Lorenzo Bonicelli, Pietro Buzzega et al.
Tri-Modal Motion Retrieval by Learning a Joint Embedding Space
Kangning Yin, Shihao Zou, Yuxuan Ge et al.
MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Yaming Yang, Dilxat Muhtar, Yelong Shen et al.
Multi-Label Cluster Discrimination for Visual Representation Learning
Xiang An, Kaicheng Yang, Xiangzi Dai et al.
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad, Vibhav Vineet, Yogesh S. Rawat
Enhancing Multi-Robot Semantic Navigation Through Multimodal Chain-of-Thought Score Collaboration
Zhixuan Shen, Haonan Luo, Kexun Chen et al.
Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
Jinluan Yang, Anke Tang, Didi Zhu et al.
Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition
Chuanguang Yang, XinQiang Yu, Han Yang et al.
Training-Free Model Merging for Multi-target Domain Adaptation
Wenyi Li, Huan-ang Gao, Mingju Gao et al.
Adapter Merging with Centroid Prototype Mapping for Scalable Class-Incremental Learning
Takuma Fukuda, Hiroshi Kera, Kazuhiko Kawamoto
Quantifying Task Priority for Multi-Task Optimization
Wooseong Jeong, Kuk-Jin Yoon
RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception
Shen Jianbing, Chunliang Li, Wencheng Han et al.
Let All Be Whitened: Multi-Teacher Distillation for Efficient Visual Retrieval
Zhe Ma, Jianfeng Dong, Shouling Ji et al.
SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining
Chull Hwan Song, Taebaek Hwang, Jooyoung Yoon et al.
Data-Efficient Multimodal Fusion on a Single GPU
Noël Vouitsis, Zhaoyan Liu, Satya Krishna Gorti et al.
FedLPS: Heterogeneous Federated Learning for Multiple Tasks with Local Parameter Sharing
Yongzhe Jia, Xuyun Zhang, Amin Beheshti et al.
PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus
Florian Kluger, Bodo Rosenhahn
Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing
Jinmin He, Kai Li, Yifan Zang et al.
Pareto Set Learning for Multi-Objective Reinforcement Learning
Erlong Liu, Yu-Chang Wu, Xiaobin Huang et al.
TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception
Zhiying Song, Lei Yang, Fuxi Wen et al.
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
Shilin Xu, Haobo Yuan, Qingyu Shi et al.
ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning
Hongshu Guo, Zeyuan Ma, Jiacheng Chen et al.
Live and Learn: Continual Action Clustering with Incremental Views
Xiaoqiang Yan, Yingtao Gan, Yiqiao Mao et al.
Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network
Xiang Fang, Wanlong Fang, Changshuo Wang et al.
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He, Siqi Zeng, Yuzheng Hu et al.
Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning
Xialei Liu, Jiang-Tian Zhai, Andrew Bagdanov et al.
Incomplete Multi-view Deep Clustering with Data Imputation and Alignment
Jiyuan Liu, Xinwang Liu, Xinhang Wan et al.
ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL
Yang Qin, Chao Chen, Zhihang Fu et al.
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.
Continual Multimodal Contrastive Learning
Xiaohao Liu, Xiaobo Xia, See-Kiong Ng et al.
MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities
Kunxi Li, Tianyu Zhan, Kairui Fu et al.
Mimic In-Context Learning for Multimodal Tasks
Yuchu Jiang, Jiale Fu, chenduo hao et al.
Activation-Informed Merging of Large Language Models
Amin Heyrani Nobari, Kaveh Alimohammadi, Ali ArjomandBigdeli et al.
H-ensemble: An Information Theoretic Approach to Reliable Few-Shot Multi-Source-Free Transfer
Yanru Wu, Jianning Wang, Weida Wang et al.