All Papers

34,598 papers found • Page 586 of 692

Mixture of Weak and Strong Experts on Graphs

Hanqing Zeng, Hanjia Lyu, Diyi Hu et al.

ICLR 2024
10
citations

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Johan Obando Ceron, Ghada Sokar, Timon Willi et al.

ICML 2024spotlightarXiv:2402.08609
64
citations

MKG-FENN: A Multimodal Knowledge Graph Fused End-to-End Neural Network for Accurate Drug–Drug Interaction Prediction

Di Wu, Wu Sun, Yi He et al.

AAAI 2024paper

MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation

Qian Huang, Jian Vora, Percy Liang et al.

ICML 2024arXiv:2310.03302
168
citations

MLI Formula: A Nearly Scale-Invariant Solution with Noise Perturbation

Bowen Tao, Xin-Chun Li, De-Chuan Zhan

ICML 2024

MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization

Yu Zhang, Qi Zhang, Zixuan Gong et al.

ICML 2024arXiv:2406.01460
7
citations

MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning

Zhe Li, Laurence Yang, Bocheng Ren et al.

CVPR 2024arXiv:2402.02045
34
citations

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark

Dongping Chen, Ruoxi Chen, Shilin Zhang et al.

ICML 2024arXiv:2402.04788
281
citations

MLNet: Mutual Learning Network with Neighborhood Invariance for Universal Domain Adaptation

Yanzuo Lu, Meng Shen, Andy J Ma et al.

AAAI 2024paperarXiv:2312.07871
23
citations

MLP Can Be A Good Transformer Learner

Sihao Lin, Pumeng Lyu, Dongrui Liu et al.

CVPR 2024arXiv:2404.05657
21
citations

MLPHand: Real Time Multi-View 3D Hand Reconstruction via MLP Modeling

Jian Yang, Jiakun Li, Guoming Li et al.

ECCV 2024

ML-SemReg: Boosting Point Cloud Registration with Multi-level Semantic Consistency

Shaocheng Yan, Pengcheng Shi, Jiayuan Li

ECCV 2024arXiv:2407.09862
6
citations

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier et al.

ECCV 2024arXiv:2403.09611
250
citations

MMA-Diffusion: MultiModal Attack on Diffusion Models

Yijun Yang, Ruiyuan Gao, Xiaosen Wang et al.

CVPR 2024arXiv:2311.17516
150
citations

MMA: Multi-Modal Adapter for Vision-Language Models

Lingxiao Yang, Ru-Yuan Zhang, Yanchen Wang et al.

CVPR 2024

MmAP: Multi-Modal Alignment Prompt for Cross-Domain Multi-Task Learning

Yi Xin, Junlong Du, Qiang Wang et al.

AAAI 2024paperarXiv:2312.08636
88
citations

MMBENCH: Is Your Multi-Modal Model an All-around Player?

Yuan Liu, Haodong Duan, Yuanhan Zhang et al.

ECCV 2024arXiv:2307.06281
1745
citations

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

Yanting Wang, Hongye Fu, Wei Zou et al.

CVPR 2024arXiv:2403.19080
5
citations

MMD Graph Kernel: Effective Metric Learning for Graphs via Maximum Mean Discrepancy

Yan Sun, Jicong Fan

ICLR 2024spotlight

MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning

Vishal Nedungadi, Ankit Kariryaa, Stefan Oehmcke et al.

ECCV 2024arXiv:2405.02771
68
citations

MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning

Haozhe Zhao, Zefan Cai, Shuzheng Si et al.

ICLR 2024arXiv:2309.07915
191
citations

MMM: Generative Masked Motion Model

Ekkasit Pinyoanuntapong, Pu Wang, Minwoo Lee et al.

CVPR 2024highlightarXiv:2312.03596
103
citations

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Xiang Yue, Yuansheng Ni, Kai Zhang et al.

CVPR 2024arXiv:2311.16502
1715
citations

MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning

Chaoyi Zhang, Kevin Lin, Zhengyuan Yang et al.

CVPR 2024highlightarXiv:2311.17435
50
citations

MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance

Yake Wei, Di Hu

ICML 2024arXiv:2405.17730
64
citations

MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding

HaiTao Yu, Mofei Song

AAAI 2024paperarXiv:2402.10002
18
citations

m&m’s: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks

Zixian Ma, Weikai Huang, Jieyu Zhang et al.

ECCV 2024arXiv:2403.11085
36
citations

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models

Xin Liu, Yichen Zhu, Jindong Gu et al.

ECCV 2024arXiv:2311.17600
199
citations

MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos

Jielin Qiu, Jiacheng Zhu, William Han et al.

CVPR 2024highlightarXiv:2306.04216
14
citations

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Kaining Ying, Fanqing Meng, Jin Wang et al.

ICML 2024arXiv:2404.16006
163
citations

MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis

Wenhao Guan, Yishuang Li, Tao Li et al.

AAAI 2024paperarXiv:2312.10687
24
citations

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities

Weihao Yu, Zhengyuan Yang, Linjie Li et al.

ICML 2024arXiv:2308.02490
1066
citations

MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors

He Zhang, Shenghao Ren, Haolei Yuan et al.

CVPR 2024arXiv:2403.17610
17
citations

MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception

Mohammad Mahbubur Rahman, Ryoma Yataka, Sorachi Kato et al.

ECCV 2024arXiv:2406.10708
20
citations

M&M VTO: Multi-Garment Virtual Try-On and Editing

Luyang Zhu, Yingwei Li, Nan Liu et al.

CVPR 2024highlightarXiv:2406.04542
27
citations

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.

ECCV 2024arXiv:2403.07508
34
citations

Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers

Zhiyu Yao, Jian Wang, Haixu Wu et al.

ICML 2024

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri et al.

CVPR 2024arXiv:2311.17049
90
citations

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.

ECCV 2024arXiv:2311.16567
36
citations

MobileInst: Video Instance Segmentation on the Mobile

Renhong Zhang, Tianheng Cheng, Shusheng Yang et al.

AAAI 2024paperarXiv:2303.17594
10
citations

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Zechun Liu, Changsheng Zhao, Forrest Iandola et al.

ICML 2024arXiv:2402.14905
195
citations

MobileNetV4: Universal Models for the Mobile Ecosystem

Danfeng Qin, Chas Leichner, Manolis Delakis et al.

ECCV 2024arXiv:2404.10518
434
citations

Möbius Transform for Mitigating Perspective Distortions in Representation Learning

Prakash Chandra Chhipa, Meenakshi Subhash Chippa, Kanjar De et al.

ECCV 2024arXiv:2405.02296
1
citations

Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera

Jiye Lee, Hanbyul Joo

CVPR 2024arXiv:2401.00847
30
citations

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

Ziyang Chen, Wei Long, He Yao et al.

CVPR 2024arXiv:2404.06842
73
citations

Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention

Ju-Hyeon Nam, Nur Suriza Syazwany, Su Jung Kim et al.

CVPR 2024arXiv:2405.06284
59
citations

Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration

Tony C. W. MOK, Zi Li, Yunhao Bai et al.

CVPR 2024highlightarXiv:2402.18933
19
citations

Modality-Collaborative Test-Time Adaptation for Action Recognition

Baochen Xiong, Xiaoshan Yang, Yaguang Song et al.

CVPR 2024

Modality Translation for Object Detection Adaptation without forgetting prior knowledge

Heitor Rapela Medeiros, Masih Aminbeidokhti, Fidel A Guerrero Pena et al.

ECCV 2024arXiv:2404.01492
4
citations

ModaVerse: Efficiently Transforming Modalities with LLMs

Xinyu Wang, Bohan Zhuang, Qi Wu

CVPR 2024arXiv:2401.06395
23
citations