CVPR Poster Papers
4,874 papers found • Page 28 of 98
MatAnyone: Stable Video Matting with Consistent Memory Propagation
Peiqing Yang, Shangchen Zhou, Jixin Zhao et al.
Matrix-Free Shared Intrinsics Bundle Adjustment
Daniel Safari
MBQ: Modality-Balanced Quantization for Large Vision-Language Models
Shiyao Li, Yingchun Hu, Xuefei Ning et al.
MC^2: Multi-concept Guidance for Customized Multi-concept Generation
Jiaxiu Jiang, Yabo Zhang, Kailai Feng et al.
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
Mingcheng Li, Xiaolu Hou, Ziyang Liu et al.
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun, Barath Lakshmanan, Maying Shen et al.
MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention
Yuhan Wang, Fangzhou Hong, Shuai Yang et al.
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang, Yang Yu, Yucheng Chen et al.
Medusa: A Multi-Scale High-order Contrastive Dual-Diffusion Approach for Multi-View Clustering
Liang Chen, Zhe Xue, Yawen Li et al.
MEET: Towards Memory-Efficient Temporal Sparse Deep Neural Networks
Zeqi Zhu, Ibrahim Batuhan Akkaya, Luc Waeijen et al.
MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing
Cong Wang, Di Kang, Heyi Sun et al.
MEGA: Masked Generative Autoencoder for Human Mesh Recovery
Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.
MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos
Zhengqi Li, Richard Tucker, Forrester Cole et al.
MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data
Hanwen Jiang, Zexiang Xu, Desai Xie et al.
MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images
Aniruddha Ganguly, Debolina Chatterjee, Wentao Huang et al.
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Siyuan Li, Luyuan Zhang, Zedong Wang et al.
MESC-3D:Mining Effective Semantic Cues for 3D Reconstruction from a Single Image
Shaoming Li, Qing Cai, Songqi KONG et al.
MeshArt: Generating Articulated Meshes with Structure-Guided Transformers
Daoyi Gao, Mohd Yawar Nihal Siddiqui, Lei Li et al.
Mesh Mamba: A Unified State Space Model for Saliency Prediction in Non-Textured and Textured Meshes
Kaiwei Zhang, Dandan Zhu, Xiongkuo Min et al.
MET3R: Measuring Multi-View Consistency in Generated Images
Mohammad Asim, Christopher Wewer, Thomas Wimmer et al.
METASCENES: Towards Automated Replica Creation for Real-world 3D Scans
Huangyue Yu, Baoxiong Jia, Yixin Chen et al.
MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis
Tianyu Wang, Jianming Zhang, Haitian Zheng et al.
MetaWriter: Personalized Handwritten Text Recognition Using Meta-Learned Prompt Tuning
Wenhao Gu, Li Gu, Ching Suen et al.
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Jianwei Zhao, XIN LI, Fan Yang et al.
MFogHub: Bridging Multi-Regional and Multi-Satellite Data for Global Marine Fog Detection and Forecasting
Mengqiu XU, Kaixin Chen, Heng Guo et al.
MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities
Bizhu Wu, Jinheng Xie, Keming Shen et al.
MICAS: Multi-grained In-Context Adaptive Sampling for 3D Point Cloud Processing
Feifei Shao, Ping Liu, Zhao Wang et al.
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
James Burgess, Jeffrey J Nirschl, Laura Bravo-Sánchez et al.
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan, Xianghong Li, Tao Xiang et al.
MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
Zehuan Huang, Yuanchen Guo, Xingqiao An et al.
Mimic In-Context Learning for Multimodal Tasks
Yuchu Jiang, Jiale Fu, chenduo hao et al.
Mimir: Improving Video Diffusion Models for Precise Text Understanding
Shuai Tan, Biao Gong, Yutong Feng et al.
MIMO: A Medical Vision Language Model with Visual Referring Multimodal Input and Pixel Grounding Multimodal Output
Yanyuan Chen, Dexuan Xu, Yu Huang et al.
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling
Yifang Men, Yuan Yao, Miaomiao Cui et al.
Minding Fuzzy Regions: A Data-driven Alternating Learning Paradigm for Stable Lesion Segmentation
Lexin Fang, Yunyang Xu, Xiang Ma et al.
Mind the Gap: Confidence Discrepancy Can Guide Federated Semi-Supervised Learning Across Pseudo-Mismatch
Yijie Liu, Xinyi Shang, Yiqun Zhang et al.
Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis
Jeonghwan Park, Niall McLaughlin, Ihsen Alouani
Mind the Time: Temporally-Controlled Multi-Event Video Generation
Ziyi Wu, Aliaksandr Siarohin, Willi Menapace et al.
Minimal Interaction Seperated Tuning: A New Paradigm for Visual Adaptation
Ningyuan Tang, Minghao Fu, Jianxin Wu
MINIMA: Modality Invariant Image Matching
Jiangwei Ren, Xingyu Jiang, Zizhuo Li et al.
Minimizing Labeled, Maximizing Unlabeled: An Image-Driven Approach for Video Instance Segmentation
Fangyun Wei, Jinjing Zhao, Kun Yan et al.
Minority-Focused Text-to-Image Generation via Prompt Optimization
Soobin Um, Jong Chul Ye
MIRE: Matched Implicit Neural Representations
Dhananjaya Jayasundara, Heng Zhao, Demetrio Labate et al.
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
Ankit Dhiman, Manan Shah, R. Venkatesh Babu
Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval
Yuanmin Tang, Jing Yu, Keke Gai et al.
Mitigating Ambiguities in 3D Classification with Gaussian Splatting
Ruiqi Zhang, Hao Zhu, Jingyi Zhao et al.
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key
Zhihe Yang, Xufang Luo, Dongqi Han et al.
Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention
Wenbin An, Feng Tian, Sicong Leng et al.
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou, Teli Ma, Kun-Yu Lin et al.
MixerMDM: Learnable Composition of Human Motion Diffusion Models
Pablo Ruiz-Ponce, German Barquero, Cristina Palmero et al.