CVPR Papers
5,589 papers found • Page 34 of 112
Monocular and Generalizable Gaussian Talking Head Animation
Shengjie Gong, Haojie Li, Jiapeng Tang et al.
MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors
Fanqi Pu, Yifan Wang, Jiru Deng et al.
MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction
Wenyuan Zhang, Yixiao Yang, Han Huang et al.
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Luo, Xue Yang, Wenhan Dou et al.
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
Rishubh Parihar, Srinjay Sarkar, Sarthak Vora et al.
MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models
Yifan Liu, Keyu Fan, Weihao Yu et al.
MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection
Hou-I Liu, Christine Wu, Jen-Hao Cheng et al.
MonSter: Marry Monodepth to Stereo Unleashes Power
JunDa Cheng, Longliang Liu, Gangwei Xu et al.
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn, Zawar Qureshi, Jakub Powierza et al.
Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation
Junha Lee, Chunghyun Park, Jaesung Choe et al.
Mosaic of Modalities: A Comprehensive Benchmark for Multimodal Graph Learning
Jing Zhu, Yuhang Zhou, Shengyi Qian et al.
MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework
Ping Guo, Cheng Gong, Fei Liu et al.
MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds
Jiahui Lei, Yijia Weng, Adam W Harley et al.
MOS: Modeling Object-Scene Associations in Generalized Category Discovery
Zhengyuan Peng, Jinpeng Ma, Zhimin Sun et al.
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
Xu Han, Yuan Tang, Jinfeng Xu et al.
MotiF: Making Text Count in Image Animation with Motion Focal Loss
Shijie Wang, Samaneh Azadi, Rohit Girdhar et al.
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Wenyi Hong, Yean Cheng, Zhuoyi Yang et al.
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng, Tongjia Chen, Shoubin Yu et al.
MotionMap: Representing Multimodality in Human Pose Forecasting
Reyhaneh Hosseininejad, Megh Shukla, Saeed Saadatnejad et al.
Motion Modes: What Could Happen Next?
Karran Pandey, Yannick Hold-Geoffroy, Matheus Gadelha et al.
MotionPro: A Precise Motion Controller for Image-to-Video Generation
Zhongwei Zhang, Fuchen Long, Zhaofan Qiu et al.
MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond
Shenghao Ren, Yi Lu, Jiayi Huang et al.
Motion Prompting: Controlling Video Generation with Motion Trajectories
Daniel Geng, Charles Herrmann, Junhwa Hur et al.
Motions as Queries: One-Stage Multi-Person Holistic Human Motion Capture
Kenkun Liu, Yurong Fu, Weihao Yuan et al.
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation
Shuwei Shi, Biao Gong, Xi Chen et al.
Move-in-2D: 2D-Conditioned Human Motion Generation
Hsin-Ping Huang, Yang Zhou, Jui-Hsien Wang et al.
MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders
jiajun cao, Yuan Zhang, Tao Huang et al.
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu, Mingyu Liu, Zeyu Zhu et al.
Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts
Feng Liang, Haoyu Ma, Zecheng He et al.
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
Ruijie Lu, Yixin Chen, Junfeng Ni et al.
MPDrive: Improving Spatial Understanding with Marker-Based Prompt Learning for Autonomous Driving
Zhi-Yuan Zhang, Xiaofan Li, Zhihao Xu et al.
MP-GUI: Modality Perception with MLLMs for GUI Understanding
Ziwei Wang, Weizhi Chen, Leyang Yang et al.
MP-SfM: Monocular Surface Priors for Robust Structure-from-Motion
Zador Pataki, Paul-Edouard Sarlin, Johannes Schönberger et al.
Mr. DETR: Instructive Multi-Route Training for Detection Transformers
Chang-Bin Zhang, Yujie Zhong, Kai Han
MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting
jun huang, Ting Liu, Yihang Wu et al.
Multi-focal Conditioned Latent Diffusion for Person Image Synthesis
Jiaqi Liu, Jichao Zhang, Paolo Rota et al.
MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction
Gangjian Zhang, Nanjie Yao, Shunsi Zhang et al.
Multi-Granularity Class Prototype Topology Distillation for Class-Incremental Source-Free Unsupervised Domain Adaptation
Peihua Deng, Jiehua Zhang, Xichun Sheng et al.
Multi-Group Proportional Representations for Text-to-Image Models
Sangwon Jung, Alex Oesterling, Claudio Mayrink Verdun et al.
Multi-Label Prototype Visual Spatial Search for Weakly Supervised Semantic Segmentation
Songsong Duan, Xi Yang, Nannan Wang
Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices
Junyan Lin, Haoran Chen, Yue Fan et al.
Multi-Modal Aerial-Ground Cross-View Place Recognition with Neural ODEs
Sijie Wang, Rui She, Qiyu Kang et al.
Multimodal Autoregressive Pre-training of Large Vision Encoders
Enrico Fini, Mustafa Shukor, Xiujun Li et al.
Multi-modal Contrastive Learning with Negative Sampling Calibration for Phenotypic Drug Discovery
Jiahua Rao, Hanjing Lin, Leyu Chen et al.
Multi-Modal Contrastive Masked Autoencoders: A Two-Stage Progressive Pre-training Approach for RGBD Datasets
Muhammad Abdullah Jamal, Omid Mohareri
Multi-modal Knowledge Distillation-based Human Trajectory Forecasting
Jaewoo Jeong, Seohee Lee, Daehee Park et al.
Multi-modal Medical Diagnosis via Large-small Model Collaboration
Wanyi Chen, Zihua Zhao, Jiangchao Yao et al.
MultimodalStudio: A Heterogeneous Sensor Dataset and Framework for Neural Rendering across Multiple Imaging Modalities
Federico Lincetto, Gianluca Agresti, Mattia Rossi et al.
Multi-Modal Synergistic Implicit Image Enhancement for Efficient Optical Flow Estimation
Weichen Dai, wu hexing, xiaoyang weng et al.
Multi-modal Topology-embedded Graph Learning for Spatially Resolved Genes Prediction from Pathology Images with Prior Gene Similarity Information
Hang Shi, Chi Changxi, Peng Wan et al.