CVPR Papers
5,589 papers found • Page 36 of 112
Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
Kunpeng Qiu, Zhiqiang Gao, Zhiying Zhou et al.
NoiseCtrl: A Sampling-Algorithm-Agnostic Conditional Generation Method for Diffusion Models
Longquan Dai, He Wang, Jinhui Tang
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis
Boming Miao, Chunxiao Li, Xiaoxiao Wang et al.
Noise Modeling in One Hour: Minimizing Preparation Efforts for Self-supervised Low-Light RAW Image Denoising
Feiran Li, Haiyang Jiang, Daisuke Iso
Noise-Resistant Video Anomaly Detection via RGB Error-Guided Multiscale Predictive Coding and Dynamic Memory
Han Hu, Wenli Du, Peng Liao et al.
Nonisotropic Gaussian Diffusion for Realistic 3D Human Motion Prediction
Cecilia Curreli, Dominik Muhle, Abhishek Saroha et al.
Non-Natural Image Understanding with Advancing Frequency-based Vision Encoders
Wang Lin, Qingsong Wang, Yueying Feng et al.
NoPain: No-box Point Cloud Attack via Optimal Transport Singular Boundary
Zezeng Li, Xiaoyu Du, Na Lei et al.
No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition
Rong Qin, Xin Liu, Xingyu Liu et al.
Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability
Lei Wang, Senmao Li, Fei Yang et al.
Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering
Wenlong Fang, Qiaofeng Wu, Jing Chen et al.
NoT: Federated Unlearning via Weight Negation
Yasser Khalil, Leo Maxime Brunswic, Soufiane Lamghari et al.
No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather
Junsung Park, HwiJeong Lee, Inha Kang et al.
Not Just Text: Uncovering Vision Modality Typographic Threats in Image Generation Models
Hao Cheng, Erjia Xiao, Jiayan Yang et al.
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Davide Berasi, Matteo Farina, Massimiliano Mancini et al.
Novel View Synthesis with Pixel-Space Diffusion Models
Noam Elata, Bahjat Kawar, Yaron Ostrovsky-Berman et al.
NSD-Imagery: A Benchmark Dataset for Extending fMRI Vision Decoding Methods to Mental Imagery
Reese Kneeland, Paul Scotti, Ghislain St-Yves et al.
NTClick: Achieving Precise Interactive Segmentation With Noise-tolerant Clicks
Chenyi Zhang, Ting Liu, Xiaochao Qu et al.
NTR-Gaussian: Nighttime Dynamic Thermal Reconstruction with 4D Gaussian Splatting Based on Thermodynamics
Kun Yang, Yuxiang Liu, Zeyu Cui et al.
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Le Yang, Ziwei Zheng, Boxu Chen et al.
Number it: Temporal Grounding Videos like Flipping Manga
Yongliang Wu, Xinting Hu, Yuyang Sun et al.
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
Lingen Li, Zhaoyang Zhang, Yaowei Li et al.
NVILA: Efficient Frontier Visual Language Models
Zhijian Liu, Ligeng Zhu, Baifeng Shi et al.
Object-aware Sound Source Localization via Audio-Visual Scene Understanding
Sung Jin Um, Dongjin Kim, Sangmin Lee et al.
Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
Xiaoqi Li, Lingyun Xu, Mingxu Zhang et al.
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset
Xiao Wang, Yu Jin, Wentao Wu et al.
ObjectMover: Generative Object Movement with Video Prior
Xin Yu, Tianyu Wang, Soo Ye Kim et al.
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng, Haoyu Zhang, Meng Liu et al.
Occlusion-aware Text-Image-Point Cloud Pretraining for Open-World 3D Object Recognition
Khanh Nguyen, Ghulam Mubashar Hassan, Ajmal Mian
OccMamba: Semantic Occupancy Prediction with State Space Models
Heng Li, Yuenan Hou, Xiaohan Xing et al.
OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad
Luyao Tang, Chaoqi Chen, Yuxuan Yuan et al.
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
Wei Suo, Lijun Zhang, Mengyang Sun et al.
ODA-GAN: Orthogonal Decoupling Alignment GAN Assisted by Weakly-supervised Learning for Virtual Immunohistochemistry Staining
Tong Wang, Mingkang Wang, Zhongze Wang et al.
Odd-One-Out: Anomaly Detection by Comparing with Neighbors
Ankan Kumar Bhunia, Changjian Li, Hakan Bilen
ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models
Yahan Tu, Rui Hu, Jitao Sang
ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos
Zetong Zhang, Manuel Kaufmann, Lixin Xue et al.
OFER: Occluded Face Expression Reconstruction
Pratheba Selvaraju, Victoria Abrevaya, Timo Bolkart et al.
OffsetOPT: Explicit Surface Reconstruction without Normals
Huan Lei
Olympus: A Universal Task Router for Computer Vision Tasks
Yuanze Lin, Yunsheng Li, Dongdong Chen et al.
Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos
Chiara Plizzari, Alessio Tonioni, Yongqin Xian et al.
Omnidirectional Multi-Object Tracking
Kai Luo, Hao Shi, Sheng Wu et al.
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Linke Ouyang, Yuan Qu, Hongbin Zhou et al.
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning
Shihao Wang, Zhiding Yu, Xiaohui Jiang et al.
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Shufan Li, Konstantinos Kallidromitis, Akash Gokul et al.
OmniGen: Unified Image Generation
Shitao Xiao, Yueze Wang, Junjie Zhou et al.
OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking
Xuanyu Zhang, Zecheng Tang, Zhipei Xu et al.
Omni-ID: Holistic Identity Representation Designed for Generative Tasks
Guocheng Qian, Kuan-Chieh Wang, Or Patashnik et al.
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints
Mingjie Pan, Jiyao Zhang, Tianshu Wu et al.
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
Yuxuan Wang, Yueqian Wang, Bo Chen et al.
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Miran Heo, Min-Hung Chen, De-An Huang et al.