Most Cited CVPR "confidence scoring" Papers

5,589 papers found • Page 27 of 28

#5201

SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding

Mingfei Chen, Israel D. Gebru, Ishwarya Ananthabhotla et al.

CVPR 2025highlightarXiv:2504.05576
#5202

Omni-ID: Holistic Identity Representation Designed for Generative Tasks

Guocheng Qian, Kuan-Chieh Wang, Or Patashnik et al.

CVPR 2025posterarXiv:2412.09694
#5203

Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields

Shijie Zhou, Hui Ren, Yijia Weng et al.

CVPR 2025posterarXiv:2503.20776
#5204

Generative Inbetweening through Frame-wise Conditions-Driven Video Generation

Tianyi Zhu, Dongwei Ren, Qilong Wang et al.

CVPR 2025posterarXiv:2412.11755
#5205

Exploring Temporally-Aware Features for Point Tracking

Inès Hyeonsu Kim, Seokju Cho, Gabriel Huang et al.

CVPR 2025posterarXiv:2501.12218
#5206

Style-Editor: Text-driven Object-centric Style Editing

Jihun Park, Jongmin Gim, Kyoungmin Lee et al.

CVPR 2025highlightarXiv:2408.08461
#5207

Locally Orderless Images for Optimization in Differentiable Rendering

Ishit Mehta, Manmohan Chandraker, Ravi Ramamoorthi

CVPR 2025highlightarXiv:2503.21931
#5208

Efficient Event-Based Object Detection: A Hybrid Neural Network with Spatial and Temporal Attention

Soikat Hasan Ahmed, Jan Finkbeiner, Emre Neftci

CVPR 2025posterarXiv:2403.10173
#5209

A Dataset for Semantic Segmentation in the Presence of Unknowns

Zakaria Laskar, Tomas Vojir, Matej Grcic et al.

CVPR 2025posterarXiv:2503.22309
#5210

Light Transport-aware Diffusion Posterior Sampling for Single-View Reconstruction of 3D Volumes

Ludwic Leonard, Nils Thuerey, rüdiger westermann

CVPR 2025highlightarXiv:2501.05226
#5211

DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation

Bo-Wen Yin, Jiao-Long Cao, Ming-Ming Cheng et al.

CVPR 2025posterarXiv:2504.04701
#5212

Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport

Hao Tan, Zichang Tan, Jun Li et al.

CVPR 2025posterarXiv:2503.15337
#5213

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Wenyi Hong, Yean Cheng, Zhuoyi Yang et al.

CVPR 2025posterarXiv:2501.02955
#5214

Adaptive Parameter Selection for Tuning Vision-Language Models

Yi Zhang, Yi-Xuan Deng, Meng-Hao Guo et al.

CVPR 2025poster
#5215

TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization

Liang Pan, Zeshi Yang, Zhiyang Dou et al.

CVPR 2025posterarXiv:2503.19901
#5216

ImagineFSL: Self-Supervised Pretraining Matters on Imagined Base Set for VLM-based Few-shot Learning

Haoyuan Yang, Xiaoou Li, Jiaming Lv et al.

CVPR 2025highlight
#5217

DarkIR: Robust Low-Light Image Restoration

Daniel Feijoo, Juan C. Benito, Alvaro Garcia et al.

CVPR 2025posterarXiv:2412.13443
#5218

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models

Chenyu Yang, Xuan Dong, Xizhou Zhu et al.

CVPR 2025posterarXiv:2412.09613
#5219

PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting

Alex Hanson, Allen Tu, Vasu Singla et al.

CVPR 2025posterarXiv:2406.10219
#5220

Free Lunch Enhancements for Multi-modal Crowd Counting

Haoliang Meng, Xiaopeng Hong, Zhengqin Lai et al.

CVPR 2025poster
#5221

From Sparse Signal to Smooth Motion: Real-Time Motion Generation with Rolling Prediction Models

German Barquero, Nadine Bertsch, Manojkumar Marramreddy et al.

CVPR 2025posterarXiv:2504.05265
#5222

Efficient Personalization of Quantized Diffusion Model without Backpropagation

Hoigi Seo, Wongi Jeong, Kyungryeol Lee et al.

CVPR 2025posterarXiv:2503.14868
#5223

KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception

Yunpeng Qu, Kun Yuan, Qizhi Xie et al.

CVPR 2025posterarXiv:2503.10259
#5224

Extreme Rotation Estimation in the Wild

Hana Bezalel, Dotan Ankri, Ruojin Cai et al.

CVPR 2025posterarXiv:2411.07096
#5225

PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval

Qiang Zou, Shuli Cheng, Jiayi Chen

CVPR 2025posterarXiv:2503.16064
#5226

Preserving Clusters in Prompt Learning for Unsupervised Domain Adaptation

Long Tung Vuong, Hoang Phan, Vy Vo et al.

CVPR 2025posterarXiv:2506.11493
#5227

EdgeMovingNet: Edge-preserving Point Cloud Reconstruction via Joint Geometry Features

Xinran Yang, Donghao Ji, Yuanqi Li et al.

CVPR 2025poster
#5228

InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions

Sirui Xu, Hung Yu Ling, Yu-Xiong Wang et al.

CVPR 2025highlightarXiv:2502.20390
#5229

CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis

Youngkyoon Jang, Eduardo Pérez-Pellitero

CVPR 2025posterarXiv:2503.20998
#5230

SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models

Subhadeep Koley, Tapas Kumar Dutta, Aneeshan Sain et al.

CVPR 2025posterarXiv:2503.14129
#5231

EgoLife: Towards Egocentric Life Assistant

Jingkang Yang, Shuai Liu, Hongming Guo et al.

CVPR 2025posterarXiv:2503.03803
#5232

AutoSSVH: Exploring Automated Frame Sampling for Efficient Self-Supervised Video Hashing

Niu Lian, Jun Li, Jinpeng Wang et al.

CVPR 2025posterarXiv:2504.03587
#5233

PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation

Qiyao Xue, Xiangyu Yin, Boyuan Yang et al.

CVPR 2025posterarXiv:2412.00596
#5234

MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation

Sankalp Sinha, Mohammad Sadil Khan, Muhammad Usama et al.

CVPR 2025posterarXiv:2411.17945
#5235

OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

Max Gutbrod, David Rauber, Danilo Weber Nunes et al.

CVPR 2025posterarXiv:2503.16247
#5236

TAROT: Towards Essentially Domain-Invariant Robustness with Theoretical Justification

Dongyoon Yang, Jihu Lee, Yongdai Kim

CVPR 2025posterarXiv:2505.06580
#5237

Explaining in Diffusion: Explaining a Classifier with Diffusion Semantics

Tahira Kazimi, Ritika Allada, Pinar Yanardag

CVPR 2025poster
#5238

AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models

Kwan Yun, Seokhyeon Hong, Chaelin Kim et al.

CVPR 2025posterarXiv:2503.08417
#5239

Learning with Noisy Triplet Correspondence for Composed Image Retrieval

Shuxian Li, Changhao He, XitingLiu et al.

CVPR 2025poster
#5240

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

Qirui Jiao, Daoyuan Chen, Yilun Huang et al.

CVPR 2025posterarXiv:2408.04594
#5241

When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach

Vaibhav Rathore, Shubhranil B, Saikat Dutta et al.

CVPR 2025posterarXiv:2503.14897
#5242

HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos

Prithviraj Banerjee, Sindi Shkodrani, Pierre Moulon et al.

CVPR 2025highlightarXiv:2411.19167
#5243

DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension

Xiaofu Chen, Yaxin Luo, Luo et al.

CVPR 2025poster
#5244

Multi-View Pose-Agnostic Change Localization with Zero Labels

Chamuditha Jayanga Galappaththige, Jason Lai, Lloyd Windrim et al.

CVPR 2025posterarXiv:2412.03911
#5245

FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance

Dian Shao, Mingfei Shi, Shengda Xu et al.

CVPR 2025posterarXiv:2505.13437
#5246

HVI: A New Color Space for Low-light Image Enhancement

Qingsen Yan, Yixu Feng, Cheng Zhang et al.

CVPR 2025posterarXiv:2502.20272
#5247

LMO: Linear Mamba Operator for MRI Reconstruction

Wei Li, jiawei jiang, Jie Wu et al.

CVPR 2025poster
#5248

Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation

Yanda Chen, Gongwei Chen, Miao Zhang et al.

CVPR 2025posterarXiv:2503.18872
#5249

Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level

Andong Deng, Tongjia Chen, Shoubin Yu et al.

CVPR 2025posterarXiv:2411.09921
#5250

CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model

Ziyu Yao, Xuxin Cheng, Zhiqi Huang et al.

CVPR 2025posterarXiv:2503.17690
#5251

Low-Rank Adaptation in Multilinear Operator Networks for Security-Preserving Incremental Learning

Huu Binh Ta, Duc Nguyen, Quyen Tran et al.

CVPR 2025poster
#5252

T-FAKE: Synthesizing Thermal Images for Facial Landmarking

Philipp Flotho, Moritz Piening, Anna Kukleva et al.

CVPR 2025posterarXiv:2408.15127
#5253

A Theory of Learning Unified Model via Knowledge Integration from Label Space Varying Domains

Dexuan Zhang, Thomas Westfechtel, Tatsuya Harada

CVPR 2025poster
#5254

Focal Split: Untethered Snapshot Depth from Differential Defocus

Junjie Luo, John Mamish, Alan Fu et al.

CVPR 2025posterarXiv:2504.11202
#5255

Generative Hard Example Augmentation for Semantic Point Cloud Segmentation

Qi Zhang, Jibin Peng, Zhao Huang et al.

CVPR 2025poster
#5256

Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation

Byung Hyun Lee, Sungjin Lim, Se Young Chun

CVPR 2025posterarXiv:2503.12356
#5257

Continuous Space-Time Video Resampling with Invertible Motion Steganography

Yuantong zhang, Zhenzhong Chen

CVPR 2025poster
#5258

Geometry-guided Online 3D Video Synthesis with Multi-View Temporal Consistency

Hyunho Ha, Lei Xiao, Christian Richardt et al.

CVPR 2025posterarXiv:2505.18932
#5259

Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

Jiayi Guo, Zhao Junhao, Chaoqun Du et al.

CVPR 2025posterarXiv:2406.04295
#5260

Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM

Qiyuan Dai, Sibei Yang

CVPR 2025posterarXiv:2507.06973
#5261

OralXrays-9: Towards Hospital-Scale Panoramic X-ray Anomaly Detection via Personalized Multi-Object Query-Aware Mining

Bingzhi Chen, Sisi Fu, Xiaocheng Fang et al.

CVPR 2025oral
#5262

Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation

Chuhao Chen, Zhiyang Dou, Chen Wang et al.

CVPR 2025posterarXiv:2506.06440
#5263

Event Ellipsometer: Event-based Mueller-Matrix Video Imaging

Ryota Maeda, Yunseong Moon, Seung-Hwan Baek

CVPR 2025highlightarXiv:2411.17313
#5264

WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion

Yang Wu, Yun Zhu, Kaihua Zhang et al.

CVPR 2025posterarXiv:2504.13561
#5265

SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks

Shining Wang, Yunlong Wang, Ruiqi Wu et al.

CVPR 2025highlightarXiv:2503.06965
#5266

V2V3D: View-to-View Denoised 3D Reconstruction for Light Field Microscopy

Jiayin Zhao, Zhenqi Fu, Tao Yu et al.

CVPR 2025posterarXiv:2504.07853
#5267

A Unified Framework for Heterogeneous Semi-supervised Learning

Marzi Heidari, Abdullah Alchihabi, Hao Yan et al.

CVPR 2025posterarXiv:2503.00286
#5268

SLADE: Shielding against Dual Exploits in Large Vision-Language Models

Md Zarif Hossain, AHMED IMTEAJ

CVPR 2025poster
#5269

MODA: Motion-Drift Augmentation for Inertial Human Motion Analysis

Yinghao Wu, Shihui Guo, Yipeng Qin

CVPR 2025poster
#5270

Learning to Filter Outlier Edges in Global SfM

Nicole Damblon, Marc Pollefeys, Daniel Barath

CVPR 2025highlight
#5271

Improving the Training of Data-Efficient GANs via Quality Aware Dynamic Discriminator Rejection Sampling

Zhaoyu Zhang, Yang Hua, Guanxiong Sun et al.

CVPR 2025poster
#5272

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Bikang Pan, Qun Li, Xiaoying Tang et al.

CVPR 2025highlightarXiv:2412.01256
#5273

No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition

Rong Qin, Xin Liu, Xingyu Liu et al.

CVPR 2025highlight
#5274

Where's the Liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content

Haoyue Bai, Yiyou Sun, Wei Cheng et al.

CVPR 2025posterarXiv:2505.01008
#5275

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Qiuheng Wang, Yukai Shi, Jiarong Ou et al.

CVPR 2025posterarXiv:2410.08260
#5276

VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification

Xianwei Zhuang, Zhihong Zhu, Yuxin Xie et al.

CVPR 2025posterarXiv:2501.06553
#5277

Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Ying Jin, Jinlong Peng, Qingdong He et al.

CVPR 2025posterarXiv:2408.13509
#5278

CoMatcher: Multi-View Collaborative Feature Matching

Jintao Zhang, Zimin Xia, Mingyue Dong et al.

CVPR 2025posterarXiv:2504.01872
#5279

PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram

Sifan Zhou, Zhihang Yuan, Dawei Yang et al.

CVPR 2025poster
#5280

Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather

Longyu Yang, Ping Hu, Shangbo Yuan et al.

CVPR 2025posterarXiv:2506.02396
#5281

Generalizable Object Keypoint Localization from Generative Priors

Dongkai Wang, Jiang Duan, Liangjian Wen et al.

CVPR 2025poster
#5282

Chebyshev Attention Depth Permutation Texture Network with Latent Texture Attribute Loss

Ravishankar Evani, Deepu Rajan, Shangbo Mao

CVPR 2025poster
#5283

Shining Yourself: High-Fidelity Ornaments Virtual Try-on with Diffusion Model

Yingmao Miao, Zhanpeng Huang, Rui Han et al.

CVPR 2025posterarXiv:2503.16065
#5284

DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching

Emanuele Aiello, Umberto Michieli, Diego Valsesia et al.

CVPR 2025posterarXiv:2411.17786
#5285

Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition

Yifei Zhang, Chang Liu, Jin Wei et al.

CVPR 2025posterarXiv:2503.18746
#5286

EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation

Diljeet Jagpal, Xi Chen, Vinay P. Namboodiri

CVPR 2025posterarXiv:2504.06861
#5287

ReWind: Understanding Long Videos with Instructed Learnable Memory

Anxhelo Diko, Tinghuai Wang, Wassim Swaileh et al.

CVPR 2025posterarXiv:2411.15556
#5288

ABBSPO: Adaptive Bounding Box Scaling and Symmetric Prior based Orientation Prediction for Detecting Aerial Image Objects

Woojin Lee, Hyugjae Chang, Jaeho Moon et al.

CVPR 2025posterarXiv:2512.10031
#5289

Semantic-guided Cross-Modal Prompt Learning for Skeleton-based Zero-shot Action Recognition

Anqi Zhu, Jingmin Zhu, James Bailey et al.

CVPR 2025poster
#5290

VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding

Chaoyu Li, Eun Woo Im, Pooyan Fazli

CVPR 2025posterarXiv:2412.03735
#5291

All-directional Disparity Estimation for Real-world QPD Images

Hongtao Yu, Shaohui Song, Lihu Sun et al.

CVPR 2025highlight
#5292

COBRA: COmBinatorial Retrieval Augmentation for Few-Shot Adaptation

Arnav Mohanty Das, Gantavya Bhatt, Lilly Kumari et al.

CVPR 2025posterarXiv:2412.17684
#5293

MFogHub: Bridging Multi-Regional and Multi-Satellite Data for Global Marine Fog Detection and Forecasting

Mengqiu XU, Kaixin Chen, Heng Guo et al.

CVPR 2025posterarXiv:2505.10281
#5294

Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning

Da-Wei Zhou, Zi-Wen Cai, Han-Jia Ye et al.

CVPR 2025posterarXiv:2410.00911
#5295

Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning

Kunyu Wang, Xueyang Fu, Xin Lu et al.

CVPR 2025posterarXiv:2506.02462
#5296

Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering

Yuanhao Zou, Zhaozheng Yin

CVPR 2025posterarXiv:2510.08791
#5297

Q-Bench-Video: Benchmark the Video Quality Understanding of LMMs

Zicheng Zhang, Ziheng Jia, Haoning Wu et al.

CVPR 2025posterarXiv:2409.20063
#5298

MAD: Memory-Augmented Detection of 3D Objects

Ben Agro, Sergio Casas, Patrick Wang et al.

CVPR 2025poster
#5299

Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights

Ondrej Tybl, Lukas Neumann

CVPR 2025poster
#5300

RAEncoder: A Label-Free Reversible Adversarial Examples Encoder for Dataset Intellectual Property Protection

Fan Xing, Zhuo Tian, Xuefeng Fan et al.

CVPR 2025poster
#5301

Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition

ZHANG LINTONG, Kang Yin, Seong-Whan Lee

CVPR 2025posterarXiv:2511.07974
#5302

Not Just Text: Uncovering Vision Modality Typographic Threats in Image Generation Models

Hao Cheng, Erjia Xiao, Jiayan Yang et al.

CVPR 2025posterarXiv:2412.05538
#5303

Mamba-Reg: Vision Mamba Also Needs Registers

Feng Wang, Jiahao Wang, Sucheng Ren et al.

CVPR 2025poster
#5304

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Yuxuan Wang, Yueqian Wang, Bo Chen et al.

CVPR 2025posterarXiv:2503.22952
#5305

Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning

yuzhuo dai, Jiaqi Jin, Zhibin Dong et al.

CVPR 2025posterarXiv:2505.11182
#5306

Autoregressive Sequential Pretraining for Visual Tracking

Shiyi Liang, Yifan Bai, Yihong Gong et al.

CVPR 2025poster
#5307

Number it: Temporal Grounding Videos like Flipping Manga

Yongliang Wu, Xinting Hu, Yuyang Sun et al.

CVPR 2025posterarXiv:2411.10332
#5308

Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

Jiangyi Wang, Na Zhao

CVPR 2025posterarXiv:2503.16125
#5309

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

chenkai zhang, Yiming Lei, Zeming Liu et al.

CVPR 2025posterarXiv:2504.21435
#5310

GS-2DGS: Geometrically Supervised 2DGS for Reflective Object Reconstruction

Jinguang Tong, Xuesong li, Fahira Afzal Maken et al.

CVPR 2025posterarXiv:2506.13110
#5311

PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting

Cheng Zhang, Haofei Xu, Qianyi Wu et al.

CVPR 2025posterarXiv:2412.12096
#5312

LEDiff: Latent Exposure Diffusion for HDR Generation

Chao Wang, Zhihao Xia, Thomas Leimkuehler et al.

CVPR 2025posterarXiv:2412.14456
#5313

FloVD: Optical Flow Meets Video Diffusion Model for Enhanced Camera-Controlled Video Synthesis

Wonjoon Jin, Qi Dai, Chong Luo et al.

CVPR 2025posterarXiv:2502.08244
#5314

NVILA: Efficient Frontier Visual Language Models

Zhijian Liu, Ligeng Zhu, Baifeng Shi et al.

CVPR 2025posterarXiv:2412.04468
#5315

Fuzzy Multimodal Learning for Trusted Cross-modal Retrieval

Siyuan Duan, Yuan Sun, Dezhong Peng et al.

CVPR 2025poster
#5316

AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction

Lingteng Qiu, Shenhao Zhu, Qi Zuo et al.

CVPR 2025posterarXiv:2412.02684
#5317

Seeing More with Less: Human-like Representations in Vision Models

Andrey Gizdov, Shimon Ullman, Daniel Harari

CVPR 2025highlight
#5318

Disentangling Safe and Unsafe Image Corruptions via Anisotropy and Locality

Ramchandran Muthukumar, Ambar Pal, Jeremias Sulam et al.

CVPR 2025poster
#5319

ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation

Ali Athar, Xueqing Deng, Liang-Chieh Chen

CVPR 2025posterarXiv:2412.09754
#5320

IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images

Chih-Hao Lin, Jia-Bin Huang, Zhengqin Li et al.

CVPR 2025posterarXiv:2401.12977
#5321

Can Machines Understand Composition? Dataset and Benchmark for Photographic Image Composition Embedding and Understanding

Zhaoran Zhao, Peng Lu, Anran Zhang et al.

CVPR 2025highlight
#5322

Dense-SfM: Structure from Motion with Dense Consistent Matching

JongMin Lee, Sungjoo Yoo

CVPR 2025posterarXiv:2501.14277
#5323

Let's Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation

Xiumei Xie, Zikai Huang, Wenhao Xu et al.

CVPR 2025poster
#5324

SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction

Yutao Tang, Yuxiang Guo, Deming Li et al.

CVPR 2025posterarXiv:2411.12592
#5325

Factored-NeuS: Reconstructing Surfaces, Illumination, and Materials of Possibly Glossy Objects

Yue Fan, Ningjing Fan, Ivan Skorokhodov et al.

CVPR 2025posterarXiv:2305.17929
#5326

TransPixeler: Advancing Text-to-Video Generation with Transparency

Luozhou Wang, Yijun Li, ZhiFei Chen et al.

CVPR 2025posterarXiv:2501.03006
#5327

FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering

Guofeng Feng, Siyan Chen, Rong Fu et al.

CVPR 2025posterarXiv:2408.07967
#5328

Variance-Based Membership Inference Attacks Against Large-Scale Image Captioning Models

Daniel Samira, Edan Habler, Yuval Elovici et al.

CVPR 2025poster
#5329

VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary

Kevin Qinghong Lin, Mike Zheng Shou

CVPR 2025posterarXiv:2503.09402
#5330

ERUPT: Efficient Rendering with Unposed Patch Transformer

Maxim Shugaev, Vincent Chen, Maxim Karrenbach et al.

CVPR 2025posterarXiv:2503.24374
#5331

Improved Monocular Depth Prediction Using Distance Transform Over Pre-semantic Contours with Self-supervised Neural Networks

Marwane Hariat, Antoine Manzanera, David Filliat

CVPR 2025poster
#5332

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

Chongjian GE, Chenfeng Xu, Yuanfeng Ji et al.

CVPR 2025posterarXiv:2410.20723
#5333

FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error

Beilin Chu, Xuan Xu, Xin Wang et al.

CVPR 2025posterarXiv:2412.07140
#5334

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Alejandro Lozano, Min Woo Sun, James Burgess et al.

CVPR 2025posterarXiv:2501.07171
#5335

DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models

Haoyang Li, Liang Wang, Chao Wang et al.

CVPR 2025posterarXiv:2503.13443
#5336

Taxonomy-Aware Evaluation of Vision-Language Models

Vésteinn Snæbjarnarson, Kevin Du, Niklas Stoehr et al.

CVPR 2025posterarXiv:2504.05457
#5337

TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models

Xin Wang, Kai Chen, Jiaming Zhang et al.

CVPR 2025posterarXiv:2411.13136
#5338

Towards Practical Real-Time Neural Video Compression

Zhaoyang Jia, Bin Li, Jiahao Li et al.

CVPR 2025posterarXiv:2502.20762
#5339

CDI: Copyrighted Data Identification in Diffusion Models

Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.

CVPR 2025posterarXiv:2411.12858
#5340

Binarized Neural Network for Multi-spectral Image Fusion

Junming Hou, Xiaoyu Chen, Ran Ran et al.

CVPR 2025poster
#5341

GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior

Zichen Tang, Yuan Yao, Miaomiao Cui et al.

CVPR 2025posterarXiv:2503.11143
#5342

Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations

Ahmad Rahimi, Po-Chien Luan, Yuejiang Liu et al.

CVPR 2025posterarXiv:2312.04540
#5343

Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity

Huaxin Zhang, Xiaohao Xu, Xiang Wang et al.

CVPR 2025highlightarXiv:2412.06171
#5344

MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework

Ping Guo, Cheng Gong, Fei Liu et al.

CVPR 2025posterarXiv:2501.07251
#5345

Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion

Xiangfeng Xu, Pinyi Zhang, Wenxuan Huang et al.

CVPR 2025poster
#5346

Disentangled Pose and Appearance Guidance for Multi-Pose Generation

Tengfei Xiao, Yue Wu, Yuelong Li et al.

CVPR 2025poster
#5347

ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way

Jiazi Bu, Pengyang Ling, Pan Zhang et al.

CVPR 2025posterarXiv:2410.06241
#5348

Learning Conditional Space-Time Prompt Distributions for Video Class-Incremental Learning

Xiaohan Zou, Wenchao Ma, Shu Zhao

CVPR 2025highlight
#5349

Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation

Xinyu Zhao, Jun Xie, Shengzhe Chen et al.

CVPR 2025poster
#5350

Rethinking Personalized Aesthetics Assessment: Employing Physique Aesthetics Assessment as An Exemplification

Haobin Zhong, Shuai He, Anlong Ming et al.

CVPR 2025highlight
#5351

Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data

Wenxin Su, Song Tang, Xiaofeng Liu et al.

CVPR 2025posterarXiv:2412.01203
#5352

SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow

Qingyuan Wang, Rui Song, Jiaojiao Li et al.

CVPR 2025posterarXiv:2504.09160
#5353

GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis

You Wang, Li Fang, Hao Zhu et al.

CVPR 2025posterarXiv:2505.19813
#5354

SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models

Wufei Ma, Luoxin Ye, Nessa McWeeney et al.

CVPR 2025highlightarXiv:2505.00788
#5355

Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining

Shangquan Sun, Wenqi Ren, Juxiang Zhou et al.

CVPR 2025posterarXiv:2505.16811
#5356

EntropyMark: Towards More Harmless Backdoor Watermark via Entropy-based Constraint for Open-source Dataset Copyright Protection

Ming Sun, Rui Wang, Zixuan Zhu et al.

CVPR 2025poster
#5357

Rethinking the Adversarial Robustness of Multi-Exit Neural Networks in an Attack-Defense Game

Keyizhi Xu, Chi Zhang, Zhan Chen et al.

CVPR 2025poster
#5358

DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation

Tianyi Yan, Dongming Wu, Wencheng Han et al.

CVPR 2025posterarXiv:2411.11252
#5359

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders

jiajun cao, Yuan Zhang, Tao Huang et al.

CVPR 2025posterarXiv:2501.01709
#5360

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

Yang Yue, Yulin Wang, Haojun Jiang et al.

CVPR 2025posterarXiv:2504.13065
#5361

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Jianing "Jed" Yang, Alexander Sax, Kevin Liang et al.

CVPR 2025posterarXiv:2501.13928
#5362

Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction

Dong Li, Wenqi Zhong, Wei Yu et al.

CVPR 2025posterarXiv:2505.16980
#5363

A Unified Image-Dense Annotation Generation Model for Underwater Scenes

Hongkai Lin, Dingkang Liang, Zhenghao Qi et al.

CVPR 2025posterarXiv:2503.21771
#5364

3D-SLNR: A Super Lightweight Neural Representation for Large-scale 3D Mapping

Chenhui Shi, Fulin Tang, Ning An et al.

CVPR 2025poster
#5365

STINR: Deciphering Spatial Transcriptomics via Implicit Neural Representation

Yisi Luo, Xile Zhao, Kai Ye et al.

CVPR 2025poster
#5366

Multi-Modal Contrastive Masked Autoencoders: A Two-Stage Progressive Pre-training Approach for RGBD Datasets

Muhammad Abdullah Jamal, Omid Mohareri

CVPR 2025poster
#5367

Font-Agent: Enhancing Font Understanding with Large Language Models

Yingxin Lai, Cuijie Xu, Haitian Shi et al.

CVPR 2025poster
#5368

RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models

Greg Heinrich, Mike Ranzinger, Danny Yin et al.

CVPR 2025posterarXiv:2412.07679
#5369

Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning

Shouhang Zhu, Chenglin Li, Yuankun Jiang et al.

CVPR 2025poster
#5370

GeoMM: On Geodesic Perspective for Multi-modal Learning

Shibin Mei, Hang Wang, Bingbing Ni

CVPR 2025posterarXiv:2505.11216
#5371

Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy

Zesen Cheng, Hang Zhang, Kehan Li et al.

CVPR 2025highlight
#5372

GeoAvatar: Geometrically-Consistent Multi-Person Avatar Reconstruction from Sparse Multi-View Videos

Soohyun Lee, SeoYeon Kim, HeeKyung Lee et al.

CVPR 2025poster
#5373

Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D Motion

Saad Lahlali, Sandra Kara, Hejer AMMAR et al.

CVPR 2025posterarXiv:2503.15022
#5374

MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations

Kyungho Bae, Jinhyung Kim, Sihaeng Lee et al.

CVPR 2025highlightarXiv:2503.15871
#5375

GPVK-VL: Geometry-Preserving Virtual Keyframes for Visual Localization under Large Viewpoint Changes

Yunxuan Li, Lei Fan, Xiaoying Xing et al.

CVPR 2025poster
#5376

Be More Specific: Evaluating Object-centric Realism in Synthetic Images

Anqi Liang, Ciprian Adrian Corneanu, Qianli Feng et al.

CVPR 2025poster
#5377

Bootstrap Your Own Views: Masked Ego-Exo Modeling for Fine-grained View-invariant Video Representations

Jungin Park, Jiyoung Lee, Kwanghoon Sohn

CVPR 2025posterarXiv:2503.19706
#5378

CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization

Junhao Xu, Yanan Zhang, Zhi Cai et al.

CVPR 2025posterarXiv:2503.03430
#5379

Hierarchical Knowledge Prompt Tuning for Multi-task Test-Time Adaptation

Qiang Zhang, Mengsheng Zhao, Jiawei Liu et al.

CVPR 2025poster
#5380

DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation

Ziyu Zhao, Xiaoguang Li, Lingjia Shi et al.

CVPR 2025posterarXiv:2505.11676
#5381

Hazy Low-Quality Satellite Video Restoration Via Learning Optimal Joint Degradation Patterns and Continuous-Scale Super-Resolution Reconstruction

Ning Ni, Libao Zhang

CVPR 2025poster
#5382

Visual Prompting for One-shot Controllable Video Editing without Inversion

Zhengbo Zhang, Yuxi Zhou, DUO PENG et al.

CVPR 2025posterarXiv:2504.14335
#5383

Segment Any Motion in Videos

Nan Huang, Wenzhao Zheng, Chenfeng Xu et al.

CVPR 2025posterarXiv:2503.22268
#5384

Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages

Matteo Farina, Massimiliano Mancini, Giovanni Iacca et al.

CVPR 2025posterarXiv:2503.11609
#5385

TAGA: Self-supervised Learning for Template-free Animatable Gaussian Articulated Model

Zhichao Zhai, Guikun Chen, Wenguan Wang et al.

CVPR 2025poster
#5386

MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing

Shuo Wang, Wanting Li, Yongcai Wang et al.

CVPR 2025posterarXiv:2412.20082
#5387

MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation

Zhenyu Wu, Yuheng Zhou, Xiuwei Xu et al.

CVPR 2025posterarXiv:2503.13446
#5388

Beyond Single-Modal Boundary: Cross-Modal Anomaly Detection through Visual Prototype and Harmonization

Kai Mao, Ping Wei, Yiyang Lian et al.

CVPR 2025poster
#5389

Augmenting Perceptual Super-Resolution via Image Quality Predictors

Fengjia Zhang, Samrudhdhi Rangrej, Tristan T Aumentado-Armstrong et al.

CVPR 2025posterarXiv:2504.18524
#5390

Perception Tokens Enhance Visual Reasoning in Multimodal Language Models

Mahtab Bigverdi, Zelun Luo, Cheng-Yu Hsieh et al.

CVPR 2025posterarXiv:2412.03548
#5391

ViKIENet: Towards Efficient 3D Object Detection with Virtual Key Instance Enhanced Network

Zhuochen Yu, Bijie Qiu, Andy W. H. Khong

CVPR 2025poster
#5392

From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed Learning

Ziang Li, Hongguang Zhang, Juan Wang et al.

CVPR 2025posterarXiv:2503.16266
#5393

Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel Imaging

Ping Wang, Lishun Wang, Gang Qu et al.

CVPR 2025posterarXiv:2505.23180
#5394

Compositional Targeted Multi-Label Universal Perturbations

Hassan Mahmood, Ehsan Elhamifar

CVPR 2025poster
#5395

CGMatch: A Different Perspective of Semi-supervised Learning

Bo Cheng, Jueqing Lu, Yuan Tian et al.

CVPR 2025posterarXiv:2503.02231
#5396

Leveraging Temporal Cues for Semi-Supervised Multi-View 3D Object Detection

Jinhyung Park, Navyata Sanghvi, Hiroki Adachi et al.

CVPR 2025poster
#5397

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Jianzong Wu, Chao Tang, Jingbo Wang et al.

CVPR 2025posterarXiv:2412.07589
#5398

CocoER: Aligning Multi-Level Feature by Competition and Coordination for Emotion Recognition

Xuli Shen, Hua Cai, Weilin Shen et al.

CVPR 2025poster
#5399

Dynamic Motion Blending for Versatile Motion Editing

Nan Jiang, Hongjie Li, Ziye Yuan et al.

CVPR 2025posterarXiv:2503.20724
#5400

A Unified Approach to Interpreting Self-supervised Pre-training Methods for 3D Point Clouds via Interactions

Qiang Li, Jian Ruan, Fanghao Wu et al.

CVPR 2025highlight