Most Cited 2025 "experiment design" Papers

22,274 papers found • Page 97 of 112

#19201

Free Lunch Enhancements for Multi-modal Crowd Counting

Haoliang Meng, Xiaopeng Hong, Zhengqin Lai et al.

CVPR 2025poster
#19202

From Sparse Signal to Smooth Motion: Real-Time Motion Generation with Rolling Prediction Models

German Barquero, Nadine Bertsch, Manojkumar Marramreddy et al.

CVPR 2025posterarXiv:2504.05265
#19203

Efficient Personalization of Quantized Diffusion Model without Backpropagation

Hoigi Seo, Wongi Jeong, Kyungryeol Lee et al.

CVPR 2025posterarXiv:2503.14868
#19204

KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception

Yunpeng Qu, Kun Yuan, Qizhi Xie et al.

CVPR 2025posterarXiv:2503.10259
#19205

Extreme Rotation Estimation in the Wild

Hana Bezalel, Dotan Ankri, Ruojin Cai et al.

CVPR 2025posterarXiv:2411.07096
#19206

PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval

Qiang Zou, Shuli Cheng, Jiayi Chen

CVPR 2025posterarXiv:2503.16064
#19207

Preserving Clusters in Prompt Learning for Unsupervised Domain Adaptation

Long Tung Vuong, Hoang Phan, Vy Vo et al.

CVPR 2025posterarXiv:2506.11493
#19208

EdgeMovingNet: Edge-preserving Point Cloud Reconstruction via Joint Geometry Features

Xinran Yang, Donghao Ji, Yuanqi Li et al.

CVPR 2025poster
#19209

InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions

Sirui Xu, Hung Yu Ling, Yu-Xiong Wang et al.

CVPR 2025highlightarXiv:2502.20390
#19210

CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis

Youngkyoon Jang, Eduardo Pérez-Pellitero

CVPR 2025posterarXiv:2503.20998
#19211

SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models

Subhadeep Koley, Tapas Kumar Dutta, Aneeshan Sain et al.

CVPR 2025posterarXiv:2503.14129
#19212

EgoLife: Towards Egocentric Life Assistant

Jingkang Yang, Shuai Liu, Hongming Guo et al.

CVPR 2025posterarXiv:2503.03803
#19213

AutoSSVH: Exploring Automated Frame Sampling for Efficient Self-Supervised Video Hashing

Niu Lian, Jun Li, Jinpeng Wang et al.

CVPR 2025posterarXiv:2504.03587
#19214

PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation

Qiyao Xue, Xiangyu Yin, Boyuan Yang et al.

CVPR 2025posterarXiv:2412.00596
#19215

MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation

Sankalp Sinha, Mohammad Sadil Khan, Muhammad Usama et al.

CVPR 2025posterarXiv:2411.17945
#19216

OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

Max Gutbrod, David Rauber, Danilo Weber Nunes et al.

CVPR 2025posterarXiv:2503.16247
#19217

TAROT: Towards Essentially Domain-Invariant Robustness with Theoretical Justification

Dongyoon Yang, Jihu Lee, Yongdai Kim

CVPR 2025posterarXiv:2505.06580
#19218

Explaining in Diffusion: Explaining a Classifier with Diffusion Semantics

Tahira Kazimi, Ritika Allada, Pinar Yanardag

CVPR 2025poster
#19219

AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models

Kwan Yun, Seokhyeon Hong, Chaelin Kim et al.

CVPR 2025posterarXiv:2503.08417
#19220

Learning with Noisy Triplet Correspondence for Composed Image Retrieval

Shuxian Li, Changhao He, XitingLiu et al.

CVPR 2025poster
#19221

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

Qirui Jiao, Daoyuan Chen, Yilun Huang et al.

CVPR 2025posterarXiv:2408.04594
#19222

When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach

Vaibhav Rathore, Shubhranil B, Saikat Dutta et al.

CVPR 2025posterarXiv:2503.14897
#19223

HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos

Prithviraj Banerjee, Sindi Shkodrani, Pierre Moulon et al.

CVPR 2025highlightarXiv:2411.19167
#19224

DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension

Xiaofu Chen, Yaxin Luo, Luo et al.

CVPR 2025poster
#19225

Multi-View Pose-Agnostic Change Localization with Zero Labels

Chamuditha Jayanga Galappaththige, Jason Lai, Lloyd Windrim et al.

CVPR 2025posterarXiv:2412.03911
#19226

FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance

Dian Shao, Mingfei Shi, Shengda Xu et al.

CVPR 2025posterarXiv:2505.13437
#19227

HVI: A New Color Space for Low-light Image Enhancement

Qingsen Yan, Yixu Feng, Cheng Zhang et al.

CVPR 2025posterarXiv:2502.20272
#19228

Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing

Maria-Paola Forte, Nikos Athanasiou, Giulia Ballardini et al.

ICCV 2025posterarXiv:2512.04862
#19229

LMO: Linear Mamba Operator for MRI Reconstruction

Wei Li, jiawei jiang, Jie Wu et al.

CVPR 2025poster
#19230

Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation

Yanda Chen, Gongwei Chen, Miao Zhang et al.

CVPR 2025posterarXiv:2503.18872
#19231

Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level

Andong Deng, Tongjia Chen, Shoubin Yu et al.

CVPR 2025posterarXiv:2411.09921
#19232

CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model

Ziyu Yao, Xuxin Cheng, Zhiqi Huang et al.

CVPR 2025posterarXiv:2503.17690
#19233

Low-Rank Adaptation in Multilinear Operator Networks for Security-Preserving Incremental Learning

Huu Binh Ta, Duc Nguyen, Quyen Tran et al.

CVPR 2025poster
#19234

T-FAKE: Synthesizing Thermal Images for Facial Landmarking

Philipp Flotho, Moritz Piening, Anna Kukleva et al.

CVPR 2025posterarXiv:2408.15127
#19235

A Theory of Learning Unified Model via Knowledge Integration from Label Space Varying Domains

Dexuan Zhang, Thomas Westfechtel, Tatsuya Harada

CVPR 2025poster
#19236

Focal Split: Untethered Snapshot Depth from Differential Defocus

Junjie Luo, John Mamish, Alan Fu et al.

CVPR 2025posterarXiv:2504.11202
#19237

Generative Hard Example Augmentation for Semantic Point Cloud Segmentation

Qi Zhang, Jibin Peng, Zhao Huang et al.

CVPR 2025poster
#19238

Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation

Byung Hyun Lee, Sungjin Lim, Se Young Chun

CVPR 2025posterarXiv:2503.12356
#19239

Continuous Space-Time Video Resampling with Invertible Motion Steganography

Yuantong zhang, Zhenzhong Chen

CVPR 2025poster
#19240

Geometry-guided Online 3D Video Synthesis with Multi-View Temporal Consistency

Hyunho Ha, Lei Xiao, Christian Richardt et al.

CVPR 2025posterarXiv:2505.18932
#19241

Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

Jiayi Guo, Zhao Junhao, Chaoqun Du et al.

CVPR 2025posterarXiv:2406.04295
#19242

Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM

Qiyuan Dai, Sibei Yang

CVPR 2025posterarXiv:2507.06973
#19243

OralXrays-9: Towards Hospital-Scale Panoramic X-ray Anomaly Detection via Personalized Multi-Object Query-Aware Mining

Bingzhi Chen, Sisi Fu, Xiaocheng Fang et al.

CVPR 2025oral
#19244

Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation

Chuhao Chen, Zhiyang Dou, Chen Wang et al.

CVPR 2025posterarXiv:2506.06440
#19245

Event Ellipsometer: Event-based Mueller-Matrix Video Imaging

Ryota Maeda, Yunseong Moon, Seung-Hwan Baek

CVPR 2025highlightarXiv:2411.17313
#19246

WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion

Yang Wu, Yun Zhu, Kaihua Zhang et al.

CVPR 2025posterarXiv:2504.13561
#19247

SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks

Shining Wang, Yunlong Wang, Ruiqi Wu et al.

CVPR 2025highlightarXiv:2503.06965
#19248

V2V3D: View-to-View Denoised 3D Reconstruction for Light Field Microscopy

Jiayin Zhao, Zhenqi Fu, Tao Yu et al.

CVPR 2025posterarXiv:2504.07853
#19249

A Unified Framework for Heterogeneous Semi-supervised Learning

Marzi Heidari, Abdullah Alchihabi, Hao Yan et al.

CVPR 2025posterarXiv:2503.00286
#19250

SLADE: Shielding against Dual Exploits in Large Vision-Language Models

Md Zarif Hossain, AHMED IMTEAJ

CVPR 2025poster
#19251

MODA: Motion-Drift Augmentation for Inertial Human Motion Analysis

Yinghao Wu, Shihui Guo, Yipeng Qin

CVPR 2025poster
#19252

Learning to Filter Outlier Edges in Global SfM

Nicole Damblon, Marc Pollefeys, Daniel Barath

CVPR 2025highlight
#19253

Improving the Training of Data-Efficient GANs via Quality Aware Dynamic Discriminator Rejection Sampling

Zhaoyu Zhang, Yang Hua, Guanxiong Sun et al.

CVPR 2025poster
#19254

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Bikang Pan, Qun Li, Xiaoying Tang et al.

CVPR 2025highlightarXiv:2412.01256
#19255

No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition

Rong Qin, Xin Liu, Xingyu Liu et al.

CVPR 2025highlight
#19256

Where's the Liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content

Haoyue Bai, Yiyou Sun, Wei Cheng et al.

CVPR 2025posterarXiv:2505.01008
#19257

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Qiuheng Wang, Yukai Shi, Jiarong Ou et al.

CVPR 2025posterarXiv:2410.08260
#19258

VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification

Xianwei Zhuang, Zhihong Zhu, Yuxin Xie et al.

CVPR 2025posterarXiv:2501.06553
#19259

Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Ying Jin, Jinlong Peng, Qingdong He et al.

CVPR 2025posterarXiv:2408.13509
#19260

CoMatcher: Multi-View Collaborative Feature Matching

Jintao Zhang, Zimin Xia, Mingyue Dong et al.

CVPR 2025posterarXiv:2504.01872
#19261

PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram

Sifan Zhou, Zhihang Yuan, Dawei Yang et al.

CVPR 2025poster
#19262

Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather

Longyu Yang, Ping Hu, Shangbo Yuan et al.

CVPR 2025posterarXiv:2506.02396
#19263

Generalizable Object Keypoint Localization from Generative Priors

Dongkai Wang, Jiang Duan, Liangjian Wen et al.

CVPR 2025poster
#19264

Integrating Task-Specific and Universal Adapters for Pre-Trained Model-based Class-Incremental Learning

yan wang, Da-Wei Zhou, Han-Jia Ye

ICCV 2025posterarXiv:2508.08165
#19265

Chebyshev Attention Depth Permutation Texture Network with Latent Texture Attribute Loss

Ravishankar Evani, Deepu Rajan, Shangbo Mao

CVPR 2025poster
#19266

Shining Yourself: High-Fidelity Ornaments Virtual Try-on with Diffusion Model

Yingmao Miao, Zhanpeng Huang, Rui Han et al.

CVPR 2025posterarXiv:2503.16065
#19267

DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching

Emanuele Aiello, Umberto Michieli, Diego Valsesia et al.

CVPR 2025posterarXiv:2411.17786
#19268

Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition

Yifei Zhang, Chang Liu, Jin Wei et al.

CVPR 2025posterarXiv:2503.18746
#19269

EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation

Diljeet Jagpal, Xi Chen, Vinay P. Namboodiri

CVPR 2025posterarXiv:2504.06861
#19270

ReWind: Understanding Long Videos with Instructed Learnable Memory

Anxhelo Diko, Tinghuai Wang, Wassim Swaileh et al.

CVPR 2025posterarXiv:2411.15556
#19271

ABBSPO: Adaptive Bounding Box Scaling and Symmetric Prior based Orientation Prediction for Detecting Aerial Image Objects

Woojin Lee, Hyugjae Chang, Jaeho Moon et al.

CVPR 2025posterarXiv:2512.10031
#19272

Semantic-guided Cross-Modal Prompt Learning for Skeleton-based Zero-shot Action Recognition

Anqi Zhu, Jingmin Zhu, James Bailey et al.

CVPR 2025poster
#19273

VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding

Chaoyu Li, Eun Woo Im, Pooyan Fazli

CVPR 2025posterarXiv:2412.03735
#19274

All-directional Disparity Estimation for Real-world QPD Images

Hongtao Yu, Shaohui Song, Lihu Sun et al.

CVPR 2025highlight
#19275

COBRA: COmBinatorial Retrieval Augmentation for Few-Shot Adaptation

Arnav Mohanty Das, Gantavya Bhatt, Lilly Kumari et al.

CVPR 2025posterarXiv:2412.17684
#19276

MFogHub: Bridging Multi-Regional and Multi-Satellite Data for Global Marine Fog Detection and Forecasting

Mengqiu XU, Kaixin Chen, Heng Guo et al.

CVPR 2025posterarXiv:2505.10281
#19277

Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning

Da-Wei Zhou, Zi-Wen Cai, Han-Jia Ye et al.

CVPR 2025posterarXiv:2410.00911
#19278

Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning

Kunyu Wang, Xueyang Fu, Xin Lu et al.

CVPR 2025posterarXiv:2506.02462
#19279

Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering

Yuanhao Zou, Zhaozheng Yin

CVPR 2025posterarXiv:2510.08791
#19280

Q-Bench-Video: Benchmark the Video Quality Understanding of LMMs

Zicheng Zhang, Ziheng Jia, Haoning Wu et al.

CVPR 2025posterarXiv:2409.20063
#19281

MAD: Memory-Augmented Detection of 3D Objects

Ben Agro, Sergio Casas, Patrick Wang et al.

CVPR 2025poster
#19282

Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights

Ondrej Tybl, Lukas Neumann

CVPR 2025poster
#19283

RAEncoder: A Label-Free Reversible Adversarial Examples Encoder for Dataset Intellectual Property Protection

Fan Xing, Zhuo Tian, Xuefeng Fan et al.

CVPR 2025poster
#19284

Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition

ZHANG LINTONG, Kang Yin, Seong-Whan Lee

CVPR 2025posterarXiv:2511.07974
#19285

Not Just Text: Uncovering Vision Modality Typographic Threats in Image Generation Models

Hao Cheng, Erjia Xiao, Jiayan Yang et al.

CVPR 2025posterarXiv:2412.05538
#19286

Mamba-Reg: Vision Mamba Also Needs Registers

Feng Wang, Jiahao Wang, Sucheng Ren et al.

CVPR 2025poster
#19287

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Yuxuan Wang, Yueqian Wang, Bo Chen et al.

CVPR 2025posterarXiv:2503.22952
#19288

Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning

yuzhuo dai, Jiaqi Jin, Zhibin Dong et al.

CVPR 2025posterarXiv:2505.11182
#19289

Autoregressive Sequential Pretraining for Visual Tracking

Shiyi Liang, Yifan Bai, Yihong Gong et al.

CVPR 2025poster
#19290

Number it: Temporal Grounding Videos like Flipping Manga

Yongliang Wu, Xinting Hu, Yuyang Sun et al.

CVPR 2025posterarXiv:2411.10332
#19291

Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

Jiangyi Wang, Na Zhao

CVPR 2025posterarXiv:2503.16125
#19292

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

chenkai zhang, Yiming Lei, Zeming Liu et al.

CVPR 2025posterarXiv:2504.21435
#19293

GS-2DGS: Geometrically Supervised 2DGS for Reflective Object Reconstruction

Jinguang Tong, Xuesong li, Fahira Afzal Maken et al.

CVPR 2025posterarXiv:2506.13110
#19294

PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting

Cheng Zhang, Haofei Xu, Qianyi Wu et al.

CVPR 2025posterarXiv:2412.12096
#19295

LEDiff: Latent Exposure Diffusion for HDR Generation

Chao Wang, Zhihao Xia, Thomas Leimkuehler et al.

CVPR 2025posterarXiv:2412.14456
#19296

FloVD: Optical Flow Meets Video Diffusion Model for Enhanced Camera-Controlled Video Synthesis

Wonjoon Jin, Qi Dai, Chong Luo et al.

CVPR 2025posterarXiv:2502.08244
#19297

NVILA: Efficient Frontier Visual Language Models

Zhijian Liu, Ligeng Zhu, Baifeng Shi et al.

CVPR 2025posterarXiv:2412.04468
#19298

Fuzzy Multimodal Learning for Trusted Cross-modal Retrieval

Siyuan Duan, Yuan Sun, Dezhong Peng et al.

CVPR 2025poster
#19299

AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction

Lingteng Qiu, Shenhao Zhu, Qi Zuo et al.

CVPR 2025posterarXiv:2412.02684
#19300

Seeing More with Less: Human-like Representations in Vision Models

Andrey Gizdov, Shimon Ullman, Daniel Harari

CVPR 2025highlight
#19301

Disentangling Safe and Unsafe Image Corruptions via Anisotropy and Locality

Ramchandran Muthukumar, Ambar Pal, Jeremias Sulam et al.

CVPR 2025poster
#19302

ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation

Ali Athar, Xueqing Deng, Liang-Chieh Chen

CVPR 2025posterarXiv:2412.09754
#19303

IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images

Chih-Hao Lin, Jia-Bin Huang, Zhengqin Li et al.

CVPR 2025posterarXiv:2401.12977
#19304

Can Machines Understand Composition? Dataset and Benchmark for Photographic Image Composition Embedding and Understanding

Zhaoran Zhao, Peng Lu, Anran Zhang et al.

CVPR 2025highlight
#19305

Dense-SfM: Structure from Motion with Dense Consistent Matching

JongMin Lee, Sungjoo Yoo

CVPR 2025posterarXiv:2501.14277
#19306

Let's Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation

Xiumei Xie, Zikai Huang, Wenhao Xu et al.

CVPR 2025poster
#19307

SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction

Yutao Tang, Yuxiang Guo, Deming Li et al.

CVPR 2025posterarXiv:2411.12592
#19308

Factored-NeuS: Reconstructing Surfaces, Illumination, and Materials of Possibly Glossy Objects

Yue Fan, Ningjing Fan, Ivan Skorokhodov et al.

CVPR 2025posterarXiv:2305.17929
#19309

TransPixeler: Advancing Text-to-Video Generation with Transparency

Luozhou Wang, Yijun Li, ZhiFei Chen et al.

CVPR 2025posterarXiv:2501.03006
#19310

FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering

Guofeng Feng, Siyan Chen, Rong Fu et al.

CVPR 2025posterarXiv:2408.07967
#19311

Variance-Based Membership Inference Attacks Against Large-Scale Image Captioning Models

Daniel Samira, Edan Habler, Yuval Elovici et al.

CVPR 2025poster
#19312

VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary

Kevin Qinghong Lin, Mike Zheng Shou

CVPR 2025posterarXiv:2503.09402
#19313

ERUPT: Efficient Rendering with Unposed Patch Transformer

Maxim Shugaev, Vincent Chen, Maxim Karrenbach et al.

CVPR 2025posterarXiv:2503.24374
#19314

Improved Monocular Depth Prediction Using Distance Transform Over Pre-semantic Contours with Self-supervised Neural Networks

Marwane Hariat, Antoine Manzanera, David Filliat

CVPR 2025poster
#19315

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

Chongjian GE, Chenfeng Xu, Yuanfeng Ji et al.

CVPR 2025posterarXiv:2410.20723
#19316

FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error

Beilin Chu, Xuan Xu, Xin Wang et al.

CVPR 2025posterarXiv:2412.07140
#19317

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Alejandro Lozano, Min Woo Sun, James Burgess et al.

CVPR 2025posterarXiv:2501.07171
#19318

DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models

Haoyang Li, Liang Wang, Chao Wang et al.

CVPR 2025posterarXiv:2503.13443
#19319

Taxonomy-Aware Evaluation of Vision-Language Models

Vésteinn Snæbjarnarson, Kevin Du, Niklas Stoehr et al.

CVPR 2025posterarXiv:2504.05457
#19320

TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models

Xin Wang, Kai Chen, Jiaming Zhang et al.

CVPR 2025posterarXiv:2411.13136
#19321

Towards Practical Real-Time Neural Video Compression

Zhaoyang Jia, Bin Li, Jiahao Li et al.

CVPR 2025posterarXiv:2502.20762
#19322

CDI: Copyrighted Data Identification in Diffusion Models

Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.

CVPR 2025posterarXiv:2411.12858
#19323

Binarized Neural Network for Multi-spectral Image Fusion

Junming Hou, Xiaoyu Chen, Ran Ran et al.

CVPR 2025poster
#19324

GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior

Zichen Tang, Yuan Yao, Miaomiao Cui et al.

CVPR 2025posterarXiv:2503.11143
#19325

Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations

Ahmad Rahimi, Po-Chien Luan, Yuejiang Liu et al.

CVPR 2025posterarXiv:2312.04540
#19326

Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity

Huaxin Zhang, Xiaohao Xu, Xiang Wang et al.

CVPR 2025highlightarXiv:2412.06171
#19327

MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework

Ping Guo, Cheng Gong, Fei Liu et al.

CVPR 2025posterarXiv:2501.07251
#19328

Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion

Xiangfeng Xu, Pinyi Zhang, Wenxuan Huang et al.

CVPR 2025poster
#19329

Disentangled Pose and Appearance Guidance for Multi-Pose Generation

Tengfei Xiao, Yue Wu, Yuelong Li et al.

CVPR 2025poster
#19330

ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way

Jiazi Bu, Pengyang Ling, Pan Zhang et al.

CVPR 2025posterarXiv:2410.06241
#19331

Learning Conditional Space-Time Prompt Distributions for Video Class-Incremental Learning

Xiaohan Zou, Wenchao Ma, Shu Zhao

CVPR 2025highlight
#19332

Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation

Xinyu Zhao, Jun Xie, Shengzhe Chen et al.

CVPR 2025poster
#19333

Rethinking Personalized Aesthetics Assessment: Employing Physique Aesthetics Assessment as An Exemplification

Haobin Zhong, Shuai He, Anlong Ming et al.

CVPR 2025highlight
#19334

Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data

Wenxin Su, Song Tang, Xiaofeng Liu et al.

CVPR 2025posterarXiv:2412.01203
#19335

SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow

Qingyuan Wang, Rui Song, Jiaojiao Li et al.

CVPR 2025posterarXiv:2504.09160
#19336

GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis

You Wang, Li Fang, Hao Zhu et al.

CVPR 2025posterarXiv:2505.19813
#19337

SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models

Wufei Ma, Luoxin Ye, Nessa McWeeney et al.

CVPR 2025highlightarXiv:2505.00788
#19338

Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining

Shangquan Sun, Wenqi Ren, Juxiang Zhou et al.

CVPR 2025posterarXiv:2505.16811
#19339

EntropyMark: Towards More Harmless Backdoor Watermark via Entropy-based Constraint for Open-source Dataset Copyright Protection

Ming Sun, Rui Wang, Zixuan Zhu et al.

CVPR 2025poster
#19340

Rethinking the Adversarial Robustness of Multi-Exit Neural Networks in an Attack-Defense Game

Keyizhi Xu, Chi Zhang, Zhan Chen et al.

CVPR 2025poster
#19341

DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation

Tianyi Yan, Dongming Wu, Wencheng Han et al.

CVPR 2025posterarXiv:2411.11252
#19342

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders

jiajun cao, Yuan Zhang, Tao Huang et al.

CVPR 2025posterarXiv:2501.01709
#19343

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

Yang Yue, Yulin Wang, Haojun Jiang et al.

CVPR 2025posterarXiv:2504.13065
#19344

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Jianing "Jed" Yang, Alexander Sax, Kevin Liang et al.

CVPR 2025posterarXiv:2501.13928
#19345

Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction

Dong Li, Wenqi Zhong, Wei Yu et al.

CVPR 2025posterarXiv:2505.16980
#19346

A Unified Image-Dense Annotation Generation Model for Underwater Scenes

Hongkai Lin, Dingkang Liang, Zhenghao Qi et al.

CVPR 2025posterarXiv:2503.21771
#19347

3D-SLNR: A Super Lightweight Neural Representation for Large-scale 3D Mapping

Chenhui Shi, Fulin Tang, Ning An et al.

CVPR 2025poster
#19348

STINR: Deciphering Spatial Transcriptomics via Implicit Neural Representation

Yisi Luo, Xile Zhao, Kai Ye et al.

CVPR 2025poster
#19349

Multi-Modal Contrastive Masked Autoencoders: A Two-Stage Progressive Pre-training Approach for RGBD Datasets

Muhammad Abdullah Jamal, Omid Mohareri

CVPR 2025poster
#19350

Font-Agent: Enhancing Font Understanding with Large Language Models

Yingxin Lai, Cuijie Xu, Haitian Shi et al.

CVPR 2025poster
#19351

RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models

Greg Heinrich, Mike Ranzinger, Danny Yin et al.

CVPR 2025posterarXiv:2412.07679
#19352

Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning

Shouhang Zhu, Chenglin Li, Yuankun Jiang et al.

CVPR 2025poster
#19353

GeoMM: On Geodesic Perspective for Multi-modal Learning

Shibin Mei, Hang Wang, Bingbing Ni

CVPR 2025posterarXiv:2505.11216
#19354

Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy

Zesen Cheng, Hang Zhang, Kehan Li et al.

CVPR 2025highlight
#19355

GeoAvatar: Geometrically-Consistent Multi-Person Avatar Reconstruction from Sparse Multi-View Videos

Soohyun Lee, SeoYeon Kim, HeeKyung Lee et al.

CVPR 2025poster
#19356

Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D Motion

Saad Lahlali, Sandra Kara, Hejer AMMAR et al.

CVPR 2025posterarXiv:2503.15022
#19357

MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations

Kyungho Bae, Jinhyung Kim, Sihaeng Lee et al.

CVPR 2025highlightarXiv:2503.15871
#19358

GPVK-VL: Geometry-Preserving Virtual Keyframes for Visual Localization under Large Viewpoint Changes

Yunxuan Li, Lei Fan, Xiaoying Xing et al.

CVPR 2025poster
#19359

Be More Specific: Evaluating Object-centric Realism in Synthetic Images

Anqi Liang, Ciprian Adrian Corneanu, Qianli Feng et al.

CVPR 2025poster
#19360

Bootstrap Your Own Views: Masked Ego-Exo Modeling for Fine-grained View-invariant Video Representations

Jungin Park, Jiyoung Lee, Kwanghoon Sohn

CVPR 2025posterarXiv:2503.19706
#19361

CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization

Junhao Xu, Yanan Zhang, Zhi Cai et al.

CVPR 2025posterarXiv:2503.03430
#19362

Hierarchical Knowledge Prompt Tuning for Multi-task Test-Time Adaptation

Qiang Zhang, Mengsheng Zhao, Jiawei Liu et al.

CVPR 2025poster
#19363

DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation

Ziyu Zhao, Xiaoguang Li, Lingjia Shi et al.

CVPR 2025posterarXiv:2505.11676
#19364

Hazy Low-Quality Satellite Video Restoration Via Learning Optimal Joint Degradation Patterns and Continuous-Scale Super-Resolution Reconstruction

Ning Ni, Libao Zhang

CVPR 2025poster
#19365

Visual Prompting for One-shot Controllable Video Editing without Inversion

Zhengbo Zhang, Yuxi Zhou, DUO PENG et al.

CVPR 2025posterarXiv:2504.14335
#19366

Segment Any Motion in Videos

Nan Huang, Wenzhao Zheng, Chenfeng Xu et al.

CVPR 2025posterarXiv:2503.22268
#19367

Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages

Matteo Farina, Massimiliano Mancini, Giovanni Iacca et al.

CVPR 2025posterarXiv:2503.11609
#19368

TAGA: Self-supervised Learning for Template-free Animatable Gaussian Articulated Model

Zhichao Zhai, Guikun Chen, Wenguan Wang et al.

CVPR 2025poster
#19369

MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing

Shuo Wang, Wanting Li, Yongcai Wang et al.

CVPR 2025posterarXiv:2412.20082
#19370

MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation

Zhenyu Wu, Yuheng Zhou, Xiuwei Xu et al.

CVPR 2025posterarXiv:2503.13446
#19371

Beyond Single-Modal Boundary: Cross-Modal Anomaly Detection through Visual Prototype and Harmonization

Kai Mao, Ping Wei, Yiyang Lian et al.

CVPR 2025poster
#19372

Augmenting Perceptual Super-Resolution via Image Quality Predictors

Fengjia Zhang, Samrudhdhi Rangrej, Tristan T Aumentado-Armstrong et al.

CVPR 2025posterarXiv:2504.18524
#19373

Perception Tokens Enhance Visual Reasoning in Multimodal Language Models

Mahtab Bigverdi, Zelun Luo, Cheng-Yu Hsieh et al.

CVPR 2025posterarXiv:2412.03548
#19374

ViKIENet: Towards Efficient 3D Object Detection with Virtual Key Instance Enhanced Network

Zhuochen Yu, Bijie Qiu, Andy W. H. Khong

CVPR 2025poster
#19375

From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed Learning

Ziang Li, Hongguang Zhang, Juan Wang et al.

CVPR 2025posterarXiv:2503.16266
#19376

Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel Imaging

Ping Wang, Lishun Wang, Gang Qu et al.

CVPR 2025posterarXiv:2505.23180
#19377

Compositional Targeted Multi-Label Universal Perturbations

Hassan Mahmood, Ehsan Elhamifar

CVPR 2025poster
#19378

CGMatch: A Different Perspective of Semi-supervised Learning

Bo Cheng, Jueqing Lu, Yuan Tian et al.

CVPR 2025posterarXiv:2503.02231
#19379

Leveraging Temporal Cues for Semi-Supervised Multi-View 3D Object Detection

Jinhyung Park, Navyata Sanghvi, Hiroki Adachi et al.

CVPR 2025poster
#19380

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Jianzong Wu, Chao Tang, Jingbo Wang et al.

CVPR 2025posterarXiv:2412.07589
#19381

CocoER: Aligning Multi-Level Feature by Competition and Coordination for Emotion Recognition

Xuli Shen, Hua Cai, Weilin Shen et al.

CVPR 2025poster
#19382

Dynamic Motion Blending for Versatile Motion Editing

Nan Jiang, Hongjie Li, Ziye Yuan et al.

CVPR 2025posterarXiv:2503.20724
#19383

A Unified Approach to Interpreting Self-supervised Pre-training Methods for 3D Point Clouds via Interactions

Qiang Li, Jian Ruan, Fanghao Wu et al.

CVPR 2025highlight
#19384

SVFR: A Unified Framework for Generalized Video Face Restoration

Zhiyao Wang, Xu Chen, Chengming Xu et al.

CVPR 2025posterarXiv:2501.01235
#19385

Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution

Huan Zheng, Wencheng Han, Jianbing Shen

CVPR 2025posterarXiv:2411.03239
#19386

Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds

Mohamed Abdelsamad, Michael Ulrich, Claudius Glaeser et al.

CVPR 2025posterarXiv:2502.20316
#19387

CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image

Jingshun Huang, Haitao Lin, Tianyu Wang et al.

CVPR 2025highlightarXiv:2504.11230
#19388

Plug-and-Play PPO: An Adaptive Point Prompt Optimizer Making SAM Greater

Xueyu Liu, Rui Wang, Yexin Lai et al.

CVPR 2025poster
#19389

Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception

ruotian peng, Haiying He, Yake Wei et al.

CVPR 2025posterarXiv:2504.06666
#19390

SCAP: Transductive Test-Time Adaptation via Supportive Clique-based Attribute Prompting

Chenyu Zhang, Kunlun Xu, Zichen Liu et al.

CVPR 2025posterarXiv:2503.12866
#19391

Neuro-3D: Towards 3D Visual Decoding from EEG Signals

Zhanqiang Guo, Jiamin Wu, Yonghao Song et al.

CVPR 2025posterarXiv:2411.12248
#19392

Adventurer: Optimizing Vision Mamba Architecture Designs for Efficiency

Feng Wang, Timing Yang, Yaodong Yu et al.

CVPR 2025posterarXiv:2410.07599
#19393

WeGen: A Unified Model for Interactive Multimodal Generation as We Chat

Zhipeng Huang, Shaobin Zhuang, Canmiao Fu et al.

CVPR 2025posterarXiv:2503.01115
#19394

Reducing Class-wise Confusion for Incremental Learning with Disentangled Manifolds

Huitong Chen, Yu Wang, Yan Fan et al.

CVPR 2025posterarXiv:2503.17677
#19395

Beyond Words: Augmenting Discriminative Richness via Diffusions in Unsupervised Prompt Learning

Hairui Ren, Fan Tang, He Zhao et al.

CVPR 2025posterarXiv:2504.11930
#19396

TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation

Ruineng Li, Daitao Xing, Huiming Sun et al.

CVPR 2025posterarXiv:2504.08181
#19397

Task-Aware Clustering for Prompting Vision-Language Models

Fusheng Hao, Fengxiang He, Fuxiang Wu et al.

CVPR 2025poster
#19398

Hunyuan-Portrait: Implicit Condition Control for Enhanced Portrait Animation

Zunnan Xu, Zhentao Yu, Zixiang Zhou et al.

CVPR 2025poster
#19399

MeshArt: Generating Articulated Meshes with Structure-Guided Transformers

Daoyi Gao, Mohd Yawar Nihal Siddiqui, Lei Li et al.

CVPR 2025posterarXiv:2412.11596
#19400

Non-Natural Image Understanding with Advancing Frequency-based Vision Encoders

Wang Lin, Qingsong Wang, Yueying Feng et al.

CVPR 2025poster