Most Cited CVPR "lifted model construction" Papers

5,589 papers found • Page 12 of 28

#2201

SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost

Haiyang Mei, Pengyu Zhang, Mike Zheng Shou

CVPR 2025arXiv:2506.01304
4
citations
#2202

PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model

Xiang Gao, Shuai Yang, Jiaying Liu

CVPR 2025arXiv:2503.06186
4
citations
#2203

Evaluating Vision-Language Models as Evaluators in Path Planning

Mohamed Aghzal, Xiang Yue, Erion Plaku et al.

CVPR 2025arXiv:2411.18711
4
citations
#2204

Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels

Qiming Xia, Wenkai Lin, Haoen Xiang et al.

CVPR 2025arXiv:2503.08421
4
citations
#2205

Full-DoF Egomotion Estimation for Event Cameras Using Geometric Solvers

Ji Zhao, Banglei Guan, Zibin Liu et al.

CVPR 2025highlightarXiv:2503.03307
4
citations
#2206

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.

Muchen Li, Sammy Christen, Chengde Wan et al.

CVPR 2025
4
citations
#2207

Universal Robustness via Median Randomized Smoothing for Real-World Super-Resolution

Zakariya Chaouai, Mohamed Tamaazousti

CVPR 2024arXiv:2405.14934
4
citations
#2208

Learning to Highlight Audio by Watching Movies

Chao Huang, Ruohan Gao, J. M. F. Tsang et al.

CVPR 2025arXiv:2505.12154
4
citations
#2209

PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning

Song Wang, Xiaolu Liu, Lingdong Kong et al.

CVPR 2025arXiv:2504.16023
4
citations
#2210

Locally Adaptive Neural 3D Morphable Models

Michail Tarasiou, Rolandos Alexandros Potamias, Eimear O' Sullivan et al.

CVPR 2024arXiv:2401.02937
4
citations
#2211

VoteFlow: Enforcing Local Rigidity in Self-Supervised Scene Flow

Yancong Lin, Shiming Wang, Liangliang Nan et al.

CVPR 2025arXiv:2503.22328
4
citations
#2212

GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking

Weikang Bian, Zhaoyang Huang, Xiaoyu Shi et al.

CVPR 2025
4
citations
#2213

Are Images Indistinguishable to Humans Also Indistinguishable to Classifiers?

Zebin You, Xinyu Zhang, Hanzhong Guo et al.

CVPR 2025arXiv:2405.18029
4
citations
#2214

SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering

Hanxiao Sun, Yupeng Gao, Jin Xie et al.

CVPR 2025arXiv:2504.06815
4
citations
#2215

Enhancing Online Continual Learning with Plug-and-Play State Space Model and Class-Conditional Mixture of Discretization

Sihao Liu, Yibo Yang, Xiaojie Li et al.

CVPR 2025arXiv:2412.18177
4
citations
#2216

Radio Frequency Ray Tracing with Neural Object Representation for Enhanced RF Modeling

Xingyu Chen, Zihao Feng, Kun Qian et al.

CVPR 2025
4
citations
#2217

Segment Any-Quality Images with Generative Latent Space Enhancement

Guangqian Guo, Yong Guo, Xuehui Yu et al.

CVPR 2025arXiv:2503.12507
4
citations
#2218

MeshGen: Generating PBR Textured Mesh with Render-Enhanced Auto-Encoder and Generative Data Augmentation

Zilong Chen, Yikai Wang, Wenqiang Sun et al.

CVPR 2025highlightarXiv:2505.04656
4
citations
#2219

Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators

Bohan Xiao, PEIYONG WANG, Qisheng He et al.

CVPR 2025arXiv:2512.23463
4
citations
#2220

X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization

Anna Kukleva, Fadime Sener, Edoardo Remelli et al.

CVPR 2024arXiv:2403.19811
4
citations
#2221

iSegMan: Interactive Segment-and-Manipulate 3D Gaussians

Yian Zhao, Wanshi Xu, Ruochong Zheng et al.

CVPR 2025arXiv:2505.11934
4
citations
#2222

Towards Generalizable Trajectory Prediction using Dual-Level Representation Learning and Adaptive Prompting

Kaouther Messaoud, Matthieu Cord, Alex Alahi

CVPR 2025arXiv:2501.04815
4
citations
#2223

FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression

Bo Tong, Bokai Lai, Yiyi Zhou et al.

CVPR 2025arXiv:2412.04317
4
citations
#2224

Unity in Diversity: Video Editing via Gradient-Latent Purification

Junyu Gao, Kunlin Yang, Xuan Yao et al.

CVPR 2025
4
citations
#2225

Robust-MVTON: Learning Cross-Pose Feature Alignment and Fusion for Robust Multi-View Virtual Try-On

Nannan Zhang, Yijiang Li, Dong Du et al.

CVPR 2025
4
citations
#2226

Context-Aware Multimodal Pretraining

Karsten Roth, Zeynep Akata, Dima Damen et al.

CVPR 2025highlightarXiv:2411.15099
4
citations
#2227

Geometry in Style: 3D Stylization via Surface Normal Deformation

Nam Anh Dinh, Itai Lang, Hyunwoo Kim et al.

CVPR 2025arXiv:2503.23241
4
citations
#2228

Multiplane Prior Guided Few-Shot Aerial Scene Rendering

Zihan Gao, Licheng Jiao, Lingling Li et al.

CVPR 2024arXiv:2406.04961
4
citations
#2229

Neural Hierarchical Decomposition for Single Image Plant Modeling

Zhihao Liu, Zhanglin Cheng, Naoto Yokoya

CVPR 2025
4
citations
#2230

Anomaly Score: Evaluating Generative Models and Individual Generated Images based on Complexity and Vulnerability

Jaehui Hwang, Junghyuk Lee, Jong-Seok Lee

CVPR 2024arXiv:2312.10634
4
citations
#2231

TexVocab: Texture Vocabulary-conditioned Human Avatars

Yuxiao Liu, Zhe Li, Yebin Liu et al.

CVPR 2024arXiv:2404.00524
4
citations
#2232

OmniStereo: Real-time Omnidireactional Depth Estimation with Multiview Fisheye Cameras

Jiaxi Deng, Yushen Wang, Haitao Meng et al.

CVPR 2025
4
citations
#2233

SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Shaoan Xie, Lingjing Kong, Yujia Zheng et al.

CVPR 2025highlightarXiv:2507.22264
4
citations
#2234

DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval

Leqi Shen, Guoqiang Gong, Tianxiang Hao et al.

CVPR 2025arXiv:2506.08887
4
citations
#2235

Real-Time Neural BRDF with Spherically Distributed Primitives

Yishun Dou, Zhong Zheng, Qiaoqiao Jin et al.

CVPR 2024arXiv:2310.08332
4
citations
#2236

LidarGait++: Learning Local Features and Size Awareness from LiDAR Point Clouds for 3D Gait Recognition

Chuanfu Shen, Rui Wang, Lixin Duan et al.

CVPR 2025
4
citations
#2237

Anatomically Constrained Implicit Face Models

Prashanth Chandran, Gaspard Zoss

CVPR 2024arXiv:2312.07538
4
citations
#2238

LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model

Xi Wang, Hongzhen Li, Heng Fang et al.

CVPR 2025arXiv:2412.11519
4
citations
#2239

Anomize: Better Open Vocabulary Video Anomaly Detection

Fei Li, Wenxuan Liu, Jingjing Chen et al.

CVPR 2025arXiv:2503.18094
4
citations
#2240

RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations

Savya Khosla, Sethuraman T V, Alexander G. Schwing et al.

CVPR 2025arXiv:2412.01826
4
citations
#2241

Reasoning to Attend: Try to Understand How <SEG> Token Works

Rui Qian, Xin Yin, Dejing Dou

CVPR 2025arXiv:2412.17741
4
citations
#2242

BG-Triangle: Bézier Gaussian Triangle for 3D Vectorization and Rendering

Minye Wu, Haizhao Dai, Kaixin Yao et al.

CVPR 2025arXiv:2503.13961
4
citations
#2243

Scalable Autoregressive Monocular Depth Estimation

Jinhong Wang, Jintai Chen, Jian liu et al.

CVPR 2025arXiv:2411.11361
4
citations
#2244

Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval

Fan Zhang, Xian-Sheng Hua, Chong Chen et al.

CVPR 2024
4
citations
#2245

Event Fields: Capturing Light Fields at High Speed, Resolution, and Dynamic Range

Ziyuan Qu, Zihao Zou, Vivek Boominathan et al.

CVPR 2025highlightarXiv:2412.06191
4
citations
#2246

One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models

Senmao Li, Lei Wang, Kai Wang et al.

CVPR 2025
4
citations
#2247

Omni-Q: Omni-Directional Scene Understanding for Unsupervised Visual Grounding

Sai Wang, Yutian Lin, Yu Wu

CVPR 2024
4
citations
#2248

Relation3D : Enhancing Relation Modeling for Point Cloud Instance Segmentation

Edward LOO, Jiacheng Deng

CVPR 2025arXiv:2506.17891
4
citations
#2249

HyperLoRA: Parameter-Efficient Adaptive Generation for Portrait Synthesis

Mengtian Li, Jinshu Chen, Wanquan Feng et al.

CVPR 2025highlightarXiv:2503.16944
4
citations
#2250

LPSNet: End-to-End Human Pose and Shape Estimation with Lensless Imaging

Haoyang Ge, Qiao Feng, Hailong Jia et al.

CVPR 2024arXiv:2404.01941
4
citations
#2251

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Shengqiong Wu, Hao Fei, Jingkang Yang et al.

CVPR 2025highlightarXiv:2503.15019
4
citations
#2252

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?

Renshuai Tao, Haoyu Wang, Yuzhe Guo et al.

CVPR 2025arXiv:2411.18082
4
citations
#2253

ESCAPE: Equivariant Shape Completion via Anchor Point Encoding

Burak Bekci, Nassir Navab, Federico Tombari et al.

CVPR 2025arXiv:2412.00952
4
citations
#2254

GCC: Generative Color Constancy via Diffusing a Color Checker

Chen-Wei Chang, Cheng-De Fan, Chia-Che Chang et al.

CVPR 2025arXiv:2502.17435
4
citations
#2255

Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images

Jiuchen Chen, Xinyu Yan, Qizhi Xu et al.

CVPR 2025arXiv:2504.09621
4
citations
#2256

Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation

Xiaoqi Li, Lingyun Xu, Mingxu Zhang et al.

CVPR 2025arXiv:2505.02166
4
citations
#2257

Secret Lies in Color: Enhancing AI-Generated Images Detection with Color Distribution Analysis

Zexi Jia, Chuanwei Huang, Yeshuang Zhu et al.

CVPR 2025
4
citations
#2258

3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surfaces

Linyi Jin, Nilesh Kulkarni, David Fouhey

CVPR 2024arXiv:2403.08768
4
citations
#2259

WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression

Yu Mao, Jun Wang, Nan Guan et al.

CVPR 2025arXiv:2503.18074
4
citations
#2260

Common3D: Self-Supervised Learning of 3D Morphable Models for Common Objects in Neural Feature Space

Leonhard Sommer, Olaf Dünkel, Christian Theobalt et al.

CVPR 2025arXiv:2504.21749
4
citations
#2261

Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair

Jeonghoon Park, Chaeyeon Chung, Jaegul Choo

CVPR 2024arXiv:2404.19250
4
citations
#2262

Towards All-in-One Medical Image Re-Identification

Yuan Tian, Kaiyuan Ji, Rongzhao Zhang et al.

CVPR 2025arXiv:2503.08173
4
citations
#2263

dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis

Luyuan Xie, Tianyu Luan, Wenyuan Cai et al.

CVPR 2025arXiv:2503.10412
4
citations
#2264

ATP: Adaptive Threshold Pruning for Efficient Data Encoding in Quantum Neural Networks

Mohamed Afane, Gabrielle Ebbrecht, Ying Wang et al.

CVPR 2025arXiv:2503.21815
4
citations
#2265

Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Zekun Li et al.

CVPR 2025arXiv:2503.16997
4
citations
#2266

Precise Event Spotting in Sports Videos: Solving Long-Range Dependency and Class Imbalance

Sanchayan Santra, Vishal Chudasama, Pankaj Wasnik et al.

CVPR 2025arXiv:2503.00147
4
citations
#2267

Interpretable Measures of Conceptual Similarity by Complexity-Constrained Descriptive Auto-Encoding

Alessandro Achille, Greg Ver Steeg, Tian Yu Liu et al.

CVPR 2024arXiv:2402.08919
4
citations
#2268

SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity

Ke Ma, Jiaqi Tang, Bin Guo et al.

CVPR 2025highlightarXiv:2503.20354
4
citations
#2269

Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features

Yuanbo Xiangli, Ruojin Cai, Hanyu Chen et al.

CVPR 2025highlightarXiv:2412.05826
4
citations
#2270

MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures

Lucas Morin, Valery Weber, Ahmed Nassar et al.

CVPR 2025arXiv:2503.16096
4
citations
#2271

MoEdit: On Learning Quantity Perception for Multi-object Image Editing

Yanfeng Li, Ka-Hou Chan, Yue Sun et al.

CVPR 2025arXiv:2503.10112
4
citations
#2272

VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness

SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim et al.

CVPR 2025arXiv:2503.16406
4
citations
#2273

Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attack on Breast Ultrasound Images

Yasamin Medghalchi, Moein Heidari, Clayton Allard et al.

CVPR 2025arXiv:2412.09910
4
citations
#2274

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Wei Li, Pin-Yu Chen, Sijia Liu et al.

CVPR 2025arXiv:2406.05826
4
citations
#2275

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Joya Chen, Yiqi Lin, Ziyun Zeng et al.

CVPR 2025arXiv:2504.16030
4
citations
#2276

Learning Affine Correspondences by Integrating Geometric Constraints

Pengju Sun, Banglei Guan, Zhenbao Yu et al.

CVPR 2025arXiv:2504.04834
4
citations
#2277

SPIDeRS: Structured Polarization for Invisible Depth and Reflectance Sensing

Tomoki Ichikawa, Shohei Nobuhara, Ko Nishino

CVPR 2024arXiv:2312.04553
4
citations
#2278

Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions

Namitha Padmanabhan, Matthew A Gwilliam, Pulkit Kumar et al.

CVPR 2024arXiv:2401.10217
4
citations
#2279

On Denoising Walking Videos for Gait Recognition

Dongyang Jin, Chao Fan, Jingzhe Ma et al.

CVPR 2025arXiv:2505.18582
4
citations
#2280

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation

Ziqin Huang, Gu Wang, Chenyangguang Zhang et al.

CVPR 2025arXiv:2503.15110
4
citations
#2281

CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design

Weitao Feng, Hang Zhou, Jing Liao et al.

CVPR 2025highlightarXiv:2504.19478
4
citations
#2282

DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing

Yufei Huang, Bangyan Liao, Yuqi Hu et al.

CVPR 2025
4
citations
#2283

HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation

Hongye Cheng, Tianyu Wang, guangsi shi et al.

CVPR 2025arXiv:2503.01175
4
citations
#2284

A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning

Xin Wen, Bingchen Zhao, Yilun Chen et al.

CVPR 2025arXiv:2503.06960
4
citations
#2285

Building Optimal Neural Architectures using Interpretable Knowledge

Keith Mills, Fred Han, Mohammad Salameh et al.

CVPR 2024arXiv:2403.13293
4
citations
#2286

OpenSDI: Spotting Diffusion-Generated Images in the Open World

Yabin Wang, Zhiwu Huang, Xiaopeng Hong

CVPR 2025arXiv:2503.19653
4
citations
#2287

Fractal Calibration for Long-tailed Object Detection

Konstantinos Alexandridis, Ismail Elezi, Jiankang Deng et al.

CVPR 2025arXiv:2410.11774
4
citations
#2288

FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting

Fangyu Wu, Yuhao Chen

CVPR 2025arXiv:2411.12089
4
citations
#2289

Test-Time Visual In-Context Tuning

Jiahao Xie, Alessio Tonioni, Nathalie Rauschmayr et al.

CVPR 2025arXiv:2503.21777
4
citations
#2290

Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation

Yiming Qin, Zhu Xu, Yang Liu

CVPR 2025arXiv:2505.05505
4
citations
#2291

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

Kang Chen, Jiyuan Zhang, Zecheng Hao et al.

CVPR 2025highlightarXiv:2411.10504
4
citations
#2292

Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation

Xiang Li, Zixuan Huang, Anh Thai et al.

CVPR 2025highlightarXiv:2411.17763
4
citations
#2293

Multi-Label Prototype Visual Spatial Search for Weakly Supervised Semantic Segmentation

Songsong Duan, Xi Yang, Nannan Wang

CVPR 2025highlight
4
citations
#2294

GraphMimic: Graph-to-Graphs Generative Modeling from Videos for Policy Learning

Guangyan Chen, Te Cui, Meiling Wang et al.

CVPR 2025
4
citations
#2295

BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting

Jeongwan On, Kyeonghwan Gwak, Gunyoung Kang et al.

CVPR 2025arXiv:2504.09097
4
citations
#2296

DEAL: Data-Efficient Adversarial Learning for High-Quality Infrared Imaging

Zhu Liu, Zijun Wang, Jinyuan Liu et al.

CVPR 2025arXiv:2503.00905
4
citations
#2297

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Rui Zhao, Weijia Mao, Mike Zheng Shou

CVPR 2025arXiv:2503.03651
4
citations
#2298

ZeroVO: Visual Odometry with Minimal Assumptions

Lei Lai, Zekai Yin, Eshed Ohn-Bar

CVPR 2025arXiv:2506.08005
4
citations
#2299

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia et al.

CVPR 2025arXiv:2503.01980
4
citations
#2300

PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution

Shian Du, Menghan Xia, Chang Liu et al.

CVPR 2025arXiv:2509.26025
4
citations
#2301

Synchronized Video-to-Audio Generation via Mel Quantization-Continuum Decomposition

Juncheng Wang, Chao Xu, Cheng Yu et al.

CVPR 2025arXiv:2503.06984
4
citations
#2302

VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors

Juil Koo, Paul Guerrero, Chun-Hao P. Huang et al.

CVPR 2025arXiv:2503.01107
4
citations
#2303

Dynamic Derivation and Elimination: Audio Visual Segmentation with Enhanced Audio Semantics

Chen Liu, Liying Yang, Peike Li et al.

CVPR 2025arXiv:2503.12840
4
citations
#2304

AMR-Transformer: Enabling Efficient Long-range Interaction for Complex Neural Fluid Simulation

Zeyi Xu, Jinfan Liu, Kuangxu Chen et al.

CVPR 2025arXiv:2503.10257
4
citations
#2305

Enhancing Dataset Distillation via Non-Critical Region Refinement

Minh-Tuan Tran, Trung Le, Xuan-May Le et al.

CVPR 2025arXiv:2503.18267
4
citations
#2306

IDEA-Bench: How Far are Generative Models from Professional Designing?

Chen Liang, Lianghua Huang, Jingwu Fang et al.

CVPR 2025arXiv:2412.11767
4
citations
#2307

Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation

Xiaoying Xing, Avinab Saha, Junfeng He et al.

CVPR 2025highlightarXiv:2501.06481
4
citations
#2308

Unbiased Video Scene Graph Generation via Visual and Semantic Dual Debiasing

Yanjun Li, Zhaoyang Li, Honghui Chen et al.

CVPR 2025arXiv:2503.00548
4
citations
#2309

Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting

Jingyi Xu, Xieyuanli Chen, Junyi Ma et al.

CVPR 2025arXiv:2411.14169
4
citations
#2310

QuCOOP: A Versatile Framework for Solving Composite and Binary-Parametrised Problems on Quantum Annealers

Natacha Kuete Meli, Vladislav Golyanik, Marcel Seelbach Benkner et al.

CVPR 2025highlightarXiv:2503.19718
4
citations
#2311

U-Know-DiffPAN: An Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening

Sungpyo Kim, Jeonghyeok Do, Jaehyup Lee et al.

CVPR 2025arXiv:2412.06243
4
citations
#2312

Dual-Agent Optimization framework for Cross-Domain Few-Shot Segmentation

Zhaoyang Li, Yuan Wang, Wangkai Li et al.

CVPR 2025
4
citations
#2313

FreeScene: Mixed Graph Diffusion for 3D Scene Synthesis from Free Prompts

Tongyuan Bai, Wangyuanfan Bai, Dong Chen et al.

CVPR 2025arXiv:2506.02781
4
citations
#2314

Noise-Resistant Video Anomaly Detection via RGB Error-Guided Multiscale Predictive Coding and Dynamic Memory

Han Hu, Wenli Du, Peng Liao et al.

CVPR 2025
4
citations
#2315

GASP: Gaussian Avatars with Synthetic Priors

Jack Saunders, Charlie Hewitt, Yanan Jian et al.

CVPR 2025arXiv:2412.07739
4
citations
#2316

Floxels: Fast Unsupervised Voxel Based Scene Flow Estimation

David T. Hoffmann, Syed Haseeb Raza, Hanqiu Jiang et al.

CVPR 2025arXiv:2503.04718
4
citations
#2317

TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser et al.

CVPR 2025highlightarXiv:2411.15580
4
citations
#2318

Order-One Rolling Shutter Cameras

Marvin Anas Hahn, Kathlén Kohn, Orlando Marigliano et al.

CVPR 2025highlightarXiv:2403.11295
4
citations
#2319

Channel Consistency Prior and Self-Reconstruction Strategy Based Unsupervised Image Deraining

Guanglu Dong, Tianheng Zheng, Yuanzhouhan Cao et al.

CVPR 2025arXiv:2503.18703
4
citations
#2320

Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras

Hoonhee Cho, Jae-Young Kang, Youngho Kim et al.

CVPR 2025highlightarXiv:2502.19630
4
citations
#2321

ReRAW: RGB-to-RAW Image Reconstruction via Stratified Sampling for Efficient Object Detection on the Edge

Radu Berdan, Beril Besbinar, Christoph Reinders et al.

CVPR 2025arXiv:2503.03782
4
citations
#2322

SeNM-VAE: Semi-Supervised Noise Modeling with Hierarchical Variational Autoencoder

Dihan Zheng, Yihang Zou, Xiaowen Zhang et al.

CVPR 2024arXiv:2403.17502
4
citations
#2323

Learning to Remove Wrinkled Transparent Film with Polarized Prior

Jiaqi Tang, RUIZHENG WU, Xiaogang Xu et al.

CVPR 2024arXiv:2403.04368
4
citations
#2324

MARBLE: Material Recomposition and Blending in CLIP-Space

Ta-Ying Cheng, Prafull Sharma, Mark Boss et al.

CVPR 2025arXiv:2506.05313
4
citations
#2325

On the Out-Of-Distribution Generalization of Large Multimodal Models

Xingxuan Zhang, Jiansheng Li, Wenjing Chu et al.

CVPR 2025
4
citations
#2326

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025highlightarXiv:2412.00782
4
citations
#2327

FaceLift: Semi-supervised 3D Facial Landmark Localization

David Ferman, Pablo Garrido, Gaurav Bharaj

CVPR 2024arXiv:2405.19646
4
citations
#2328

Mono3DVLT: Monocular-Video-Based 3D Visual Language Tracking

Hongkai Wei, YANG YANG, Shijie Sun et al.

CVPR 2025
4
citations
#2329

HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver

Cong Wei, Haoxian Tan, Yujie Zhong et al.

CVPR 2025
4
citations
#2330

Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations

Jeonghyeon Kim, Sangheum Hwang

CVPR 2025arXiv:2503.18817
4
citations
#2331

Pose Adapted Shape Learning for Large-Pose Face Reenactment

Gee-Sern Hsu, Jie-Ying Zhang, Yu-Hsiang Huang et al.

CVPR 2024
4
citations
#2332

Multi-modal Medical Diagnosis via Large-small Model Collaboration

Wanyi Chen, Zihua Zhao, Jiangchao Yao et al.

CVPR 2025
4
citations
#2333

Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

Kelvin C.K. Chan, Yang Zhao, Xuhui Jia et al.

CVPR 2024arXiv:2405.01356
4
citations
#2334

URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration

Rui Xu, Yuzhen Niu, Yuezhou Li et al.

CVPR 2025arXiv:2505.23068
4
citations
#2335

Spectral State Space Model for Rotation-Invariant Visual Representation Learning

Sahar Dastani, Ali Bahri, Moslem Yazdanpanah et al.

CVPR 2025arXiv:2503.06369
4
citations
#2336

Atom-Level Optical Chemical Structure Recognition with Limited Supervision

Martijn Oldenhof, Edward De Brouwer, Adam Arany et al.

CVPR 2024arXiv:2404.01743
4
citations
#2337

BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions

Wonyong Seo, Jihyong Oh, Munchurl Kim

CVPR 2025arXiv:2412.11365
4
citations
#2338

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning

Jiayi Guan, Li Shen, Ao Zhou et al.

CVPR 2024
4
citations
#2339

Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization

Jamie Wynn, Zawar Qureshi, Jakub Powierza et al.

CVPR 2025arXiv:2503.02009
4
citations
#2340

SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization

Jianyu LAI, Sixiang Chen, yunlong lin et al.

CVPR 2025
4
citations
#2341

Enhanced then Progressive Fusion with View Graph for Multi-View Clustering

Zhibin Dong, Meng Liu, Siwei Wang et al.

CVPR 2025
4
citations
#2342

GENIUS: A Generative Framework for Universal Multimodal Search

Sungyeon Kim, Xinliang Zhu, Xiaofan Lin et al.

CVPR 2025arXiv:2503.19868
4
citations
#2343

DistinctAD: Distinctive Audio Description Generation in Contexts

Bo Fang, Wenhao Wu, Qiangqiang Wu et al.

CVPR 2025highlightarXiv:2411.18180
4
citations
#2344

SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

Yaniv Benny, Lior Wolf

CVPR 2025arXiv:2412.06968
4
citations
#2345

BiLoRA: Almost-Orthogonal Parameter Spaces for Continual Learning

Hao Zhu, Yifei Zhang, Junhao Dong et al.

CVPR 2025
4
citations
#2346

Ges3ViG : Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding

Atharv Mahesh Mane, Dulanga Weerakoon, Vigneshwaran Subbaraju et al.

CVPR 2025arXiv:2504.09623
4
citations
#2347

SSHNet: Unsupervised Cross-modal Homography Estimation via Problem Reformulation and Split Optimization

Junchen Yu, Siyuan Cao, Runmin Zhang et al.

CVPR 2025highlightarXiv:2409.17993
4
citations
#2348

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

Hao Lin, Ke Wu, Jie Li et al.

CVPR 2025arXiv:2307.16375
4
citations
#2349

Understanding Multi-Task Activities from Single-Task Videos

Yuhan Shen, Ehsan Elhamifar

CVPR 2025highlight
4
citations
#2350

End-to-End Implicit Neural Representations for Classification

Alexander Gielisse, Jan van Gemert

CVPR 2025arXiv:2503.18123
4
citations
#2351

SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens

Chi Su, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2025arXiv:2411.19824
4
citations
#2352

Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval

Boseung Jeong, Jicheol Park, Sungyeon Kim et al.

CVPR 2025arXiv:2504.02397
4
citations
#2353

Single Domain Generalization for Few-Shot Counting via Universal Representation Matching

Xianing Chen, Si Huo, Borui Jiang et al.

CVPR 2025arXiv:2505.16778
4
citations
#2354

Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization

Xiran Wang, Jian Zhang, Lei Qi et al.

CVPR 2025arXiv:2503.18987
4
citations
#2355

Taste More, Taste Better: Diverse Data and Strong Model Boost Semi-Supervised Crowd Counting

Maochen Yang, Zekun Li, Jian Zhang et al.

CVPR 2025arXiv:2503.17984
4
citations
#2356

Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning

Sherry X. Chen, Misha Sra, Pradeep Sen

CVPR 2025arXiv:2503.18406
4
citations
#2357

Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations

Haitong Liu, Kuofeng Gao, Yang Bai et al.

CVPR 2025arXiv:2503.21824
4
citations
#2358

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

Zhengfei Kuang, Tianyuan Zhang, Kai Zhang et al.

CVPR 2025arXiv:2411.17249
4
citations
#2359

SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything Model

Chongkai Yu, Ting Liu, Li Anqi et al.

CVPR 2025arXiv:2408.11535
3
citations
#2360

WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects Under Occlusion

Khiem Vuong, N. Dinesh Reddy, Robert Tamburo et al.

CVPR 2024arXiv:2403.19022
3
citations
#2361

PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds

Barza Nisar, Steven L. Waslander

CVPR 2025arXiv:2503.13914
3
citations
#2362

ASHiTA: Automatic Scene-grounded HIerarchical Task Analysis

Yun Chang, Leonor Fermoselle, Duy Ta et al.

CVPR 2025arXiv:2504.06553
3
citations
#2363

Enhancing Diversity for Data-free Quantization

Kai Zhao, zhihao zhuang, Miao Zhang et al.

CVPR 2025
3
citations
#2364

Sufficient Invariant Learning for Distribution Shift

Taero Kim, Subeen Park, Sungjun Lim et al.

CVPR 2025arXiv:2210.13533
3
citations
#2365

CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation

Townim Chowdhury, Kewen Liao, Vu Minh Hieu Phan et al.

CVPR 2024arXiv:2404.02388
3
citations
#2366

Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better

Zihang Lai, Andrea Vedaldi

CVPR 2025highlightarXiv:2503.19904
3
citations
#2367

Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis

Boming Miao, Chunxiao Li, Xiaoxiao Wang et al.

CVPR 2025arXiv:2411.16503
3
citations
#2368

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025arXiv:2501.12381
3
citations
#2369

Reproducible Vision-Language Models Meet Concepts Out of Pre-Training

Ziliang Chen, Xin Huang, Xiaoxuan Fan et al.

CVPR 2025
3
citations
#2370

LOD-GS: Achieving Levels of Detail using Scalable Gaussian Soup

Jianxiong Shen, Yue Qian, Xiaohang Zhan

CVPR 2025
3
citations
#2371

GroomLight: Hybrid Inverse Rendering for Relightable Human Hair Appearance Modeling

Yang Zheng, Menglei Chai, Delio Vicini et al.

CVPR 2025arXiv:2503.10597
3
citations
#2372

Reasoning in Visual Navigation of End-to-end Trained Agents: A Dynamical Systems Approach

Steeven JANNY, Hervé Poirier, Leonid Antsfeld et al.

CVPR 2025highlightarXiv:2503.08306
3
citations
#2373

Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels

Yongshuo Zong, Qin ZHANG, DONGSHENG An et al.

CVPR 2025arXiv:2505.13788
3
citations
#2374

Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation

Yiheng Li, Yang Yang, Zichang Tan et al.

CVPR 2025arXiv:2506.05890
3
citations
#2375

GG-SSMs: Graph-Generating State Space Models

Nikola Zubic, Davide Scaramuzza

CVPR 2025
3
citations
#2376

SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models

Jaerin Lee, Daniel Jung, Kanggeon Lee et al.

CVPR 2025arXiv:2403.09055
3
citations
#2377

HalLoc: Token-level Localization of Hallucinations for Vision Language Models

Eunkyu Park, Minyeong Kim, Gunhee Kim

CVPR 2025arXiv:2506.10286
3
citations
#2378

ATA: Adaptive Transformation Agent for Text-Guided Subject-Position Variable Background Inpainting

Yizhe Tang, Zhimin Sun, Yuzhen Du et al.

CVPR 2025
3
citations
#2379

PMA: Towards Parameter-Efficient Point Cloud Understanding via Point Mamba Adapter

Yaohua Zha, Yanzi Wang, Hang Guo et al.

CVPR 2025arXiv:2505.20941
3
citations
#2380

4D-Fly: Fast 4D Reconstruction from a Single Monocular Video

Diankun Wu, Fangfu Liu, Yi-Hsin Hung et al.

CVPR 2025
3
citations
#2381

Scaling Down Text Encoders of Text-to-Image Diffusion Models

Lifu Wang, Daqing Liu, Xinchen Liu et al.

CVPR 2025arXiv:2503.19897
3
citations
#2382

FaceCom: Towards High-fidelity 3D Facial Shape Completion via Optimization and Inpainting Guidance

Yinglong Li, Hongyu Wu, Wang et al.

CVPR 2024arXiv:2406.02074
3
citations
#2383

Simpler Diffusion: 1.5 FID on ImageNet512 with Pixel-space Diffusion

Emiel Hoogeboom, Thomas Mensink, Jonathan Heek et al.

CVPR 2025
3
citations
#2384

FFR: Frequency Feature Rectification for Weakly Supervised Semantic Segmentation

Ziqian Yang, Xinqiao Zhao, Xiaolei Wang et al.

CVPR 2025
3
citations
#2385

Action Detail Matters: Refining Video Recognition with Local Action Queries

Mengmeng Wang, Zeyi Huang, Xiangjie Kong et al.

CVPR 2025
3
citations
#2386

From Laboratory to Real World: A New Benchmark Towards Privacy-Preserved Visible-Infrared Person Re-Identification

Yan Jiang, Hao Yu, Xu Cheng et al.

CVPR 2025
3
citations
#2387

LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions

Faridoun Mehri, Mahdieh Baghshah, Mohammad Taher Pilehvar

CVPR 2025arXiv:2411.16760
3
citations
#2388

DFM: Differentiable Feature Matching for Anomaly Detection

Wu Sheng, Yimi Wang, Xudong Liu et al.

CVPR 2025
3
citations
#2389

LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds

Zihui Zhang, Weisheng Dai, Hongtao Wen et al.

CVPR 2025arXiv:2506.07857
3
citations
#2390

Previously on ... From Recaps to Story Summarization

Aditya Kumar Singh, Dhruv Srivastava, Makarand Tapaswi

CVPR 2024
3
citations
#2391

Reconstruction-free Cascaded Adaptive Compressive Sensing

Chenxi Qiu, Tao Yue, Xuemei Hu

CVPR 2024
3
citations
#2392

Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories

Yan Zhang, Sergey Prokudin, Marko Mihajlovic et al.

CVPR 2024arXiv:2406.03625
3
citations
#2393

Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation

Gianni Franchi, Nacim Belkhir, Dat NGUYEN et al.

CVPR 2025arXiv:2412.03178
3
citations
#2394

Conditional Balance: Improving Multi-Conditioning Trade-Offs in Image Generation

Nadav Z. Cohen, Oron Nir, Ariel Shamir

CVPR 2025arXiv:2412.19853
3
citations
#2395

On the Generalization of Handwritten Text Recognition Models

Carlos Garrido-Munoz, Jorge Calvo-Zaragoza

CVPR 2025arXiv:2411.17332
3
citations
#2396

Distilled Datamodel with Reverse Gradient Matching

Jingwen Ye, Ruonan Yu, Songhua Liu et al.

CVPR 2024arXiv:2404.14006
3
citations
#2397

StickMotion: Generating 3D Human Motions by Drawing a Stickman

Tao Wang, Zhihua Wu, Qiaozhi He et al.

CVPR 2025arXiv:2503.04829
3
citations
#2398

A Regularization-Guided Equivariant Approach for Image Restoration

Yulu Bai, Jiahong Fu, Qi Xie et al.

CVPR 2025arXiv:2505.19799
3
citations
#2399

SFDM: Robust Decomposition of Geometry and Reflectance for Realistic Face Rendering from Sparse-view Images

Daisheng Jin, Jiangbei Hu, Baixin Xu et al.

CVPR 2025arXiv:2312.06085
3
citations
#2400

Finsler Multi-Dimensional Scaling: Manifold Learning for Asymmetric Dimensionality Reduction and Embedding

Thomas Dagès, Simon Weber, Ya-Wei Eileen Lin et al.

CVPR 2025arXiv:2503.18010
3
citations