Most Cited CVPR &quot;interpretable neural networks&quot; Papers

#2602

Residual Learning in Diffusion Models

Junyu Zhang, Daochang Liu, Eunbyung Park et al.

CVPR 2024highlight

CVPR 2025posterarXiv:2503.23606

#2603

Blurry-Edges: Photon-Limited Depth Estimation from Defocused Boundaries

Wei Xu, Charlie Wagner, Junjie Luo et al.

CVPR 2025posterarXiv:2412.11365

#2604

BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions

Wonyong Seo, Jihyong Oh, Munchurl Kim

CVPR 2025posterarXiv:2504.05499

#2605

Few-shot Personalized Scanpath Prediction

Ruoyu Xue, Jingyi Xu, Sounak Mondal et al.

CVPR 2025posterarXiv:2506.01591

#2606

Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation

Yuan Gan, Jiaxu Miao, Yunze Wang et al.

CVPR 2025posterarXiv:2508.15127

#2607

Towards Source-Free Machine Unlearning

Sk Miraj Ahmed, Umit Basaran, Dripta S. Raychaudhuri et al.

CVPR 2024posterarXiv:2309.13855

#2608

Adaptive Softassign via Hadamard-Equipped Sinkhorn

Binrui Shen, Qiang Niu, Shengxin Zhu

CVPR 2025posterarXiv:2505.01237

#2609

CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment

Edson Araujo, Andrew Rouditchenko, Yuan Gong et al.

CVPR 2025posterarXiv:2506.02893

#2610

Dense Match Summarization for Faster Two-view Estimation

Jonathan Astermark, Anders Heyden, Viktor Larsson

CVPR 2025highlightarXiv:2503.00605

#2611

GenVDM: Generating Vector Displacement Maps From a Single Image

Yuezhi Yang, Qimin Chen, Vladimir G. Kim et al.

CVPR 2025posterarXiv:2503.22462

#2612

SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations

Krispin Wandel, Hesheng Wang

CVPR 2024posterarXiv:2311.10339

#2613

A2XP: Towards Private Domain Generalization

Geunhyeok Yu, Hyoseok Hwang

#2614

Rethinking Correspondence-based Category-Level Object Pose Estimation

Huan Ren, Wenfei Yang, Shifeng Zhang et al.

#2615

Boosting the Dual-Stream Architecture in Ultra-High Resolution Segmentation with Resolution-Biased Uncertainty Estimation

Rong Qin, Xingyu Liu, Jinglei Shi et al.

CVPR 2025posterarXiv:2503.12535

#2616

SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs

Guibiao Liao, Qing Li, Zhenyu Bao et al.

CVPR 2025posterarXiv:2503.10526

#2617

NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval

Zengrong Lin, Zheng Wang, Tianwen Qian et al.

#2618

SEC-Prompt:SEmantic Complementary Prompting for Few-Shot Class-Incremental Learning

Ye Liu, Meng Yang

CVPR 2024posterarXiv:2307.08727

#2619

Learning to Count without Annotations

Lukas Knobel, Tengda Han, Yuki Asano

#2620

Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding

Yuxuan Wang, Aming Wu, Muli Yang et al.

#2621

DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences

Xingjian Li, Qiming Zhao, Neelesh Bisht et al.

CVPR 2025posterarXiv:2408.11535

#2622

SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything Model

Chongkai Yu, Ting Liu, Li Anqi et al.

CVPR 2025posterarXiv:2402.08784

#2623

Preconditioners for the Stochastic Training of Neural Fields

Shin-Fang Chng, Hemanth Saratchandran, Simon Lucey

CVPR 2025posterarXiv:2411.16773

#2624

MICAS: Multi-grained In-Context Adaptive Sampling for 3D Point Cloud Processing

Feifei Shao, Ping Liu, Zhao Wang et al.

CVPR 2025posterarXiv:2504.03800

#2625

Decision SpikeFormer: Spike-Driven Transformer for Decision Making

Wei Huang, Qinying Gu, Nanyang Ye

CVPR 2025posterarXiv:2505.17475

#2626

PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation

Uyoung Jeong, Jonathan Freer, Seungryul Baek et al.

CVPR 2025posterarXiv:2503.21772

#2627

LOCORE: Image Re-ranking with Long-Context Sequence Modeling

Zilin Xiao, Pavel Suma, Ayush Sachdeva et al.

CVPR 2025posterarXiv:2506.04174

#2628

FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting

Hengyu Liu, Yuehao Wang, Chenxin Li et al.

CVPR 2025posterarXiv:2506.11131

#2629

Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation

Tanner Schmidt, Richard Newcombe

CVPR 2025highlightarXiv:2503.14359

#2630

ImViD: Immersive Volumetric Videos for Enhanced VR Engagement

Zhengxian Yang, Shi Pan, Shengqi Wang et al.

CVPR 2025posterarXiv:2405.18560

#2631

Potential Field Based Deep Metric Learning

Shubhang Bhatnagar, Narendra Ahuja

#2632

Deep Fair Multi-View Clustering with Attention KAN

HaiMing Xu, Qianqian Wang, Boyue Wang et al.

#2633

DeepCompress-ViT: Rethinking Model Compression to Enhance Efficiency of Vision Transformers at the Edge

Sabbir Ahmed, Abdullah Al Arafat, Deniz Najafi et al.

CVPR 2025posterarXiv:2503.19295

#2634

Exploring Semantic Feature Discrimination for Perceptual Image Super-Resolution and Opinion-Unaware No-Reference Image Quality Assessment

Guanglu Dong, Xiangyu Liao, Mingyang Li et al.

CVPR 2025posterarXiv:2503.06746

#2635

Color Alignment in Diffusion

Ka Chun SHUM, Binh-Son Hua, Thanh Nguyen et al.

#2636

PhyS-EdiT: Physics-aware Semantic Image Editing with Text Description

Ziqi Cai, Shuchen Weng, Yifei Xia et al.

CVPR 2025highlightarXiv:2506.09343

#2637

CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation

Yuxing Long, Jiyao Zhang, Mingjie Pan et al.

CVPR 2025posterarXiv:2601.07377

#2638

Learning Dynamic Collaborative Network for Semi-supervised 3D Vessel Segmentation

Jiao Xu, Xin Chen, Lihe Zhang

CVPR 2025posterarXiv:2503.03115

#2639

NTR-Gaussian: Nighttime Dynamic Thermal Reconstruction with 4D Gaussian Splatting Based on Thermodynamics

Kun Yang, Yuxiang Liu, Zeyu Cui et al.

CVPR 2025highlightarXiv:2504.01383

#2640

v-CLR: View-Consistent Learning for Open-World Instance Segmentation

Chang-Bin Zhang, Jinhong Ni, Yujie Zhong et al.

CVPR 2025posterarXiv:2504.01466

#2641

Mesh Mamba: A Unified State Space Model for Saliency Prediction in Non-Textured and Textured Meshes

Kaiwei Zhang, Dandan Zhu, Xiongkuo Min et al.

CVPR 2025posterarXiv:2504.09606

#2642

Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training

Lexington Whalen, Zhenbang Du, Haoran You et al.

CVPR 2025posterarXiv:2412.00124

#2643

Auto-Encoded Supervision for Perceptual Image Super-Resolution

MinKyu Lee, Sangeek Hyun, Woojin Jun et al.

#2644

Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration

Chao Wang, Hehe Fan, Huichen Yang et al.

CVPR 2025posterarXiv:2504.02862

#2645

Towards Understanding How Knowledge Evolves in Large Vision-Language Models

Sudong Wang, Yunjian Zhang, Yao Zhu et al.

CVPR 2025posterarXiv:2412.08859

#2646

ViUniT: Visual Unit Tests for More Robust Visual Programming

Artemis Panagopoulou, Honglu Zhou, silvio savarese et al.

CVPR 2025posterarXiv:2505.03097

#2647

Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability

Lei Wang, Senmao Li, Fei Yang et al.

CVPR 2025posterarXiv:2505.07300

#2648

L-SWAG: Layer-Sample Wise Activation with Gradients Information for Zero-Shot NAS on Vision Transformers

Sofia Casarin, Sergio Escalera, Oswald Lanz

CVPR 2025posterarXiv:2503.07446

#2649

EigenGS Representation: From Eigenspace to Gaussian Image Space

LO-WEI TAI, Ching-En Ching En, Li et al.

CVPR 2025posterarXiv:2503.22138

#2650

Enhancing Dance-to-Music Generation via Negative Conditioning Latent Diffusion Model

Changchang Sun, Gaowen Liu, Charles Fleming et al.

CVPR 2025posterarXiv:2503.08422

#2651

JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data

Runjian Chen, Wenqi Shao, Bo Zhang et al.

#2652

Hand-held Object Reconstruction from RGB Video with Dynamic Interaction

Shijian Jiang, Qi Ye, Rengan Xie et al.

CVPR 2025posterarXiv:2504.19514

#2653

FSBench: A Figure Skating Benchmark for Advancing Artistic Sports Understanding

Rong Gao, Xin Liu, Zhuozhao Hu et al.

#2654

DTOS: Dynamic Time Object Sensing with Large Multimodal Model

Jirui Tian, Jinrong Zhang, Shenglan Liu et al.

#2655

Incomplete Multi-modal Brain Tumor Segmentation via Learnable Sorting State Space Model

Zheyu Zhang, Yayuan Lu, Feipeng Ma et al.

CVPR 2025posterarXiv:2505.22427

#2656

RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network

Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.

#2657

A Unified, Resilient, and Explainable Adversarial Patch Detector

Vishesh Kumar, Akshay Agarwal

CVPR 2025posterarXiv:2505.00998

#2658

Deterministic-to-Stochastic Diverse Latent Feature Mapping for Human Motion Synthesis

Hua Yu, Weiming Liu, Gui Xu et al.

CVPR 2025posterarXiv:2503.22163

#2659

T-CIL: Temperature Scaling using Adversarial Perturbation for Calibration in Class-Incremental Learning

Seong-Hyeon Hwang, Minsu Kim, Steven Euijong Whang

#2660

Implicit Correspondence Learning for Image-to-Point Cloud Registration

Xinjun Li, Wenfei Yang, Jiacheng Deng et al.

#2661

Argus: A Compact and Versatile Foundation Model for Vision

Weiming Zhuang, Chen Chen, Zhizhong Li et al.

#2662

POMP: Physics-constrainable Motion Generative Model through Phase Manifolds

Bin Ji, Ye Pan, zhimeng Liu et al.

CVPR 2025posterarXiv:2412.06146

#2663

Homogeneous Dynamics Space for Heterogeneous Humans

Xinpeng Liu, Junxuan Liang, Chenshuo Zhang et al.

#2664

Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering

Zhen Yang, Zhuo Tao, Qi Chen et al.

CVPR 2025posterarXiv:2507.23569

#2665

Gaussian Splatting Feature Fields for (Privacy-Preserving) Visual Localization

Maxime Pietrantoni, Gabriela Csurka, Torsten Sattler

CVPR 2025posterarXiv:2503.02388

#2666

PIDLoc: Cross-View Pose Optimization Network Inspired by PID Controllers

Wooju Lee, Juhye Park, Dasol Hong et al.

#2667

SinGS: Animatable Single-Image Human Gaussian Splats with Kinematic Priors

Yufan Wu, Xuanhong Chen, Wen Li et al.

CVPR 2025posterarXiv:2503.12053

#2668

Ferret: An Efficient Online Continual Learning Framework under Varying Memory Constraints

Yuhao Zhou, Yuxin Tian, Jindi Lv et al.

CVPR 2025posterarXiv:2503.18483

#2669

Explaining Domain Shifts in Language: Concept Erasing for Interpretable Image Classification

Zequn Zeng, Yudi Su, Jianqiao Sun et al.

CVPR 2025posterarXiv:2503.17117

#2670

A New Statistical Model of Star Speckles for Learning to Detect and Characterize Exoplanets in Direct Imaging Observations

Theo Bodrito, Olivier Flasseur, Julien Mairal et al.

CVPR 2025highlightarXiv:2506.10182

#2671

Improving Personalized Search with Regularized Low-Rank Parameter Updates

Fiona Ryan, Josef Sivic, Fabian Caba Heilbron et al.

CVPR 2025posterarXiv:2503.08601

#2672

LiSu: A Dataset and Method for LiDAR Surface Normal Estimation

Dušan Malić, Christian Fruhwirth-Reisinger, Samuel Schulter et al.

CVPR 2025highlightarXiv:2505.09393

#2673

UMotion: Uncertainty-driven Human Motion Estimation from Inertial and Ultra-wideband Units

Huakun Liu, Hiroki Ota, Xin Wei et al.

CVPR 2025posterarXiv:2502.04369

#2674

HSI: A Holistic Style Injector for Arbitrary Style Transfer

Shuhao Zhang, Hui Kang, Yang Liu et al.

CVPR 2025posterarXiv:2406.10197

#2675

Composing Parts for Expressive Object Generation

Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni et al.

CVPR 2025posterarXiv:2412.05279

#2676

Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories

Susung Hong, Johanna Suvi Karras, Ricardo Martin et al.

CVPR 2025posterarXiv:2507.01721

#2677

Soft Self-labeling and Potts Relaxations for Weakly-supervised Segmentation

Zhongwen Zhang, Yuri Boykov

#2678

The Photographer's Eye: Teaching Multimodal Large Language Models to See, and Critique Like Photographers

Daiqing Qi, Handong Zhao, Jing Shi et al.

#2679

VSNet: Focusing on the Linguistic Characteristics of Sign Language

Yuhao Li, Xinyue Chen, Hongkai Li et al.

CVPR 2025posterarXiv:2503.08382

#2680

Twinner: Shining Light on Digital Twins in a Few Snaps

Jesus Zarzar, Tom Monnier, Roman Shapovalov et al.

#2681

Incorporating Dense Knowledge Alignment into Unified Multimodal Representation Models

Yuhao Cui, Xinxing Zu, Wenhua Zhang et al.

CVPR 2025highlightarXiv:2505.00502

#2682

Towards Scalable Human-aligned Benchmark for Text-guided Image Editing

Suho Ryu, Kihyun Kim, Eugene Baek et al.

CVPR 2025posterarXiv:2503.16916

#2683

Temporal Action Detection Model Compression by Progressive Block Drop

Xiaoyong Chen, Yong Guo, Jiaming Liang et al.

CVPR 2025highlightarXiv:2411.18159

#2684

Type-R: Automatically Retouching Typos for Text-to-Image Generation

Wataru Shimoda, Naoto Inoue, Daichi Haraguchi et al.

CVPR 2025posterarXiv:2503.06517

#2685

Instance-wise Supervision-level Optimization in Active Learning

Shinnosuke Matsuo, Riku Togashi, Ryoma Bise et al.

CVPR 2025posterarXiv:2505.04668

#2686

SGCR: Spherical Gaussians for Efficient 3D Curve Reconstruction

Xinran Yang, Donghao Ji, Yuanqi Li et al.

#2687

Language-Assisted Debiasing and Smoothing for Foundation Model-Based Semi-Supervised Learning

Na Zheng, Xuemeng Song, Xue Dong et al.

#2688

MaDCoW: Marginal Distortion Correction for Wide-Angle Photography with Arbitrary Objects

Kevin Zhang, Jia-Bin Huang, Jose Echevarria et al.

#2689

RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse Corruptions

Shihang Du, Sanqing Qu, Tianhang Wang et al.

CVPR 2025posterarXiv:2504.06801

#2690

MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection

Rishubh Parihar, Srinjay Sarkar, Sarthak Vora et al.

#2691

Fitted Neural Lossless Image Compression

Zhe Zhang, Zhenzhong Chen, Shan Liu

#2692

Minimal Interaction Seperated Tuning: A New Paradigm for Visual Adaptation

Ningyuan Tang, Minghao Fu, Jianxin Wu

CVPR 2025highlightarXiv:2503.09491

#2693

DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction

Junjie Zhou, Shouju Wang, Yuxia Tang et al.

CVPR 2024posterarXiv:2403.14430

#2694

Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels

Tianming Liang, Chaolei Tan, Beihao Xia et al.

#2695

Graph Neural Network Combining Event Stream and Periodic Aggregation for Low-Latency Event-based Vision

Manon Dampfhoffer, Thomas Mesquida, Damien Joubert et al.

CVPR 2025posterarXiv:2504.06389

#2696

SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation

Hritam Basak, Zhaozheng Yin

CVPR 2025posterarXiv:2412.18609

#2697

Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models

Jinhui Yi, Syed Talal Wasim, Yanan Luo et al.

#2698

A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation

Zheng Zhang, Guanchun Yin, Bo Zhang et al.

#2699

ONDA-Pose: Occlusion-Aware Neural Domain Adaptation for Self-Supervised 6D Object Pose Estimation

Tao Tan, Qiulei Dong

CVPR 2025posterarXiv:2503.18134

#2700

An Image-like Diffusion Method for Human-Object Interaction Detection

Xiaofei Hui, Haoxuan Qu, Hossein Rahmani et al.

CVPR 2025posterarXiv:2503.06900

#2701

DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation

Xiaoliang Ju, Hongsheng Li

CVPR 2025posterarXiv:2504.11786

#2702

DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation

Sang-Jun Park, Keun-Soo Heo, Dong-Hee Shin et al.

CVPR 2025posterarXiv:2503.18507

#2703

Can Text-to-Video Generation help Video-Language Alignment?

Luca Zanella, Massimiliano Mancini, Willi Menapace et al.

#2704

Multi-modal Topology-embedded Graph Learning for Spatially Resolved Genes Prediction from Pathology Images with Prior Gene Similarity Information

Hang Shi, Chi Changxi, Peng Wan et al.

CVPR 2025highlightarXiv:2503.04119

#2705

SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer

Chunnan Shang, Zhizhong Wang, Hongwei Wang et al.

CVPR 2025posterarXiv:2406.20099

#2706

Odd-One-Out: Anomaly Detection by Comparing with Neighbors

Ankan Kumar Bhunia, Changjian Li, Hakan Bilen

#2707

Named Entity Driven Zero-Shot Image Manipulation

Zhida Feng, Li Chen, Jing Tian et al.

#2708

NTClick: Achieving Precise Interactive Segmentation With Noise-tolerant Clicks

Chenyi Zhang, Ting Liu, Xiaochao Qu et al.

CVPR 2024posterarXiv:2404.07448

#2709

Transferable and Principled Efficiency for Open-Vocabulary Segmentation

Jingxuan Xu, Wuyang Chen, Yao Zhao et al.

#2710

Adapting to Observation Length of Trajectory Prediction via Contrastive Learning

Ruiqi Qiu, JUN GONG, Xinyu Zhang et al.

#2711

PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation

Xinting Hu, Haoran Wang, Jan Lenssen et al.

#2712

Beyond Human Perception: Understanding Multi-Object World from Monocular View

Keyu Guo, Yongle Huang, Shijie Sun et al.

#2713

High-quality Point Cloud Oriented Normal Estimation via Hybrid Angular and Euclidean Distance Encoding

Yuanqi Li, Jingcheng Huang, Hongshen Wang et al.

CVPR 2025posterarXiv:2412.04456

#2714

HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery

Yuto Matsubara, Ko Nishino

#2715

Active Event-based Stereo Vision

Jianing Li, Yunjian Zhang, Haiqian Han et al.

#2716

EASEMVC:Efficient Dual Selection Mechanism for Deep Multi-View Clustering

Baili Xiao, Zhibin Dong, KE LIANG et al.

CVPR 2025posterarXiv:2503.20011

#2717

Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception

Luke Chen, Junyao Wang, Trier Mortlock et al.

CVPR 2025posterarXiv:2505.04915

#2718

GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing

Tong Wang, Ting Liu, Xiaochao Qu et al.

CVPR 2025posterarXiv:2504.02011

#2719

Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression

Dohyun Kim, Sehwan Park, GeonHee Han et al.

CVPR 2025posterarXiv:2504.05925

#2720

SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation

Hao Du, Bo Wu, Yan Lu et al.

CVPR 2025posterarXiv:2503.00260

#2721

Seeing A 3D World in A Grain of Sand

Yufan Zhang, Yu Ji, Yu Guo et al.

CVPR 2025posterarXiv:2502.02187

#2722

ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion

Nissim Maruani, Wang Yifan, Matthew Fisher et al.

CVPR 2025posterarXiv:2504.10007

#2723

Balancing Two Classifiers via A Simplex ETF Structure for Model Calibration

Jiani Ni, He Zhao, Jintong Gao et al.

CVPR 2025posterarXiv:2505.13091

#2724

Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction

Yuanbo Wang, Zhaoxuan Zhang, Jiajin Qiu et al.

#2725

Attraction Diminishing and Distributing for Few-Shot Class-Incremental Learning

Li-Jun Zhao, Zhen-Duo Chen, Yongxin Wang et al.

#2726

Pose-Guided Temporal Enhancement for Robust Low-Resolution Hand Reconstruction

Kaixin Fan, Pengfei Ren, Jingyu Wang et al.

#2727

CSC-PA: Cross-image Semantic Correlation via Prototype Attentions for Single-network Semi-supervised Breast Tumor Segmentation

Zhenhui Ding, Guilian Chen, Qin Zhang et al.

#2728

Attribute-Missing Multi-view Graph Clustering

Bowen Zhao, Qianqian Wang, Zhengming Ding et al.

#2729

Meta-Learning Hyperparameters for Parameter Efficient Fine-Tuning

Zichen Tian, Yaoyao Liu, Qianru Sun

#2730

Anchor-Aware Similarity Cohesion in Target Frames Enables Predicting Temporal Moment Boundaries in 2D

Jiawei Tan, Hongxing Wang, Junwu Weng et al.

CVPR 2025posterarXiv:2505.10671

#2731

GA3CE: Unconstrained 3D Gaze Estimation with Gaze-Aware 3D Context Encoding

Yuki Kawana, Shintaro Shiba, Quan Kong et al.

#2732

Link-based Contrastive Learning for One-Shot Unsupervised Domain Adaptation

Yue Zhang, Mingyue Bin, Yuyang Zhang et al.

#2733

Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model

Shuyun Wang, Hu Zhang, Xin Shen et al.

#2734

Directional Label Diffusion Model for Learning from Noisy Labels

Senyu Hou, Gaoxia Jiang, Jia Zhang et al.

#2735

IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular VideosC

Yuan Li, Ziqian Bai, Feitong Tan et al.

#2736

Flexible Group Count Enables Hassle-Free Structured Pruning

Jiamu Zhang, Shaochen Zhong, Andrew Ye et al.

#2737

Align-A-Video: Deterministic Reward Tuning of Image Diffusion Models for Consistent Video Editing

Shengzhi Wang, Yingkang Zhong, Jiangchuan Mu et al.

#2738

EvOcc: Accurate Semantic Occupancy for Automated Driving Using Evidence Theory

Jonas Kälble, Sascha Wirges, Maxim Tatarchenko et al.

#2739

PIAD: Pose and Illumination agnostic Anomaly Detection

Kaichen Yang, Junjie Cao, Zeyu Bai et al.

CVPR 2025posterarXiv:2504.02828

#2740

Concept Lancet: Image Editing with Compositional Representation Transplant

Jinqi Luo, Tianjiao Ding, Kwan Ho Ryan Chan et al.

CVPR 2025posterarXiv:2502.04074

#2741

3D Prior Is All You Need: Cross-Task Few-shot 2D Gaze Estimation

Yihua Cheng, Hengfei Wang, Zhongqun Zhang et al.

#2742

Dual-Granularity Semantic Guided Sparse Routing Diffusion Model for General Pansharpening

Yinghui Xing, Qu Li Tao, Shizhou Zhang et al.

CVPR 2025highlightarXiv:2503.10000

#2743

MetricGrids: Arbitrary Nonlinear Approximation with Elementary Metric Grids based Implicit Neural Representation

Shu Wang, Yanbo Gao, Shuai Li et al.

CVPR 2025posterarXiv:2412.05984

#2744

Nested Diffusion Models Using Hierarchical Latent Priors

Xiao Zhang, Ruoxi Jiang, Rebecca Willett et al.

CVPR 2025posterarXiv:2504.15397

#2745

MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World

Ankit Dhiman, Manan Shah, R. Venkatesh Babu

#2746

Adapting Dense Matching for Homography Estimation with Grid-based Acceleration

Kaining Zhang, Yuxin Deng, Jiayi Ma et al.

#2747

Multi-modal Contrastive Learning with Negative Sampling Calibration for Phenotypic Drug Discovery

Jiahua Rao, Hanjing Lin, Leyu Chen et al.

CVPR 2025posterarXiv:2503.17095

#2748

FFaceNeRF: Few-shot Face Editing in Neural Radiance Fields

Kwan Yun, Chaelin Kim, Hangyeul Shin et al.

CVPR 2025highlightarXiv:2503.09968

#2749

Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection

Zihao Zhang, Aming Wu, Yahong Han

CVPR 2025highlightarXiv:2504.17828

#2750

VEU-Bench: Towards Comprehensive Understanding of Video Editing

Bozheng Li, Yongliang Wu, YI LU et al.

CVPR 2025posterarXiv:2407.07052

#2751

Latent Space Imaging

Matheus Souza, Yidan Zheng, Kaizhang Kang et al.

#2752

WISNet: Pseudo Label Generation on Unbalanced and Patch Annotated Waste Images

Shifan Zhang, Hongzi Zhu, Yinan He et al.

#2753

AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering

Jing Wang, Songhe Feng, Kristoffer Knutsen Wickstrøm et al.

#2754

DynPose: Largely Improving the Efficiency of Human Pose Estimation by a Simple Dynamic Framework

Yalong Xu, Lin Zhao, Chen Gong et al.

CVPR 2025posterarXiv:2504.14875

#2755

ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams

Chris Dongjoo Kim, Jihwan Moon, Sangwoo Moon et al.

#2756

Deep Video Inverse Tone Mapping Based on Temporal Clues

Yuyao Ye, Ning Zhang, Yang Zhao et al.

CVPR 2025posterarXiv:2502.20499

#2757

Data Distributional Properties As Inductive Bias for Systematic Generalization

Felipe del Rio, Alain Raymond, Daniel Florea et al.

CVPR 2025highlightarXiv:2412.00133

#2758

ETAP: Event-based Tracking of Any Point

Friedhelm Hamann, Daniel Gehrig, Filbert Febryanto et al.

CVPR 2025posterarXiv:2411.16199

#2759

VIRES: Video Instance Repainting via Sketch and Text Guided Generation

Shuchen Weng, Haojie Zheng, Peixuan Zhang et al.

CVPR 2025posterarXiv:2504.15616

#2760

SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction

Kai Chen, Xiaodong Zhao, Yujie Huang et al.

#2761

Automatic Spectral Calibration of Hyperspectral Images: Method, Dataset and Benchmark

Zhuoran Du, Shaodi You, Cheng Cheng et al.

#2762

VRetouchEr: Learning Cross-frame Feature Interdependence with Imperfection Flow for Face Retouching in Videos

Wen Xue, Le Jiang, Lianxin Xie et al.

#2763

Video Language Model Pretraining with Spatio-temporal Masking

Yue Wu, Zhaobo Qi, Junshu Sun et al.

#2764

Take the Bull by the Horns: Learning to Segment Hard Samples

Yuan Guo, Jingyu Kong, Yu Wang et al.

CVPR 2025posterarXiv:2506.09473

#2765

Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning

Cheng Chen, Yunpeng Zhai, Yifan Zhao et al.

CVPR 2025posterarXiv:2501.04666

#2766

Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling

Nannan Li, Kevin Shih, Bryan A. Plummer

CVPR 2025posterarXiv:2409.03745

#2767

ArtiFade: Learning to Generate High-quality Subject from Blemished Images

Shuya Yang, Shaozhe Hao, Yukang Cao et al.

#2768

GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model

Yue Han, Jiangning Zhang, Junwei Zhu et al.

CVPR 2025posterarXiv:2504.08902

#2769

LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping

Pascal Chang, Sergio Sancho, Jingwei Tang et al.

CVPR 2025posterarXiv:2504.17261

#2770

Symbolic Representation for Any-to-Any Generative Tasks

Jiaqi Chen, Xiaoye Zhu, Yue Wang et al.

#2771

Leveraging Global Stereo Consistency for Category-Level Shape and 6D Pose Estimation from Stereo Images

Junning Qiu, Minglei Lu, Fei Wang et al.

#2772

Self-Supervised Cross-View Correspondence with Predictive Cycle Consistency

Alan Baade, Changan Chen

CVPR 2025posterarXiv:2503.13241

#2773

Sampling Innovation-Based Adaptive Compressive Sensing

Zhifu Tian, Tao Hu, Chaoyang Niu et al.

CVPR 2025posterarXiv:2503.00861

#2774

Zero-Shot Head Swapping in Real-World Scenarios

Sohyun Jeong, Taewoong Kang, Hyojin Jang et al.

CVPR 2025highlightarXiv:2503.06012

#2775

End-to-End HOI Reconstruction Transformer with Graph-based Encoding

Zhenrong Wang, Qi Zheng, Sihan Ma et al.

CVPR 2025posterarXiv:2505.12685

#2776

Mamba-Adaptor: State Space Model Adaptor for Visual Recognition

Fei Xie, Jiahao Nie, Yujin Tang et al.

#2777

GeoDepth: From Point-to-Depth to Plane-to-Depth Modeling for Self-Supervised Monocular Depth Estimation

Haifeng Wu, Shuhang Gu, Lixin Duan et al.

#2778

Vision-Guided Action: Enhancing 3D Human Motion Prediction with Gaze-informed Affordance in 3D Scenes

Ting Yu, Yi Lin, Jun Yu et al.

#2779

Advancing Manga Analysis: Comprehensive Segmentation Annotations for the Manga109 Dataset

Minshan Xie, Jian Lin, Hanyuan Liu et al.

#2780

Customized Condition Controllable Generation for Video Soundtrack

Fan Qi, KunSheng Ma, Changsheng Xu

CVPR 2025highlightarXiv:2503.23094

#2781

FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video

Andrea Boscolo Camiletto, Jian Wang, Eduardo Alvarado et al.

CVPR 2025posterarXiv:2504.19581

#2782

SAMBLE: Shape-Specific Point Cloud Sampling for an Optimal Trade-Off Between Local Detail and Global Uniformity

Chengzhi Wu, Yuxin Wan, Hao Fu et al.

#2783

GaPT-DAR: Category-level Garments Pose Tracking via Integrated 2D Deformation and 3D Reconstruction

Li Zhang, mingliang xu, Jianan Wang et al.

#2784

Revisiting Fairness in Multitask Learning: A Performance-Driven Approach for Variance Reduction

Xiaohan Qin, Xiaoxing Wang, Junchi Yan

#2785

EnliveningGS: Active Locomotion of 3DGS

Siyuan Shen, Tianjia Shao, Kun Zhou et al.

CVPR 2025highlightarXiv:2504.02199

#2786

ESC: Erasing Space Concept for Knowledge Deletion

Tae-Young Lee, Sundong Park, Minwoo Jeon et al.

CVPR 2025posterarXiv:2503.18637

#2787

Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks

Nina Shvetsova, Arsha Nagrani, Bernt Schiele et al.

#2788

CroCoDL: Cross-device Collaborative Dataset for Localization

Hermann Blum, Alessandro Mercurio, Joshua O'Reilly et al.

CVPR 2025highlightarXiv:2505.21377

#2789

Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility

Yidi Li, Jun Xiao, Zhengda Lu et al.

CVPR 2025posterarXiv:2412.02631

#2790

Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation

Yiftach Edelstein, Or Patashnik, Dana Cohen-Bar et al.

CVPR 2025highlightarXiv:2503.21099

#2791

Learning Class Prototypes for Unified Sparse-Supervised 3D Object Detection

Yun Zhu, Le Hui, Hang Yang et al.

#2792

F^3OCUS - Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics

Pramit Saha, Felix Wagner, Divyanshu Mishra et al.

CVPR 2025posterarXiv:2407.02165

#2793

WildAvatar: Learning In-the-wild 3D Avatars from the Web

Zihao Huang, Shoukang Hu, Guangcong Wang et al.

#2794

ToonerGAN: Reinforcing GANs for Obfuscating Automated Facial Indexing

Kartik Thakral, Shashikant Prasad, Stuti Aswani et al.

CVPR 2025posterarXiv:2412.09723

#2795

MAC-Ego3D: Multi-Agent Gaussian Consensus for Real-Time Collaborative Ego-Motion and Photorealistic 3D Reconstruction

Xiaohao Xu, Feng Xue, Shibo Zhao et al.

#2796

Black Hole-Driven Identity Absorbing in Diffusion Models

Muhammad Shaheryar, Jong Taek Lee, Soon Ki Jung

CVPR 2025highlightarXiv:2503.00643

#2797

Deep Change Monitoring: A Hyperbolic Representative Learning Framework and a Dataset for Long-term Fine-grained Tree Change Detection

Yante Li, Hanwen Qi, Haoyu Chen et al.

CVPR 2025posterarXiv:2503.19794

#2798

PAVE: Patching and Adapting Video Large Language Models

Zhuoming Liu, Yiquan Li, Khoi D Nguyen et al.

#2799

Polarized Color Screen Matting

Kenji Enomoto, Scott Cohen, Brian Price et al.

CVPR 2025posterarXiv:2503.05333

#2800

PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations?

Martin Spitznagel, Jan Vaillant, Janis Keuper