Most Cited CVPR &quot;watermark removal attack&quot; Papers

CVPR 2024arXiv:2405.07933

#2802

Authentic Hand Avatar from a Phone Scan via Universal Hand Model

Gyeongsik Moon, Weipeng Xu, Rohan Joshi et al.

CVPR 2024arXiv:2406.04999

#2803

ProMotion: Prototypes As Motion Learners

Yawen Lu, Dongfang Liu, Qifan Wang et al.

CVPR 2025arXiv:2412.07293

#2804

EventSplat: 3D Gaussian Splatting from Moving Event Cameras for Real-time Rendering

Toshiya Yura, Ashkan Mirzaei, Igor Gilitschenski

CVPR 2025arXiv:2411.17385

#2805

DepthCues: Evaluating Monocular Depth Perception in Large Vision Models

Duolikun Danier, Mehmet Aygun, Changjian Li et al.

CVPR 2024arXiv:2311.17833

#2806

DiG-IN: Diffusion Guidance for Investigating Networks - Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations

Maximilian Augustin, Yannic Neuhaus, Matthias Hein

CVPR 2025arXiv:2412.12096

#2807

PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting

Cheng Zhang, Haofei Xu, Qianyi Wu et al.

CVPR 2025highlightarXiv:2505.24315

#2808

InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing

Jinlu Zhang, Yixin Chen, Zan Wang et al.

CVPR 2025arXiv:2503.17074

#2809

Zero-Shot Styled Text Image Generation, but Make It Autoregressive

Vittorio Pippi, Fabio Quattrini, Silvia Cascianelli et al.

CVPR 2024arXiv:2303.15368

#2810

2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images

Junkai Deng, Fei Hou, Xuhui Chen et al.

CVPR 2024arXiv:2303.16990

#2811

What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions

Brian Chen, Nina Shvetsova, Andrew Rouditchenko et al.

CVPR 2024arXiv:2307.04760

#2812

Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos

Sagnik Majumder, Ziad Al-Halah, Kristen Grauman

CVPR 2024highlightarXiv:2405.06216

#2813

Event-based Structure-from-Orbit

Ethan Elms, Yasir Latif, Tae Ha Park et al.

CVPR 2024arXiv:2404.00678

#2814

OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees

Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang et al.

CVPR 2024arXiv:2404.01225

#2815

SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering

Tao Hu, Fangzhou Hong, Ziwei Liu

CVPR 2024arXiv:2405.02608

#2816

UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

Shuai Yuan, Lei Luo, Zhuo Hui et al.

CVPR 2024arXiv:2404.19696

#2817

Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners

Chun Feng, Joy Hsu, Weiyu Liu et al.

CVPR 2025arXiv:2504.06675

#2818

Probability Density Geodesics in Image Diffusion Latent Space

Qingtao Yu, Jaskirat Singh, Zhaoyuan Yang et al.

CVPR 2025arXiv:2411.09911

#2819

DiffFNO: Diffusion Fourier Neural Operator

Xiaoyi Liu, Hao Tang

CVPR 2024arXiv:2405.12978

#2820

Personalized Residuals for Concept-Driven Text-to-Image Generation

Cusuh Ham, Matthew Fisher, James Hays et al.

CVPR 2025arXiv:2408.00754

#2821

Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model

Benlin Liu, Yuhao Dong, Yiqin Wang et al.

CVPR 2024arXiv:2312.03442

#2822

High-Quality Facial Geometry and Appearance Capture at Home

Yuxuan Han, Junfeng Lyu, Feng Xu

CVPR 2024arXiv:2404.13024

#2823

BANF: Band-Limited Neural Fields for Levels of Detail Reconstruction

Ahan Shabanov, Shrisudhan Govindarajan, Cody Reading et al.

CVPR 2024arXiv:2405.12509

#2824

Active Object Detection with Knowledge Aggregation and Distillation from Large Models

Dejie Yang, Yang Liu

CVPR 2024arXiv:2402.17210

#2825

Purified and Unified Steganographic Network

GuoBiao Li, Sheng Li, Zicong Luo et al.

CVPR 2024arXiv:2403.20254

#2826

Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

Runhao Zeng, Xiaoyong Chen, Jiaming Liang et al.

CVPR 2024arXiv:2404.16123

#2827

FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication

Eric Slyman, Stefan Lee, Scott Cohen et al.

CVPR 2025arXiv:2411.18042

#2828

HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation

Trong-Thuan Nguyen, Pha Nguyen, Jackson Cothren et al.

#2829

SAM2Object: Consolidating View Consistency via SAM2 for Zero-Shot 3D Instance Segmentation

Jihuai Zhao, Junbao Zhuo, Jiansheng Chen et al.

CVPR 2024arXiv:2403.07244

#2830

Time-Efficient Light-Field Acquisition Using Coded Aperture and Events

Shuji Habuchi, Keita Takahashi, Chihiro Tsutake et al.

CVPR 2024arXiv:2405.10037

#2831

Bilateral Event Mining and Complementary for Event Stream Super-Resolution

Zhilin Huang, Quanmin Liang, Yijie Yu et al.

CVPR 2025highlightarXiv:2505.00690

#2832

Towards Autonomous Micromobility through Scalable Urban Simulation

Wayne Wu, Honglin He, Chaoyuan Zhang et al.

CVPR 2025arXiv:2503.13214

#2833

A General Adaptive Dual-level Weighting Mechanism for Remote Sensing Pansharpening

Jie Huang, Haorui Chen, Jiaxuan Ren et al.

CVPR 2024arXiv:2403.19235

#2834

DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

Haonan Lin

CVPR 2025highlightarXiv:2504.14687

#2835

Seurat: From Moving Points to Depth

Seokju Cho, Gabriel Huang, Seungryong Kim et al.

CVPR 2025arXiv:2504.06120

#2836

Hyperbolic Category Discovery

Yuanpei Liu, Zhenqi He, Kai Han

CVPR 2025arXiv:2410.20723

#2837

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

Chongjian GE, Chenfeng Xu, Yuanfeng Ji et al.

CVPR 2025arXiv:2412.03103

#2838

MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction

Gangjian Zhang, Nanjie Yao, Shunsi Zhang et al.

CVPR 2025arXiv:2412.05725

#2839

Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events

Aditya Chinchure, Sahithya Ravi, Raymond Ng et al.

CVPR 2025arXiv:2503.17267

#2840

Physical Plausibility-aware Trajectory Prediction via Locomotion Embodiment

Hiromu Taketsugu, Takeru Oba, Takahiro Maeda et al.

CVPR 2024arXiv:2312.01964

#2841

Semantics-aware Motion Retargeting with Vision-Language Models

Haodong Zhang, ZhiKe Chen, Haocheng Xu et al.

CVPR 2025arXiv:2504.17695

#2842

PICO: Reconstructing 3D People In Contact with Objects

Alpár Cseke, Shashank Tripathi, Sai Kumar Dwivedi et al.

CVPR 2024arXiv:2002.07756

#2843

Hierarchical Correlation Clustering and Tree Preserving Embedding

Morteza Haghir Chehreghani, Mostafa Haghir Chehreghani

CVPR 2024arXiv:2404.00330

#2844

Memory-Scalable and Simplified Functional Map Learning

Robin Magnet, Maks Ovsjanikov

CVPR 2025highlightarXiv:2502.20625

#2845

T2ICount: Enhancing Cross-modal Understanding for Zero-Shot Counting

Yifei Qian, Zhongliang Guo, Bowen Deng et al.

CVPR 2024arXiv:2310.11696

#2846

MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision

Chenyangguang Zhang, Guanlong Jiao, Yan Di et al.

CVPR 2025arXiv:2503.13130

#2847

ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation

Ling-An Zeng, Guohong Huang, Yi-Lin Wei et al.

CVPR 2025arXiv:2503.19357

#2848

Correcting Deviations from Normality: A Reformulated Diffusion Model for Multi-Class Unsupervised Anomaly Detection

Farzad Beizaee, Gregory A. Lodygensky, Christian Desrosiers et al.

CVPR 2025highlightarXiv:2412.15211

#2849

Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation

Hadi Alzayer, Philipp Henzler, Jonathan T. Barron et al.

CVPR 2024arXiv:2311.12796

#2850

Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models

David Stotko, Nils Wandel, Reinhard Klein

#2851

Fusing Personal and Environmental Cues for Identification and Segmentation of First-Person Camera Wearers in Third-Person Views

Ziwei Zhao, Yuchen Wang, Chuhua Wang

CVPR 2025arXiv:2501.01633

#2852

ACE: Anti-Editing Concept Erasure in Text-to-Image Models

Zihao Wang, Yuxiang Wei, Fan Li et al.

CVPR 2025arXiv:2502.18290

#2853

Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models

Zhaoyi Liu, Huan Zhang

CVPR 2025highlightarXiv:2412.16212

#2854

ManiVideo: Generating Hand-Object Manipulation Video with Dexterous and Generalizable Grasping

Youxin Pang, Ruizhi Shao, Jiajun Zhang et al.

CVPR 2025arXiv:2408.17135

#2855

TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation

Yabiao Wang, Shuo Wang, Jiangning Zhang et al.

CVPR 2024arXiv:2403.20249

#2856

Relation Rectification in Diffusion Model

Yinwei Wu, Xingyi Yang, Xinchao Wang

CVPR 2025arXiv:2506.11036

#2857

Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification

Yang Qin, Chao Chen, Zhihang Fu et al.

CVPR 2025arXiv:2503.03272

#2858

Towards Effective and Sparse Adversarial Attack on Spiking Neural Networks via Breaking Invisible Surrogate Gradients

Li Lun, Kunyu Feng, Qinglong Ni et al.

CVPR 2024arXiv:2404.01543

#2859

Efficient 3D Implicit Head Avatar with Mesh-anchored Hash Table Blendshapes

Ziqian Bai, Feitong Tan, Sean Fanello et al.

CVPR 2025arXiv:2412.11457

#2860

MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes

Ruijie Lu, Yixin Chen, Junfeng Ni et al.

CVPR 2025arXiv:2501.08326

#2861

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Miran Heo, Min-Hung Chen, De-An Huang et al.

CVPR 2025arXiv:2411.16308

#2862

An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

Wentao Qu, Jing Wang, Yongshun Gong et al.

CVPR 2025arXiv:2503.18589

#2863

Unified Uncertainty-Aware Diffusion for Multi-Agent Trajectory Modeling

Guillem Capellera, Antonio Rubio, Luis Ferraz et al.

CVPR 2025arXiv:2503.19009

#2864

Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval

Arun Reddy, Alexander Martin, Eugene Yang et al.

CVPR 2024arXiv:2402.18929

#2865

Navigating Beyond Dropout: An Intriguing Solution towards Generalizable Image Super Resolution

Hongjun Wang, Jiyuan Chen, Yinqiang Zheng et al.

CVPR 2025arXiv:2410.11666

#2866

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

Zhengxue Wang, Zhiqiang Yan, Jinshan Pan et al.

CVPR 2025arXiv:2503.15835

#2867

BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting

Yiren Lu, Yunlai Zhou, Disheng Liu et al.

CVPR 2024arXiv:2403.17719

#2868

Resolution Limit of Single-Photon LiDAR

Stanley H. Chan, Hashan K Weerasooriya, Weijian Zhang et al.

CVPR 2025arXiv:2503.02187

#2869

h-Edit: Effective and Flexible Diffusion-Based Editing via Doob's h-Transform

Toan Nguyen, Kien Do, Duc Kieu et al.

CVPR 2025highlightarXiv:2503.18223

#2870

MammAlps: A Multi-view Video Behavior Monitoring Dataset of Wild Mammals in the Swiss Alps

Valentin Gabeff, Haozhe Qi, Brendan Flaherty et al.

CVPR 2025arXiv:2501.07574

#2871

UnCommon Objects in 3D

Xingchen Liu, Piyush Tayal, Jianyuan Wang et al.

CVPR 2025arXiv:2412.09511

#2872

GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency

Dongyue Lu, Lingdong Kong, Tianxin Huang et al.

CVPR 2024arXiv:2403.04245

#2873

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

Yusheng Dai, HangChen, Jun Du et al.

CVPR 2024arXiv:2403.13647

#2874

Meta-Point Learning and Refining for Category-Agnostic Pose Estimation

Junjie Chen, Jiebin Yan, Yuming Fang et al.

CVPR 2025arXiv:2506.01558

#2875

SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes

Yuji Wang, Haoran Xu, Yong Liu et al.

#2876

Motion Diversification Networks

Hee Jae Kim, Eshed Ohn-Bar

CVPR 2024arXiv:2402.08876

#2877

DUDF: Differentiable Unsigned Distance Fields with Hyperbolic Scaling

Miguel Fainstein, Viviana Siless, Emmanuel Iarussi

CVPR 2025arXiv:2503.14140

#2878

Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding

Zining Wang, Tongkun Guan, Pei Fu et al.

CVPR 2024arXiv:2404.00931

#2879

GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields

Fangyin Wei, Hanlin Chen, Gim Hee Lee

CVPR 2025arXiv:2411.17687

#2880

GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration

Sudarshan Rajagopalan, Nithin Gopalakrishnan Nair, Jay Paranjape et al.

CVPR 2025arXiv:2410.13360

#2881

RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models

Haoran Hao, Jiaming Han, Changsheng Li et al.

CVPR 2024arXiv:2311.17951

#2882

C3Net: Compound Conditioned ControlNet for Multimodal Content Generation

Juntao Zhang, Yuehuai LIU, Yu-Wing Tai et al.

CVPR 2025arXiv:2503.18107

#2883

PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding

Hongjia Zhai, Hai Li, Zhenzhe Li et al.

CVPR 2025arXiv:2412.10061

#2884

Quaffure: Real-Time Quasi-Static Neural Hair Simulation

Tuur Stuyck, Gene Wei-Chin Lin, Egor Larionov et al.

CVPR 2025arXiv:2504.04566

#2885

DyCON: Dynamic Uncertainty-aware Consistency and Contrastive Learning for Semi-supervised Medical Image Segmentation

Maregu Assefa, Muzammal Naseer, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2025arXiv:2504.20378

#2886

Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views

Jiang Wu, Rui Li, Yu Zhu et al.

CVPR 2024arXiv:2403.12961

#2887

TexTile: A Differentiable Metric for Texture Tileability

Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.

CVPR 2025arXiv:2503.02491

#2888

Joint Out-of-Distribution Filtering and Data Discovery Active Learning

Sebastian Schmidt, Leonard Schenk, Leo Schwinn et al.

CVPR 2024arXiv:2404.03296

#2889

AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution

Cheeun Hong, Kyoung Mu Lee

#2890

Generative Zero-Shot Composed Image Retrieval

Lan Wang, Wei Ao, Vishnu Naresh Boddeti et al.

CVPR 2024arXiv:2403.14870

#2891

VidLA: Video-Language Alignment at Scale

Mamshad Nayeem Rizve, Fan Fei, Jayakrishnan Unnikrishnan et al.

CVPR 2025arXiv:2505.24816

#2892

CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning

Jiangpeng He, Zhihao Duan, Fengqing Zhu

CVPR 2025arXiv:2412.02254

#2893

ProbPose: A Probabilistic Approach to 2D Human Pose Estimation

Miroslav Purkrábek, Jiri Matas

CVPR 2024highlightarXiv:2308.14740

#2894

Total Selfie: Generating Full-Body Selfies

Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman et al.

CVPR 2025arXiv:2411.19415

#2895

AMO Sampler: Enhancing Text Rendering with Overshooting

Xixi Hu, Keyang Xu, Bo Liu et al.

CVPR 2025arXiv:2412.10122

#2896

The Art of Deception: Color Visual Illusions and Diffusion Models

Alexandra Gomez-Villa, Kai Wang, C.Alejandro Parraga et al.

CVPR 2025arXiv:2503.14945

#2897

Generating Multimodal Driving Scenes via Next-Scene Prediction

Yanhao Wu, Haoyang Zhang, Tianwei Lin et al.

CVPR 2025arXiv:2311.15965

#2898

FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding

Thanh-Dat Truong, Utsav Prabhu, Bhiksha Raj et al.

CVPR 2025highlightarXiv:2412.12087

#2899

Instruction-based Image Manipulation by Watching How Things Move

Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.

CVPR 2025arXiv:2403.12922

#2900

Contextual AD Narration with Interleaved Multimodal Sequence

Hanlin Wang, Zhan Tong, Kecheng Zheng et al.

CVPR 2025arXiv:2505.07843

#2901

PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

HsiaoYuan Hsu, Yuxin Peng

CVPR 2025arXiv:2506.16201

#2902

FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation

Sen Wang, Le Wang, Sanping Zhou et al.

CVPR 2025arXiv:2503.23538

#2903

Enhancing Creative Generation on Stable Diffusion-based Models

Jiyeon Han, Dahee Kwon, Gayoung Lee et al.

CVPR 2025arXiv:2504.01503

#2904

Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Ziteng Cui, Xuangeng Chu, Tatsuya Harada

CVPR 2025arXiv:2412.15341

#2905

Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models

Reza Shirkavand, Peiran Yu, Shangqian Gao et al.

CVPR 2025arXiv:2503.06514

#2906

GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks

Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.

CVPR 2024highlightarXiv:2403.14366

#2907

SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field

Lizhe Liu, Bohua Wang, Hongwei Xie et al.

CVPR 2024arXiv:2404.04848

#2908

Task-Aware Encoder Control for Deep Video Compression

Xingtong Ge, Jixiang Luo, XINJIE ZHANG et al.

CVPR 2024highlightarXiv:2312.17161

#2909

Restoration by Generation with Constrained Priors

Zheng Ding, Xuaner Zhang, Zhuowen Tu et al.

CVPR 2025arXiv:2503.06873

#2910

Interactive Medical Image Analysis with Concept-based Similarity Reasoning

Ta Duc Huy, Sen Kim Tran, Phan Nguyen et al.

CVPR 2024arXiv:2312.07937

#2911

BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics

Wenqian Zhang, Molin Huang, Yuxuan Zhou et al.

CVPR 2024arXiv:2212.08251

#2912

Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning

Xialei Liu, Jiang-Tian Zhai, Andrew Bagdanov et al.

CVPR 2025arXiv:2501.00603

#2913

DiC: Rethinking Conv3x3 Designs in Diffusion Models

Yuchuan Tian, Jing Han, Chengcheng Wang et al.

CVPR 2024highlightarXiv:2405.12759

#2914

Cross-spectral Gated-RGB Stereo Depth Estimation

Samuel Brucker, Stefanie Walz, Mario Bijelic et al.

CVPR 2025arXiv:2503.02394

#2915

BHViT: Binarized Hybrid Vision Transformer

Tian Gao, Yu Zhang, Zhiyuan Zhang et al.

CVPR 2024arXiv:2404.05621

#2916

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

Matteo Farina, Massimiliano Mancini, Elia Cunegatti et al.

CVPR 2025arXiv:2405.16240

#2917

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He, Kai Tong, Di Fang et al.

CVPR 2025arXiv:2409.13222

#2918

3D-GSW: 3D Gaussian Splatting for Robust Watermarking

Youngdong Jang, Hyunje Park, Feng Yang et al.

#2919

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025arXiv:2501.11175

#2920

ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models

Yassir Bendou, Amine Ouasfi, Vincent Gripon et al.

CVPR 2024arXiv:2403.09480

#2921

What Sketch Explainability Really Means for Downstream Tasks?

Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia et al.

#2922

The Change You Want To Detect: Semantic Change Detection In Earth Observation With Hybrid Data Generationf

Yanis Benidir, Nicolas Gonthier, Clement Mallet

#2923

Diffusion-FOF: Single-View Clothed Human Reconstruction via Diffusion-Based Fourier Occupancy Field

Yuanzhen Li, Fei LUO, Chunxia Xiao

CVPR 2025arXiv:2411.15540

#2924

Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Hyelin Nam, Jaemin Kim, Dohun Lee et al.

CVPR 2025arXiv:2406.09390

#2925

LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living

Dominick Reilly, Rajatsubhra Chakraborty, Arkaprava Sinha et al.

CVPR 2024arXiv:2403.01773

#2926

Improving Out-of-Distribution Generalization in Graphs via Hierarchical Semantic Environments

Yinhua Piao, Sangseon Lee, Yijingxiu Lu et al.

CVPR 2025arXiv:2505.05473

#2927

DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Qitao Zhao, Amy Lin, Jeff Tan et al.

CVPR 2025arXiv:2503.22201

#2928

Multi-modal Knowledge Distillation-based Human Trajectory Forecasting

Jaewoo Jeong, Seohee Lee, Daehee Park et al.

CVPR 2024arXiv:2306.16772

#2929

Learning from Synthetic Human Group Activities

Che-Jui Chang, Danrui Li, Deep Patel et al.

#2930

Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard et al.

CVPR 2025highlightarXiv:2412.01027

#2931

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Bolin Lai, Felix Juefei-Xu, Miao Liu et al.

CVPR 2025arXiv:2409.06214

#2932

Towards Generalizable Scene Change Detection

Jae-Woo KIM, Ue-Hwan Kim

CVPR 2025highlightarXiv:2405.20216

#2933

Boost Your Human Image Generation Model via Direct Preference Optimization

Sanghyeon Na, Yonggyu Kim, Hyunjoon Lee

CVPR 2025arXiv:2505.00045

#2934

Noise Modeling in One Hour: Minimizing Preparation Efforts for Self-supervised Low-Light RAW Image Denoising

Feiran Li, Haiyang Jiang, Daisuke Iso

CVPR 2025arXiv:2406.19827

#2935

Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory

Wenliang Zhong, Haoyu Tang, Qinghai Zheng et al.

CVPR 2025arXiv:2504.02451

#2936

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer

Jiayi Gao, Zijin Yin, Changcheng Hua et al.

CVPR 2024arXiv:2403.19326

#2937

MedBN: Robust Test-Time Adaptation against Malicious Test Samples

Hyejin Park, Jeongyeon Hwang, Sunung Mun et al.

CVPR 2025arXiv:2503.20826

#2938

Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Yucong Meng, Kexue Fu et al.

CVPR 2024arXiv:2404.07985

#2939

WaveMo: Learning Wavefront Modulations to See Through Scattering

Mingyang Xie, Haiyun Guo, Brandon Y. Feng et al.

CVPR 2024arXiv:2405.19902

#2940

Learning Discriminative Dynamics with Label Corruption for Noisy Label Detection

Suyeon Kim, Dongha Lee, SeongKu Kang et al.

CVPR 2025arXiv:2503.01323

#2941

CacheQuant: Comprehensively Accelerated Diffusion Models

Xuewen Liu, Zhikai Li, Qingyi Gu

CVPR 2025arXiv:2411.18936

#2942

Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Weimin Qiu, Jieke Wang, Meng Tang

CVPR 2024arXiv:2404.00842

#2943

An N-Point Linear Solver for Line and Motion Estimation with Event Cameras

Ling Gao, Daniel Gehrig, Hang Su et al.

CVPR 2024arXiv:2311.11995

#2944

BrainWash: A Poisoning Attack to Forget in Continual Learning

Ali Abbasi, Parsa Nooralinejad, Hamed Pirsiavash et al.

#2945

Cross-Dimension Affinity Distillation for 3D EM Neuron Segmentation

Xiaoyu Liu, Miaomiao Cai, Yinda Chen et al.

CVPR 2024arXiv:2403.17360

#2946

Activity-Biometrics: Person Identification from Daily Activities

Shehreen Azad, Yogesh S. Rawat

CVPR 2024arXiv:2404.10966

#2947

Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

Yeonguk Yu, Sungho Shin, Seunghyeok Back et al.

CVPR 2025arXiv:2412.06978

#2948

Edge-SD-SR: Low Latency and Parameter Efficient On-device Super-Resolution with Stable Diffusion via Bidirectional Conditioning

Isma Hadji, Mehdi Noroozi, Victor Escorcia et al.

CVPR 2025arXiv:2410.14379

#2949

AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios

Ziming Huang, Xurui Li, Haotian Liu et al.

CVPR 2025arXiv:2412.21206

#2950

PERSE: Personalized 3D Generative Avatars from A Single Portrait

Hyunsoo Cha, Inhee Lee, Hanbyul Joo

CVPR 2024arXiv:2212.09298

#2951

From a Bird's Eye View to See: Joint Camera and Subject Registration without the Camera Calibration

Zekun Qian, Ruize Han, Wei Feng et al.

CVPR 2024arXiv:2312.08591

#2952

Joint2Human: High-Quality 3D Human Generation via Compact Spherical Embedding of 3D Joints

Muxin Zhang, Qiao Feng, Zhuo Su et al.

CVPR 2024arXiv:2404.05661

#2953

Automatic Controllable Colorization via Imagination

Xiaoyan Cong, Yue Wu, Qifeng Chen et al.

CVPR 2024arXiv:2404.10880

#2954

HumMUSS: Human Motion Understanding using State Space Models

Arnab Mondal, Stefano Alletto, Denis Tome

CVPR 2024arXiv:2405.10286

#2955

FFF: Fixing Flawed Foundations in Contrastive Pre-Training Results in Very Strong Vision-Language Models

Adrian Bulat, Yassine Ouali, Georgios Tzimiropoulos

CVPR 2025arXiv:2412.04146

#2956

AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

Xinghui Li, Qichao Sun, Pengze Zhang et al.

#2957

Multirate Neural Image Compression with Adaptive Lattice Vector Quantization

Hao Xu, Xiaolin Wu, Xi Zhang

CVPR 2025highlight

CVPR 2025arXiv:2406.01591

#2958

DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation

Chun-Hung Wu, Shih-Hong Chen, Chih Yao Hu et al.

CVPR 2024arXiv:2403.13548

#2959

Diversity-aware Channel Pruning for StyleGAN Compression

Jiwoo Chung, Sangeek Hyun, Sang-Heon Shim et al.

CVPR 2024arXiv:2403.15330

#2960

Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization

Jimyeong Kim, Jungwon Park, Wonjong Rhee

CVPR 2025arXiv:2503.00881

#2961

Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization

You Shen, Zhipeng Zhang, Xinyang Li et al.

CVPR 2025arXiv:2503.21457

#2962

FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs

Xiaoqin Wang, Xusen Ma, Xianxu Hou et al.

CVPR 2025arXiv:2503.19916

#2963

EventFly: Event Camera Perception from Ground to the Sky

Lingdong Kong, Dongyue Lu, Xiang Xu et al.

CVPR 2024highlightarXiv:2312.06716

#2964

Deciphering ‘What’ and ‘Where’ Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations

Xiao Zhang, David Yunis, Michael Maire

CVPR 2025arXiv:2411.16832

#2965

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Hanhui Wang, Yihua Zhang, Ruizheng Bai et al.

CVPR 2024arXiv:2404.00913

#2966

LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction

Bo Zou, Chao Yang, Yu Qiao et al.

CVPR 2025highlightarXiv:2503.20519

#2967

MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation

Jinnan Chen, Lingting Zhu, Zeyu HU et al.

CVPR 2024arXiv:2405.00340

#2968

NC-SDF: Enhancing Indoor Scene Reconstruction Using Neural SDFs with View-Dependent Normal Compensation

Ziyi Chen, Xiaolong Wu, Yu Zhang

CVPR 2025arXiv:2503.04953

#2969

Spectral Informed Mamba for Robust Point Cloud Processing

Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori et al.

#2970

Focusing on Tracks for Online Multi-Object Tracking

Kyujin Shim, Kangwook Ko, YuJin Yang et al.

CVPR 2025arXiv:2502.05741

#2971

Linear Attention Modeling for Learned Image Compression

Donghui Feng, Zhengxue Cheng, Shen Wang et al.

CVPR 2025arXiv:2405.18839

#2972

MEGA: Masked Generative Autoencoder for Human Mesh Recovery

Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.

CVPR 2025arXiv:2411.16969

#2973

ZoomLDM: Latent Diffusion Model for Multi-scale Image Generation

Srikar Yellapragada, Alexandros Graikos, Kostas Triaridis et al.

CVPR 2025highlightarXiv:2411.10825

#2974

ARM: Appearance Reconstruction Model for Relightable 3D Generation

Xiang Feng, Chang Yu, Zoubin Bi et al.

CVPR 2024arXiv:2404.01636

#2975

Learning to Control Camera Exposure via Reinforcement Learning

Kyunghyun Lee, Ukcheol Shin, Byeong-Uk Lee

CVPR 2025arXiv:2505.23694

#2976

DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers

Li Ren, Chen Chen, Liqiang Wang et al.

#2977

Relational Matching for Weakly Semi-Supervised Oriented Object Detection

Wenhao Wu, Hau San Wong, Si Wu et al.

CVPR 2024arXiv:2302.04871

#2978

In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing

Yiran Xu, Zhixin Shu, Cameron Smith et al.

CVPR 2025arXiv:2503.07819

#2979

POp-GS: Next Best View in 3D-Gaussian Splatting with P-Optimality

Joey Wilson, Marcelino M. de Almeida, Sachit Mahajan et al.

CVPR 2024arXiv:2402.18372

#2980

FedUV: Uniformity and Variance for Heterogeneous Federated Learning

Ha Min Son, Moon-Hyun Kim, Tai-Myoung Chung et al.

CVPR 2025arXiv:2504.00387

#2981

Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration

Zilong Huang, Jun He, Junyan Ye et al.

CVPR 2025arXiv:2503.13110

#2982

DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Jing Li, Yihang Fu, Falai Chen

#2983

General Point Model Pretraining with Autoencoding and Autoregressive

Zhe Li, Zhangyang Gao, Cheng Tan et al.

CVPR 2025arXiv:2504.00665

#2984

Monocular and Generalizable Gaussian Talking Head Animation

Shengjie Gong, Haojie Li, Jiapeng Tang et al.

CVPR 2025arXiv:2503.18746

#2985

Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition

Yifei Zhang, Chang Liu, Jin Wei et al.

CVPR 2024arXiv:2403.16897

#2986

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

Junshu Tang, Yanhong Zeng, Ke Fan et al.

CVPR 2025arXiv:2411.14723

#2987

Effective SAM Combination for Open-Vocabulary Semantic Segmentation

Minhyeok Lee, Suhwan Cho, Jungho Lee et al.

CVPR 2025arXiv:2503.04639

#2988

Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation

Aishik Konwer, Zhijian Yang, Erhan Bas et al.

CVPR 2024arXiv:2310.17154

#2989

Deep Imbalanced Regression via Hierarchical Classification Adjustment

Haipeng Xiong, Angela Yao

CVPR 2025arXiv:2503.16707

#2990

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding

Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.

CVPR 2025arXiv:2503.15931

#2991

DnLUT: Ultra-Efficient Color Image Denoising via Channel-Aware Lookup Tables

Sidi Yang, Binxiao Huang, Yulun Zhang et al.

CVPR 2024arXiv:2403.11674

#2992

Towards Generalizing to Unseen Domains with Few Labels

Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana et al.

CVPR 2025arXiv:2501.06184

#2993

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs

Yangyu Huang, Tianyi Gao, Haoran Xu et al.

CVPR 2025highlightarXiv:2503.07635

#2994

Cross-modal Causal Relation Alignment for Video Question Grounding

weixing chen, Yang Liu, Binglin Chen et al.

CVPR 2024arXiv:2404.19294

#2995

Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement

Jinyoung Jun, Jae-Han Lee, Chang-Su Kim

CVPR 2025arXiv:2505.15185

#2996

MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models

Yifan Liu, Keyu Fan, Weihao Yu et al.

CVPR 2024highlightarXiv:2312.00057

#2997

VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models

Xiang Li, Qianli Shen, Kenji Kawaguchi

#2998

Adversarially Robust Few-shot Learning via Parameter Co-distillation of Similarity and Class Concept Learners

Junhao Dong, Piotr Koniusz, Junxi Chen et al.

CVPR 2025arXiv:2506.10966

#2999

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Ning Gao, Yilun Chen, Shuai Yang et al.

CVPR 2024highlightarXiv:2302.09585

#3000

StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation

Yining Shi, Kun JIANG, Ke Wang et al.