Most Cited CVPR &quot;black-box detection&quot; Papers

CVPR 2025arXiv:2412.11457

#1602

MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes

Ruijie Lu, Yixin Chen, Junfeng Ni et al.

CVPR 2025arXiv:2412.00832

#1603

EventGPT: Event Stream Understanding with Multimodal Large Language Models

shaoyu liu, Jianing Li, guanghui zhao et al.

CVPR 2024arXiv:2403.19501

#1604

RELI11D: A Comprehensive Multimodal Human Motion Dataset and Method

Ming Yan, Yan Zhang, Shuqiang Cai et al.

CVPR 2025arXiv:2503.14021

#1605

MP-GUI: Modality Perception with MLLMs for GUI Understanding

Ziwei Wang, Weizhi Chen, Leyang Yang et al.

CVPR 2024highlightarXiv:2401.15261

#1606

Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes

Diandian Guo, Deng-Ping Fan, Tongyu Lu et al.

CVPR 2025highlightarXiv:2503.18454

#1607

InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Yunhong Lu, Qichao Wang, Hengyuan Cao et al.

CVPR 2024arXiv:2308.06699

#1608

Neural Super-Resolution for Real-time Rendering with Radiance Demodulation

Jia Li, Ziling Chen, Xiaolong Wu et al.

CVPR 2025arXiv:2411.17687

#1609

GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration

Sudarshan Rajagopalan, Nithin Gopalakrishnan Nair, Jay Paranjape et al.

CVPR 2025arXiv:2504.06675

#1610

Probability Density Geodesics in Image Diffusion Latent Space

Qingtao Yu, Jaskirat Singh, Zhaoyuan Yang et al.

CVPR 2025arXiv:2503.23283

#1611

Language Guided Concept Bottleneck Models for Interpretable Continual Learning

Lu Yu, HaoYu Han, Zhe Tao et al.

CVPR 2025highlightarXiv:2503.12096

#1612

O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models

Ashshak Sharifdeen, Muhammad Akhtar Munir, Sanoojan Baliah et al.

CVPR 2025arXiv:2504.04956

#1613

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

Jihyun Lee, Weipeng Xu, Alexander Richard et al.

CVPR 2024highlightarXiv:2312.08128

#1614

Clockwork Diffusion: Efficient Generation With Model-Step Distillation

Amirhossein Habibian, Amir Ghodrati, Noor Fathima et al.

CVPR 2025arXiv:2503.17940

#1615

FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation

Dong Zhao, Jinlong Li, Shuang Wang et al.

#1616

Motion Diversification Networks

Hee Jae Kim, Eshed Ohn-Bar

CVPR 2024arXiv:2002.07756

#1617

Hierarchical Correlation Clustering and Tree Preserving Embedding

Morteza Haghir Chehreghani, Mostafa Haghir Chehreghani

CVPR 2025arXiv:2412.03103

#1618

MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction

Gangjian Zhang, Nanjie Yao, Shunsi Zhang et al.

CVPR 2025arXiv:2409.19702

#1619

RNG: Relightable Neural Gaussians

Jiahui Fan, Fujun Luan, Jian Yang et al.

CVPR 2025arXiv:2503.19009

#1620

Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval

Arun Reddy, Alexander Martin, Eugene Yang et al.

CVPR 2024arXiv:2404.16123

#1621

FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication

Eric Slyman, Stefan Lee, Scott Cohen et al.

CVPR 2025arXiv:2504.12717

#1622

Post-pre-training for Modality Alignment in Vision-Language Foundation Models

Shin'ya Yamaguchi, Dewei Feng, Sekitoshi Kanai et al.

CVPR 2024arXiv:2311.17833

#1623

DiG-IN: Diffusion Guidance for Investigating Networks - Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations

Maximilian Augustin, Yannic Neuhaus, Matthias Hein

CVPR 2025arXiv:2501.08326

#1624

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Miran Heo, Min-Hung Chen, De-An Huang et al.

CVPR 2024arXiv:2406.04155

#1625

Improving Physics-Augmented Continuum Neural Radiance Field-Based Geometry-Agnostic System Identification with Lagrangian Particle Optimization

Takuhiro Kaneko

CVPR 2024arXiv:2405.02608

#1626

UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

Shuai Yuan, Lei Luo, Zhuo Hui et al.

#1627

TurboSL: Dense Accurate and Fast 3D by Neural Inverse Structured Light

Parsa Mirdehghan, Maxx Wu, Wenzheng Chen et al.

CVPR 2025arXiv:2503.22912

#1628

DIFFER: Disentangling Identity Features via Semantic Cues for Clothes-Changing Person Re-ID

Xin Liang, Yogesh S. Rawat

CVPR 2025highlightarXiv:2504.14687

#1629

Seurat: From Moving Points to Depth

Seokju Cho, Gabriel Huang, Seungryong Kim et al.

CVPR 2025highlightarXiv:2412.02317

#1630

HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset

Zedong Chu, Feng Xiong, Meiduo Liu et al.

CVPR 2025highlightarXiv:2412.00175

#1631

Circumventing Shortcuts in Audio-visual Deepfake Detection Datasets with Unsupervised Learning

Stefan Smeu, Dragos-Alexandru Boldisor, Dan Oneata et al.

CVPR 2024arXiv:2403.10988

#1632

Boosting Flow-based Generative Super-Resolution Models via Learned Prior

Li-Yuan Tsao, Yi-Chen Lo, Chia-Che Chang et al.

CVPR 2025arXiv:2412.09511

#1633

GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency

Dongyue Lu, Lingdong Kong, Tianxin Huang et al.

CVPR 2025arXiv:2408.14468

#1634

K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences

Zhikai Li, Xuewen Liu, Dongrong Joe Fu et al.

CVPR 2025arXiv:2411.17696

#1635

ScribbleLight: Single Image Indoor Relighting with Scribbles

Jun Myeong Choi, Annie N. Wang, Pieter Peers et al.

CVPR 2024arXiv:2405.12509

#1636

Active Object Detection with Knowledge Aggregation and Distillation from Large Models

Dejie Yang, Yang Liu

CVPR 2025arXiv:2412.00927

#1637

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Weiming Ren, Huan Yang, Jie Min et al.

CVPR 2025highlightarXiv:2412.03451

#1638

PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes

Bin Tan, Rui Yu, Yujun Shen et al.

CVPR 2025arXiv:2502.20985

#1639

LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging

Maximilian Rokuss, Yannick Kirchhoff, Seval Akbal et al.

CVPR 2025arXiv:2503.23284

#1640

SketchVideo: Sketch-based Video Generation and Editing

Feng-Lin Liu, Hongbo Fu, Xintao Wang et al.

#1641

Mamba4D: Efficient 4D Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models

Jiuming Liu, Jinru Han, Lihao Liu et al.

CVPR 2024arXiv:2404.00330

#1642

Memory-Scalable and Simplified Functional Map Learning

Robin Magnet, Maks Ovsjanikov

CVPR 2025arXiv:2412.15341

#1643

Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models

Reza Shirkavand, Peiran Yu, Shangqian Gao et al.

CVPR 2025arXiv:2411.16832

#1644

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Hanhui Wang, Yihua Zhang, Ruizheng Bai et al.

CVPR 2025arXiv:2503.21457

#1645

FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs

Xiaoqin Wang, Xusen Ma, Xianxu Hou et al.

CVPR 2024arXiv:2404.05661

#1646

Automatic Controllable Colorization via Imagination

Xiaoyan Cong, Yue Wu, Qifeng Chen et al.

CVPR 2025arXiv:2412.16939

#1647

Image Quality Assessment: Investigating Causal Perceptual Effects with Abductive Counterfactual Inference

Wenhao Shen, Mingliang Zhou, Yu Chen et al.

CVPR 2024arXiv:2403.19326

#1648

MedBN: Robust Test-Time Adaptation against Malicious Test Samples

Hyejin Park, Jeongyeon Hwang, Sunung Mun et al.

CVPR 2024arXiv:2403.04245

#1649

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

Yusheng Dai, HangChen, Jun Du et al.

CVPR 2025arXiv:2410.13360

#1650

RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models

Haoran Hao, Jiaming Han, Changsheng Li et al.

CVPR 2025arXiv:2501.01633

#1651

ACE: Anti-Editing Concept Erasure in Text-to-Image Models

Zihao Wang, Yuxiang Wei, Fan Li et al.

CVPR 2025arXiv:2411.19415

#1652

AMO Sampler: Enhancing Text Rendering with Overshooting

Xixi Hu, Keyang Xu, Bo Liu et al.

CVPR 2025highlightarXiv:2412.12087

#1653

Instruction-based Image Manipulation by Watching How Things Move

Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.

CVPR 2025arXiv:2311.15965

#1654

FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding

Thanh-Dat Truong, Utsav Prabhu, Bhiksha Raj et al.

CVPR 2025arXiv:2412.06978

#1655

Edge-SD-SR: Low Latency and Parameter Efficient On-device Super-Resolution with Stable Diffusion via Bidirectional Conditioning

Isma Hadji, Mehdi Noroozi, Victor Escorcia et al.

CVPR 2025arXiv:2503.15835

#1656

BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting

Yiren Lu, Yunlai Zhou, Disheng Liu et al.

CVPR 2025arXiv:2409.13222

#1657

3D-GSW: 3D Gaussian Splatting for Robust Watermarking

Youngdong Jang, Hyunje Park, Feng Yang et al.

CVPR 2025arXiv:2410.11666

#1658

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

Zhengxue Wang, Zhiqiang Yan, Jinshan Pan et al.

CVPR 2025arXiv:2504.04708

#1659

SapiensID: Foundation for Human Recognition

Minchul Kim, Dingqiang Ye, Yiyang Su et al.

CVPR 2024arXiv:2403.16141

#1660

Entity-NeRF: Detecting and Removing Moving Entities in Urban Scenes

Takashi Otonari, Satoshi Ikehata, Kiyoharu Aizawa

CVPR 2024arXiv:2403.13548

#1661

Diversity-aware Channel Pruning for StyleGAN Compression

Jiwoo Chung, Sangeek Hyun, Sang-Heon Shim et al.

#1662

Focusing on Tracks for Online Multi-Object Tracking

Kyujin Shim, Kangwook Ko, YuJin Yang et al.

CVPR 2025arXiv:2501.06903

#1663

Synthetic Prior for Few-Shot Drivable Head Avatar Inversion

Wojciech Zielonka, Stephan J. Garbin, Alexandros Lattas et al.

CVPR 2025arXiv:2411.09911

#1664

DiffFNO: Diffusion Fourier Neural Operator

Xiaoyi Liu, Hao Tang

CVPR 2025arXiv:2504.08710

#1665

Hypergraph Vision Transformers: Images are More than Nodes, More than Edges

Joshua Fixelle

CVPR 2024arXiv:2405.10286

#1666

FFF: Fixing Flawed Foundations in Contrastive Pre-Training Results in Very Strong Vision-Language Models

Adrian Bulat, Yassine Ouali, Georgios Tzimiropoulos

CVPR 2025arXiv:2504.02555

#1667

Noise Calibration and Spatial-Frequency Interactive Network for STEM Image Enhancement

Hesong Li, Ziqi Wu, Ruiwen Shao et al.

CVPR 2025highlightarXiv:2501.11515

#1668

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Zixuan Chen, Yujin Wang, Xin Cai et al.

CVPR 2025highlightarXiv:2405.02700

#1669

Unveiling Differences in Generative Models: A Scalable Differential Clustering Approach

Jingwei Zhang, Mohammad Jalali, Cheuk Ting Li et al.

CVPR 2024arXiv:2405.19902

#1670

Learning Discriminative Dynamics with Label Corruption for Noisy Label Detection

Suyeon Kim, Dongha Lee, SeongKu Kang et al.

CVPR 2025arXiv:2503.18595

#1671

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition

Chengxiang Huang, Yake Wei, Zequn Yang et al.

CVPR 2025arXiv:2504.20378

#1672

Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views

Jiang Wu, Rui Li, Yu Zhu et al.

CVPR 2025arXiv:2503.02491

#1673

Joint Out-of-Distribution Filtering and Data Discovery Active Learning

Sebastian Schmidt, Leonard Schenk, Leo Schwinn et al.

#1674

Generative Zero-Shot Composed Image Retrieval

Lan Wang, Wei Ao, Vishnu Naresh Boddeti et al.

CVPR 2024arXiv:2404.04848

#1675

Task-Aware Encoder Control for Deep Video Compression

Xingtong Ge, Jixiang Luo, XINJIE ZHANG et al.

CVPR 2025highlightarXiv:2411.17662

#1676

RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training

Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun et al.

CVPR 2025arXiv:2505.07843

#1677

PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

HsiaoYuan Hsu, Yuxin Peng

CVPR 2025arXiv:2503.16942

#1678

Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model

Yingying Fan, Quanwei Yang, Kaisiyuan Wang et al.

CVPR 2025arXiv:2506.16201

#1679

FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation

Sen Wang, Le Wang, Sanping Zhou et al.

CVPR 2025arXiv:2503.13110

#1680

DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Jing Li, Yihang Fu, Falai Chen

#1681

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025highlightarXiv:2503.16964

#1682

DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery

Jiadong Tang, Yu Gao, Dianyi Yang et al.

CVPR 2024arXiv:2403.14870

#1683

VidLA: Video-Language Alignment at Scale

Mamshad Nayeem Rizve, Fan Fei, Jayakrishnan Unnikrishnan et al.

#1684

ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models

Heng Yin, Yuqiang Ren, Ke Yan et al.

CVPR 2025arXiv:2504.17695

#1685

PICO: Reconstructing 3D People In Contact with Objects

Alpár Cseke, Shashank Tripathi, Sai Kumar Dwivedi et al.

CVPR 2025arXiv:2412.04533

#1686

Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation

Yongkang Li, Tianheng Cheng, Bin Feng et al.

#1687

Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard et al.

CVPR 2025arXiv:2503.02394

#1688

BHViT: Binarized Hybrid Vision Transformer

Tian Gao, Yu Zhang, Zhiyuan Zhang et al.

CVPR 2025arXiv:2501.11309

#1689

Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation

Ziheng Zhang, Jianyang Gu, Arpita Chowdhury et al.

CVPR 2025highlightarXiv:2503.18223

#1690

MammAlps: A Multi-view Video Behavior Monitoring Dataset of Wild Mammals in the Swiss Alps

Valentin Gabeff, Haozhe Qi, Brendan Flaherty et al.

CVPR 2025arXiv:2501.11175

#1691

ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models

Yassir Bendou, Amine Ouasfi, Vincent Gripon et al.

#1692

The Change You Want To Detect: Semantic Change Detection In Earth Observation With Hybrid Data Generationf

Yanis Benidir, Nicolas Gonthier, Clement Mallet

CVPR 2024arXiv:2310.10700

#1693

PELA: Learning Parameter-Efficient Models with Low-Rank Approximation

Yangyang Guo, Guangzhi Wang, Mohan Kankanhalli

CVPR 2025arXiv:2503.16394

#1694

Do Visual Imaginations Improve Vision-and-Language Navigation Agents?

Akhil Perincherry, Jacob Krantz, Stefan Lee

CVPR 2025highlightarXiv:2405.20216

#1695

Boost Your Human Image Generation Model via Direct Preference Optimization

Sanghyeon Na, Yonggyu Kim, Hyunjoon Lee

CVPR 2024arXiv:2310.17154

#1696

Deep Imbalanced Regression via Hierarchical Classification Adjustment

Haipeng Xiong, Angela Yao

CVPR 2025arXiv:2503.20826

#1697

Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Yucong Meng, Kexue Fu et al.

CVPR 2025arXiv:2411.17386

#1698

vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation

Bastian Wittmann, Yannick Wattenberg, Tamaz Amiranashvili et al.

CVPR 2025arXiv:2503.01087

#1699

Rashomon Sets for Prototypical-Part Networks: Editing Interpretable Models in Real-Time

Jon Donnelly, Zhicheng Guo, Alina Jade Barnett et al.

CVPR 2025arXiv:2411.18936

#1700

Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Weimin Qiu, Jieke Wang, Meng Tang

CVPR 2025arXiv:2405.16240

#1701

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He, Kai Tong, Di Fang et al.

CVPR 2024arXiv:2212.08251

#1702

Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning

Xialei Liu, Jiang-Tian Zhai, Andrew Bagdanov et al.

#1703

Adversarially Robust Few-shot Learning via Parameter Co-distillation of Similarity and Class Concept Learners

Junhao Dong, Piotr Koniusz, Junxi Chen et al.

CVPR 2025arXiv:2501.07256

#1704

EdgeTAM: On-Device Track Anything Model

Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.

#1705

General Point Model Pretraining with Autoencoding and Autoregressive

Zhe Li, Zhangyang Gao, Cheng Tan et al.

CVPR 2024arXiv:2404.00254

#1706

Clustering for Protein Representation Learning

Ruijie Quan, Wenguan Wang, Fan Ma et al.

CVPR 2024arXiv:2404.11139

#1707

GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement

Linfang Zheng, Tze Ho Elden Tse, Chen Wang et al.

CVPR 2025arXiv:2412.09191

#1708

RAD: Region-Aware Diffusion Models for Image Inpainting

Sora Kim, Sungho Suh, Minsik Lee

#1709

Cross-Dimension Affinity Distillation for 3D EM Neuron Segmentation

Xiaoyu Liu, Miaomiao Cai, Yinda Chen et al.

CVPR 2025arXiv:2503.19359

#1710

Show and Segment: Universal Medical Image Segmentation via In-Context Learning

Yunhe Gao, Di Liu, Zhuowei Li et al.

CVPR 2024arXiv:2403.13647

#1711

Meta-Point Learning and Refining for Category-Agnostic Pose Estimation

Junjie Chen, Jiebin Yan, Yuming Fang et al.

#1712

Diffusion-FOF: Single-View Clothed Human Reconstruction via Diffusion-Based Fourier Occupancy Field

Yuanzhen Li, Fei LUO, Chunxia Xiao

CVPR 2024arXiv:2403.19235

#1713

DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

Haonan Lin

CVPR 2024highlightarXiv:2405.06216

#1714

Event-based Structure-from-Orbit

Ethan Elms, Yasir Latif, Tae Ha Park et al.

CVPR 2025highlightarXiv:2412.04077

#1715

SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning

Seokju Yun, Seunghye Chae, Dongheon Lee et al.

CVPR 2025arXiv:2506.11036

#1716

Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification

Yang Qin, Chao Chen, Zhihang Fu et al.

CVPR 2025arXiv:2407.08027

#1717

Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images

Kazi Sajeed Mehrab, M. Maruf, Arka Daw et al.

CVPR 2024highlightarXiv:2302.09585

#1718

StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation

Yining Shi, Kun JIANG, Ke Wang et al.

CVPR 2025highlightarXiv:2409.16434

#1719

Lessons and Insights from a Unifying Study of Parameter-Efficient Fine-Tuning (PEFT) in Visual Recognition

Zheda Mai, Ping Zhang, Cheng-Hao Tu et al.

CVPR 2024arXiv:2406.08960

#1720

AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings

Jamie Watson, Filippo Aleotti, Mohamed Sayed et al.

CVPR 2025arXiv:2505.15185

#1721

MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models

Yifan Liu, Keyu Fan, Weihao Yu et al.

#1722

Cross Initialization for Face Personalization of Text-to-Image Models

Lianyu Pang, Jian Yin, Haoran Xie et al.

CVPR 2024arXiv:2403.12202

#1723

DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions

Yunxiao Shi, Manish Singh, Hong Cai et al.

CVPR 2025arXiv:2411.16064

#1724

Multi-Granularity Class Prototype Topology Distillation for Class-Incremental Source-Free Unsupervised Domain Adaptation

Peihua Deng, Jiehua Zhang, Xichun Sheng et al.

CVPR 2025arXiv:2412.00837

#1725

AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Jin Lyu, Tianyi Zhu, Yi Gu et al.

CVPR 2024arXiv:2403.07244

#1726

Time-Efficient Light-Field Acquisition Using Coded Aperture and Events

Shuji Habuchi, Keita Takahashi, Chihiro Tsutake et al.

#1727

Multirate Neural Image Compression with Adaptive Lattice Vector Quantization

Hao Xu, Xiaolin Wu, Xi Zhang

CVPR 2025highlight

CVPR 2025arXiv:2504.00387

#1728

Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration

Zilong Huang, Jun He, Junyan Ye et al.

#1729

Relational Matching for Weakly Semi-Supervised Oriented Object Detection

Wenhao Wu, Hau San Wong, Si Wu et al.

CVPR 2024arXiv:2302.04871

#1730

In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing

Yiran Xu, Zhixin Shu, Cameron Smith et al.

CVPR 2025arXiv:2411.17030

#1731

g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks

Zihan Wang, Gim Hee Lee

CVPR 2025arXiv:2501.00603

#1732

DiC: Rethinking Conv3x3 Designs in Diffusion Models

Yuchuan Tian, Jing Han, Chengcheng Wang et al.

CVPR 2025arXiv:2503.18314

#1733

LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

Christoforos N. Spartalis, Theodoros Semertzidis, Efstratios Gavves et al.

CVPR 2025highlightarXiv:2502.20625

#1734

T2ICount: Enhancing Cross-modal Understanding for Zero-Shot Counting

Yifei Qian, Zhongliang Guo, Bowen Deng et al.

CVPR 2025arXiv:2412.11509

#1735

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves

Shihan Wu, Ji Zhang, Pengpeng Zeng et al.

CVPR 2025arXiv:2412.01553

#1736

SfM-Free 3D Gaussian Splatting via Hierarchical Training

Bo Ji, Angela Yao

CVPR 2024arXiv:2404.00842

#1737

An N-Point Linear Solver for Line and Motion Estimation with Event Cameras

Ling Gao, Daniel Gehrig, Hang Su et al.

#1738

AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning

Xuecheng Wu, Heli Sun, Yifan Wang et al.

CVPR 2025arXiv:2412.13047

#1739

Gaussian Splatting for Efficient Satellite Image Photogrammetry

Luca Savant Aira, Gabriele Facciolo, Thibaud Ehret

CVPR 2025arXiv:2406.01591

#1740

DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation

Chun-Hung Wu, Shih-Hong Chen, Chih Yao Hu et al.

CVPR 2025arXiv:2412.02254

#1741

ProbPose: A Probabilistic Approach to 2D Human Pose Estimation

Miroslav Purkrábek, Jiri Matas

CVPR 2025arXiv:2503.17928

#1742

Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization

zefeng zhang, Hengzhu Tang, Jiawei Sheng et al.

CVPR 2025arXiv:2504.01503

#1743

Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Ziteng Cui, Xuangeng Chu, Tatsuya Harada

CVPR 2024arXiv:2404.00301

#1744

Monocular Identity-Conditioned Facial Reflectance Reconstruction

Xingyu Ren, Jiankang Deng, Yuhao Cheng et al.

CVPR 2024arXiv:2404.01591

#1745

Language Model Guided Interpretable Video Action Reasoning

Ning Wang, Guangming Zhu, Hongsheng Li et al.

CVPR 2025arXiv:2502.20256

#1746

The Computer Vision Foundation

Yancheng Cai, Fei Yin, Dounia Hammou et al.

CVPR 2025arXiv:2412.04432

#1747

Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation

Yuying Ge, Yizhuo Li, Yixiao Ge et al.

CVPR 2025arXiv:2405.18840

#1748

Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2024arXiv:2403.08262

#1749

BiTT: Bi-directional Texture Reconstruction of Interacting Two Hands from a Single Image

Minje Kim, Tae-Kyun Kim

CVPR 2024arXiv:2312.05889

#1750

SuperPrimitive: Scene Reconstruction at a Primitive Level

Kirill Mazur, Gwangbin Bae, Andrew J. Davison

CVPR 2025highlightarXiv:2503.07635

#1751

Cross-modal Causal Relation Alignment for Video Question Grounding

weixing chen, Yang Liu, Binglin Chen et al.

CVPR 2025arXiv:2504.06120

#1752

Hyperbolic Category Discovery

Yuanpei Liu, Zhenqi He, Kai Han

CVPR 2024arXiv:2404.02889

#1753

Steganographic Passport: An Owner and User Verifiable Credential for Deep Model IP Protection Without Retraining

Qi Cui, Ruohan Meng, Chaohui Xu et al.

CVPR 2025arXiv:2503.06621

#1754

Dynamic Updates for Language Adaptation in Visual-Language Tracking

Xiaohai Li, Bineng Zhong, Qihua Liang et al.

CVPR 2025arXiv:2503.18513

#1755

LookCloser: Frequency-aware Radiance Field for Tiny-Detail Scene

Xiaoyu Zhang, Weihong Pan, Chong Bao et al.

CVPR 2024arXiv:2406.17219

#1756

Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction

Zhenzhong Kuang, Xiaochen Yang, Yingjie Shen et al.

CVPR 2025arXiv:2406.19827

#1757

Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory

Wenliang Zhong, Haoyu Tang, Qinghai Zheng et al.

CVPR 2025highlightarXiv:2503.18420

#1758

Panorama Generation From NFoV Image Done Right

Dian Zheng, Cheng Zhang, Xiao-Ming Wu et al.

CVPR 2025arXiv:2502.20249

#1759

Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels

Pierre Vuillecard, Jean-marc Odobez

CVPR 2024arXiv:2312.03420

#1760

Artist-Friendly Relightable and Animatable Neural Heads

Yingyan Xu, Prashanth Chandran, Sebastian Weiss et al.

CVPR 2025arXiv:2412.17630

#1761

Detail-Preserving Latent Diffusion for Stable Shadow Removal

Jiamin Xu, Yuxin Zheng, Zelong Li et al.

CVPR 2024arXiv:2308.15692

#1762

Intriguing Properties of Diffusion Models: An Empirical Study of the Natural Attack Capability in Text-to-Image Generative Models

Takami Sato, Justin Yue, Nanze Chen et al.

CVPR 2024arXiv:2406.01843

#1763

L-MAGIC: Language Model Assisted Generation of Images with Coherence

zhipeng cai, Matthias Mueller, Reiner Birkl et al.

CVPR 2025arXiv:2409.06214

#1764

Towards Generalizable Scene Change Detection

Jae-Woo KIM, Ue-Hwan Kim

CVPR 2025arXiv:2505.06166

#1765

DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

Radu Alexandru Rosu, Keyu Wu, Yao Feng et al.

CVPR 2025arXiv:2503.01725

#1766

HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization

Zitang Zhou, Ke Mei, Yu Lu et al.

CVPR 2025arXiv:2412.02071

#1767

Progress-Aware Video Frame Captioning

Zihui Xue, Joungbin An, Xitong Yang et al.

CVPR 2025arXiv:2503.24210

#1768

DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting

Seungjun Lee, Gim Hee Lee

CVPR 2024arXiv:2305.17368

#1769

Instance-based Max-margin for Practical Few-shot Recognition

Minghao Fu, Ke Zhu

CVPR 2025highlightarXiv:2505.24315

#1770

InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing

Jinlu Zhang, Yixin Chen, Zan Wang et al.

CVPR 2025arXiv:2505.24816

#1771

CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning

Jiangpeng He, Zhihao Duan, Fengqing Zhu

CVPR 2025highlightarXiv:2502.20162

#1772

Gradient-Guided Annealing for Domain Generalization

Aristotelis Ballas, Christos Diou

CVPR 2025highlightarXiv:2503.04919

#1773

FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement

Ian Huang, Yanan Bao, Karen Truong et al.

CVPR 2024arXiv:2309.04437

#1774

Single View Refractive Index Tomography with Neural Fields

Brandon Zhao, Aviad Levis, Liam Connor et al.

CVPR 2024arXiv:2403.01231

#1775

Benchmarking Segmentation Models with Mask-Preserved Attribute Editing

Zijin Yin, Kongming Liang, Bing Li et al.

CVPR 2024arXiv:2312.04552

#1776

Generating Illustrated Instructions

Sachit Menon, Ishan Misra, Rohit Girdhar

CVPR 2025arXiv:2411.11909

#1777

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.

CVPR 2024arXiv:2406.06813

#1778

Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation

Dong Zhao, Shuang Wang, Qi Zang et al.

CVPR 2025arXiv:2403.12922

#1779

Contextual AD Narration with Interleaved Multimodal Sequence

Hanlin Wang, Zhan Tong, Kecheng Zheng et al.

CVPR 2025arXiv:2507.06928

#1780

Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement

Qiyuan Dai, Hanzhuo Huang, Yu Wu et al.

CVPR 2025arXiv:2411.05738

#1781

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

Yuze He, Yanning Zhou, Wang Zhao et al.

CVPR 2025arXiv:2503.01291

#1782

SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Peishan Cong, Ziyi Wang, Yuexin Ma et al.

CVPR 2025arXiv:2411.18552

#1783

FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion

Haosen Yang, Adrian Bulat, Isma Hadji et al.

CVPR 2024arXiv:2403.00939

#1784

G3DR: Generative 3D Reconstruction in ImageNet

Pradyumna Reddy, Ismail Elezi, Jiankang Deng

CVPR 2024arXiv:2311.11995

#1785

BrainWash: A Poisoning Attack to Forget in Continual Learning

Ali Abbasi, Parsa Nooralinejad, Hamed Pirsiavash et al.

CVPR 2024arXiv:2403.02041

#1786

A Generative Approach for Wikipedia-Scale Visual Entity Recognition

Mathilde Caron, Ahmet Iscen, Alireza Fathi et al.

CVPR 2024arXiv:2404.01123

#1787

CLIPtone: Unsupervised Learning for Text-based Image Tone Adjustment

Hyeongmin Lee, Kyoungkook Kang, Jungseul Ok et al.

CVPR 2025arXiv:2408.17135

#1788

TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation

Yabiao Wang, Shuo Wang, Jiangning Zhang et al.

CVPR 2025arXiv:2408.16266

#1789

Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification

Yanghao Wang, Long Chen

CVPR 2025arXiv:2503.06514

#1790

GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks

Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.

CVPR 2025arXiv:2509.09555

#1791

InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation

Sirui Xu, Dongting Li, Yucheng Zhang et al.

CVPR 2025arXiv:2503.18434

#1792

A Simple yet Effective Layout Token in Large Language Models for Document Understanding

Zhaoqing Zhu, Chuwei Luo, Zirui Shao et al.

CVPR 2025arXiv:2502.20678

#1793

STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding

Aaryan Garg, Akash Kumar, Yogesh S. Rawat

CVPR 2024arXiv:2402.18862

#1794

Towards Backward-Compatible Continual Learning of Image Compression

Zhihao Duan, Ming Lu, Justin Yang et al.

CVPR 2025arXiv:2412.09680

#1795

PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields

Sean Wu, Shamik Basu, Tim Broedermann et al.

CVPR 2025arXiv:2504.02451

#1796

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer

Jiayi Gao, Zijin Yin, Changcheng Hua et al.

CVPR 2025arXiv:2504.02764

#1797

Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model

Shengjun Zhang, Jinzhao Li, Xin Fei et al.

#1798

When Visual Grounding Meets Gigapixel-level Large-scale Scenes: Benchmark and Approach

TAO MA, Bing Bai, Haozhe Lin et al.

#1799

Robust Multimodal Survival Prediction with Conditional Latent Differentiation Variational AutoEncoder

Junjie Zhou, Jiao Tang, Yingli Zuo et al.

CVPR 2025arXiv:2505.04270

#1800

Object-Shot Enhanced Grounding Network for Egocentric Video

Yisen Feng, Haoyu Zhang, Meng Liu et al.