Most Cited CVPR "black-box detection" Papers

5,589 papers found • Page 9 of 28

#1601

Towards Transformer-Based Aligned Generation with Self-Coherence Guidance

Shulei Wang, Wang Lin, Hai Huang et al.

CVPR 2025arXiv:2503.17675
9
citations
#1602

MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes

Ruijie Lu, Yixin Chen, Junfeng Ni et al.

CVPR 2025arXiv:2412.11457
9
citations
#1603

EventGPT: Event Stream Understanding with Multimodal Large Language Models

shaoyu liu, Jianing Li, guanghui zhao et al.

CVPR 2025arXiv:2412.00832
9
citations
#1604

RELI11D: A Comprehensive Multimodal Human Motion Dataset and Method

Ming Yan, Yan Zhang, Shuqiang Cai et al.

CVPR 2024arXiv:2403.19501
9
citations
#1605

MP-GUI: Modality Perception with MLLMs for GUI Understanding

Ziwei Wang, Weizhi Chen, Leyang Yang et al.

CVPR 2025arXiv:2503.14021
9
citations
#1606

Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes

Diandian Guo, Deng-Ping Fan, Tongyu Lu et al.

CVPR 2024highlightarXiv:2401.15261
9
citations
#1607

InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Yunhong Lu, Qichao Wang, Hengyuan Cao et al.

CVPR 2025highlightarXiv:2503.18454
9
citations
#1608

Neural Super-Resolution for Real-time Rendering with Radiance Demodulation

Jia Li, Ziling Chen, Xiaolong Wu et al.

CVPR 2024arXiv:2308.06699
9
citations
#1609

GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration

Sudarshan Rajagopalan, Nithin Gopalakrishnan Nair, Jay Paranjape et al.

CVPR 2025arXiv:2411.17687
9
citations
#1610

Probability Density Geodesics in Image Diffusion Latent Space

Qingtao Yu, Jaskirat Singh, Zhaoyuan Yang et al.

CVPR 2025arXiv:2504.06675
9
citations
#1611

Language Guided Concept Bottleneck Models for Interpretable Continual Learning

Lu Yu, HaoYu Han, Zhe Tao et al.

CVPR 2025arXiv:2503.23283
9
citations
#1612

O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models

Ashshak Sharifdeen, Muhammad Akhtar Munir, Sanoojan Baliah et al.

CVPR 2025highlightarXiv:2503.12096
9
citations
#1613

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

Jihyun Lee, Weipeng Xu, Alexander Richard et al.

CVPR 2025arXiv:2504.04956
9
citations
#1614

Clockwork Diffusion: Efficient Generation With Model-Step Distillation

Amirhossein Habibian, Amir Ghodrati, Noor Fathima et al.

CVPR 2024highlightarXiv:2312.08128
9
citations
#1615

FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation

Dong Zhao, Jinlong Li, Shuang Wang et al.

CVPR 2025arXiv:2503.17940
9
citations
#1616

Motion Diversification Networks

Hee Jae Kim, Eshed Ohn-Bar

CVPR 2024
9
citations
#1617

Hierarchical Correlation Clustering and Tree Preserving Embedding

Morteza Haghir Chehreghani, Mostafa Haghir Chehreghani

CVPR 2024arXiv:2002.07756
9
citations
#1618

MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction

Gangjian Zhang, Nanjie Yao, Shunsi Zhang et al.

CVPR 2025arXiv:2412.03103
9
citations
#1619

RNG: Relightable Neural Gaussians

Jiahui Fan, Fujun Luan, Jian Yang et al.

CVPR 2025arXiv:2409.19702
9
citations
#1620

Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval

Arun Reddy, Alexander Martin, Eugene Yang et al.

CVPR 2025arXiv:2503.19009
9
citations
#1621

FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication

Eric Slyman, Stefan Lee, Scott Cohen et al.

CVPR 2024arXiv:2404.16123
9
citations
#1622

Post-pre-training for Modality Alignment in Vision-Language Foundation Models

Shin'ya Yamaguchi, Dewei Feng, Sekitoshi Kanai et al.

CVPR 2025arXiv:2504.12717
9
citations
#1623

DiG-IN: Diffusion Guidance for Investigating Networks - Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations

Maximilian Augustin, Yannic Neuhaus, Matthias Hein

CVPR 2024arXiv:2311.17833
9
citations
#1624

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Miran Heo, Min-Hung Chen, De-An Huang et al.

CVPR 2025arXiv:2501.08326
9
citations
#1625

Improving Physics-Augmented Continuum Neural Radiance Field-Based Geometry-Agnostic System Identification with Lagrangian Particle Optimization

Takuhiro Kaneko

CVPR 2024arXiv:2406.04155
9
citations
#1626

UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

Shuai Yuan, Lei Luo, Zhuo Hui et al.

CVPR 2024arXiv:2405.02608
9
citations
#1627

TurboSL: Dense Accurate and Fast 3D by Neural Inverse Structured Light

Parsa Mirdehghan, Maxx Wu, Wenzheng Chen et al.

CVPR 2024
9
citations
#1628

DIFFER: Disentangling Identity Features via Semantic Cues for Clothes-Changing Person Re-ID

Xin Liang, Yogesh S. Rawat

CVPR 2025arXiv:2503.22912
9
citations
#1629

Seurat: From Moving Points to Depth

Seokju Cho, Gabriel Huang, Seungryong Kim et al.

CVPR 2025highlightarXiv:2504.14687
9
citations
#1630

HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset

Zedong Chu, Feng Xiong, Meiduo Liu et al.

CVPR 2025highlightarXiv:2412.02317
9
citations
#1631

Circumventing Shortcuts in Audio-visual Deepfake Detection Datasets with Unsupervised Learning

Stefan Smeu, Dragos-Alexandru Boldisor, Dan Oneata et al.

CVPR 2025highlightarXiv:2412.00175
9
citations
#1632

Boosting Flow-based Generative Super-Resolution Models via Learned Prior

Li-Yuan Tsao, Yi-Chen Lo, Chia-Che Chang et al.

CVPR 2024arXiv:2403.10988
9
citations
#1633

GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency

Dongyue Lu, Lingdong Kong, Tianxin Huang et al.

CVPR 2025arXiv:2412.09511
9
citations
#1634

K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences

Zhikai Li, Xuewen Liu, Dongrong Joe Fu et al.

CVPR 2025arXiv:2408.14468
9
citations
#1635

ScribbleLight: Single Image Indoor Relighting with Scribbles

Jun Myeong Choi, Annie N. Wang, Pieter Peers et al.

CVPR 2025arXiv:2411.17696
9
citations
#1636

Active Object Detection with Knowledge Aggregation and Distillation from Large Models

Dejie Yang, Yang Liu

CVPR 2024arXiv:2405.12509
9
citations
#1637

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Weiming Ren, Huan Yang, Jie Min et al.

CVPR 2025arXiv:2412.00927
9
citations
#1638

PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes

Bin Tan, Rui Yu, Yujun Shen et al.

CVPR 2025highlightarXiv:2412.03451
9
citations
#1639

LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging

Maximilian Rokuss, Yannick Kirchhoff, Seval Akbal et al.

CVPR 2025arXiv:2502.20985
9
citations
#1640

SketchVideo: Sketch-based Video Generation and Editing

Feng-Lin Liu, Hongbo Fu, Xintao Wang et al.

CVPR 2025arXiv:2503.23284
9
citations
#1641

Mamba4D: Efficient 4D Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models

Jiuming Liu, Jinru Han, Lihao Liu et al.

CVPR 2025
9
citations
#1642

Memory-Scalable and Simplified Functional Map Learning

Robin Magnet, Maks Ovsjanikov

CVPR 2024arXiv:2404.00330
9
citations
#1643

Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models

Reza Shirkavand, Peiran Yu, Shangqian Gao et al.

CVPR 2025arXiv:2412.15341
8
citations
#1644

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Hanhui Wang, Yihua Zhang, Ruizheng Bai et al.

CVPR 2025arXiv:2411.16832
8
citations
#1645

FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs

Xiaoqin Wang, Xusen Ma, Xianxu Hou et al.

CVPR 2025arXiv:2503.21457
8
citations
#1646

Automatic Controllable Colorization via Imagination

Xiaoyan Cong, Yue Wu, Qifeng Chen et al.

CVPR 2024arXiv:2404.05661
8
citations
#1647

Image Quality Assessment: Investigating Causal Perceptual Effects with Abductive Counterfactual Inference

Wenhao Shen, Mingliang Zhou, Yu Chen et al.

CVPR 2025arXiv:2412.16939
8
citations
#1648

MedBN: Robust Test-Time Adaptation against Malicious Test Samples

Hyejin Park, Jeongyeon Hwang, Sunung Mun et al.

CVPR 2024arXiv:2403.19326
8
citations
#1649

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

Yusheng Dai, HangChen, Jun Du et al.

CVPR 2024arXiv:2403.04245
8
citations
#1650

RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models

Haoran Hao, Jiaming Han, Changsheng Li et al.

CVPR 2025arXiv:2410.13360
8
citations
#1651

ACE: Anti-Editing Concept Erasure in Text-to-Image Models

Zihao Wang, Yuxiang Wei, Fan Li et al.

CVPR 2025arXiv:2501.01633
8
citations
#1652

AMO Sampler: Enhancing Text Rendering with Overshooting

Xixi Hu, Keyang Xu, Bo Liu et al.

CVPR 2025arXiv:2411.19415
8
citations
#1653

Instruction-based Image Manipulation by Watching How Things Move

Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.

CVPR 2025highlightarXiv:2412.12087
8
citations
#1654

FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding

Thanh-Dat Truong, Utsav Prabhu, Bhiksha Raj et al.

CVPR 2025arXiv:2311.15965
8
citations
#1655

Edge-SD-SR: Low Latency and Parameter Efficient On-device Super-Resolution with Stable Diffusion via Bidirectional Conditioning

Isma Hadji, Mehdi Noroozi, Victor Escorcia et al.

CVPR 2025arXiv:2412.06978
8
citations
#1656

BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting

Yiren Lu, Yunlai Zhou, Disheng Liu et al.

CVPR 2025arXiv:2503.15835
8
citations
#1657

3D-GSW: 3D Gaussian Splatting for Robust Watermarking

Youngdong Jang, Hyunje Park, Feng Yang et al.

CVPR 2025arXiv:2409.13222
8
citations
#1658

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

Zhengxue Wang, Zhiqiang Yan, Jinshan Pan et al.

CVPR 2025arXiv:2410.11666
8
citations
#1659

SapiensID: Foundation for Human Recognition

Minchul Kim, Dingqiang Ye, Yiyang Su et al.

CVPR 2025arXiv:2504.04708
8
citations
#1660

Entity-NeRF: Detecting and Removing Moving Entities in Urban Scenes

Takashi Otonari, Satoshi Ikehata, Kiyoharu Aizawa

CVPR 2024arXiv:2403.16141
8
citations
#1661

Diversity-aware Channel Pruning for StyleGAN Compression

Jiwoo Chung, Sangeek Hyun, Sang-Heon Shim et al.

CVPR 2024arXiv:2403.13548
8
citations
#1662

Focusing on Tracks for Online Multi-Object Tracking

Kyujin Shim, Kangwook Ko, YuJin Yang et al.

CVPR 2025
8
citations
#1663

Synthetic Prior for Few-Shot Drivable Head Avatar Inversion

Wojciech Zielonka, Stephan J. Garbin, Alexandros Lattas et al.

CVPR 2025arXiv:2501.06903
8
citations
#1664

DiffFNO: Diffusion Fourier Neural Operator

Xiaoyi Liu, Hao Tang

CVPR 2025arXiv:2411.09911
8
citations
#1665

Hypergraph Vision Transformers: Images are More than Nodes, More than Edges

Joshua Fixelle

CVPR 2025arXiv:2504.08710
8
citations
#1666

FFF: Fixing Flawed Foundations in Contrastive Pre-Training Results in Very Strong Vision-Language Models

Adrian Bulat, Yassine Ouali, Georgios Tzimiropoulos

CVPR 2024arXiv:2405.10286
8
citations
#1667

Noise Calibration and Spatial-Frequency Interactive Network for STEM Image Enhancement

Hesong Li, Ziqi Wu, Ruiwen Shao et al.

CVPR 2025arXiv:2504.02555
8
citations
#1668

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Zixuan Chen, Yujin Wang, Xin Cai et al.

CVPR 2025highlightarXiv:2501.11515
8
citations
#1669

Unveiling Differences in Generative Models: A Scalable Differential Clustering Approach

Jingwei Zhang, Mohammad Jalali, Cheuk Ting Li et al.

CVPR 2025highlightarXiv:2405.02700
8
citations
#1670

Learning Discriminative Dynamics with Label Corruption for Noisy Label Detection

Suyeon Kim, Dongha Lee, SeongKu Kang et al.

CVPR 2024arXiv:2405.19902
8
citations
#1671

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition

Chengxiang Huang, Yake Wei, Zequn Yang et al.

CVPR 2025arXiv:2503.18595
8
citations
#1672

Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views

Jiang Wu, Rui Li, Yu Zhu et al.

CVPR 2025arXiv:2504.20378
8
citations
#1673

Joint Out-of-Distribution Filtering and Data Discovery Active Learning

Sebastian Schmidt, Leonard Schenk, Leo Schwinn et al.

CVPR 2025arXiv:2503.02491
8
citations
#1674

Generative Zero-Shot Composed Image Retrieval

Lan Wang, Wei Ao, Vishnu Naresh Boddeti et al.

CVPR 2025
8
citations
#1675

Task-Aware Encoder Control for Deep Video Compression

Xingtong Ge, Jixiang Luo, XINJIE ZHANG et al.

CVPR 2024arXiv:2404.04848
8
citations
#1676

RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training

Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun et al.

CVPR 2025highlightarXiv:2411.17662
8
citations
#1677

PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

HsiaoYuan Hsu, Yuxin Peng

CVPR 2025arXiv:2505.07843
8
citations
#1678

Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model

Yingying Fan, Quanwei Yang, Kaisiyuan Wang et al.

CVPR 2025arXiv:2503.16942
8
citations
#1679

FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation

Sen Wang, Le Wang, Sanping Zhou et al.

CVPR 2025arXiv:2506.16201
8
citations
#1680

DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Jing Li, Yihang Fu, Falai Chen

CVPR 2025arXiv:2503.13110
8
citations
#1681

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025
8
citations
#1682

DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery

Jiadong Tang, Yu Gao, Dianyi Yang et al.

CVPR 2025highlightarXiv:2503.16964
8
citations
#1683

VidLA: Video-Language Alignment at Scale

Mamshad Nayeem Rizve, Fan Fei, Jayakrishnan Unnikrishnan et al.

CVPR 2024arXiv:2403.14870
8
citations
#1684

ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models

Heng Yin, Yuqiang Ren, Ke Yan et al.

CVPR 2025
8
citations
#1685

PICO: Reconstructing 3D People In Contact with Objects

Alpár Cseke, Shashank Tripathi, Sai Kumar Dwivedi et al.

CVPR 2025arXiv:2504.17695
8
citations
#1686

Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation

Yongkang Li, Tianheng Cheng, Bin Feng et al.

CVPR 2025arXiv:2412.04533
8
citations
#1687

Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard et al.

CVPR 2025
8
citations
#1688

BHViT: Binarized Hybrid Vision Transformer

Tian Gao, Yu Zhang, Zhiyuan Zhang et al.

CVPR 2025arXiv:2503.02394
8
citations
#1689

Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation

Ziheng Zhang, Jianyang Gu, Arpita Chowdhury et al.

CVPR 2025arXiv:2501.11309
8
citations
#1690

MammAlps: A Multi-view Video Behavior Monitoring Dataset of Wild Mammals in the Swiss Alps

Valentin Gabeff, Haozhe Qi, Brendan Flaherty et al.

CVPR 2025highlightarXiv:2503.18223
8
citations
#1691

ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models

Yassir Bendou, Amine Ouasfi, Vincent Gripon et al.

CVPR 2025arXiv:2501.11175
8
citations
#1692

The Change You Want To Detect: Semantic Change Detection In Earth Observation With Hybrid Data Generationf

Yanis Benidir, Nicolas Gonthier, Clement Mallet

CVPR 2025
8
citations
#1693

PELA: Learning Parameter-Efficient Models with Low-Rank Approximation

Yangyang Guo, Guangzhi Wang, Mohan Kankanhalli

CVPR 2024arXiv:2310.10700
8
citations
#1694

Do Visual Imaginations Improve Vision-and-Language Navigation Agents?

Akhil Perincherry, Jacob Krantz, Stefan Lee

CVPR 2025arXiv:2503.16394
8
citations
#1695

Boost Your Human Image Generation Model via Direct Preference Optimization

Sanghyeon Na, Yonggyu Kim, Hyunjoon Lee

CVPR 2025highlightarXiv:2405.20216
8
citations
#1696

Deep Imbalanced Regression via Hierarchical Classification Adjustment

Haipeng Xiong, Angela Yao

CVPR 2024arXiv:2310.17154
8
citations
#1697

Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Yucong Meng, Kexue Fu et al.

CVPR 2025arXiv:2503.20826
8
citations
#1698

vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation

Bastian Wittmann, Yannick Wattenberg, Tamaz Amiranashvili et al.

CVPR 2025arXiv:2411.17386
8
citations
#1699

Rashomon Sets for Prototypical-Part Networks: Editing Interpretable Models in Real-Time

Jon Donnelly, Zhicheng Guo, Alina Jade Barnett et al.

CVPR 2025arXiv:2503.01087
8
citations
#1700

Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Weimin Qiu, Jieke Wang, Meng Tang

CVPR 2025arXiv:2411.18936
8
citations
#1701

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He, Kai Tong, Di Fang et al.

CVPR 2025arXiv:2405.16240
8
citations
#1702

Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning

Xialei Liu, Jiang-Tian Zhai, Andrew Bagdanov et al.

CVPR 2024arXiv:2212.08251
8
citations
#1703

Adversarially Robust Few-shot Learning via Parameter Co-distillation of Similarity and Class Concept Learners

Junhao Dong, Piotr Koniusz, Junxi Chen et al.

CVPR 2024
8
citations
#1704

EdgeTAM: On-Device Track Anything Model

Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.

CVPR 2025arXiv:2501.07256
8
citations
#1705

General Point Model Pretraining with Autoencoding and Autoregressive

Zhe Li, Zhangyang Gao, Cheng Tan et al.

CVPR 2024
8
citations
#1706

Clustering for Protein Representation Learning

Ruijie Quan, Wenguan Wang, Fan Ma et al.

CVPR 2024arXiv:2404.00254
8
citations
#1707

GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement

Linfang Zheng, Tze Ho Elden Tse, Chen Wang et al.

CVPR 2024arXiv:2404.11139
8
citations
#1708

RAD: Region-Aware Diffusion Models for Image Inpainting

Sora Kim, Sungho Suh, Minsik Lee

CVPR 2025arXiv:2412.09191
8
citations
#1709

Cross-Dimension Affinity Distillation for 3D EM Neuron Segmentation

Xiaoyu Liu, Miaomiao Cai, Yinda Chen et al.

CVPR 2024
8
citations
#1710

Show and Segment: Universal Medical Image Segmentation via In-Context Learning

Yunhe Gao, Di Liu, Zhuowei Li et al.

CVPR 2025arXiv:2503.19359
8
citations
#1711

Meta-Point Learning and Refining for Category-Agnostic Pose Estimation

Junjie Chen, Jiebin Yan, Yuming Fang et al.

CVPR 2024arXiv:2403.13647
8
citations
#1712

Diffusion-FOF: Single-View Clothed Human Reconstruction via Diffusion-Based Fourier Occupancy Field

Yuanzhen Li, Fei LUO, Chunxia Xiao

CVPR 2024
8
citations
#1713

DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

Haonan Lin

CVPR 2024arXiv:2403.19235
8
citations
#1714

Event-based Structure-from-Orbit

Ethan Elms, Yasir Latif, Tae Ha Park et al.

CVPR 2024highlightarXiv:2405.06216
8
citations
#1715

SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning

Seokju Yun, Seunghye Chae, Dongheon Lee et al.

CVPR 2025highlightarXiv:2412.04077
8
citations
#1716

Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification

Yang Qin, Chao Chen, Zhihang Fu et al.

CVPR 2025arXiv:2506.11036
8
citations
#1717

Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images

Kazi Sajeed Mehrab, M. Maruf, Arka Daw et al.

CVPR 2025arXiv:2407.08027
8
citations
#1718

StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation

Yining Shi, Kun JIANG, Ke Wang et al.

CVPR 2024highlightarXiv:2302.09585
8
citations
#1719

Lessons and Insights from a Unifying Study of Parameter-Efficient Fine-Tuning (PEFT) in Visual Recognition

Zheda Mai, Ping Zhang, Cheng-Hao Tu et al.

CVPR 2025highlightarXiv:2409.16434
8
citations
#1720

AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings

Jamie Watson, Filippo Aleotti, Mohamed Sayed et al.

CVPR 2024arXiv:2406.08960
8
citations
#1721

MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models

Yifan Liu, Keyu Fan, Weihao Yu et al.

CVPR 2025arXiv:2505.15185
8
citations
#1722

Cross Initialization for Face Personalization of Text-to-Image Models

Lianyu Pang, Jian Yin, Haoran Xie et al.

CVPR 2024
8
citations
#1723

DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions

Yunxiao Shi, Manish Singh, Hong Cai et al.

CVPR 2024arXiv:2403.12202
8
citations
#1724

Multi-Granularity Class Prototype Topology Distillation for Class-Incremental Source-Free Unsupervised Domain Adaptation

Peihua Deng, Jiehua Zhang, Xichun Sheng et al.

CVPR 2025arXiv:2411.16064
8
citations
#1725

AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Jin Lyu, Tianyi Zhu, Yi Gu et al.

CVPR 2025arXiv:2412.00837
8
citations
#1726

Time-Efficient Light-Field Acquisition Using Coded Aperture and Events

Shuji Habuchi, Keita Takahashi, Chihiro Tsutake et al.

CVPR 2024arXiv:2403.07244
8
citations
#1727

Multirate Neural Image Compression with Adaptive Lattice Vector Quantization

Hao Xu, Xiaolin Wu, Xi Zhang

CVPR 2025highlight
8
citations
#1728

Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration

Zilong Huang, Jun He, Junyan Ye et al.

CVPR 2025arXiv:2504.00387
8
citations
#1729

Relational Matching for Weakly Semi-Supervised Oriented Object Detection

Wenhao Wu, Hau San Wong, Si Wu et al.

CVPR 2024
8
citations
#1730

In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing

Yiran Xu, Zhixin Shu, Cameron Smith et al.

CVPR 2024arXiv:2302.04871
8
citations
#1731

g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks

Zihan Wang, Gim Hee Lee

CVPR 2025arXiv:2411.17030
8
citations
#1732

DiC: Rethinking Conv3x3 Designs in Diffusion Models

Yuchuan Tian, Jing Han, Chengcheng Wang et al.

CVPR 2025arXiv:2501.00603
8
citations
#1733

LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

Christoforos N. Spartalis, Theodoros Semertzidis, Efstratios Gavves et al.

CVPR 2025arXiv:2503.18314
8
citations
#1734

T2ICount: Enhancing Cross-modal Understanding for Zero-Shot Counting

Yifei Qian, Zhongliang Guo, Bowen Deng et al.

CVPR 2025highlightarXiv:2502.20625
8
citations
#1735

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves

Shihan Wu, Ji Zhang, Pengpeng Zeng et al.

CVPR 2025arXiv:2412.11509
8
citations
#1736

SfM-Free 3D Gaussian Splatting via Hierarchical Training

Bo Ji, Angela Yao

CVPR 2025arXiv:2412.01553
8
citations
#1737

An N-Point Linear Solver for Line and Motion Estimation with Event Cameras

Ling Gao, Daniel Gehrig, Hang Su et al.

CVPR 2024arXiv:2404.00842
8
citations
#1738

AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning

Xuecheng Wu, Heli Sun, Yifan Wang et al.

CVPR 2025
7
citations
#1739

Gaussian Splatting for Efficient Satellite Image Photogrammetry

Luca Savant Aira, Gabriele Facciolo, Thibaud Ehret

CVPR 2025arXiv:2412.13047
7
citations
#1740

DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation

Chun-Hung Wu, Shih-Hong Chen, Chih Yao Hu et al.

CVPR 2025arXiv:2406.01591
7
citations
#1741

ProbPose: A Probabilistic Approach to 2D Human Pose Estimation

Miroslav Purkrábek, Jiri Matas

CVPR 2025arXiv:2412.02254
7
citations
#1742

Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization

zefeng zhang, Hengzhu Tang, Jiawei Sheng et al.

CVPR 2025arXiv:2503.17928
7
citations
#1743

Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Ziteng Cui, Xuangeng Chu, Tatsuya Harada

CVPR 2025arXiv:2504.01503
7
citations
#1744

Monocular Identity-Conditioned Facial Reflectance Reconstruction

Xingyu Ren, Jiankang Deng, Yuhao Cheng et al.

CVPR 2024arXiv:2404.00301
7
citations
#1745

Language Model Guided Interpretable Video Action Reasoning

Ning Wang, Guangming Zhu, Hongsheng Li et al.

CVPR 2024arXiv:2404.01591
7
citations
#1746

The Computer Vision Foundation

Yancheng Cai, Fei Yin, Dounia Hammou et al.

CVPR 2025arXiv:2502.20256
7
citations
#1747

Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation

Yuying Ge, Yizhuo Li, Yixiao Ge et al.

CVPR 2025arXiv:2412.04432
7
citations
#1748

Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2025arXiv:2405.18840
7
citations
#1749

BiTT: Bi-directional Texture Reconstruction of Interacting Two Hands from a Single Image

Minje Kim, Tae-Kyun Kim

CVPR 2024arXiv:2403.08262
7
citations
#1750

SuperPrimitive: Scene Reconstruction at a Primitive Level

Kirill Mazur, Gwangbin Bae, Andrew J. Davison

CVPR 2024arXiv:2312.05889
7
citations
#1751

Cross-modal Causal Relation Alignment for Video Question Grounding

weixing chen, Yang Liu, Binglin Chen et al.

CVPR 2025highlightarXiv:2503.07635
7
citations
#1752

Hyperbolic Category Discovery

Yuanpei Liu, Zhenqi He, Kai Han

CVPR 2025arXiv:2504.06120
7
citations
#1753

Steganographic Passport: An Owner and User Verifiable Credential for Deep Model IP Protection Without Retraining

Qi Cui, Ruohan Meng, Chaohui Xu et al.

CVPR 2024arXiv:2404.02889
7
citations
#1754

Dynamic Updates for Language Adaptation in Visual-Language Tracking

Xiaohai Li, Bineng Zhong, Qihua Liang et al.

CVPR 2025arXiv:2503.06621
7
citations
#1755

LookCloser: Frequency-aware Radiance Field for Tiny-Detail Scene

Xiaoyu Zhang, Weihong Pan, Chong Bao et al.

CVPR 2025arXiv:2503.18513
7
citations
#1756

Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction

Zhenzhong Kuang, Xiaochen Yang, Yingjie Shen et al.

CVPR 2024arXiv:2406.17219
7
citations
#1757

Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory

Wenliang Zhong, Haoyu Tang, Qinghai Zheng et al.

CVPR 2025arXiv:2406.19827
7
citations
#1758

Panorama Generation From NFoV Image Done Right

Dian Zheng, Cheng Zhang, Xiao-Ming Wu et al.

CVPR 2025highlightarXiv:2503.18420
7
citations
#1759

Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels

Pierre Vuillecard, Jean-marc Odobez

CVPR 2025arXiv:2502.20249
7
citations
#1760

Artist-Friendly Relightable and Animatable Neural Heads

Yingyan Xu, Prashanth Chandran, Sebastian Weiss et al.

CVPR 2024arXiv:2312.03420
7
citations
#1761

Detail-Preserving Latent Diffusion for Stable Shadow Removal

Jiamin Xu, Yuxin Zheng, Zelong Li et al.

CVPR 2025arXiv:2412.17630
7
citations
#1762

Intriguing Properties of Diffusion Models: An Empirical Study of the Natural Attack Capability in Text-to-Image Generative Models

Takami Sato, Justin Yue, Nanze Chen et al.

CVPR 2024arXiv:2308.15692
7
citations
#1763

L-MAGIC: Language Model Assisted Generation of Images with Coherence

zhipeng cai, Matthias Mueller, Reiner Birkl et al.

CVPR 2024arXiv:2406.01843
7
citations
#1764

Towards Generalizable Scene Change Detection

Jae-Woo KIM, Ue-Hwan Kim

CVPR 2025arXiv:2409.06214
7
citations
#1765

DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

Radu Alexandru Rosu, Keyu Wu, Yao Feng et al.

CVPR 2025arXiv:2505.06166
7
citations
#1766

HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization

Zitang Zhou, Ke Mei, Yu Lu et al.

CVPR 2025arXiv:2503.01725
7
citations
#1767

Progress-Aware Video Frame Captioning

Zihui Xue, Joungbin An, Xitong Yang et al.

CVPR 2025arXiv:2412.02071
7
citations
#1768

DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting

Seungjun Lee, Gim Hee Lee

CVPR 2025arXiv:2503.24210
7
citations
#1769

Instance-based Max-margin for Practical Few-shot Recognition

Minghao Fu, Ke Zhu

CVPR 2024arXiv:2305.17368
7
citations
#1770

InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing

Jinlu Zhang, Yixin Chen, Zan Wang et al.

CVPR 2025highlightarXiv:2505.24315
7
citations
#1771

CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning

Jiangpeng He, Zhihao Duan, Fengqing Zhu

CVPR 2025arXiv:2505.24816
7
citations
#1772

Gradient-Guided Annealing for Domain Generalization

Aristotelis Ballas, Christos Diou

CVPR 2025highlightarXiv:2502.20162
7
citations
#1773

FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement

Ian Huang, Yanan Bao, Karen Truong et al.

CVPR 2025highlightarXiv:2503.04919
7
citations
#1774

Single View Refractive Index Tomography with Neural Fields

Brandon Zhao, Aviad Levis, Liam Connor et al.

CVPR 2024arXiv:2309.04437
7
citations
#1775

Benchmarking Segmentation Models with Mask-Preserved Attribute Editing

Zijin Yin, Kongming Liang, Bing Li et al.

CVPR 2024arXiv:2403.01231
7
citations
#1776

Generating Illustrated Instructions

Sachit Menon, Ishan Misra, Rohit Girdhar

CVPR 2024arXiv:2312.04552
7
citations
#1777

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.

CVPR 2025arXiv:2411.11909
7
citations
#1778

Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation

Dong Zhao, Shuang Wang, Qi Zang et al.

CVPR 2024arXiv:2406.06813
7
citations
#1779

Contextual AD Narration with Interleaved Multimodal Sequence

Hanlin Wang, Zhan Tong, Kecheng Zheng et al.

CVPR 2025arXiv:2403.12922
7
citations
#1780

Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement

Qiyuan Dai, Hanzhuo Huang, Yu Wu et al.

CVPR 2025arXiv:2507.06928
7
citations
#1781

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

Yuze He, Yanning Zhou, Wang Zhao et al.

CVPR 2025arXiv:2411.05738
7
citations
#1782

SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Peishan Cong, Ziyi Wang, Yuexin Ma et al.

CVPR 2025arXiv:2503.01291
7
citations
#1783

FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion

Haosen Yang, Adrian Bulat, Isma Hadji et al.

CVPR 2025arXiv:2411.18552
7
citations
#1784

G3DR: Generative 3D Reconstruction in ImageNet

Pradyumna Reddy, Ismail Elezi, Jiankang Deng

CVPR 2024arXiv:2403.00939
7
citations
#1785

BrainWash: A Poisoning Attack to Forget in Continual Learning

Ali Abbasi, Parsa Nooralinejad, Hamed Pirsiavash et al.

CVPR 2024arXiv:2311.11995
7
citations
#1786

A Generative Approach for Wikipedia-Scale Visual Entity Recognition

Mathilde Caron, Ahmet Iscen, Alireza Fathi et al.

CVPR 2024arXiv:2403.02041
7
citations
#1787

CLIPtone: Unsupervised Learning for Text-based Image Tone Adjustment

Hyeongmin Lee, Kyoungkook Kang, Jungseul Ok et al.

CVPR 2024arXiv:2404.01123
7
citations
#1788

TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation

Yabiao Wang, Shuo Wang, Jiangning Zhang et al.

CVPR 2025arXiv:2408.17135
7
citations
#1789

Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification

Yanghao Wang, Long Chen

CVPR 2025arXiv:2408.16266
7
citations
#1790

GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks

Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.

CVPR 2025arXiv:2503.06514
7
citations
#1791

InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation

Sirui Xu, Dongting Li, Yucheng Zhang et al.

CVPR 2025arXiv:2509.09555
7
citations
#1792

A Simple yet Effective Layout Token in Large Language Models for Document Understanding

Zhaoqing Zhu, Chuwei Luo, Zirui Shao et al.

CVPR 2025arXiv:2503.18434
7
citations
#1793

STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding

Aaryan Garg, Akash Kumar, Yogesh S. Rawat

CVPR 2025arXiv:2502.20678
7
citations
#1794

Towards Backward-Compatible Continual Learning of Image Compression

Zhihao Duan, Ming Lu, Justin Yang et al.

CVPR 2024arXiv:2402.18862
7
citations
#1795

PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields

Sean Wu, Shamik Basu, Tim Broedermann et al.

CVPR 2025arXiv:2412.09680
7
citations
#1796

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer

Jiayi Gao, Zijin Yin, Changcheng Hua et al.

CVPR 2025arXiv:2504.02451
7
citations
#1797

Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model

Shengjun Zhang, Jinzhao Li, Xin Fei et al.

CVPR 2025arXiv:2504.02764
7
citations
#1798

When Visual Grounding Meets Gigapixel-level Large-scale Scenes: Benchmark and Approach

TAO MA, Bing Bai, Haozhe Lin et al.

CVPR 2024
7
citations
#1799

Robust Multimodal Survival Prediction with Conditional Latent Differentiation Variational AutoEncoder

Junjie Zhou, Jiao Tang, Yingli Zuo et al.

CVPR 2025
7
citations
#1800

Object-Shot Enhanced Grounding Network for Egocentric Video

Yisen Feng, Haoyu Zhang, Meng Liu et al.

CVPR 2025arXiv:2505.04270
7
citations