Most Cited CVPR "low inference latency" Papers

5,589 papers found • Page 25 of 28

#4801

Revisiting Adversarial Training at Scale

Zeyu Wang, Xianhang li, Hongru Zhu et al.

CVPR 2024arXiv:2401.04727
#4802

PersonaBooth: Personalized Text-to-Motion Generation

Boeun Kim, Hea In Jeong, JungHoon Sung et al.

CVPR 2025arXiv:2503.07390
#4803

G-FARS: Gradient-Field-based Auto-Regressive Sampling for 3D Part Grouping

Junfeng Cheng, Tania Stathaki

CVPR 2024arXiv:2405.06828
#4804

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

Keda Tao, Can Qin, Haoxuan You et al.

CVPR 2025arXiv:2411.15024
#4805

Make Pixels Dance: High-Dynamic Video Generation

Yan Zeng, Guoqiang Wei, Jiani Zheng et al.

CVPR 2024arXiv:2311.10982
#4806

Masked AutoDecoder is Effective Multi-Task Vision Generalist

Han Qiu, Jiaxing Huang, Peng Gao et al.

CVPR 2024arXiv:2403.07692
#4807

Generative Multi-modal Models are Good Class Incremental Learners

Xusheng Cao, Haori Lu, Linlan Huang et al.

CVPR 2024arXiv:2403.18383
#4808

Deciphering ‘What’ and ‘Where’ Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations

Xiao Zhang, David Yunis, Michael Maire

CVPR 2024highlightarXiv:2312.06716
#4809

CRISP: Object Pose and Shape Estimation with Test-Time Adaptation

Jingnan Shi, Rajat Talak, Harry Zhang et al.

CVPR 2025highlightarXiv:2412.01052
#4810

Yo’Chameleon: Personalized Vision and Language Generation

Thao Nguyen, Krishna Kumar Singh, Jing Shi et al.

CVPR 2025
#4811

LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction

Bo Zou, Chao Yang, Yu Qiao et al.

CVPR 2024arXiv:2404.00913
#4812

EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models

Sijie Cheng, Zhicheng Guo, Jingwen Wu et al.

CVPR 2024highlightarXiv:2311.15596
#4813

Learning Temporally Consistent Video Depth from Video Diffusion Priors

Jiahao Shao, Yuanbo Yang, Hongyu Zhou et al.

CVPR 2025arXiv:2406.01493
#4814

Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch

Aneeshan Sain, Subhajit Maity, Pinaki Nath Chowdhury et al.

CVPR 2025arXiv:2505.23763
#4815

EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights

Zhenghao Xing, Hao Chen, Binzhu Xie et al.

CVPR 2025
#4816

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks Methods and Applications

Karren Yang, Anurag Ranjan, Jen-Hao Rick Chang et al.

CVPR 2024
#4817

From Feature to Gaze: A Generalizable Replacement of Linear Layer for Gaze Estimation

Yiwei Bao, Feng Lu

CVPR 2024highlight
#4818

OSLoPrompt: Bridging Low-Supervision Challenges and Open-Set Domain Generalization in CLIP

Mohamad Hassan N C, Divyam Gupta, Mainak Singha et al.

CVPR 2025arXiv:2503.16106
#4819

NC-SDF: Enhancing Indoor Scene Reconstruction Using Neural SDFs with View-Dependent Normal Compensation

Ziyi Chen, Xiaolong Wu, Yu Zhang

CVPR 2024arXiv:2405.00340
#4820

Temporal Alignment-Free Video Matching for Few-shot Action Recognition

SuBeen Lee, WonJun Moon, Hyun Seok Seong et al.

CVPR 2025arXiv:2504.05956
#4821

Language Models as Black-Box Optimizers for Vision-Language Models

Shihong Liu, Samuel Yu, Zhiqiu Lin et al.

CVPR 2024arXiv:2309.05950
#4822

Transferable Structural Sparse Adversarial Attack Via Exact Group Sparsity Training

Di Ming, Peng Ren, Yunlong Wang et al.

CVPR 2024
#4823

IRGS: Inter-Reflective Gaussian Splatting with 2D Gaussian Ray Tracing

Chun Gu, Xiaofei Wei, Zixuan Zeng et al.

CVPR 2025arXiv:2412.15867
#4824

Practical Solutions to the Relative Pose of Three Calibrated Cameras

Charalambos Tzamos, Viktor Kocur, Yaqing Ding et al.

CVPR 2025arXiv:2303.16078
#4825

Holistic Autonomous Driving Understanding by Bird’s-Eye-View Injected Multi-Modal Large Models

Xinpeng Ding, Jianhua Han, Hang Xu et al.

CVPR 2024arXiv:2401.00988
#4826

MultimodalStudio: A Heterogeneous Sensor Dataset and Framework for Neural Rendering across Multiple Imaging Modalities

Federico Lincetto, Gianluca Agresti, Mattia Rossi et al.

CVPR 2025arXiv:2503.19673
#4827

ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering

Haokai Pang, Heming Zhu, Adam Kortylewski et al.

CVPR 2024arXiv:2312.05941
#4828

Exploring Timeline Control for Facial Motion Generation

Yifeng Ma, Jinwei Qi, Chaonan Ji et al.

CVPR 2025arXiv:2505.20861
#4829

Structured 3D Latents for Scalable and Versatile 3D Generation

Jianfeng XIANG, Zelong Lv, Sicheng Xu et al.

CVPR 2025highlightarXiv:2412.01506
#4830

Equivariant Plug-and-Play Image Reconstruction

Matthieu Terris, Thomas Moreau, Nelly Pustelnik et al.

CVPR 2024arXiv:2312.01831
#4831

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

Tobias Kirschstein, Simon Giebenhain, Matthias Nießner

CVPR 2024arXiv:2311.18635
#4832

Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts

Feng Liang, Haoyu Ma, Zecheng He et al.

CVPR 2025arXiv:2502.07802
#4833

Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

Jiaming Li, Jiacheng Zhang, Jichang Li et al.

CVPR 2024arXiv:2406.00510
#4834

Enhancing Facial Privacy Protection via Weakening Diffusion Purification

Ali Salar, Qing Liu, Yingli Tian et al.

CVPR 2025arXiv:2503.10350
#4835

Addressing Background Context Bias in Few-Shot Segmentation through Iterative Modulation

Lanyun Zhu, Tianrun Chen, Jianxiong Yin et al.

CVPR 2024
#4836

Learned Representation-Guided Diffusion Models for Large-Image Generation

Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le et al.

CVPR 2024arXiv:2312.07330
#4837

AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction

Yuanbin Man, Ying Huang, Chengming Zhang et al.

CVPR 2025highlightarXiv:2411.12593
#4838

OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation

Xiongwei Wu, Sicheng Yu, Ee-Peng Lim et al.

CVPR 2024arXiv:2404.01409
#4839

AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection

Trevine Oorloff, Surya Koppisetti, Nicolo Bonettini et al.

CVPR 2024arXiv:2406.02951
#4840

CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection

Haonan Zhang, Longjun Liu, Yuqi Huang et al.

CVPR 2024
#4841

Friendly Sharpness-Aware Minimization

Tao Li, Pan Zhou, Zhengbao He et al.

CVPR 2024arXiv:2403.12350
#4842

CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation

Xi Liu, Ying Guo, Cheng Zhen et al.

CVPR 2024arXiv:2403.00274
#4843

Brain Decodes Deep Nets

Huzheng Yang, James Gee, Jianbo Shi

CVPR 2024highlightarXiv:2312.01280
#4844

MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using Differentiable Shading

Abdallah Dib, Luiz Gustavo Hafemann, Emeline Got et al.

CVPR 2024arXiv:2312.13091
#4845

Point2CAD: Reverse Engineering CAD Models from 3D Point Clouds

Yujia Liu, Anton Obukhov, Jan D. Wegner et al.

CVPR 2024highlightarXiv:2312.04962
#4846

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

Jian Wang, Zhe Cao, Diogo Luvizon et al.

CVPR 2024
#4847

A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Yuelin Zhang, Pengyu Zheng, Wanquan Yan et al.

CVPR 2024arXiv:2403.02611
#4848

Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting

Haipeng Liu, Yang Wang, Biao Qian et al.

CVPR 2024arXiv:2403.19898
#4849

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Sili Chen, Hengkai Guo, Shengnan Zhu et al.

CVPR 2025highlightarXiv:2501.12375
#4850

Misalignment-Robust Frequency Distribution Loss for Image Transformation

Zhangkai Ni, Juncheng Wu, Zian Wang et al.

CVPR 2024arXiv:2402.18192
#4851

Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Jona Ballé, Luca Versari, Emilien Dupont et al.

CVPR 2025highlightarXiv:2412.00505
#4852

WildlifeMapper: Aerial Image Analysis for Multi-Species Detection and Identification

Satish Kumar, Bowen Zhang, Chandrakanth Gudavalli et al.

CVPR 2024
#4853

SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking

Xiaojun Hou, Jiazheng Xing, Yijie Qian et al.

CVPR 2024arXiv:2403.16002
#4854

Consistent Normal Orientation for 3D Point Clouds via Least Squares on Delaunay Graph

Rao Fu, Jianmin Zheng, Liang Yu

CVPR 2025
#4855

BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology

Amaya Gallagher-Syed, Henry Senior, Omnia Alwazzan et al.

CVPR 2025arXiv:2503.20880
#4856

SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

Yunfei Fan, Tianyu Zhao, Guidong Wang

CVPR 2024arXiv:2312.01616
#4857

MACE: Mass Concept Erasure in Diffusion Models

Shilin Lu, Zilan Wang, Leyang Li et al.

CVPR 2024arXiv:2403.06135
#4858

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations

Tianhao Qi, Shancheng Fang, Yanze Wu et al.

CVPR 2024highlightarXiv:2403.06951
#4859

Descriptor-In-Pixel : Point-Feature Tracking For Pixel Processor Arrays

Laurie Bose, Piotr Dudek, Jianing Chen

CVPR 2025
#4860

Learning Degradation-unaware Representation with Prior-based Latent Transformations for Blind Face Restoration

Lianxin Xie, csbingbing zheng, Wen Xue et al.

CVPR 2024
#4861

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

Qian Wang, Weiqi Li, Chong Mou et al.

CVPR 2024arXiv:2401.06578
#4862

Alpha Invariance: On Inverse Scaling Between Distance and Volume Density in Neural Radiance Fields

Joshua Ahn, Haochen Wang, Raymond A. Yeh et al.

CVPR 2024arXiv:2404.02155
#4863

Countering Personalized Text-to-Image Generation with Influence Watermarks

Hanwen Liu, Zhicheng Sun, Yadong Mu

CVPR 2024
#4864

Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Multi-Scale Aggregation and Anthropic Prior Knowledge

Bo Zou, Shaofeng Wang, Hao Liu et al.

CVPR 2024
#4865

Interpreting Object-level Foundation Models via Visual Precision Search

Ruoyu Chen, Siyuan Liang, Jingzhi Li et al.

CVPR 2025highlightarXiv:2411.16198
#4866

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

Tanvir Mahmud, Yapeng Tian, Diana Marculescu

CVPR 2024arXiv:2404.01751
#4867

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

Xiang Xu, Lingdong Kong, hui shuai et al.

CVPR 2025arXiv:2501.04004
#4868

TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions

Wang Yu-Hang, Junkang Guo, Aolei Liu et al.

CVPR 2025
#4869

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a single RGB-D Image

Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos et al.

CVPR 2024arXiv:2403.10357
#4870

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

Zengqun Zhao, Ziquan Liu, Yu Cao et al.

CVPR 2025arXiv:2503.05665
#4871

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024arXiv:2403.13347
#4872

Initialization Matters for Adversarial Transfer Learning

Andong Hua, Jindong Gu, Zhiyu Xue et al.

CVPR 2024arXiv:2312.05716
#4873

RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives

Chirag Parikh, Deepti Rawat, Rakshitha R. T. et al.

CVPR 2025arXiv:2503.21459
#4874

MindBridge: A Cross-Subject Brain Decoding Framework

Shizun Wang, Songhua Liu, Zhenxiong Tan et al.

CVPR 2024highlightarXiv:2404.07850
#4875

Loopy-SLAM: Dense Neural SLAM with Loop Closures

Lorenzo Liso, Erik Sandström, Vladimir Yugay et al.

CVPR 2024arXiv:2402.09944
#4876

Weakly-Supervised Audio-Visual Video Parsing with Prototype-based Pseudo-Labeling

Kranthi Kumar Rachavarapu, Kalyan Ramakrishnan, A. N. Rajagopalan

CVPR 2024
#4877

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

Sanjoy Chowdhury, Sayan Nag, Joseph K J et al.

CVPR 2024highlightarXiv:2406.04673
#4878

InstaGen: Enhancing Object Detection by Training on Synthetic Dataset

Chengjian Feng, Yujie Zhong, Zequn Jie et al.

CVPR 2024arXiv:2402.05937
#4879

Narrative Action Evaluation with Prompt-Guided Multimodal Interaction

Shiyi Zhang, Sule Bai, Guangyi Chen et al.

CVPR 2024arXiv:2404.14471
#4880

DeconfuseTrack: Dealing with Confusion for Multi-Object Tracking

Cheng Huang, Shoudong Han, Mengyu He et al.

CVPR 2024
#4881

ChatPose: Chatting about 3D Human Pose

Yao Feng, Jing Lin, Sai Kumar Dwivedi et al.

CVPR 2024arXiv:2311.18836
#4882

Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention

Ju-Hyeon Nam, Nur Suriza Syazwany, Su Jung Kim et al.

CVPR 2024arXiv:2405.06284
#4883

NC-TTT: A Noise Constrastive Approach for Test-Time Training

David OSOWIECHI, Gustavo Vargas Hakim, Mehrdad Noori et al.

CVPR 2024highlight
#4884

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments

Duy Tho Le, Chenhui Gou, Stavya Datta et al.

CVPR 2024arXiv:2404.01686
#4885

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models

Jingyao Xu, Yuetong Lu, Yandong Li et al.

CVPR 2024arXiv:2404.15081
#4886

ESCAPE: Encoding Super-keypoints for Category-Agnostic Pose Estimation

Khoi D Nguyen, Chen Li, Gim Hee Lee

CVPR 2024
#4887

Task-Specific Gradient Adaptation for Few-Shot One-Class Classification

Yunlong Li, Xiabi Liu, Liyuan Pan et al.

CVPR 2025
#4888

Minimal Perspective Autocalibration

Andrea Porfiri Dal Cin, Timothy Duff, Luca Magri et al.

CVPR 2024arXiv:2405.05605
#4889

ReGenNet: Towards Human Action-Reaction Synthesis

Liang Xu, Yizhou Zhou, Yichao Yan et al.

CVPR 2024arXiv:2403.11882
#4890

RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos

Hongchi Xia, Yang Fu, Sifei Liu et al.

CVPR 2024arXiv:2401.12592
#4891

Aligning and Prompting Everything All at Once for Universal Visual Perception

Yunhang Shen, Chaoyou Fu, Peixian Chen et al.

CVPR 2024arXiv:2312.02153
#4892

ZONE: Zero-Shot Instruction-Guided Local Editing

Shanglin Li, Bohan Zeng, Yutang Feng et al.

CVPR 2024arXiv:2312.16794
#4893

Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption

Buzhen Huang, Chen Li, Chongyang Xu et al.

CVPR 2024arXiv:2404.11291
#4894

Label Propagation for Zero-shot Classification with Vision-Language Models

Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias

CVPR 2024arXiv:2404.04072
#4895

IQ-VFI: Implicit Quadratic Motion Estimation for Video Frame Interpolation

Mengshun Hu, Kui Jiang, Zhihang Zhong et al.

CVPR 2024
#4896

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition

Anqi Zhu, Qiuhong Ke, Mingming Gong et al.

CVPR 2024arXiv:2406.13327
#4897

Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous and Instruction-guided Driving

Brian Yang, Huangyuan Su, Nikolaos Gkanatsios et al.

CVPR 2024
#4898

Parametric Point Cloud Completion for Polygonal Surface Reconstruction

Zhaiyu Chen, Yuqing Wang, Liangliang Nan et al.

CVPR 2025arXiv:2503.08363
#4899

Structured Model Probing: Empowering Efficient Transfer Learning by Structured Regularization

Zhi-Fan Wu, Chaojie Mao, Xue Wang et al.

CVPR 2024
#4900

TinyFusion: Diffusion Transformers Learned Shallow

Gongfan Fang, Kunjun Li, Xinyin Ma et al.

CVPR 2025highlightarXiv:2412.01199
#4901

Poly-Autoregressive Prediction for Modeling Interactions

Neerja Thakkar, Tara Sadjadpour, Jathushan Rajasegaran et al.

CVPR 2025arXiv:2502.08646
#4902

CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections

Thomas Walker, Salvatore Esposito, Daniel Rebain et al.

CVPR 2025arXiv:2412.04120
#4903

Decentralized Diffusion Models

David McAllister, Matthew Tancik, Jiaming Song et al.

CVPR 2025arXiv:2501.05450
#4904

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Utkarsh Mall, Cheng Perng Phoo, Mia Chiquier et al.

CVPR 2025arXiv:2502.10060
#4905

CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation

Lingjun Zhao, Jingyu Song, Katherine Skinner

CVPR 2024arXiv:2403.19104
#4906

Investigating the Role of Weight Decay in Enhancing Nonconvex SGD

Tao Sun, Yuhao Huang, Li Shen et al.

CVPR 2025
#4907

UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping

Aashish Rai, Dilin Wang, Mihir Jain et al.

CVPR 2025arXiv:2502.01846
#4908

Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Yingying Deng, Xiangyu He, Fan Tang et al.

CVPR 2025arXiv:2503.12124
#4909

CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models

Felix Taubner, Ruihang Zhang, Mathieu Tuli et al.

CVPR 2025arXiv:2412.12093
#4910

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

Jingcheng Ni, Yuxin Guo, Yichen Liu et al.

CVPR 2025arXiv:2502.11663
#4911

Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing

Bingyan Liu, Chengyu Wang, Tingfeng Cao et al.

CVPR 2024arXiv:2403.03431
#4912

ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency

Dong Wei, Xiaoning Sun, Xizhan Gao et al.

CVPR 2025highlight
#4913

Two by Two: Learning Multi-Task Pairwise Objects Assembly for Generalizable Robot Manipulation

Yu Qi, Yuanchen Ju, Tianming Wei et al.

CVPR 2025arXiv:2504.06961
#4914

GIF: Generative Inspiration for Face Recognition at Scale

Mohammad Saadabadi Saadabadi, Sahar Rahimi Malakshan, Ali Dabouei et al.

CVPR 2025
#4915

TULIP: Transformer for Upsampling of LiDAR Point Clouds

Bin Yang, Patrick Pfreundschuh, Roland Siegwart et al.

CVPR 2024arXiv:2312.06733
#4916

AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning

Kaixuan Wu, Xinde Li, Xinglin Li et al.

CVPR 2025
#4917

Science-T2I: Addressing Scientific Illusions in Image Synthesis

Jialuo Li, Wenhao Chai, XINGYU FU et al.

CVPR 2025arXiv:2504.13129
#4918

Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition

Yang Chen, Jingcai Guo, Song Guo et al.

CVPR 2025arXiv:2411.11288
#4919

Navigating Image Restoration with VAR’s Distribution Alignment Prior

Siyang Wang, Naishan Zheng, Jie Huang et al.

CVPR 2025arXiv:2412.21063
#4920

Incremental Residual Concept Bottleneck Models

Chenming Shang, Shiji Zhou, Hengyuan Zhang et al.

CVPR 2024arXiv:2404.08978
#4921

HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation

Hermann Kumbong, Xian Liu, Tsung-Yi Lin et al.

CVPR 2025arXiv:2506.04421
#4922

Minding Fuzzy Regions: A Data-driven Alternating Learning Paradigm for Stable Lesion Segmentation

Lexin Fang, Yunyang Xu, Xiang Ma et al.

CVPR 2025arXiv:2503.11140
#4923

LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

Xiaoyan Xing, Konrad Groh, Sezer Karaoglu et al.

CVPR 2025arXiv:2412.00177
#4924

ProjAttacker: A Configurable Physical Adversarial Attack for Face Recognition via Projector

Yuanwei Liu, Hui Wei, Chengyu Jia et al.

CVPR 2025
#4925

Efficient Dataset Distillation via Minimax Diffusion

Jianyang Gu, Saeed Vahidian, Vyacheslav Kungurtsev et al.

CVPR 2024arXiv:2311.15529
#4926

DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Qitao Zhao, Amy Lin, Jeff Tan et al.

CVPR 2025arXiv:2505.05473
#4927

DUSt3R: Geometric 3D Vision Made Easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon et al.

CVPR 2024arXiv:2312.14132
#4928

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Guotao liang, Baoquan Zhang, Zhiyuan Wen et al.

CVPR 2025highlightarXiv:2503.01261
#4929

StyleMaster: Stylize Your Video with Artistic Generation and Translation

Zixuan Ye, Huijuan Huang, Xintao Wang et al.

CVPR 2025arXiv:2412.07744
#4930

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

Gaozheng Pei, Shaojie Lyu, Gong Chen et al.

CVPR 2025arXiv:2503.01407
#4931

Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows

Shentong Mo, Yibing Song

CVPR 2025
#4932

DL2G: Degradation-guided Local-to-Global Restoration for Eyeglass Reflection Removal

Yizhilv, Xiao Lu, Hong Ding et al.

CVPR 2025
#4933

Efficient Decoupled Feature 3D Gaussian Splatting via Hierarchical Compression

Zhenqi Dai, Ting Liu, Yanning Zhang

CVPR 2025
#4934

Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning

Mi Luo, Zihui Xue, Alex Dimakis et al.

CVPR 2025
#4935

BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing

Yunqi Gu, Ian Huang, Jihyeon Je et al.

CVPR 2025highlightarXiv:2504.01786
#4936

NightCC: Nighttime Color Constancy via Adaptive Channel Masking

Shuwei Li, Robby T. Tan

CVPR 2024
#4937

AdMiT: Adaptive Multi-Source Tuning in Dynamic Environments

Xiangyu Chang, Fahim Faisal Niloy, Sk Miraj Ahmed et al.

CVPR 2025
#4938

Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition

Zhiyuan Chen, Keyi Li, Yifan Jia et al.

CVPR 2025arXiv:2505.05829
#4939

Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes

Yiming Dou, Wonseok Oh, Yuqing Luo et al.

CVPR 2025arXiv:2506.09989
#4940

Fortifying Federated Learning Towards Trustworthiness via Auditable Data Valuation and Verifiable Client Contribution

Naveen Kumar Kummari, Ranjeet Ranjan Jha, Krishna Mohan Chalavadi et al.

CVPR 2025
#4941

RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing

Zhipeng Huang, Wangbo Yu, Xinhua Cheng et al.

CVPR 2025arXiv:2412.16778
#4942

DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos

Lorenzo Mur-Labadia, Jose J. Guerrero, Ruben Martinez-Cantin

CVPR 2025highlightarXiv:2503.08344
#4943

A Simple Data Augmentation for Feature Distribution Skewed Federated Learning

Yunlu Yan, Huazhu Fu, Yuexiang Li et al.

CVPR 2025arXiv:2306.09363
#4944

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Kai Xu, Ziwei Yu, Xin Wang et al.

CVPR 2024highlightarXiv:2305.00163
#4945

Domain Generalization in CLIP via Learning with Diverse Text Prompts

Changsong Wen, Zelin Peng, Yu Huang et al.

CVPR 2025
#4946

Alias-Free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space

Yifan Zhou, Zeqi Xiao, Shuai Yang et al.

CVPR 2025arXiv:2503.09419
#4947

Weakly Supervised Contrastive Adversarial Training for Learning Robust Features from Semi-supervised Data

Lilin Zhang, Chengpei Wu, Ning Yang

CVPR 2025arXiv:2503.11032
#4948

Graph-Embedded Structure-Aware Perceptual Hashing for Neural Network Protection and Piracy Detection

Ruiheng Liu, Haozhe Chen, Boyao Zhao et al.

CVPR 2025
#4949

Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning

Debora Caldarola, Pietro Cagnasso, Barbara Caputo et al.

CVPR 2025arXiv:2412.03752
#4950

Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems

Alejandro Castañeda Garcia, Jan Warchocki, Jan van Gemert et al.

CVPR 2025arXiv:2410.01376
#4951

Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization

Feifei Li, Mi Zhang, Yiming Sun et al.

CVPR 2025arXiv:2503.15197
#4952

Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation

Dingcheng Zhen, Shunshun Yin, Shiyang Qin et al.

CVPR 2025arXiv:2503.18429
#4953

AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes

Andrea Porfiri Dal Cin, Georgi Dikov, Jihong Ju et al.

CVPR 2025
#4954

K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs

Ziheng Ouyang, Zhen Li, Qibin Hou

CVPR 2025arXiv:2502.18461
#4955

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long et al.

CVPR 2025arXiv:2412.13573
#4956

CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-Scale Reinforcement Learning in Autonomous Driving

Dongkun Zhang, Jiaming Liang, Ke Guo et al.

CVPR 2025arXiv:2502.19908
#4957

FASTer: Focal token Acquiring-and-Scaling Transformer for Long-term 3D Objection Detection

Chenxu Dang, Pei An, Xinmin Zhang et al.

CVPR 2025arXiv:2503.01899
#4958

UCM-VeID V2: A Richer Dataset and A Pre-training Method for UAV Cross-Modality Vehicle Re-Identification

Xingyue Liu, Jiahao Qi, Chen Chen et al.

CVPR 2025
#4959

Unboxed: Geometrically and Temporally Consistent Video Outpainting

Zhongrui Yu, Martina Megaro-Boldini, Robert Sumner et al.

CVPR 2025
#4960

Less is More: Efficient Model Merging with Binary Task Switch

Biqing Qi, Fangyuan Li, Zhen Wang et al.

CVPR 2025highlightarXiv:2412.00054
#4961

Adversarial Text to Continuous Image Generation

Kilichbek Haydarov, Aashiq Muhamed, Xiaoqian Shen et al.

CVPR 2024
#4962

Visual Lexicon: Rich Image Features in Language Space

XuDong Wang, Xingyi Zhou, Alireza Fathi et al.

CVPR 2025arXiv:2412.06774
#4963

Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes

Hyeonggon Ryu, Seongyu Kim, Joon Chung et al.

CVPR 2025
#4964

Continual SFT Matches Multimodal RLHF with Negative Supervision

Ke Zhu, Yu Wang, Yanpeng Sun et al.

CVPR 2025arXiv:2411.14797
#4965

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation

Junjie Chen, Weilong Chen, Yifan Zuo et al.

CVPR 2025arXiv:2503.21140
#4966

Cross-Modal 3D Representation with Multi-View Images and Point Clouds

Ziyang Zhou, Pinghui Wang, Zi Liang et al.

CVPR 2025
#4967

Heterogeneous Skeleton-Based Action Representation Learning

Xiaoyan Ma, jidong kuang, Hongsong Wang et al.

CVPR 2025arXiv:2506.03481
#4968

DeformCL: Learning Deformable Centerline Representation for Vessel Extraction in 3D Medical Image

Ziwei Zhao, Zhixing Zhang, Yuhang Liu et al.

CVPR 2025arXiv:2506.05820
#4969

Once-Tuning-Multiple-Variants: Tuning Once and Expanded as Multiple Vision-Language Model Variants

Chong Yu, Tao Chen, Zhongxue Gan

CVPR 2025
#4970

Seeing is Not Believing: Adversarial Natural Object Optimization for Hard-Label 3D Scene Attacks

Daizong Liu, Wei Hu

CVPR 2025
#4971

HomoGen: Enhanced Video Inpainting via Homography Propagation and Diffusion

Ding Ding, Yueming Pan, Ruoyu Feng et al.

CVPR 2025
#4972

Towards Continual Universal Segmentation

Zihan Lin, Zilei Wang, Xu Wang

CVPR 2025
#4973

HiPART: Hierarchical Pose AutoRegressive Transformer for Occluded 3D Human Pose Estimation

Hongwei Zheng, Han Li, Wenrui Dai et al.

CVPR 2025arXiv:2503.23331
#4974

Decoupled Motion Expression Video Segmentation

Hao Fang, Runmin Cong, Xiankai Lu et al.

CVPR 2025
#4975

Exploring Contextual Attribute Density in Referring Expression Counting

Zhicheng Wang, Zhiyu Pan, Zhan Peng et al.

CVPR 2025arXiv:2503.12460
#4976

Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data

Haoxin Li, Boyang Li

CVPR 2025arXiv:2503.01167
#4977

Efficient Depth Estimation for Unstable Stereo Camera Systems on AR Glasses

Yongfan Liu, Hyoukjun Kwon

CVPR 2025arXiv:2411.10013
#4978

FSHNet: Fully Sparse Hybrid Network for 3D Object Detection

Shuai Liu, Mingyue Cui, Boyang Li et al.

CVPR 2025arXiv:2506.03714
#4979

Mixture of Submodules for Domain Adaptive Person Search

Minsu Kim, Seungryong Kim, Kwanghoon Sohn

CVPR 2025
#4980

Unsupervised Discovery of Facial Landmarks and Head Pose

Satyajit Tourani, Siddharth Tourani, Arif Mahmood et al.

CVPR 2025
#4981

InceptionNeXt: When Inception Meets ConvNeXt

Weihao Yu, Pan Zhou, Shuicheng Yan et al.

CVPR 2024arXiv:2303.16900
#4982

Dynamic Integration of Task-Specific Adapters for Class Incremental Learning

Jiashuo Li, Shaokun Wang, Bo Qian et al.

CVPR 2025arXiv:2409.14983
#4983

DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows

Mashrur M. Morshed, Vishnu Naresh Boddeti

CVPR 2025arXiv:2504.07894
#4984

Test-time Augmentation Improves Efficiency in Conformal Prediction

Divya M Shanmugam, Helen Lu, Swami Sankaranarayanan et al.

CVPR 2025arXiv:2505.22764
#4985

GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding

Yawen Shao, Wei Zhai, Yuhang Yang et al.

CVPR 2025arXiv:2411.19626
#4986

Robotic Visual Instruction

Yanbang Li, ZiYang Gong, Haoyang Li et al.

CVPR 2025arXiv:2505.00693
#4987

Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues

Yuhui Liu, Liangxun Ou, Qiang Fu et al.

CVPR 2025
#4988

Dual Diffusion for Unified Image Generation and Understanding

Zijie Li, Henry Li, Yichun Shi et al.

CVPR 2025arXiv:2501.00289
#4989

Learning Compatible Multi-Prize Subnetworks for Asymmetric Retrieval

Yushuai Sun, Zikun Zhou, Dongmei Jiang et al.

CVPR 2025arXiv:2504.11879
#4990

Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning

Huabin Liu, Filip Ilievski, Cees G. M. Snoek

CVPR 2025arXiv:2501.05069
#4991

Opportunistic Single-Photon Time of Flight

Sotiris Nousias, Mian Wei, Howard Xiao et al.

CVPR 2025
#4992

Enduring, Efficient and Robust Trajectory Prediction Attack in Autonomous Driving via Optimization-Driven Multi-Frame Perturbation Framework

Yi Yu, Weizhen Han, Libing Wu et al.

CVPR 2025highlight
#4993

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Teng Hu, Jiangning Zhang, Ran Yi et al.

CVPR 2025arXiv:2501.00880
#4994

UNEM: UNrolled Generalized EM for Transductive Few-Shot Learning

Long Zhou, Fereshteh Shakeri, Aymen Sadraoui et al.

CVPR 2025arXiv:2412.16739
#4995

Query Efficient Black-Box Visual Prompting with Subspace Learning

Haozhen Zhang, Zhaogeng Liu, Hualin Zhang et al.

CVPR 2025
#4996

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes

Stefano Esposito, Anpei Chen, Christian Reiser et al.

CVPR 2025arXiv:2409.02482
#4997

Fingerprinting Denoising Diffusion Probabilistic Models

Huan Teng, Yuhui Quan, Chengyu Wang et al.

CVPR 2025
#4998

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Hanwen Jiang, Zexiang Xu, Desai Xie et al.

CVPR 2025arXiv:2412.14166
#4999

Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation

Tianfu Wang, Mingyang Xie, Haoming Cai et al.

CVPR 2025arXiv:2501.00637
#5000

AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation

Jingyi Xie, Jintao Yang, Zhunchen Luo et al.

CVPR 2025