Most Cited CVPR "low inference latency" Papers

5,589 papers found • Page 25 of 28

Filters:Most Cited CVPR low inference latency Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#4801

Revisiting Adversarial Training at Scale

Zeyu Wang, Xianhang li, Hongru Zhu et al.

CVPR 2024arXiv:2401.04727

#4802

PersonaBooth: Personalized Text-to-Motion Generation

Boeun Kim, Hea In Jeong, JungHoon Sung et al.

CVPR 2025arXiv:2503.07390

#4803

G-FARS: Gradient-Field-based Auto-Regressive Sampling for 3D Part Grouping

Junfeng Cheng, Tania Stathaki

CVPR 2024arXiv:2405.06828

#4804

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

Keda Tao, Can Qin, Haoxuan You et al.

CVPR 2025arXiv:2411.15024

#4805

Make Pixels Dance: High-Dynamic Video Generation

Yan Zeng, Guoqiang Wei, Jiani Zheng et al.

CVPR 2024arXiv:2311.10982

#4806

Masked AutoDecoder is Effective Multi-Task Vision Generalist

Han Qiu, Jiaxing Huang, Peng Gao et al.

CVPR 2024arXiv:2403.07692

#4807

Generative Multi-modal Models are Good Class Incremental Learners

Xusheng Cao, Haori Lu, Linlan Huang et al.

CVPR 2024arXiv:2403.18383

#4808

Deciphering ‘What’ and ‘Where’ Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations

Xiao Zhang, David Yunis, Michael Maire

CVPR 2024highlightarXiv:2312.06716

#4809

CRISP: Object Pose and Shape Estimation with Test-Time Adaptation

Jingnan Shi, Rajat Talak, Harry Zhang et al.

CVPR 2025highlightarXiv:2412.01052

#4810

Yo’Chameleon: Personalized Vision and Language Generation

Thao Nguyen, Krishna Kumar Singh, Jing Shi et al.

CVPR 2025

#4811

LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction

Bo Zou, Chao Yang, Yu Qiao et al.

CVPR 2024arXiv:2404.00913

#4812

EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models

Sijie Cheng, Zhicheng Guo, Jingwen Wu et al.

CVPR 2024highlightarXiv:2311.15596

#4813

Learning Temporally Consistent Video Depth from Video Diffusion Priors

Jiahao Shao, Yuanbo Yang, Hongyu Zhou et al.

CVPR 2025arXiv:2406.01493

#4814

Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch

Aneeshan Sain, Subhajit Maity, Pinaki Nath Chowdhury et al.

CVPR 2025arXiv:2505.23763

#4815

EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights

Zhenghao Xing, Hao Chen, Binzhu Xie et al.

CVPR 2025

#4816

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks Methods and Applications

Karren Yang, Anurag Ranjan, Jen-Hao Rick Chang et al.

CVPR 2024

#4817

From Feature to Gaze: A Generalizable Replacement of Linear Layer for Gaze Estimation

Yiwei Bao, Feng Lu

CVPR 2024highlight

#4818

OSLoPrompt: Bridging Low-Supervision Challenges and Open-Set Domain Generalization in CLIP

Mohamad Hassan N C, Divyam Gupta, Mainak Singha et al.

CVPR 2025arXiv:2503.16106

#4819

NC-SDF: Enhancing Indoor Scene Reconstruction Using Neural SDFs with View-Dependent Normal Compensation

Ziyi Chen, Xiaolong Wu, Yu Zhang

CVPR 2024arXiv:2405.00340

#4820

Temporal Alignment-Free Video Matching for Few-shot Action Recognition

SuBeen Lee, WonJun Moon, Hyun Seok Seong et al.

CVPR 2025arXiv:2504.05956

#4821

Language Models as Black-Box Optimizers for Vision-Language Models

Shihong Liu, Samuel Yu, Zhiqiu Lin et al.

CVPR 2024arXiv:2309.05950

#4822

Transferable Structural Sparse Adversarial Attack Via Exact Group Sparsity Training

Di Ming, Peng Ren, Yunlong Wang et al.

CVPR 2024

#4823

IRGS: Inter-Reflective Gaussian Splatting with 2D Gaussian Ray Tracing

Chun Gu, Xiaofei Wei, Zixuan Zeng et al.

CVPR 2025arXiv:2412.15867

#4824

Practical Solutions to the Relative Pose of Three Calibrated Cameras

Charalambos Tzamos, Viktor Kocur, Yaqing Ding et al.

CVPR 2025arXiv:2303.16078

#4825

Holistic Autonomous Driving Understanding by Bird’s-Eye-View Injected Multi-Modal Large Models

Xinpeng Ding, Jianhua Han, Hang Xu et al.

CVPR 2024arXiv:2401.00988

#4826

MultimodalStudio: A Heterogeneous Sensor Dataset and Framework for Neural Rendering across Multiple Imaging Modalities

Federico Lincetto, Gianluca Agresti, Mattia Rossi et al.

CVPR 2025arXiv:2503.19673

#4827

ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering

Haokai Pang, Heming Zhu, Adam Kortylewski et al.

CVPR 2024arXiv:2312.05941

#4828

Exploring Timeline Control for Facial Motion Generation

Yifeng Ma, Jinwei Qi, Chaonan Ji et al.

CVPR 2025arXiv:2505.20861

#4829

Structured 3D Latents for Scalable and Versatile 3D Generation

Jianfeng XIANG, Zelong Lv, Sicheng Xu et al.

CVPR 2025highlightarXiv:2412.01506

#4830

Equivariant Plug-and-Play Image Reconstruction

Matthieu Terris, Thomas Moreau, Nelly Pustelnik et al.

CVPR 2024arXiv:2312.01831

#4831

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

Tobias Kirschstein, Simon Giebenhain, Matthias Nießner

CVPR 2024arXiv:2311.18635

#4832

Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts

Feng Liang, Haoyu Ma, Zecheng He et al.

CVPR 2025arXiv:2502.07802

#4833

Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

Jiaming Li, Jiacheng Zhang, Jichang Li et al.

CVPR 2024arXiv:2406.00510

#4834

Enhancing Facial Privacy Protection via Weakening Diffusion Purification

Ali Salar, Qing Liu, Yingli Tian et al.

CVPR 2025arXiv:2503.10350

#4835

Addressing Background Context Bias in Few-Shot Segmentation through Iterative Modulation

Lanyun Zhu, Tianrun Chen, Jianxiong Yin et al.

CVPR 2024

#4836

Learned Representation-Guided Diffusion Models for Large-Image Generation

Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le et al.

CVPR 2024arXiv:2312.07330

#4837

AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction

Yuanbin Man, Ying Huang, Chengming Zhang et al.

CVPR 2025highlightarXiv:2411.12593

#4838

OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation

Xiongwei Wu, Sicheng Yu, Ee-Peng Lim et al.

CVPR 2024arXiv:2404.01409

#4839

AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection

Trevine Oorloff, Surya Koppisetti, Nicolo Bonettini et al.

CVPR 2024arXiv:2406.02951

#4840

CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection

Haonan Zhang, Longjun Liu, Yuqi Huang et al.

CVPR 2024

#4841

Friendly Sharpness-Aware Minimization

Tao Li, Pan Zhou, Zhengbao He et al.

CVPR 2024arXiv:2403.12350

#4842

CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation

Xi Liu, Ying Guo, Cheng Zhen et al.

CVPR 2024arXiv:2403.00274

#4843

Brain Decodes Deep Nets

Huzheng Yang, James Gee, Jianbo Shi

CVPR 2024highlightarXiv:2312.01280

#4844

MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using Differentiable Shading

Abdallah Dib, Luiz Gustavo Hafemann, Emeline Got et al.

CVPR 2024arXiv:2312.13091

#4845

Point2CAD: Reverse Engineering CAD Models from 3D Point Clouds

Yujia Liu, Anton Obukhov, Jan D. Wegner et al.

CVPR 2024highlightarXiv:2312.04962

#4846

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

Jian Wang, Zhe Cao, Diogo Luvizon et al.

CVPR 2024

#4847

A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Yuelin Zhang, Pengyu Zheng, Wanquan Yan et al.

CVPR 2024arXiv:2403.02611

#4848

Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting

Haipeng Liu, Yang Wang, Biao Qian et al.

CVPR 2024arXiv:2403.19898

#4849

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Sili Chen, Hengkai Guo, Shengnan Zhu et al.

CVPR 2025highlightarXiv:2501.12375

#4850

Misalignment-Robust Frequency Distribution Loss for Image Transformation

Zhangkai Ni, Juncheng Wu, Zian Wang et al.

CVPR 2024arXiv:2402.18192

#4851

Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Jona Ballé, Luca Versari, Emilien Dupont et al.

CVPR 2025highlightarXiv:2412.00505

#4852

WildlifeMapper: Aerial Image Analysis for Multi-Species Detection and Identification

Satish Kumar, Bowen Zhang, Chandrakanth Gudavalli et al.

CVPR 2024

#4853

SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking

Xiaojun Hou, Jiazheng Xing, Yijie Qian et al.

CVPR 2024arXiv:2403.16002

#4854

Consistent Normal Orientation for 3D Point Clouds via Least Squares on Delaunay Graph

Rao Fu, Jianmin Zheng, Liang Yu

CVPR 2025

#4855

BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology

Amaya Gallagher-Syed, Henry Senior, Omnia Alwazzan et al.

CVPR 2025arXiv:2503.20880

#4856

SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

Yunfei Fan, Tianyu Zhao, Guidong Wang

CVPR 2024arXiv:2312.01616

#4857

MACE: Mass Concept Erasure in Diffusion Models

Shilin Lu, Zilan Wang, Leyang Li et al.

CVPR 2024arXiv:2403.06135

#4858

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations

Tianhao Qi, Shancheng Fang, Yanze Wu et al.

CVPR 2024highlightarXiv:2403.06951

#4859

Descriptor-In-Pixel : Point-Feature Tracking For Pixel Processor Arrays

Laurie Bose, Piotr Dudek, Jianing Chen

CVPR 2025

#4860

Learning Degradation-unaware Representation with Prior-based Latent Transformations for Blind Face Restoration

Lianxin Xie, csbingbing zheng, Wen Xue et al.

CVPR 2024

#4861

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

Qian Wang, Weiqi Li, Chong Mou et al.

CVPR 2024arXiv:2401.06578

#4862

Alpha Invariance: On Inverse Scaling Between Distance and Volume Density in Neural Radiance Fields

Joshua Ahn, Haochen Wang, Raymond A. Yeh et al.

CVPR 2024arXiv:2404.02155

#4863

Countering Personalized Text-to-Image Generation with Influence Watermarks

Hanwen Liu, Zhicheng Sun, Yadong Mu

CVPR 2024

#4864

Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Multi-Scale Aggregation and Anthropic Prior Knowledge

Bo Zou, Shaofeng Wang, Hao Liu et al.

CVPR 2024

#4865

Interpreting Object-level Foundation Models via Visual Precision Search

Ruoyu Chen, Siyuan Liang, Jingzhi Li et al.

CVPR 2025highlightarXiv:2411.16198

#4866

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

Tanvir Mahmud, Yapeng Tian, Diana Marculescu

CVPR 2024arXiv:2404.01751

#4867

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

Xiang Xu, Lingdong Kong, hui shuai et al.

CVPR 2025arXiv:2501.04004

#4868

TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions

Wang Yu-Hang, Junkang Guo, Aolei Liu et al.

CVPR 2025

#4869

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a single RGB-D Image

Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos et al.

CVPR 2024arXiv:2403.10357

#4870

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

Zengqun Zhao, Ziquan Liu, Yu Cao et al.

CVPR 2025arXiv:2503.05665

#4871

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024arXiv:2403.13347

#4872

Initialization Matters for Adversarial Transfer Learning

Andong Hua, Jindong Gu, Zhiyu Xue et al.

CVPR 2024arXiv:2312.05716

#4873

RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives

Chirag Parikh, Deepti Rawat, Rakshitha R. T. et al.

CVPR 2025arXiv:2503.21459

#4874

MindBridge: A Cross-Subject Brain Decoding Framework

Shizun Wang, Songhua Liu, Zhenxiong Tan et al.

CVPR 2024highlightarXiv:2404.07850

#4875

Loopy-SLAM: Dense Neural SLAM with Loop Closures

Lorenzo Liso, Erik Sandström, Vladimir Yugay et al.

CVPR 2024arXiv:2402.09944

#4876

Weakly-Supervised Audio-Visual Video Parsing with Prototype-based Pseudo-Labeling

Kranthi Kumar Rachavarapu, Kalyan Ramakrishnan, A. N. Rajagopalan

CVPR 2024

#4877

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

Sanjoy Chowdhury, Sayan Nag, Joseph K J et al.

CVPR 2024highlightarXiv:2406.04673

#4878

InstaGen: Enhancing Object Detection by Training on Synthetic Dataset

Chengjian Feng, Yujie Zhong, Zequn Jie et al.

CVPR 2024arXiv:2402.05937

#4879

Narrative Action Evaluation with Prompt-Guided Multimodal Interaction

Shiyi Zhang, Sule Bai, Guangyi Chen et al.

CVPR 2024arXiv:2404.14471

#4880

DeconfuseTrack: Dealing with Confusion for Multi-Object Tracking

Cheng Huang, Shoudong Han, Mengyu He et al.

CVPR 2024

#4881

ChatPose: Chatting about 3D Human Pose

Yao Feng, Jing Lin, Sai Kumar Dwivedi et al.

CVPR 2024arXiv:2311.18836

#4882

Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention

Ju-Hyeon Nam, Nur Suriza Syazwany, Su Jung Kim et al.

CVPR 2024arXiv:2405.06284

#4883

NC-TTT: A Noise Constrastive Approach for Test-Time Training

David OSOWIECHI, Gustavo Vargas Hakim, Mehrdad Noori et al.

CVPR 2024highlight

#4884

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments

Duy Tho Le, Chenhui Gou, Stavya Datta et al.

CVPR 2024arXiv:2404.01686

#4885

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models

Jingyao Xu, Yuetong Lu, Yandong Li et al.

CVPR 2024arXiv:2404.15081

#4886

ESCAPE: Encoding Super-keypoints for Category-Agnostic Pose Estimation

Khoi D Nguyen, Chen Li, Gim Hee Lee

CVPR 2024

#4887

Task-Specific Gradient Adaptation for Few-Shot One-Class Classification

Yunlong Li, Xiabi Liu, Liyuan Pan et al.

CVPR 2025

#4888

Minimal Perspective Autocalibration

Andrea Porfiri Dal Cin, Timothy Duff, Luca Magri et al.

CVPR 2024arXiv:2405.05605

#4889

ReGenNet: Towards Human Action-Reaction Synthesis

Liang Xu, Yizhou Zhou, Yichao Yan et al.

CVPR 2024arXiv:2403.11882

#4890

RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos

Hongchi Xia, Yang Fu, Sifei Liu et al.

CVPR 2024arXiv:2401.12592

#4891

Aligning and Prompting Everything All at Once for Universal Visual Perception

Yunhang Shen, Chaoyou Fu, Peixian Chen et al.

CVPR 2024arXiv:2312.02153

#4892

ZONE: Zero-Shot Instruction-Guided Local Editing

Shanglin Li, Bohan Zeng, Yutang Feng et al.

CVPR 2024arXiv:2312.16794

#4893

Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption

Buzhen Huang, Chen Li, Chongyang Xu et al.

CVPR 2024arXiv:2404.11291

#4894

Label Propagation for Zero-shot Classification with Vision-Language Models

Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias

CVPR 2024arXiv:2404.04072

#4895

IQ-VFI: Implicit Quadratic Motion Estimation for Video Frame Interpolation

Mengshun Hu, Kui Jiang, Zhihang Zhong et al.

CVPR 2024

#4896

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition

Anqi Zhu, Qiuhong Ke, Mingming Gong et al.

CVPR 2024arXiv:2406.13327

#4897

Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous and Instruction-guided Driving

Brian Yang, Huangyuan Su, Nikolaos Gkanatsios et al.

CVPR 2024

#4898

Parametric Point Cloud Completion for Polygonal Surface Reconstruction

Zhaiyu Chen, Yuqing Wang, Liangliang Nan et al.

CVPR 2025arXiv:2503.08363

#4899

Structured Model Probing: Empowering Efficient Transfer Learning by Structured Regularization

Zhi-Fan Wu, Chaojie Mao, Xue Wang et al.

CVPR 2024

#4900

TinyFusion: Diffusion Transformers Learned Shallow

Gongfan Fang, Kunjun Li, Xinyin Ma et al.

CVPR 2025highlightarXiv:2412.01199

#4901

Poly-Autoregressive Prediction for Modeling Interactions

Neerja Thakkar, Tara Sadjadpour, Jathushan Rajasegaran et al.

CVPR 2025arXiv:2502.08646

#4902

CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections

Thomas Walker, Salvatore Esposito, Daniel Rebain et al.

CVPR 2025arXiv:2412.04120

#4903

Decentralized Diffusion Models

David McAllister, Matthew Tancik, Jiaming Song et al.

CVPR 2025arXiv:2501.05450

#4904

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Utkarsh Mall, Cheng Perng Phoo, Mia Chiquier et al.

CVPR 2025arXiv:2502.10060

#4905

CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation

Lingjun Zhao, Jingyu Song, Katherine Skinner

CVPR 2024arXiv:2403.19104

#4906

Investigating the Role of Weight Decay in Enhancing Nonconvex SGD

Tao Sun, Yuhao Huang, Li Shen et al.

CVPR 2025

#4907

UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping

Aashish Rai, Dilin Wang, Mihir Jain et al.

CVPR 2025arXiv:2502.01846

#4908

Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Yingying Deng, Xiangyu He, Fan Tang et al.

CVPR 2025arXiv:2503.12124

#4909

CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models

Felix Taubner, Ruihang Zhang, Mathieu Tuli et al.

CVPR 2025arXiv:2412.12093

#4910

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

Jingcheng Ni, Yuxin Guo, Yichen Liu et al.

CVPR 2025arXiv:2502.11663

#4911

Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing

Bingyan Liu, Chengyu Wang, Tingfeng Cao et al.

CVPR 2024arXiv:2403.03431

#4912

ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency

Dong Wei, Xiaoning Sun, Xizhan Gao et al.

CVPR 2025highlight

#4913

Two by Two: Learning Multi-Task Pairwise Objects Assembly for Generalizable Robot Manipulation

Yu Qi, Yuanchen Ju, Tianming Wei et al.

CVPR 2025arXiv:2504.06961

#4914

GIF: Generative Inspiration for Face Recognition at Scale

Mohammad Saadabadi Saadabadi, Sahar Rahimi Malakshan, Ali Dabouei et al.

CVPR 2025

#4915

TULIP: Transformer for Upsampling of LiDAR Point Clouds

Bin Yang, Patrick Pfreundschuh, Roland Siegwart et al.

CVPR 2024arXiv:2312.06733

#4916

AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning

Kaixuan Wu, Xinde Li, Xinglin Li et al.

CVPR 2025

#4917

Science-T2I: Addressing Scientific Illusions in Image Synthesis

Jialuo Li, Wenhao Chai, XINGYU FU et al.

CVPR 2025arXiv:2504.13129

#4918

Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition

Yang Chen, Jingcai Guo, Song Guo et al.

CVPR 2025arXiv:2411.11288

#4919

Navigating Image Restoration with VAR’s Distribution Alignment Prior

Siyang Wang, Naishan Zheng, Jie Huang et al.

CVPR 2025arXiv:2412.21063

#4920

Incremental Residual Concept Bottleneck Models

Chenming Shang, Shiji Zhou, Hengyuan Zhang et al.

CVPR 2024arXiv:2404.08978

#4921

HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation

Hermann Kumbong, Xian Liu, Tsung-Yi Lin et al.

CVPR 2025arXiv:2506.04421

#4922

Minding Fuzzy Regions: A Data-driven Alternating Learning Paradigm for Stable Lesion Segmentation

Lexin Fang, Yunyang Xu, Xiang Ma et al.

CVPR 2025arXiv:2503.11140

#4923

LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

Xiaoyan Xing, Konrad Groh, Sezer Karaoglu et al.

CVPR 2025arXiv:2412.00177

#4924

ProjAttacker: A Configurable Physical Adversarial Attack for Face Recognition via Projector

Yuanwei Liu, Hui Wei, Chengyu Jia et al.

CVPR 2025

#4925

Efficient Dataset Distillation via Minimax Diffusion

Jianyang Gu, Saeed Vahidian, Vyacheslav Kungurtsev et al.

CVPR 2024arXiv:2311.15529

#4926

DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Qitao Zhao, Amy Lin, Jeff Tan et al.

CVPR 2025arXiv:2505.05473

#4927

DUSt3R: Geometric 3D Vision Made Easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon et al.

CVPR 2024arXiv:2312.14132

#4928

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Guotao liang, Baoquan Zhang, Zhiyuan Wen et al.

CVPR 2025highlightarXiv:2503.01261

#4929

StyleMaster: Stylize Your Video with Artistic Generation and Translation

Zixuan Ye, Huijuan Huang, Xintao Wang et al.

CVPR 2025arXiv:2412.07744

#4930

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

Gaozheng Pei, Shaojie Lyu, Gong Chen et al.

CVPR 2025arXiv:2503.01407

#4931

Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows

Shentong Mo, Yibing Song

CVPR 2025

#4932

DL2G: Degradation-guided Local-to-Global Restoration for Eyeglass Reflection Removal

Yizhilv, Xiao Lu, Hong Ding et al.

CVPR 2025

#4933

Efficient Decoupled Feature 3D Gaussian Splatting via Hierarchical Compression

Zhenqi Dai, Ting Liu, Yanning Zhang

CVPR 2025

#4934

Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning

Mi Luo, Zihui Xue, Alex Dimakis et al.

CVPR 2025

#4935

BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing

Yunqi Gu, Ian Huang, Jihyeon Je et al.

CVPR 2025highlightarXiv:2504.01786

#4936

NightCC: Nighttime Color Constancy via Adaptive Channel Masking

Shuwei Li, Robby T. Tan

CVPR 2024

#4937

AdMiT: Adaptive Multi-Source Tuning in Dynamic Environments

Xiangyu Chang, Fahim Faisal Niloy, Sk Miraj Ahmed et al.

CVPR 2025

#4938

Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition

Zhiyuan Chen, Keyi Li, Yifan Jia et al.

CVPR 2025arXiv:2505.05829

#4939

Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes

Yiming Dou, Wonseok Oh, Yuqing Luo et al.

CVPR 2025arXiv:2506.09989

#4940

Fortifying Federated Learning Towards Trustworthiness via Auditable Data Valuation and Verifiable Client Contribution

Naveen Kumar Kummari, Ranjeet Ranjan Jha, Krishna Mohan Chalavadi et al.

CVPR 2025

#4941

RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing

Zhipeng Huang, Wangbo Yu, Xinhua Cheng et al.

CVPR 2025arXiv:2412.16778

#4942

DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos

Lorenzo Mur-Labadia, Jose J. Guerrero, Ruben Martinez-Cantin

CVPR 2025highlightarXiv:2503.08344

#4943

A Simple Data Augmentation for Feature Distribution Skewed Federated Learning

Yunlu Yan, Huazhu Fu, Yuexiang Li et al.

CVPR 2025arXiv:2306.09363

#4944

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Kai Xu, Ziwei Yu, Xin Wang et al.

CVPR 2024highlightarXiv:2305.00163

#4945

Domain Generalization in CLIP via Learning with Diverse Text Prompts

Changsong Wen, Zelin Peng, Yu Huang et al.

CVPR 2025

#4946

Alias-Free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space

Yifan Zhou, Zeqi Xiao, Shuai Yang et al.

CVPR 2025arXiv:2503.09419

#4947

Weakly Supervised Contrastive Adversarial Training for Learning Robust Features from Semi-supervised Data

Lilin Zhang, Chengpei Wu, Ning Yang

CVPR 2025arXiv:2503.11032

#4948

Graph-Embedded Structure-Aware Perceptual Hashing for Neural Network Protection and Piracy Detection

Ruiheng Liu, Haozhe Chen, Boyao Zhao et al.

CVPR 2025

#4949

Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning

Debora Caldarola, Pietro Cagnasso, Barbara Caputo et al.

CVPR 2025arXiv:2412.03752

#4950

Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems

Alejandro Castañeda Garcia, Jan Warchocki, Jan van Gemert et al.

CVPR 2025arXiv:2410.01376

#4951

Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization

Feifei Li, Mi Zhang, Yiming Sun et al.

CVPR 2025arXiv:2503.15197

#4952

Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation

Dingcheng Zhen, Shunshun Yin, Shiyang Qin et al.

CVPR 2025arXiv:2503.18429

#4953

AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes

Andrea Porfiri Dal Cin, Georgi Dikov, Jihong Ju et al.

CVPR 2025

#4954

K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs

Ziheng Ouyang, Zhen Li, Qibin Hou

CVPR 2025arXiv:2502.18461

#4955

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long et al.

CVPR 2025arXiv:2412.13573

#4956

CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-Scale Reinforcement Learning in Autonomous Driving

Dongkun Zhang, Jiaming Liang, Ke Guo et al.

CVPR 2025arXiv:2502.19908

#4957

FASTer: Focal token Acquiring-and-Scaling Transformer for Long-term 3D Objection Detection

Chenxu Dang, Pei An, Xinmin Zhang et al.

CVPR 2025arXiv:2503.01899

#4958

UCM-VeID V2: A Richer Dataset and A Pre-training Method for UAV Cross-Modality Vehicle Re-Identification

Xingyue Liu, Jiahao Qi, Chen Chen et al.

CVPR 2025

#4959

Unboxed: Geometrically and Temporally Consistent Video Outpainting

Zhongrui Yu, Martina Megaro-Boldini, Robert Sumner et al.

CVPR 2025

#4960

Less is More: Efficient Model Merging with Binary Task Switch

Biqing Qi, Fangyuan Li, Zhen Wang et al.

CVPR 2025highlightarXiv:2412.00054

#4961

Adversarial Text to Continuous Image Generation

Kilichbek Haydarov, Aashiq Muhamed, Xiaoqian Shen et al.

CVPR 2024

#4962

Visual Lexicon: Rich Image Features in Language Space

XuDong Wang, Xingyi Zhou, Alireza Fathi et al.

CVPR 2025arXiv:2412.06774

#4963

Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes

Hyeonggon Ryu, Seongyu Kim, Joon Chung et al.

CVPR 2025

#4964

Continual SFT Matches Multimodal RLHF with Negative Supervision

Ke Zhu, Yu Wang, Yanpeng Sun et al.

CVPR 2025arXiv:2411.14797

#4965

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation

Junjie Chen, Weilong Chen, Yifan Zuo et al.

CVPR 2025arXiv:2503.21140

#4966

Cross-Modal 3D Representation with Multi-View Images and Point Clouds

Ziyang Zhou, Pinghui Wang, Zi Liang et al.

CVPR 2025

#4967

Heterogeneous Skeleton-Based Action Representation Learning

Xiaoyan Ma, jidong kuang, Hongsong Wang et al.

CVPR 2025arXiv:2506.03481

#4968

DeformCL: Learning Deformable Centerline Representation for Vessel Extraction in 3D Medical Image

Ziwei Zhao, Zhixing Zhang, Yuhang Liu et al.

CVPR 2025arXiv:2506.05820

#4969

Once-Tuning-Multiple-Variants: Tuning Once and Expanded as Multiple Vision-Language Model Variants

Chong Yu, Tao Chen, Zhongxue Gan

CVPR 2025

#4970

Seeing is Not Believing: Adversarial Natural Object Optimization for Hard-Label 3D Scene Attacks

Daizong Liu, Wei Hu

CVPR 2025

#4971

HomoGen: Enhanced Video Inpainting via Homography Propagation and Diffusion

Ding Ding, Yueming Pan, Ruoyu Feng et al.

CVPR 2025

#4972

Towards Continual Universal Segmentation

Zihan Lin, Zilei Wang, Xu Wang

CVPR 2025

#4973

HiPART: Hierarchical Pose AutoRegressive Transformer for Occluded 3D Human Pose Estimation

Hongwei Zheng, Han Li, Wenrui Dai et al.

CVPR 2025arXiv:2503.23331

#4974

Decoupled Motion Expression Video Segmentation

Hao Fang, Runmin Cong, Xiankai Lu et al.

CVPR 2025

#4975

Exploring Contextual Attribute Density in Referring Expression Counting

Zhicheng Wang, Zhiyu Pan, Zhan Peng et al.

CVPR 2025arXiv:2503.12460

#4976

Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data

Haoxin Li, Boyang Li

CVPR 2025arXiv:2503.01167

#4977

Efficient Depth Estimation for Unstable Stereo Camera Systems on AR Glasses

Yongfan Liu, Hyoukjun Kwon

CVPR 2025arXiv:2411.10013

#4978

FSHNet: Fully Sparse Hybrid Network for 3D Object Detection

Shuai Liu, Mingyue Cui, Boyang Li et al.

CVPR 2025arXiv:2506.03714

#4979

Mixture of Submodules for Domain Adaptive Person Search

Minsu Kim, Seungryong Kim, Kwanghoon Sohn

CVPR 2025

#4980

Unsupervised Discovery of Facial Landmarks and Head Pose

Satyajit Tourani, Siddharth Tourani, Arif Mahmood et al.

CVPR 2025

#4981

InceptionNeXt: When Inception Meets ConvNeXt

Weihao Yu, Pan Zhou, Shuicheng Yan et al.

CVPR 2024arXiv:2303.16900

#4982

Dynamic Integration of Task-Specific Adapters for Class Incremental Learning

Jiashuo Li, Shaokun Wang, Bo Qian et al.

CVPR 2025arXiv:2409.14983

#4983

DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows

Mashrur M. Morshed, Vishnu Naresh Boddeti

CVPR 2025arXiv:2504.07894

#4984

Test-time Augmentation Improves Efficiency in Conformal Prediction

Divya M Shanmugam, Helen Lu, Swami Sankaranarayanan et al.

CVPR 2025arXiv:2505.22764

#4985

GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding

Yawen Shao, Wei Zhai, Yuhang Yang et al.

CVPR 2025arXiv:2411.19626

#4986

Robotic Visual Instruction

Yanbang Li, ZiYang Gong, Haoyang Li et al.

CVPR 2025arXiv:2505.00693

#4987

Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues

Yuhui Liu, Liangxun Ou, Qiang Fu et al.

CVPR 2025

#4988

Dual Diffusion for Unified Image Generation and Understanding

Zijie Li, Henry Li, Yichun Shi et al.

CVPR 2025arXiv:2501.00289

#4989

Learning Compatible Multi-Prize Subnetworks for Asymmetric Retrieval

Yushuai Sun, Zikun Zhou, Dongmei Jiang et al.

CVPR 2025arXiv:2504.11879

#4990

Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning

Huabin Liu, Filip Ilievski, Cees G. M. Snoek

CVPR 2025arXiv:2501.05069

#4991

Opportunistic Single-Photon Time of Flight

Sotiris Nousias, Mian Wei, Howard Xiao et al.

CVPR 2025

#4992

Enduring, Efficient and Robust Trajectory Prediction Attack in Autonomous Driving via Optimization-Driven Multi-Frame Perturbation Framework

Yi Yu, Weizhen Han, Libing Wu et al.

CVPR 2025highlight

#4993

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Teng Hu, Jiangning Zhang, Ran Yi et al.

CVPR 2025arXiv:2501.00880

#4994

UNEM: UNrolled Generalized EM for Transductive Few-Shot Learning

Long Zhou, Fereshteh Shakeri, Aymen Sadraoui et al.

CVPR 2025arXiv:2412.16739

#4995

Query Efficient Black-Box Visual Prompting with Subspace Learning

Haozhen Zhang, Zhaogeng Liu, Hualin Zhang et al.

CVPR 2025

#4996

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes

Stefano Esposito, Anpei Chen, Christian Reiser et al.

CVPR 2025arXiv:2409.02482

#4997

Fingerprinting Denoising Diffusion Probabilistic Models

Huan Teng, Yuhui Quan, Chengyu Wang et al.

CVPR 2025

#4998

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Hanwen Jiang, Zexiang Xu, Desai Xie et al.

CVPR 2025arXiv:2412.14166

#4999

Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation

Tianfu Wang, Mingyang Xie, Haoming Cai et al.

CVPR 2025arXiv:2501.00637

#5000

AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation

Jingyi Xie, Jintao Yang, Zhunchen Luo et al.

CVPR 2025

← Previous

1...23 24 25 26 27 28