Most Cited CVPR "anomaly score refinement" Papers

5,589 papers found • Page 25 of 28

#4801

PersonaBooth: Personalized Text-to-Motion Generation

Boeun Kim, Hea In Jeong, JungHoon Sung et al.

CVPR 2025posterarXiv:2503.07390
#4802

Make Pixels Dance: High-Dynamic Video Generation

Yan Zeng, Guoqiang Wei, Jiani Zheng et al.

CVPR 2024posterarXiv:2311.10982
#4803

Masked AutoDecoder is Effective Multi-Task Vision Generalist

Han Qiu, Jiaxing Huang, Peng Gao et al.

CVPR 2024posterarXiv:2403.07692
#4804

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

Keda Tao, Can Qin, Haoxuan You et al.

CVPR 2025posterarXiv:2411.15024
#4805

Generative Multi-modal Models are Good Class Incremental Learners

Xusheng Cao, Haori Lu, Linlan Huang et al.

CVPR 2024poster
#4806

Deciphering ‘What’ and ‘Where’ Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations

Xiao Zhang, David Yunis, Michael Maire

CVPR 2024highlightarXiv:2312.06716
#4807

CRISP: Object Pose and Shape Estimation with Test-Time Adaptation

Jingnan Shi, Rajat Talak, Harry Zhang et al.

CVPR 2025highlightarXiv:2412.01052
#4808

LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction

Bo Zou, Chao Yang, Yu Qiao et al.

CVPR 2024posterarXiv:2404.00913
#4809

EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models

Sijie Cheng, Zhicheng Guo, Jingwen Wu et al.

CVPR 2024highlightarXiv:2311.15596
#4810

Yo’Chameleon: Personalized Vision and Language Generation

Thao Nguyen, Krishna Kumar Singh, Jing Shi et al.

CVPR 2025poster
#4811

Learning Temporally Consistent Video Depth from Video Diffusion Priors

Jiahao Shao, Yuanbo Yang, Hongyu Zhou et al.

CVPR 2025posterarXiv:2406.01493
#4812

Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch

Aneeshan Sain, Subhajit Maity, Pinaki Nath Chowdhury et al.

CVPR 2025posterarXiv:2505.23763
#4813

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks Methods and Applications

Karren Yang, Anurag Ranjan, Jen-Hao Rick Chang et al.

CVPR 2024poster
#4814

From Feature to Gaze: A Generalizable Replacement of Linear Layer for Gaze Estimation

Yiwei Bao, Feng Lu

CVPR 2024highlight
#4815

EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights

Zhenghao Xing, Hao Chen, Binzhu Xie et al.

CVPR 2025poster
#4816

NC-SDF: Enhancing Indoor Scene Reconstruction Using Neural SDFs with View-Dependent Normal Compensation

Ziyi Chen, Xiaolong Wu, Yu Zhang

CVPR 2024posterarXiv:2405.00340
#4817

OSLoPrompt: Bridging Low-Supervision Challenges and Open-Set Domain Generalization in CLIP

Mohamad Hassan N C, Divyam Gupta, Mainak Singha et al.

CVPR 2025posterarXiv:2503.16106
#4818

Language Models as Black-Box Optimizers for Vision-Language Models

Shihong Liu, Samuel Yu, Zhiqiu Lin et al.

CVPR 2024posterarXiv:2309.05950
#4819

Transferable Structural Sparse Adversarial Attack Via Exact Group Sparsity Training

Di Ming, Peng Ren, Yunlong Wang et al.

CVPR 2024poster
#4820

Temporal Alignment-Free Video Matching for Few-shot Action Recognition

SuBeen Lee, WonJun Moon, Hyun Seok Seong et al.

CVPR 2025posterarXiv:2504.05956
#4821

IRGS: Inter-Reflective Gaussian Splatting with 2D Gaussian Ray Tracing

Chun Gu, Xiaofei Wei, Zixuan Zeng et al.

CVPR 2025posterarXiv:2412.15867
#4822

Holistic Autonomous Driving Understanding by Bird’s-Eye-View Injected Multi-Modal Large Models

Xinpeng Ding, Jianhua Han, Hang Xu et al.

CVPR 2024posterarXiv:2401.00988
#4823

Practical Solutions to the Relative Pose of Three Calibrated Cameras

Charalambos Tzamos, Viktor Kocur, Yaqing Ding et al.

CVPR 2025posterarXiv:2303.16078
#4824

ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering

Haokai Pang, Heming Zhu, Adam Kortylewski et al.

CVPR 2024posterarXiv:2312.05941
#4825

MultimodalStudio: A Heterogeneous Sensor Dataset and Framework for Neural Rendering across Multiple Imaging Modalities

Federico Lincetto, Gianluca Agresti, Mattia Rossi et al.

CVPR 2025posterarXiv:2503.19673
#4826

Exploring Timeline Control for Facial Motion Generation

Yifeng Ma, Jinwei Qi, Chaonan Ji et al.

CVPR 2025posterarXiv:2505.20861
#4827

Equivariant Plug-and-Play Image Reconstruction

Matthieu Terris, Thomas Moreau, Nelly Pustelnik et al.

CVPR 2024posterarXiv:2312.01831
#4828

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

Tobias Kirschstein, Simon Giebenhain, Matthias Nießner

CVPR 2024posterarXiv:2311.18635
#4829

Structured 3D Latents for Scalable and Versatile 3D Generation

Jianfeng XIANG, Zelong Lv, Sicheng Xu et al.

CVPR 2025highlightarXiv:2412.01506
#4830

Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

Jiaming Li, Jiacheng Zhang, Jichang Li et al.

CVPR 2024posterarXiv:2406.00510
#4831

Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts

Feng Liang, Haoyu Ma, Zecheng He et al.

CVPR 2025posterarXiv:2502.07802
#4832

Addressing Background Context Bias in Few-Shot Segmentation through Iterative Modulation

Lanyun Zhu, Tianrun Chen, Jianxiong Yin et al.

CVPR 2024poster
#4833

Learned Representation-Guided Diffusion Models for Large-Image Generation

Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le et al.

CVPR 2024posterarXiv:2312.07330
#4834

Enhancing Facial Privacy Protection via Weakening Diffusion Purification

Ali Salar, Qing Liu, Yingli Tian et al.

CVPR 2025posterarXiv:2503.10350
#4835

OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation

Xiongwei Wu, Sicheng Yu, Ee-Peng Lim et al.

CVPR 2024posterarXiv:2404.01409
#4836

AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection

Trevine Oorloff, Surya Koppisetti, Nicolo Bonettini et al.

CVPR 2024posterarXiv:2406.02951
#4837

CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection

Haonan Zhang, Longjun Liu, Yuqi Huang et al.

CVPR 2024poster
#4838

Friendly Sharpness-Aware Minimization

Tao Li, Pan Zhou, Zhengbao He et al.

CVPR 2024posterarXiv:2403.12350
#4839

CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation

Xi Liu, Ying Guo, Cheng Zhen et al.

CVPR 2024posterarXiv:2403.00274
#4840

Brain Decodes Deep Nets

Huzheng Yang, James Gee, Jianbo Shi

CVPR 2024highlightarXiv:2312.01280
#4841

MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using Differentiable Shading

Abdallah Dib, Luiz Gustavo Hafemann, Emeline Got et al.

CVPR 2024posterarXiv:2312.13091
#4842

Point2CAD: Reverse Engineering CAD Models from 3D Point Clouds

Yujia Liu, Anton Obukhov, Jan D. Wegner et al.

CVPR 2024highlightarXiv:2312.04962
#4843

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

Jian Wang, Zhe Cao, Diogo Luvizon et al.

CVPR 2024poster
#4844

A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Yuelin Zhang, Pengyu Zheng, Wanquan Yan et al.

CVPR 2024posterarXiv:2403.02611
#4845

Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting

Haipeng Liu, Yang Wang, Biao Qian et al.

CVPR 2024posterarXiv:2403.19898
#4846

AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction

Yuanbin Man, Ying Huang, Chengming Zhang et al.

CVPR 2025highlightarXiv:2411.12593
#4847

Misalignment-Robust Frequency Distribution Loss for Image Transformation

Zhangkai Ni, Juncheng Wu, Zian Wang et al.

CVPR 2024posterarXiv:2402.18192
#4848

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Sili Chen, Hengkai Guo, Shengnan Zhu et al.

CVPR 2025highlightarXiv:2501.12375
#4849

WildlifeMapper: Aerial Image Analysis for Multi-Species Detection and Identification

Satish Kumar, Bowen Zhang, Chandrakanth Gudavalli et al.

CVPR 2024poster
#4850

Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Jona Ballé, Luca Versari, Emilien Dupont et al.

CVPR 2025highlightarXiv:2412.00505
#4851

SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking

Xiaojun Hou, Jiazheng Xing, Yijie Qian et al.

CVPR 2024posterarXiv:2403.16002
#4852

Consistent Normal Orientation for 3D Point Clouds via Least Squares on Delaunay Graph

Rao Fu, Jianmin Zheng, Liang Yu

CVPR 2025poster
#4853

SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

Yunfei Fan, Tianyu Zhao, Guidong Wang

CVPR 2024posterarXiv:2312.01616
#4854

BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology

Amaya Gallagher-Syed, Henry Senior, Omnia Alwazzan et al.

CVPR 2025posterarXiv:2503.20880
#4855

MACE: Mass Concept Erasure in Diffusion Models

Shilin Lu, Zilan Wang, Leyang Li et al.

CVPR 2024posterarXiv:2403.06135
#4856

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations

Tianhao Qi, Shancheng Fang, Yanze Wu et al.

CVPR 2024highlightarXiv:2403.06951
#4857

Learning Degradation-unaware Representation with Prior-based Latent Transformations for Blind Face Restoration

Lianxin Xie, csbingbing zheng, Wen Xue et al.

CVPR 2024poster
#4858

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

Qian Wang, Weiqi Li, Chong Mou et al.

CVPR 2024posterarXiv:2401.06578
#4859

Alpha Invariance: On Inverse Scaling Between Distance and Volume Density in Neural Radiance Fields

Joshua Ahn, Haochen Wang, Raymond A. Yeh et al.

CVPR 2024posterarXiv:2404.02155
#4860

Descriptor-In-Pixel : Point-Feature Tracking For Pixel Processor Arrays

Laurie Bose, Piotr Dudek, Jianing Chen

CVPR 2025poster
#4861

Countering Personalized Text-to-Image Generation with Influence Watermarks

Hanwen Liu, Zhicheng Sun, Yadong Mu

CVPR 2024poster
#4862

Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Multi-Scale Aggregation and Anthropic Prior Knowledge

Bo Zou, Shaofeng Wang, Hao Liu et al.

CVPR 2024poster
#4863

Interpreting Object-level Foundation Models via Visual Precision Search

Ruoyu Chen, Siyuan Liang, Jingzhi Li et al.

CVPR 2025highlightarXiv:2411.16198
#4864

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

Tanvir Mahmud, Yapeng Tian, Diana Marculescu

CVPR 2024posterarXiv:2404.01751
#4865

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

Xiang Xu, Lingdong Kong, hui shuai et al.

CVPR 2025posterarXiv:2501.04004
#4866

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a single RGB-D Image

Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos et al.

CVPR 2024posterarXiv:2403.10357
#4867

TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions

Wang Yu-Hang, Junkang Guo, Aolei Liu et al.

CVPR 2025poster
#4868

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024posterarXiv:2403.13347
#4869

Initialization Matters for Adversarial Transfer Learning

Andong Hua, Jindong Gu, Zhiyu Xue et al.

CVPR 2024posterarXiv:2312.05716
#4870

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

Zengqun Zhao, Ziquan Liu, Yu Cao et al.

CVPR 2025posterarXiv:2503.05665
#4871

MindBridge: A Cross-Subject Brain Decoding Framework

Shizun Wang, Songhua Liu, Zhenxiong Tan et al.

CVPR 2024highlightarXiv:2404.07850
#4872

Loopy-SLAM: Dense Neural SLAM with Loop Closures

Lorenzo Liso, Erik Sandström, Vladimir Yugay et al.

CVPR 2024posterarXiv:2402.09944
#4873

Weakly-Supervised Audio-Visual Video Parsing with Prototype-based Pseudo-Labeling

Kranthi Kumar Rachavarapu, Kalyan Ramakrishnan, A. N. Rajagopalan

CVPR 2024poster
#4874

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

Sanjoy Chowdhury, Sayan Nag, Joseph K J et al.

CVPR 2024highlightarXiv:2406.04673
#4875

InstaGen: Enhancing Object Detection by Training on Synthetic Dataset

Chengjian Feng, Yujie Zhong, Zequn Jie et al.

CVPR 2024posterarXiv:2402.05937
#4876

Narrative Action Evaluation with Prompt-Guided Multimodal Interaction

Shiyi Zhang, Sule Bai, Guangyi Chen et al.

CVPR 2024posterarXiv:2404.14471
#4877

DeconfuseTrack: Dealing with Confusion for Multi-Object Tracking

Cheng Huang, Shoudong Han, Mengyu He et al.

CVPR 2024poster
#4878

ChatPose: Chatting about 3D Human Pose

Yao Feng, Jing Lin, Sai Kumar Dwivedi et al.

CVPR 2024posterarXiv:2311.18836
#4879

Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention

Ju-Hyeon Nam, Nur Suriza Syazwany, Su Jung Kim et al.

CVPR 2024posterarXiv:2405.06284
#4880

NC-TTT: A Noise Constrastive Approach for Test-Time Training

David OSOWIECHI, Gustavo Vargas Hakim, Mehrdad Noori et al.

CVPR 2024highlight
#4881

RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives

Chirag Parikh, Deepti Rawat, Rakshitha R. T. et al.

CVPR 2025posterarXiv:2503.21459
#4882

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models

Jingyao Xu, Yuetong Lu, Yandong Li et al.

CVPR 2024posterarXiv:2404.15081
#4883

ESCAPE: Encoding Super-keypoints for Category-Agnostic Pose Estimation

Khoi D Nguyen, Chen Li, Gim Hee Lee

CVPR 2024poster
#4884

Task-Specific Gradient Adaptation for Few-Shot One-Class Classification

Yunlong Li, Xiabi Liu, Liyuan Pan et al.

CVPR 2025poster
#4885

Minimal Perspective Autocalibration

Andrea Porfiri Dal Cin, Timothy Duff, Luca Magri et al.

CVPR 2024posterarXiv:2405.05605
#4886

ReGenNet: Towards Human Action-Reaction Synthesis

Liang Xu, Yizhou Zhou, Yichao Yan et al.

CVPR 2024posterarXiv:2403.11882
#4887

RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos

Hongchi Xia, Yang Fu, Sifei Liu et al.

CVPR 2024posterarXiv:2401.12592
#4888

Aligning and Prompting Everything All at Once for Universal Visual Perception

Yunhang Shen, Chaoyou Fu, Peixian Chen et al.

CVPR 2024posterarXiv:2312.02153
#4889

ZONE: Zero-Shot Instruction-Guided Local Editing

Shanglin Li, Bohan Zeng, Yutang Feng et al.

CVPR 2024posterarXiv:2312.16794
#4890

Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption

Buzhen Huang, Chen Li, Chongyang Xu et al.

CVPR 2024posterarXiv:2404.11291
#4891

Label Propagation for Zero-shot Classification with Vision-Language Models

Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias

CVPR 2024posterarXiv:2404.04072
#4892

IQ-VFI: Implicit Quadratic Motion Estimation for Video Frame Interpolation

Mengshun Hu, Kui Jiang, Zhihang Zhong et al.

CVPR 2024poster
#4893

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition

Anqi Zhu, Qiuhong Ke, Mingming Gong et al.

CVPR 2024posterarXiv:2406.13327
#4894

Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous and Instruction-guided Driving

Brian Yang, Huangyuan Su, Nikolaos Gkanatsios et al.

CVPR 2024poster
#4895

Parametric Point Cloud Completion for Polygonal Surface Reconstruction

Zhaiyu Chen, Yuqing Wang, Liangliang Nan et al.

CVPR 2025posterarXiv:2503.08363
#4896

Structured Model Probing: Empowering Efficient Transfer Learning by Structured Regularization

Zhi-Fan Wu, Chaojie Mao, Xue Wang et al.

CVPR 2024poster
#4897

CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation

Lingjun Zhao, Jingyu Song, Katherine Skinner

CVPR 2024posterarXiv:2403.19104
#4898

CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models

Felix Taubner, Ruihang Zhang, Mathieu Tuli et al.

CVPR 2025posterarXiv:2412.12093
#4899

Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing

Bingyan Liu, Chengyu Wang, Tingfeng Cao et al.

CVPR 2024posterarXiv:2403.03431
#4900

TULIP: Transformer for Upsampling of LiDAR Point Clouds

Bin Yang, Patrick Pfreundschuh, Roland Siegwart et al.

CVPR 2024posterarXiv:2312.06733
#4901

Incremental Residual Concept Bottleneck Models

Chenming Shang, Shiji Zhou, Hengyuan Zhang et al.

CVPR 2024posterarXiv:2404.08978
#4902

HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation

Hermann Kumbong, Xian Liu, Tsung-Yi Lin et al.

CVPR 2025posterarXiv:2506.04421
#4903

Efficient Dataset Distillation via Minimax Diffusion

Jianyang Gu, Saeed Vahidian, Vyacheslav Kungurtsev et al.

CVPR 2024posterarXiv:2311.15529
#4904

DUSt3R: Geometric 3D Vision Made Easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon et al.

CVPR 2024posterarXiv:2312.14132
#4905

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Kai Xu, Ziwei Yu, Xin Wang et al.

CVPR 2024highlightarXiv:2305.00163
#4906

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Guotao liang, Baoquan Zhang, Zhiyuan Wen et al.

CVPR 2025highlightarXiv:2503.01261
#4907

StyleMaster: Stylize Your Video with Artistic Generation and Translation

Zixuan Ye, Huijuan Huang, Xintao Wang et al.

CVPR 2025posterarXiv:2412.07744
#4908

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

Gaozheng Pei, Shaojie Lyu, Gong Chen et al.

CVPR 2025posterarXiv:2503.01407
#4909

Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows

Shentong Mo, Yibing Song

CVPR 2025poster
#4910

DL2G: Degradation-guided Local-to-Global Restoration for Eyeglass Reflection Removal

Yizhilv, Xiao Lu, Hong Ding et al.

CVPR 2025poster
#4911

Efficient Decoupled Feature 3D Gaussian Splatting via Hierarchical Compression

Zhenqi Dai, Ting Liu, Yanning Zhang

CVPR 2025poster
#4912

BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing

Yunqi Gu, Ian Huang, Jihyeon Je et al.

CVPR 2025highlightarXiv:2504.01786
#4913

ProjAttacker: A Configurable Physical Adversarial Attack for Face Recognition via Projector

Yuanwei Liu, Hui Wei, Chengyu Jia et al.

CVPR 2025poster
#4914

LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

Xiaoyan Xing, Konrad Groh, Sezer Karaoglu et al.

CVPR 2025posterarXiv:2412.00177
#4915

Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition

Zhiyuan Chen, Keyi Li, Yifan Jia et al.

CVPR 2025posterarXiv:2505.05829
#4916

Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes

Yiming Dou, Wonseok Oh, Yuqing Luo et al.

CVPR 2025posterarXiv:2506.09989
#4917

Minding Fuzzy Regions: A Data-driven Alternating Learning Paradigm for Stable Lesion Segmentation

Lexin Fang, Yunyang Xu, Xiang Ma et al.

CVPR 2025posterarXiv:2503.11140
#4918

RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing

Zhipeng Huang, Wangbo Yu, Xinhua Cheng et al.

CVPR 2025posterarXiv:2412.16778
#4919

Science-T2I: Addressing Scientific Illusions in Image Synthesis

Jialuo Li, Wenhao Chai, XINGYU FU et al.

CVPR 2025posterarXiv:2504.13129
#4920

A Simple Data Augmentation for Feature Distribution Skewed Federated Learning

Yunlu Yan, Huazhu Fu, Yuexiang Li et al.

CVPR 2025posterarXiv:2306.09363
#4921

Adversarial Text to Continuous Image Generation

Kilichbek Haydarov, Aashiq Muhamed, Xiaoqian Shen et al.

CVPR 2024poster
#4922

Domain Generalization in CLIP via Learning with Diverse Text Prompts

Changsong Wen, Zelin Peng, Yu Huang et al.

CVPR 2025poster
#4923

Alias-Free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space

Yifan Zhou, Zeqi Xiao, Shuai Yang et al.

CVPR 2025posterarXiv:2503.09419
#4924

AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning

Kaixuan Wu, Xinde Li, Xinglin Li et al.

CVPR 2025poster
#4925

Weakly Supervised Contrastive Adversarial Training for Learning Robust Features from Semi-supervised Data

Lilin Zhang, Chengpei Wu, Ning Yang

CVPR 2025posterarXiv:2503.11032
#4926

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

Jingcheng Ni, Yuxin Guo, Yichen Liu et al.

CVPR 2025posterarXiv:2502.11663
#4927

Graph-Embedded Structure-Aware Perceptual Hashing for Neural Network Protection and Piracy Detection

Ruiheng Liu, Haozhe Chen, Boyao Zhao et al.

CVPR 2025poster
#4928

Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation

Dingcheng Zhen, Shunshun Yin, Shiyang Qin et al.

CVPR 2025posterarXiv:2503.18429
#4929

Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Yingying Deng, Xiangyu He, Fan Tang et al.

CVPR 2025posterarXiv:2503.12124
#4930

K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs

Ziheng Ouyang, Zhen Li, Qibin Hou

CVPR 2025posterarXiv:2502.18461
#4931

UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping

Aashish Rai, Dilin Wang, Mihir Jain et al.

CVPR 2025posterarXiv:2502.01846
#4932

CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-Scale Reinforcement Learning in Autonomous Driving

Dongkun Zhang, Jiaming Liang, Ke Guo et al.

CVPR 2025posterarXiv:2502.19908
#4933

LayoutFormer: Hierarchical Text Detection Towards Scene Text Understanding

Min Liang, Jia-Wei Ma, Xiaobin Zhu et al.

CVPR 2024poster
#4934

UCM-VeID V2: A Richer Dataset and A Pre-training Method for UAV Cross-Modality Vehicle Re-Identification

Xingyue Liu, Jiahao Qi, Chen Chen et al.

CVPR 2025poster
#4935

Unboxed: Geometrically and Temporally Consistent Video Outpainting

Zhongrui Yu, Martina Megaro-Boldini, Robert Sumner et al.

CVPR 2025poster
#4936

Less is More: Efficient Model Merging with Binary Task Switch

Biqing Qi, Fangyuan Li, Zhen Wang et al.

CVPR 2025highlightarXiv:2412.00054
#4937

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Utkarsh Mall, Cheng Perng Phoo, Mia Chiquier et al.

CVPR 2025posterarXiv:2502.10060
#4938

InceptionNeXt: When Inception Meets ConvNeXt

Weihao Yu, Pan Zhou, Shuicheng Yan et al.

CVPR 2024posterarXiv:2303.16900
#4939

Visual Lexicon: Rich Image Features in Language Space

XuDong Wang, Xingyi Zhou, Alireza Fathi et al.

CVPR 2025posterarXiv:2412.06774
#4940

Continual SFT Matches Multimodal RLHF with Negative Supervision

Ke Zhu, Yu Wang, Yanpeng Sun et al.

CVPR 2025posterarXiv:2411.14797
#4941

Poly-Autoregressive Prediction for Modeling Interactions

Neerja Thakkar, Tara Sadjadpour, Jathushan Rajasegaran et al.

CVPR 2025posterarXiv:2502.08646
#4942

TinyFusion: Diffusion Transformers Learned Shallow

Gongfan Fang, Kunjun Li, Xinyin Ma et al.

CVPR 2025highlightarXiv:2412.01199
#4943

Cross-Modal 3D Representation with Multi-View Images and Point Clouds

Ziyang Zhou, Pinghui Wang, Zi Liang et al.

CVPR 2025poster
#4944

DeformCL: Learning Deformable Centerline Representation for Vessel Extraction in 3D Medical Image

Ziwei Zhao, Zhixing Zhang, Yuhang Liu et al.

CVPR 2025posterarXiv:2506.05820
#4945

CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections

Thomas Walker, Salvatore Esposito, Daniel Rebain et al.

CVPR 2025posterarXiv:2412.04120
#4946

Decentralized Diffusion Models

David McAllister, Matthew Tancik, Jiaming Song et al.

CVPR 2025posterarXiv:2501.05450
#4947

ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency

Dong Wei, Xiaoning Sun, Xizhan Gao et al.

CVPR 2025highlight
#4948

Towards Continual Universal Segmentation

Zihan Lin, Zilei Wang, Xu Wang

CVPR 2025poster
#4949

HiPART: Hierarchical Pose AutoRegressive Transformer for Occluded 3D Human Pose Estimation

Hongwei Zheng, Han Li, Wenrui Dai et al.

CVPR 2025posterarXiv:2503.23331
#4950

Decoupled Motion Expression Video Segmentation

Hao Fang, Runmin Cong, Xiankai Lu et al.

CVPR 2025poster
#4951

Two by Two: Learning Multi-Task Pairwise Objects Assembly for Generalizable Robot Manipulation

Yu Qi, Yuanchen Ju, Tianming Wei et al.

CVPR 2025posterarXiv:2504.06961
#4952

GIF: Generative Inspiration for Face Recognition at Scale

Mohammad Saadabadi Saadabadi, Sahar Rahimi Malakshan, Ali Dabouei et al.

CVPR 2025poster
#4953

Exploring Contextual Attribute Density in Referring Expression Counting

Zhicheng Wang, Zhiyu Pan, Zhan Peng et al.

CVPR 2025posterarXiv:2503.12460
#4954

Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition

Yang Chen, Jingcai Guo, Song Guo et al.

CVPR 2025posterarXiv:2411.11288
#4955

Navigating Image Restoration with VAR’s Distribution Alignment Prior

Siyang Wang, Naishan Zheng, Jie Huang et al.

CVPR 2025posterarXiv:2412.21063
#4956

FSHNet: Fully Sparse Hybrid Network for 3D Object Detection

Shuai Liu, Mingyue Cui, Boyang Li et al.

CVPR 2025posterarXiv:2506.03714
#4957

Mixture of Submodules for Domain Adaptive Person Search

Minsu Kim, Seungryong Kim, Kwanghoon Sohn

CVPR 2025poster
#4958

Unsupervised Discovery of Facial Landmarks and Head Pose

Satyajit Tourani, Siddharth Tourani, Arif Mahmood et al.

CVPR 2025poster
#4959

Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis

Bichen Wu, Ching-Yao Chuang, Xiaoyan Wang et al.

CVPR 2024posterarXiv:2312.13834
#4960

Dynamic Integration of Task-Specific Adapters for Class Incremental Learning

Jiashuo Li, Shaokun Wang, Bo Qian et al.

CVPR 2025posterarXiv:2409.14983
#4961

DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows

Mashrur M. Morshed, Vishnu Naresh Boddeti

CVPR 2025posterarXiv:2504.07894
#4962

DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Qitao Zhao, Amy Lin, Jeff Tan et al.

CVPR 2025posterarXiv:2505.05473
#4963

Test-time Augmentation Improves Efficiency in Conformal Prediction

Divya M Shanmugam, Helen Lu, Swami Sankaranarayanan et al.

CVPR 2025posterarXiv:2505.22764
#4964

GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding

Yawen Shao, Wei Zhai, Yuhang Yang et al.

CVPR 2025posterarXiv:2411.19626
#4965

Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning

Mi Luo, Zihui Xue, Alex Dimakis et al.

CVPR 2025poster
#4966

Dual Diffusion for Unified Image Generation and Understanding

Zijie Li, Henry Li, Yichun Shi et al.

CVPR 2025posterarXiv:2501.00289
#4967

Learning Compatible Multi-Prize Subnetworks for Asymmetric Retrieval

Yushuai Sun, Zikun Zhou, Dongmei Jiang et al.

CVPR 2025posterarXiv:2504.11879
#4968

Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning

Huabin Liu, Filip Ilievski, Cees G. M. Snoek

CVPR 2025posterarXiv:2501.05069
#4969

Opportunistic Single-Photon Time of Flight

Sotiris Nousias, Mian Wei, Howard Xiao et al.

CVPR 2025poster
#4970

Enduring, Efficient and Robust Trajectory Prediction Attack in Autonomous Driving via Optimization-Driven Multi-Frame Perturbation Framework

Yi Yu, Weizhen Han, Libing Wu et al.

CVPR 2025highlight
#4971

AdMiT: Adaptive Multi-Source Tuning in Dynamic Environments

Xiangyu Chang, Fahim Faisal Niloy, Sk Miraj Ahmed et al.

CVPR 2025poster
#4972

UNEM: UNrolled Generalized EM for Transductive Few-Shot Learning

Long Zhou, Fereshteh Shakeri, Aymen Sadraoui et al.

CVPR 2025posterarXiv:2412.16739
#4973

Fortifying Federated Learning Towards Trustworthiness via Auditable Data Valuation and Verifiable Client Contribution

Naveen Kumar Kummari, Ranjeet Ranjan Jha, Krishna Mohan Chalavadi et al.

CVPR 2025poster
#4974

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes

Stefano Esposito, Anpei Chen, Christian Reiser et al.

CVPR 2025posterarXiv:2409.02482
#4975

DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos

Lorenzo Mur-Labadia, Jose J. Guerrero, Ruben Martinez-Cantin

CVPR 2025highlightarXiv:2503.08344
#4976

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Hanwen Jiang, Zexiang Xu, Desai Xie et al.

CVPR 2025posterarXiv:2412.14166
#4977

Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation

Tianfu Wang, Mingyang Xie, Haoming Cai et al.

CVPR 2025posterarXiv:2501.00637
#4978

IndoorGS: Geometric Cues Guided Gaussian Splatting for Indoor Scene Reconstruction

Cong Ruan, Yuesong Wang, Bin Zhang et al.

CVPR 2025poster
#4979

Pose Priors from Language Models

Sanjay Subramanian, Evonne Ng, Lea Müller et al.

CVPR 2025posterarXiv:2405.03689
#4980

Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction

Seungtae Nam, Xiangyu Sun, Gyeongjin Kang et al.

CVPR 2025highlightarXiv:2412.06234
#4981

Towards Optimizing Large-Scale Multi-Graph Matching in Bioimaging

Max Kahl, Sebastian Stricker, Lisa Hutschenreiter et al.

CVPR 2025poster
#4982

Test-Time Backdoor Detection for Object Detection Models

Hangtao Zhang, Yichen Wang, Shihui Yan et al.

CVPR 2025posterarXiv:2503.15293
#4983

Classifier-Free Guidance Inside the Attraction Basin May Cause Memorization

Anubhav Jain, Yuya Kobayashi, Takashi Shibuya et al.

CVPR 2025posterarXiv:2411.16738
#4984

Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning

Debora Caldarola, Pietro Cagnasso, Barbara Caputo et al.

CVPR 2025posterarXiv:2412.03752
#4985

Image Quality Assessment: From Human to Machine Preference

Chunyi Li, Yuan Tian, Xiaoyue Ling et al.

CVPR 2025highlightarXiv:2503.10078
#4986

Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems

Alejandro Castañeda Garcia, Jan Warchocki, Jan van Gemert et al.

CVPR 2025posterarXiv:2410.01376
#4987

MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision

Ruicheng Wang, Sicheng Xu, Cassie Lee Dai et al.

CVPR 2025posterarXiv:2410.19115
#4988

Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization

Feifei Li, Mi Zhang, Yiming Sun et al.

CVPR 2025posterarXiv:2503.15197
#4989

Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual

Chong Wang, Lanqing Guo, Zixuan Fu et al.

CVPR 2025posterarXiv:2503.01288
#4990

Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models

Zhenguang Liu, Chao Shuai, Shaojing Fan et al.

CVPR 2025posterarXiv:2503.11071
#4991

Gain from Neighbors: Boosting Model Robustness in the Wild via Adversarial Perturbations Toward Neighboring Classes

Zhou Yang, Mingtao Feng, Tao Huang et al.

CVPR 2025poster
#4992

M^3-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation

Zixuan Chen, Jiaxin Li, Junxuan Liang et al.

CVPR 2025posterarXiv:2412.13803
#4993

Enhancing Creative Generation on Stable Diffusion-based Models

Jiyeon Han, Dahee Kwon, Gayoung Lee et al.

CVPR 2025posterarXiv:2503.23538
#4994

EquiPose: Exploiting Permutation Equivariance for Relative Camera Pose Estimation

Yuzhen Liu, Qiulei Dong

CVPR 2025poster
#4995

AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes

Andrea Porfiri Dal Cin, Georgi Dikov, Jihong Ju et al.

CVPR 2025poster
#4996

Visual Consensus Prompting for Co-Salient Object Detection

Jie Wang, Nana Yu, Zihao Zhang et al.

CVPR 2025posterarXiv:2504.14254
#4997

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long et al.

CVPR 2025posterarXiv:2412.13573
#4998

DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

Bencheng Liao, Shaoyu Chen, haoran yin et al.

CVPR 2025highlightarXiv:2411.15139
#4999

Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification

Dongseob Kim, Hyunjung Shim

CVPR 2025posterarXiv:2503.16873
#5000

FASTer: Focal token Acquiring-and-Scaling Transformer for Long-term 3D Objection Detection

Chenxu Dang, Pei An, Xinmin Zhang et al.

CVPR 2025posterarXiv:2503.01899