Most Cited CVPR "long-form qa" Papers

5,589 papers found • Page 20 of 28

#3801

From SAM to CAMs: Exploring Segment Anything Model for Weakly Supervised Semantic Segmentation

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2024poster
#3802

T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation

Kaiyue Sun, Kaiyi Huang, Xian Liu et al.

CVPR 2025posterarXiv:2407.14505
#3803

Dragin3D: Image Editing by Dragging in 3D Space

Weiran Guang, Xiaoguang Gu, Mengqi Huang et al.

CVPR 2025poster
#3804

CoMatcher: Multi-View Collaborative Feature Matching

Jintao Zhang, Zimin Xia, Mingyue Dong et al.

CVPR 2025posterarXiv:2504.01872
#3805

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Wang Zhao, Yan-Pei Cao, Jiale Xu et al.

CVPR 2025posterarXiv:2412.15200
#3806

Vlogger: Make Your Dream A Vlog

Shaobin Zhuang, Kunchang Li, Xinyuan Chen et al.

CVPR 2024posterarXiv:2401.09414
#3807

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

Zehuan Huang, Hao Wen, Junting Dong et al.

CVPR 2024posterarXiv:2312.06725
#3808

IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images

Yushuang Wu, Luyue Shi, Junhao Cai et al.

CVPR 2024highlightarXiv:2404.00269
#3809

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Bo He, Hengduo Li, Young Kyun Jang et al.

CVPR 2024posterarXiv:2404.05726
#3810

SVGDreamer: Text Guided SVG Generation with Diffusion Model

XiMing Xing, Chuang Wang, Haitao Zhou et al.

CVPR 2024posterarXiv:2312.16476
#3811

Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Ying Jin, Jinlong Peng, Qingdong He et al.

CVPR 2025posterarXiv:2408.13509
#3812

Dual Prototype Attention for Unsupervised Video Object Segmentation

Suhwan Cho, Minhyeok Lee, Seunghoon Lee et al.

CVPR 2024posterarXiv:2211.12036
#3813

R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization

Kennard Chan, Fayao Liu, Guosheng Lin et al.

CVPR 2024poster
#3814

Contrastive Mean-Shift Learning for Generalized Category Discovery

Sua Choi, Dahyun Kang, Minsu Cho

CVPR 2024posterarXiv:2404.09451
#3815

VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification

Xianwei Zhuang, Zhihong Zhu, Yuxin Xie et al.

CVPR 2025posterarXiv:2501.06553
#3816

Panacea: Panoramic and Controllable Video Generation for Autonomous Driving

Yuqing Wen, Yucheng Zhao, Yingfei Liu et al.

CVPR 2024posterarXiv:2408.07605
#3817

Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention Alignment and Prompt Tuning

Leslie Ching Ow Tiong, Dick Sigmund, Chen-Hui Chan et al.

CVPR 2024poster
#3818

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Qiuheng Wang, Yukai Shi, Jiarong Ou et al.

CVPR 2025posterarXiv:2410.08260
#3819

AvatarArtist: Open-Domain 4D Avatarization

Hongyu Liu, Xuan Wang, Ziyu Wan et al.

CVPR 2025posterarXiv:2503.19906
#3820

Omni-Scene: Omni-Gaussian Representation for Ego-Centric Sparse-View Scene Reconstruction

Dongxu Wei, Zhiqi Li, Peidong Liu

CVPR 2025posterarXiv:2412.06273
#3821

Towards Variable and Coordinated Holistic Co-Speech Motion Generation

Yifei Liu, Qiong Cao, Yandong Wen et al.

CVPR 2024posterarXiv:2404.00368
#3822

Where's the Liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content

Haoyue Bai, Yiyou Sun, Wei Cheng et al.

CVPR 2025posterarXiv:2505.01008
#3823

TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion

Yiran Wang, Jiaqi Li, Chaoyi Hong et al.

CVPR 2025posterarXiv:2504.11773
#3824

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

Zihua Liu, Hiroki Sakuma, Masatoshi Okutomi

CVPR 2024posterarXiv:2404.00149
#3825

Class Incremental Learning with Multi-Teacher Distillation

Haitao Wen, Lili Pan, Yu Dai et al.

CVPR 2024poster
#3826

Parameter Efficient Self-Supervised Geospatial Domain Adaptation

Linus Scheibenreif, Michael Mommert, Damian Borth

CVPR 2024poster
#3827

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Narges Norouzi, Svetlana Orlova, Daan de Geus et al.

CVPR 2024posterarXiv:2406.09936
#3828

No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition

Rong Qin, Xin Liu, Xingyu Liu et al.

CVPR 2025highlight
#3829

Beyond Seen Primitive Concepts and Attribute-Object Compositional Learning

Nirat Saini, Khoi Pham, Abhinav Shrivastava

CVPR 2024poster
#3830

Scaling Laws of Synthetic Images for Model Training ... for Now

Lijie Fan, Kaifeng Chen, Dilip Krishnan et al.

CVPR 2024posterarXiv:2312.04567
#3831

MEGA: Masked Generative Autoencoder for Human Mesh Recovery

Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.

CVPR 2025posterarXiv:2405.18839
#3832

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Bikang Pan, Qun Li, Xiaoying Tang et al.

CVPR 2025highlightarXiv:2412.01256
#3833

Improving the Training of Data-Efficient GANs via Quality Aware Dynamic Discriminator Rejection Sampling

Zhaoyu Zhang, Yang Hua, Guanxiong Sun et al.

CVPR 2025poster
#3834

VideoDirector: Precise Video Editing via Text-to-Video Models

Yukun Wang, Longguang Wang, Zhiyuan Ma et al.

CVPR 2025posterarXiv:2411.17592
#3835

UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion

Junsheng Zhou, Weiqi Zhang, Baorui Ma et al.

CVPR 2024posterarXiv:2404.06851
#3836

Learning Group Activity Features Through Person Attribute Prediction

Chihiro Nakatani, Hiroaki Kawashima, Norimichi Ukita

CVPR 2024posterarXiv:2403.02753
#3837

MICap: A Unified Model for Identity-Aware Movie Descriptions

Haran Raajesh, Naveen Reddy Desanur, Zeeshan Khan et al.

CVPR 2024posterarXiv:2405.11483
#3838

UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity

Jialong Zuo, Hanyu Zhou, Ying Nie et al.

CVPR 2024posterarXiv:2312.03441
#3839

Test-Time Zero-Shot Temporal Action Localization

Benedetta Liberatori, Alessandro Conti, Paolo Rota et al.

CVPR 2024posterarXiv:2404.05426
#3840

FreeU: Free Lunch in Diffusion U-Net

Chenyang Si, Ziqi Huang, Yuming Jiang et al.

CVPR 2024posterarXiv:2309.11497
#3841

Learning to Filter Outlier Edges in Global SfM

Nicole Damblon, Marc Pollefeys, Daniel Barath

CVPR 2025highlight
#3842

Towards Text-guided 3D Scene Composition

Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin et al.

CVPR 2024posterarXiv:2312.08885
#3843

Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

Xiaohan Lei, Min Wang, Wengang Zhou et al.

CVPR 2024posterarXiv:2402.17587
#3844

AnyScene: Customized Image Synthesis with Composited Foreground

Ruidong Chen, Lanjun Wang, Weizhi Nie et al.

CVPR 2024poster
#3845

Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform

Chunghyun Park, Seungwook Kim, Jaesik Park et al.

CVPR 2024posterarXiv:2404.11156
#3846

Color Shift Estimation-and-Correction for Image Enhancement

Yiyu Li, Ke Xu, Gerhard Hancke et al.

CVPR 2024posterarXiv:2405.17725
#3847

MODA: Motion-Drift Augmentation for Inertial Human Motion Analysis

Yinghao Wu, Shihui Guo, Yipeng Qin

CVPR 2025poster
#3848

SLADE: Shielding against Dual Exploits in Large Vision-Language Models

Md Zarif Hossain, AHMED IMTEAJ

CVPR 2025poster
#3849

TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression

Ho-Joong Kim, Jung-Ho Hong, Heejo Kong et al.

CVPR 2024posterarXiv:2404.02405
#3850

Endow SAM with Keen Eyes: Temporal-spatial Prompt Learning for Video Camouflaged Object Detection

Wenjun Hui, Zhenfeng Zhu, Shuai Zheng et al.

CVPR 2024poster
#3851

NICE: Neurogenesis Inspired Contextual Encoding for Replay-free Class Incremental Learning

Mustafa B Gurbuz, Jean Moorman, Constantine Dovrolis

CVPR 2024poster
#3852

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

Peihao Wang, Dejia Xu, Zhiwen Fan et al.

CVPR 2024posterarXiv:2401.00909
#3853

DiN: Diffusion Model for Robust Medical VQA with Semantic Noisy Labels

Erjian Guo, Zhen Zhao, Zicheng Wang et al.

CVPR 2025posterarXiv:2503.18536
#3854

A Unified Framework for Heterogeneous Semi-supervised Learning

Marzi Heidari, Abdullah Alchihabi, Hao Yan et al.

CVPR 2025posterarXiv:2503.00286
#3855

V2V3D: View-to-View Denoised 3D Reconstruction for Light Field Microscopy

Jiayin Zhao, Zhenqi Fu, Tao Yu et al.

CVPR 2025posterarXiv:2504.07853
#3856

FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders

Soumen Basu, Mayuna Gupta, Chetan Madan et al.

CVPR 2024posterarXiv:2403.08848
#3857

Towards Universal AI-Generated Image Detection by Variational Information Bottleneck Network

Haifeng Zhang, Qinghui He, Xiuli Bi et al.

CVPR 2025poster
#3858

Noisy One-point Homographies are Surprisingly Good

Yaqing Ding, Jonathan Astermark, Magnus Oskarsson et al.

CVPR 2024poster
#3859

CSTA: CNN-based Spatiotemporal Attention for Video Summarization

Jaewon Son, Jaehun Park, Kwangsu Kim

CVPR 2024posterarXiv:2405.11905
#3860

SRTube: Video-Language Pre-Training with Action-Centric Video Tube Features and Semantic Role Labeling

Juhee Lee, Jewon Kang

CVPR 2024poster
#3861

SUGAR: Pre-training 3D Visual Representations for Robotics

Shizhe Chen, Ricardo Garcia Pinel, Ivan Laptev et al.

CVPR 2024posterarXiv:2404.01491
#3862

SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks

Shining Wang, Yunlong Wang, Ruiqi Wu et al.

CVPR 2025highlightarXiv:2503.06965
#3863

SnAG: Scalable and Accurate Video Grounding

Fangzhou Mu, Sicheng Mo, Yin Li

CVPR 2024posterarXiv:2404.02257
#3864

GLaMM: Pixel Grounding Large Multimodal Model

Hanoona Rasheed, Muhammad Maaz, Sahal Shaji Mullappilly et al.

CVPR 2024posterarXiv:2311.03356
#3865

Spectral Informed Mamba for Robust Point Cloud Processing

Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori et al.

CVPR 2025posterarXiv:2503.04953
#3866

ManiFPT: Defining and Analyzing Fingerprints of Generative Models

Hae Jin Song, Mahyar Khayatkhoei, Wael AbdAlmageed

CVPR 2024posterarXiv:2402.10401
#3867

WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion

Yang Wu, Yun Zhu, Kaihua Zhang et al.

CVPR 2025posterarXiv:2504.13561
#3868

Hiding Images in Diffusion Models by Editing Learned Score Functions

Haoyu Chen, Yunqiao Yang, Nan Zhong et al.

CVPR 2025posterarXiv:2503.18459
#3869

Event Ellipsometer: Event-based Mueller-Matrix Video Imaging

Ryota Maeda, Yunseong Moon, Seung-Hwan Baek

CVPR 2025highlightarXiv:2411.17313
#3870

A Physics-Informed Blur Learning Framework for Imaging Systems

liqun.chen, Yuxuan Li, Jun Dai et al.

CVPR 2025poster
#3871

Self-Calibrating Vicinal Risk Minimisation for Model Calibration

Jiawei Liu, Changkun Ye, Ruikai Cui et al.

CVPR 2024poster
#3872

Cinematic Behavior Transfer via NeRF-based Differentiable Filming

Xuekun Jiang, Anyi Rao, Jingbo Wang et al.

CVPR 2024posterarXiv:2311.17754
#3873

Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning

Leonardo Iurada, Marco Ciccone, Tatiana Tommasi

CVPR 2024posterarXiv:2406.01820
#3874

CORE-MPI: Consistency Object Removal with Embedding MultiPlane Image

Donggeun Yoon, Donghyeon Cho

CVPR 2024poster
#3875

ScoreHypo: Probabilistic Human Mesh Estimation with Hypothesis Scoring

Yuan Xu, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2024poster
#3876

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

Nataniel Ruiz, Yuanzhen Li, Varun Jampani et al.

CVPR 2024posterarXiv:2307.06949
#3877

Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models

Chang Liu, Haoning Wu, Yujie Zhong et al.

CVPR 2024posterarXiv:2306.00973
#3878

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

Haimei Zhao, Jing Zhang, Zhuo Chen et al.

CVPR 2024posterarXiv:2404.05145
#3879

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Yunsheng Ma, Can Cui, Xu Cao et al.

CVPR 2024posterarXiv:2312.04372
#3880

Dynamic Policy-Driven Adaptive Multi-Instance Learning for Whole Slide Image Classification

Tingting Zheng, Kui Jiang, Hongxun Yao

CVPR 2024highlightarXiv:2403.07939
#3881

BilevelPruning: Unified Dynamic and Static Channel Pruning for Convolutional Neural Networks

Shangqian Gao, Yanfu Zhang, Feihu Huang et al.

CVPR 2024poster
#3882

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

Jiayi Guo, Xingqian Xu, Yifan Pu et al.

CVPR 2024posterarXiv:2312.04410
#3883

Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents

Yuxi Wei, Zi Wang, Yifan Lu et al.

CVPR 2024highlightarXiv:2402.05746
#3884

HUNet: Homotopy Unfolding Network for Image Compressive Sensing

Feiyang Shen, Hongping Gan

CVPR 2025poster
#3885

Learning Continuous 3D Words for Text-to-Image Generation

Ta-Ying Cheng, Matheus Gadelha, Thibault Groueix et al.

CVPR 2024posterarXiv:2402.08654
#3886

Content-Adaptive Non-Local Convolution for Remote Sensing Pansharpening

Yule Duan, Xiao Wu, Haoyu Deng et al.

CVPR 2024posterarXiv:2404.07543
#3887

A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling

Wentao Qu, Yuantian Shao, Lingwu Meng et al.

CVPR 2024posterarXiv:2312.02719
#3888

Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation

Chuhao Chen, Zhiyang Dou, Chen Wang et al.

CVPR 2025posterarXiv:2506.06440
#3889

QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Xuan Shen, Weize Ma, Jing Liu et al.

CVPR 2025posterarXiv:2503.16709
#3890

APISR: Anime Production Inspired Real-World Anime Super-Resolution

Boyang Wang, Fengyu Yang, Xihang Yu et al.

CVPR 2024posterarXiv:2403.01598
#3891

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

Weizhen He, Yiheng Deng, SHIXIANG TANG et al.

CVPR 2024posterarXiv:2306.07520
#3892

Device-Wise Federated Network Pruning

Shangqian Gao, Junyi Li, Zeyu Zhang et al.

CVPR 2024poster
#3893

SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers

Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos et al.

CVPR 2024highlightarXiv:2312.00648
#3894

OralXrays-9: Towards Hospital-Scale Panoramic X-ray Anomaly Detection via Personalized Multi-Object Query-Aware Mining

Bingzhi Chen, Sisi Fu, Xiaocheng Fang et al.

CVPR 2025oral
#3895

MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation

Yuelong Li, Yafei Mao, Raja Bala et al.

CVPR 2024posterarXiv:2403.08019
#3896

Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM

Qiyuan Dai, Sibei Yang

CVPR 2025posterarXiv:2507.06973
#3897

Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos

Yuhan Shen, Ehsan Elhamifar

CVPR 2024poster
#3898

MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision

Chenyangguang Zhang, Guanlong Jiao, Yan Di et al.

CVPR 2024posterarXiv:2310.11696
#3899

Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

Jiayi Guo, Zhao Junhao, Chaoqun Du et al.

CVPR 2025posterarXiv:2406.04295
#3900

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

Youngjoon Jang, Jihoon Kim, Junseok Ahn et al.

CVPR 2024posterarXiv:2405.10272
#3901

Learning to Segment Referred Objects from Narrated Egocentric Videos

Yuhan Shen, Huiyu Wang, Xitong Yang et al.

CVPR 2024poster
#3902

EGTR: Extracting Graph from Transformer for Scene Graph Generation

Jinbae Im, JeongYeon Nam, Nokyung Park et al.

CVPR 2024posterarXiv:2404.02072
#3903

Distributionally Generative Augmentation for Fair Facial Attribute Classification

Fengda Zhang, Qianpei He, Kun Kuang et al.

CVPR 2024posterarXiv:2403.06606
#3904

PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks

Marina Neseem, Conor McCullough, Randy Hsin et al.

CVPR 2024posterarXiv:2404.00103
#3905

Geometry-guided Online 3D Video Synthesis with Multi-View Temporal Consistency

Hyunho Ha, Lei Xiao, Christian Richardt et al.

CVPR 2025posterarXiv:2505.18932
#3906

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

Prannay Kaul, Zhizhong Li, Hao Yang et al.

CVPR 2024posterarXiv:2405.05256
#3907

UniVS: Unified and Universal Video Segmentation with Prompts as Queries

Minghan LI, Shuai Li, Xindong Zhang et al.

CVPR 2024posterarXiv:2402.18115
#3908

Satellite to GroundScape - Large-scale Consistent Ground View Generation from Satellite Views

Ningli Xu, Rongjun Qin

CVPR 2025posterarXiv:2504.15786
#3909

Continuous Space-Time Video Resampling with Invertible Motion Steganography

Yuantong zhang, Zhenzhong Chen

CVPR 2025poster
#3910

Inlier Confidence Calibration for Point Cloud Registration

Yongzhe Yuan, Yue Wu, Xiaolong Fan et al.

CVPR 2024poster
#3911

CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow

Chenbin Pan, Burhan Yaman, Senem Velipasalar et al.

CVPR 2024posterarXiv:2403.08919
#3912

Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation

Byung Hyun Lee, Sungjin Lim, Se Young Chun

CVPR 2025posterarXiv:2503.12356
#3913

ADFactory: An Effective Framework for Generalizing Optical Flow with NeRF

Han Ling, Quansen Sun, Yinghui Sun et al.

CVPR 2024posterarXiv:2311.04246
#3914

3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions

Weijia Li, Haote Yang, Zhenghao Hu et al.

CVPR 2024posterarXiv:2404.04823
#3915

In Search of a Data Transformation That Accelerates Neural Field Training

Junwon Seo, Sangyoon Lee, Kwang In Kim et al.

CVPR 2024posterarXiv:2311.17094
#3916

FastMAC: Stochastic Spectral Sampling of Correspondence Graph

Yifei Zhang, Hao Zhao, Hongyang Li et al.

CVPR 2024posterarXiv:2403.08770
#3917

Generative Hard Example Augmentation for Semantic Point Cloud Segmentation

Qi Zhang, Jibin Peng, Zhao Huang et al.

CVPR 2025poster
#3918

Focal Split: Untethered Snapshot Depth from Differential Defocus

Junjie Luo, John Mamish, Alan Fu et al.

CVPR 2025posterarXiv:2504.11202
#3919

DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture

Qianlong Xiang, Miao Zhang, Yuzhang Shang et al.

CVPR 2025posterarXiv:2409.03550
#3920

PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution

Honghao Chen, Xiangxiang Chu, Renyongjian et al.

CVPR 2024posterarXiv:2403.07589
#3921

Towards Generalizing to Unseen Domains with Few Labels

Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana et al.

CVPR 2024posterarXiv:2403.11674
#3922

Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts

Jialin Wu, Xia Hu, Yaqing Wang et al.

CVPR 2024highlightarXiv:2312.00968
#3923

Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Octave Mariotti, Oisin Mac Aodha, Hakan Bilen

CVPR 2024posterarXiv:2312.13216
#3924

A Theory of Learning Unified Model via Knowledge Integration from Label Space Varying Domains

Dexuan Zhang, Thomas Westfechtel, Tatsuya Harada

CVPR 2025poster
#3925

Learning Degradation-Independent Representations for Camera ISP Pipelines

Yanhui Guo, Fangzhou Luo, Xiaolin Wu

CVPR 2024posterarXiv:2307.00761
#3926

T-FAKE: Synthesizing Thermal Images for Facial Landmarking

Philipp Flotho, Moritz Piening, Anna Kukleva et al.

CVPR 2025posterarXiv:2408.15127
#3927

A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion

Feng Yu, Teng Zhang, Gilad Lerman

CVPR 2024posterarXiv:2404.11590
#3928

Low-Rank Adaptation in Multilinear Operator Networks for Security-Preserving Incremental Learning

Huu Binh Ta, Duc Nguyen, Quyen Tran et al.

CVPR 2025poster
#3929

Low-Resource Vision Challenges for Foundation Models

Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek

CVPR 2024posterarXiv:2401.04716
#3930

Population Normalization for Federated Learning

Zhuoyao Wang, Fan Yi, Peizhu Gong et al.

CVPR 2025poster
#3931

MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis

Tianyu Wang, Jianming Zhang, Haitian Zheng et al.

CVPR 2025posterarXiv:2412.02635
#3932

Pay Attention to the Foreground in Object-Centric Learning

Pinzhuo Tian, Shengjie Yang, Hang Yu et al.

CVPR 2025poster
#3933

Low-Latency Neural Stereo Streaming

Qiqi Hou, Farzad Farhadzadeh, Amir Said et al.

CVPR 2024posterarXiv:2403.17879
#3934

Hierarchical Compact Clustering Attention (COCA) for Unsupervised Object-Centric Learning

Can Küçüksözen, Yucel Yemez

CVPR 2025posterarXiv:2505.02071
#3935

CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model

Ziyu Yao, Xuxin Cheng, Zhiqi Huang et al.

CVPR 2025posterarXiv:2503.17690
#3936

Your Transferability Barrier is Fragile: Free-Lunch for Transferring the Non-Transferable Learning

Ziming Hong, Li Shen, Tongliang Liu

CVPR 2024highlight
#3937

ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe

Yifan Bai, Zeyang Zhao, Yihong Gong et al.

CVPR 2024posterarXiv:2312.17133
#3938

DPHMs: Diffusion Parametric Head Models for Depth-based Tracking

Jiapeng Tang, Angela Dai, Yinyu Nie et al.

CVPR 2024posterarXiv:2312.01068
#3939

MaxQ: Multi-Axis Query for N:M Sparsity Network

Jingyang Xiang, Siqi Li, Junhao Chen et al.

CVPR 2024poster
#3940

MatAnyone: Stable Video Matting with Consistent Memory Propagation

Peiqing Yang, Shangchen Zhou, Jixin Zhao et al.

CVPR 2025posterarXiv:2501.14677
#3941

The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

Myeongseob Ko, Feiyang Kang, Weiyan Shi et al.

CVPR 2024posterarXiv:2402.08922
#3942

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Haoning Wu, Zicheng Zhang, Erli Zhang et al.

CVPR 2024posterarXiv:2311.06783
#3943

FG^2: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching

Zimin Xia, Alex Alahi

CVPR 2025posterarXiv:2503.18725
#3944

Efficient Scene Recovery Using Luminous Flux Prior

ZhongYu Li, Lei Zhang

CVPR 2024poster
#3945

Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training

Shizhan Gong, Qi Dou, Farzan Farnia

CVPR 2024posterarXiv:2404.04647
#3946

Revisiting Global Translation Estimation with Feature Tracks

Peilin Tao, Hainan Cui, Mengqi Rong et al.

CVPR 2024poster
#3947

Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection

Huan Liu, Zichang Tan, Chuangchuang Tan et al.

CVPR 2024posterarXiv:2312.16649
#3948

RORem: Training a Robust Object Remover with Human-in-the-Loop

Ruibin Li, Tao Yang, Song Guo et al.

CVPR 2025posterarXiv:2501.00740
#3949

MeaCap: Memory-Augmented Zero-shot Image Captioning

Zequn Zeng, Yan Xie, Hao Zhang et al.

CVPR 2024posterarXiv:2403.03715
#3950

MuseChat: A Conversational Music Recommendation System for Videos

Zhikang Dong, Bin Chen, Xiulong Liu et al.

CVPR 2024highlightarXiv:2310.06282
#3951

Novel View Synthesis with View-Dependent Effects from a Single Image

Juan Luis Gonzalez Bello, Munchurl Kim

CVPR 2024posterarXiv:2312.08071
#3952

Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation

Hongwei Yan, Liyuan Wang, Kaisheng Ma et al.

CVPR 2024posterarXiv:2404.00417
#3953

DisCo: Disentangled Control for Realistic Human Dance Generation

Tan Wang, Linjie Li, Kevin Lin et al.

CVPR 2024posterarXiv:2307.00040
#3954

Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level

Andong Deng, Tongjia Chen, Shoubin Yu et al.

CVPR 2025posterarXiv:2411.09921
#3955

Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Xun Lin, Shuai Wang, RIZHAO CAI et al.

CVPR 2024highlightarXiv:2402.19298
#3956

Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Lei Qi et al.

CVPR 2024posterarXiv:2404.08951
#3957

LAMP: Learn A Motion Pattern for Few-Shot Video Generation

Rui-Qi Wu, Liangyu Chen, Tong Yang et al.

CVPR 2024poster
#3958

Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation

Yanda Chen, Gongwei Chen, Miao Zhang et al.

CVPR 2025posterarXiv:2503.18872
#3959

PixelLM: Pixel Reasoning with Large Multimodal Model

Zhongwei Ren, Zhicheng Huang, Yunchao Wei et al.

CVPR 2024posterarXiv:2312.02228
#3960

Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and Consistency

Yuqi Zhang, Han Luo, Yinjie Lei

CVPR 2024poster
#3961

iKUN: Speak to Trackers without Retraining

Yunhao Du, Cheng Lei, Zhicheng Zhao et al.

CVPR 2024posterarXiv:2312.16245
#3962

Neural Fields as Distributions: Signal Processing Beyond Euclidean Space

Daniel Rebain, Soroosh Yazdani, Kwang Moo Yi et al.

CVPR 2024poster
#3963

Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation

Xingqun Qi, Jiahao Pan, Peng Li et al.

CVPR 2024posterarXiv:2311.17532
#3964

LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection

Yunpeng Luo, Junlong Du, Ke Yan et al.

CVPR 2024posterarXiv:2403.17465
#3965

Stratified Avatar Generation from Sparse Observations

Han Feng, Wenchao Ma, Quankai Gao et al.

CVPR 2024posterarXiv:2405.20786
#3966

Few-shot Learner Parameterization by Diffusion Time-steps

Zhongqi Yue, Pan Zhou, Richang Hong et al.

CVPR 2024posterarXiv:2403.02649
#3967

Global and Hierarchical Geometry Consistency Priors for Few-shot NeRFs in Indoor Scenes

Xiaotian Sun, Qingshan Xu, Xinjie Yang et al.

CVPR 2024poster
#3968

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Zihan Wang, Xiangyang Li, Jiahao Yang et al.

CVPR 2024highlightarXiv:2404.01943
#3969

Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis

Simon Niedermayr, Josef Stumpfegger, rüdiger westermann

CVPR 2024posterarXiv:2401.02436
#3970

The STVchrono Dataset: Towards Continuous Change Recognition in Time

Yanjun Sun, Yue Qiu, Mariia Khan et al.

CVPR 2024poster
#3971

IEEE Computer Society

CVPR 2025
#3972

Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection

Ke Li, Di Wang, Zhangyuan Hu et al.

CVPR 2024poster
#3973

Motion Blur Decomposition with Cross-shutter Guidance

Xiang Ji, Haiyang Jiang, Yinqiang Zheng

CVPR 2024posterarXiv:2404.01120
#3974

LMO: Linear Mamba Operator for MRI Reconstruction

Wei Li, jiawei jiang, Jie Wu et al.

CVPR 2025poster
#3975

Digital Twin Catalog: A Large-Scale Photorealistic 3D Object Digital Twin Dataset

Zhao Dong, Ka chen, Zhaoyang Lv et al.

CVPR 2025highlightarXiv:2504.08541
#3976

HVI: A New Color Space for Low-light Image Enhancement

Qingsen Yan, Yixu Feng, Cheng Zhang et al.

CVPR 2025posterarXiv:2502.20272
#3977

Sparse Point Cloud Patches Rendering via Splitting 2D Gaussians

Changfeng Ma, Ran Bi, Jie Guo et al.

CVPR 2025posterarXiv:2505.09413
#3978

FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance

Dian Shao, Mingfei Shi, Shengda Xu et al.

CVPR 2025posterarXiv:2505.13437
#3979

LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge

Gongwei Chen, Leyang Shen, Rui Shao et al.

CVPR 2024posterarXiv:2311.11860
#3980

Pixel-Aligned Language Model

Jiarui Xu, Xingyi Zhou, Shen Yan et al.

CVPR 2024poster
#3981

Eclipse: Disambiguating Illumination and Materials using Unintended Shadows

Dor Verbin, Ben Mildenhall, Peter Hedman et al.

CVPR 2024posterarXiv:2305.16321
#3982

ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis

Muhammad Hamza Mughal, Rishabh Dabral, Ikhsanul Habibie et al.

CVPR 2024posterarXiv:2403.17936
#3983

2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images

Junkai Deng, Fei Hou, Xuhui Chen et al.

CVPR 2024posterarXiv:2303.15368
#3984

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Lukas Höllein, Aljaž Božič, Norman Müller et al.

CVPR 2024posterarXiv:2403.01807
#3985

Cross-modal Information Flow in Multimodal Large Language Models

Zhi Zhang, Srishti Yadav, Fengze Han et al.

CVPR 2025posterarXiv:2411.18620
#3986

Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos

Leonhard Sommer, Artur Jesslen, Eddy Ilg et al.

CVPR 2024posterarXiv:2407.04384
#3987

OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Meng Lou, Yizhou Yu

CVPR 2025posterarXiv:2502.20087
#3988

Multi-View Pose-Agnostic Change Localization with Zero Labels

Chamuditha Jayanga Galappaththige, Jason Lai, Lloyd Windrim et al.

CVPR 2025posterarXiv:2412.03911
#3989

Taming Stable Diffusion for Text to 360 Panorama Image Generation

Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella et al.

CVPR 2024highlightarXiv:2404.07949
#3990

DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension

Xiaofu Chen, Yaxin Luo, Luo et al.

CVPR 2025poster
#3991

CAMEL: CAusal Motion Enhancement Tailored for Lifting Text-driven Video Editing

Guiwei Zhang, Tianyu Zhang, Guanglin Niu et al.

CVPR 2024poster
#3992

Prior-free 3D Object Tracking

Xiuqiang Song, Li Jin, Zhengxian Zhang et al.

CVPR 2025highlight
#3993

Conical Visual Concentration for Efficient Large Vision-Language Models

Long Xing, Qidong Huang, Xiaoyi Dong et al.

CVPR 2025poster
#3994

DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation

Yuanchen Wu, Xichen Ye, KequanYang et al.

CVPR 2024posterarXiv:2403.11184
#3995

A Physics-informed Low-rank Deep Neural Network for Blind and Universal Lens Aberration Correction

Jin Gong, Runzhao Yang, Weihang Zhang et al.

CVPR 2024poster
#3996

NAPGuard: Towards Detecting Naturalistic Adversarial Patches

Siyang Wu, Jiakai Wang, Jiejie Zhao et al.

CVPR 2024poster
#3997

Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning

Christopher Liao, Theodoros Tsiligkaridis, Brian Kulis

CVPR 2024posterarXiv:2311.13612
#3998

A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning

Xiaoyang Xu, Mengda Yang, Wenzhe Yi et al.

CVPR 2024posterarXiv:2405.04115
#3999

Bootstrapping SparseFormers from Vision Foundation Models

Ziteng Gao, Zhan Tong, Kevin Qinghong Lin et al.

CVPR 2024posterarXiv:2312.01987
#4000

BiomedCoOp: Learning to Prompt for Biomedical Vision-Language Models

Taha Koleilat, Hojat Asgariandehkordi, Hassan Rivaz et al.

CVPR 2025posterarXiv:2411.15232