Most Cited CVPR &quot;sparse neural networks&quot; Papers

CVPR 2025highlightarXiv:2412.03968

#2002

Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation

Hao Zhu, Yan Zhu, Jiayu Xiao et al.

CVPR 2024posterarXiv:2403.16937

#2003

Hyperspherical Classification with Dynamic Label-to-Prototype Assignment

Mohammad Saadabadi Saadabadi, Ali Dabouei, Sahar Rahimi Malakshan et al.

CVPR 2024posterarXiv:2404.02242

#2004

Towards Robust 3D Pose Transfer with Adversarial Learning

Haoyu Chen, Hao Tang, Ehsan Adeli et al.

CVPR 2025posterarXiv:2504.18856

#2005

Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation

Shahad Albastaki, Anabia Sohail, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2025posterarXiv:2411.09998

#2006

Adaptive Non-Uniform Timestep Sampling for Accelerating Diffusion Model Training

Myunsoo Kim, Donghyeon Ki, Seong-Woong Shim et al.

CVPR 2024posterarXiv:2403.17638

#2007

Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency

Xu Yingjie, Bangzhen Liu, Hao Tang et al.

CVPR 2024posterarXiv:2210.05248

#2008

Self-supervised Debiasing Using Low Rank Regularization

Geon Yeong Park, Chanyong Jung, Sangmin Lee et al.

CVPR 2025posterarXiv:2501.01589

#2009

D^3-Human: Dynamic Disentangled Digital Human from Monocular Video

Honghu Chen, Bo Peng, Yunfan Tao et al.

#2010

Audio-Visual Semantic Graph Network for Audio-Visual Event Localization

Liang Liu, Shuaiyong Li, Yongqiang Zhu

#2011

On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach

Baoshun Tong, Hanjiang Lai, Yan Pan et al.

#2012

Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine

Zhaohu Xing, Lihao Liu, Yijun Yang et al.

CVPR 2024posterarXiv:2404.00252

#2013

Learned Scanpaths Aid Blind Panoramic Video Quality Assessment

Kanglong FAN, Wen Wen, Mu Li et al.

#2014

AniMo: Species-Aware Model for Text-Driven Animal Motion Generation

Xuan Wang, Kai Ruan, Xing Zhang et al.

CVPR 2025posterarXiv:2412.18565

#2015

3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

Yihang Luo, Shangchen Zhou, Yushi Lan et al.

CVPR 2025highlightarXiv:2503.06956

#2016

LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending

Jian Jin, Zhenbo Yu, Yang Shen et al.

#2017

Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression

Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai et al.

#2018

Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

#2019

HeMoRa: Unsupervised Heuristic Consensus Sampling for Robust Point Cloud Registration

Shaocheng Yan, Yiming Wang, Kaiyan Zhao et al.

CVPR 2024posterarXiv:2311.18695

#2020

Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction

Cheng Sun, Wei-En Tai, Yu-Lin Shih et al.

CVPR 2025posterarXiv:2412.10084

#2021

ProbeSDF: Light Field Probes For Neural Surface Reconstruction

Briac Toussaint, Diego Thomas, Jean-Sébastien Franco

CVPR 2024posterarXiv:2211.14309

#2022

FutureHuman3D: Forecasting Complex Long-Term 3D Human Behavior from Video Observations

Christian Diller, Thomas Funkhouser, Angela Dai

CVPR 2025highlightarXiv:2411.15678

#2023

Towards RAW Object Detection in Diverse Conditions

Zhong-Yu Li, Xin Jin, Bo-Yuan Sun et al.

CVPR 2025posterarXiv:2410.13924

#2024

ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding

Guangda Ji, Silvan Weder, Francis Engelmann et al.

CVPR 2025posterarXiv:2412.16645

#2025

Complementary Advantages: Exploiting Cross-Field Frequency Correlation for NIR-Assisted Image Denoising

Yuchen Wang, Hongyuan Wang, Lizhi Wang et al.

CVPR 2025posterarXiv:2411.18654

#2026

AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

Haonan Han, Xiangzuo Wu, Huan Liao et al.

CVPR 2025posterarXiv:2502.19894

#2027

High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model

Mingtao Guo, Guanyu Xing, Yanli Liu

CVPR 2025posterarXiv:2404.14414

#2028

Removing Reflections from RAW Photos

Eric Kee, Adam Pikielny, Kevin Blackburn-Matzen et al.

CVPR 2025posterarXiv:2505.01428

#2029

Multi-party Collaborative Attention Control for Image Customization

Han Yang, Chuanguang Yang, Qiuli Wang et al.

CVPR 2025posterarXiv:2411.19756

#2030

DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering

Yihao Wang, Marcus Klasson, Matias Turkulainen et al.

CVPR 2025posterarXiv:2504.01515

#2031

Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis

Zixuan Wang, DUO PENG, Feng Chen et al.

#2032

TSAM: Temporal SAM Augmented with Multimodal Prompts for Referring Audio-Visual Segmentation

Abduljalil Radman, Jorma Laaksonen

#2033

A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition

Duosheng Chen, Shihao Zhou, Jinshan Pan et al.

CVPR 2025highlight

CVPR 2025posterarXiv:2411.12951

#2034

On the Consistency of Video Large Language Models in Temporal Comprehension

Minjoon Jung, Junbin Xiao, Byoung-Tak Zhang et al.

CVPR 2025posterarXiv:2411.16799

#2035

One is Plenty: A Polymorphic Feature Interpreter for Immutable Heterogeneous Collaborative Perception

Yuchen Xia, Quan Yuan, Guiyang Luo et al.

CVPR 2025highlightarXiv:2504.05046

#2036

MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond

Shenghao Ren, Yi Lu, Jiayi Huang et al.

CVPR 2024posterarXiv:2308.10638

#2037

SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes

Soubhik Sanyal, Partha Ghosh, Jinlong Yang et al.

#2038

Flow-Guided Online Stereo Rectification for Wide Baseline Stereo

Anush Kumar, Fahim Mannan, Omid Hosseini Jafari et al.

#2039

ShapeWalk: Compositional Shape Editing Through Language-Guided Chains

Habib Slim, Mohamed Elhoseiny

CVPR 2025posterarXiv:2503.18359

#2040

Context-Enhanced Memory-Refined Transformer for Online Action Detection

Zhanzhong Pang, Fadime Sener, Angela Yao

CVPR 2025posterarXiv:2504.10857

#2041

ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping

Shun Iwase, Muhammad Zubair Irshad, Katherine Liu et al.

CVPR 2025posterarXiv:2412.19712

#2042

From Elements to Design: A Layered Approach for Automatic Graphic Design Composition

Jiawei Lin, Shizhao Sun, Danqing Huang et al.

CVPR 2024posterarXiv:2404.12322

#2043

Generalizable Face Landmarking Guided by Conditional Face Warping

Jiayi Liang, Haotian Liu, Hongteng Xu et al.

CVPR 2025posterarXiv:2506.07865

#2044

FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity

Jinxi Li, Ziyang Song, Siyuan Zhou et al.

CVPR 2025posterarXiv:2503.00591

#2045

AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models

Sohan Patnaik, Rishabh Jain, Balaji Krishnamurthy et al.

CVPR 2024posterarXiv:2310.12153

#2046

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

Jan-Nico Zaech, Martin Danelljan, Tolga Birdal et al.

#2047

Dynamic Stereotype Theory Induced Micro-expression Recognition with Oriented Deformation

Bohao Zhang, Xuejiao Wang, Changbo Wang et al.

CVPR 2025posterarXiv:2406.16321

#2048

Mosaic of Modalities: A Comprehensive Benchmark for Multimodal Graph Learning

Jing Zhu, Yuhang Zhou, Shengyi Qian et al.

CVPR 2025posterarXiv:2412.05278

#2049

Birth and Death of a Rose

Chen Geng, Yunzhi Zhang, Shangzhe Wu et al.

#2050

Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild

Wei Liu, Yufei Chen, Xiaodong Yue

CVPR 2024posterarXiv:2403.06846

#2051

DiaLoc: An Iterative Approach to Embodied Dialog Localization

Chao Zhang, Mohan Li, Ignas Budvytis et al.

CVPR 2025posterarXiv:2503.18695

#2052

OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad

Luyao Tang, Chaoqi Chen, Yuxuan Yuan et al.

CVPR 2025posterarXiv:2408.07790

#2053

Cropper: Vision-Language Model for Image Cropping through In-Context Learning

Seung Hyun Lee, Jijun jiang, Yiran Xu et al.

CVPR 2025posterarXiv:2405.16414

#2054

Robust Message Embedding via Attention Flow-Based Steganography

Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.

CVPR 2025highlightarXiv:2410.10604

#2055

Multi-modal Vision Pre-training for Medical Image Analysis

Shaohao Rui, Lingzhi Chen, Zhenyu Tang et al.

CVPR 2025highlightarXiv:2412.03937

#2056

AIpparel: A Multimodal Foundation Model for Digital Garments

Kiyohiro Nakayama, Jan Ackermann, Timur Levent Kesdogan et al.

CVPR 2024posterarXiv:2403.17761

#2057

Makeup Prior Models for 3D Facial Makeup Estimation and Applications

Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.

CVPR 2024posterarXiv:2401.10219

#2058

Edit One for All: Interactive Batch Image Editing

Thao Nguyen, Utkarsh Ojha, Yuheng Li et al.

CVPR 2024posterarXiv:2403.11812

#2059

Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

Yuqi Zhang, Guanying Chen, Jiaxing Chen et al.

CVPR 2025highlightarXiv:2412.19637

#2060

ReNeg: Learning Negative Embedding with Reward Guidance

Xiaomin Li, yixuan liu, Takashi Isobe et al.

CVPR 2025posterarXiv:2506.18557

#2061

Object-aware Sound Source Localization via Audio-Visual Scene Understanding

Sung Jin Um, Dongjin Kim, Sangmin Lee et al.

CVPR 2025posterarXiv:2411.15231

#2062

IterIS: Iterative Inference-Solving Alignment for LoRA Merging

Hongxu chen, Zhen Wang, Runshi Li et al.

CVPR 2025posterarXiv:2504.17813

#2063

CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss

Dileepa Pitawela, Gustavo Carneiro, Hsiang-Ting Chen

CVPR 2025posterarXiv:2503.04565

#2064

Omnidirectional Multi-Object Tracking

Kai Luo, Hao Shi, Sheng Wu et al.

CVPR 2024posterarXiv:2311.16304

#2065

Robust Self-calibration of Focal Lengths from the Fundamental Matrix

Viktor Kocur, Daniel Kyselica, Zuzana Kukelova

CVPR 2024highlightarXiv:2401.13296

#2066

Visual Objectification in Films: Towards a New AI Task for Video Interpretation

Julie Tores, Lucile Sassatelli, Hui-Yin Wu et al.

CVPR 2025posterarXiv:2503.22168

#2067

Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis

Woojung Han, Yeonkyung Lee, Chanyoung Kim et al.

CVPR 2025highlightarXiv:2412.06767

#2068

MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views

Antoine Guédon, Tomoki Ichikawa, Kohei Yamashita et al.

CVPR 2025posterarXiv:2506.10966

#2069

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Ning Gao, Yilun Chen, Shuai Yang et al.

CVPR 2024posterarXiv:2311.17902

#2070

Language-conditioned Detection Transformer

Jang Hyun Cho, Philipp Krähenbühl

CVPR 2025highlightarXiv:2506.11543

#2071

FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation

Zhuguanyu Wu, Shihe Wang, Jiayi Zhang et al.

CVPR 2025posterarXiv:2411.16173

#2072

SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis

Junho Kim, Hyunjun Kim, Hosu Lee et al.

#2073

Detection-Friendly Nonuniformity Correction: A Union Framework for Infrared UAV Target Detection

Houzhang Fang, Xiaolin Wang, Zengyang Li et al.

CVPR 2025highlight

CVPR 2025posterarXiv:2502.19739

#2074

LUCAS: Layered Universal Codec Avatars

Di Liu, Teng Deng, Giljoo Nam et al.

CVPR 2025posterarXiv:2503.15404

#2075

Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

CVPR 2025posterarXiv:2501.08303

#2076

Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers

Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris et al.

CVPR 2025posterarXiv:2411.07765

#2077

Novel View Synthesis with Pixel-Space Diffusion Models

Noam Elata, Bahjat Kawar, Yaron Ostrovsky-Berman et al.

#2078

Pos3R: 6D Pose Estimation for Unseen Objects Made Easy

Weijian Deng, Dylan Campbell, Chunyi Sun et al.

CVPR 2025posterarXiv:2503.24382

#2079

Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views

Chong Bao, Xiyu Zhang, Zehao Yu et al.

CVPR 2024posterarXiv:2403.07560

#2080

Unleashing Network Potentials for Semantic Scene Completion

Fengyun Wang, Qianru Sun, Dong Zhang et al.

CVPR 2025highlightarXiv:2503.21076

#2081

KAC: Kolmogorov-Arnold Classifier for Continual Learning

Yusong Hu, Zichen Liang, Fei Yang et al.

CVPR 2025posterarXiv:2503.02101

#2082

Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection

Boyong He, Yuxiang Ji, Qianwen Ye et al.

CVPR 2025posterarXiv:2503.12042

#2083

Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing

Zhedong Zhang, Liang Li, Chenggang Yan et al.

#2084

Efficient Hyperparameter Optimization with Adaptive Fidelity Identification

Jiantong Jiang, Zeyi Wen, Atif Mansoor et al.

CVPR 2025highlightarXiv:2503.09962

#2085

Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification

Jiayu Jiang, Changxing Ding, Wentao Tan et al.

CVPR 2025posterarXiv:2503.19191

#2086

FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing

Yufan Ren, Zicong Jiang, Tong Zhang et al.

CVPR 2025posterarXiv:2503.00548

#2087

Unbiased Video Scene Graph Generation via Visual and Semantic Dual Debiasing

Yanjun Li, Zhaoyang Li, Honghui Chen et al.

CVPR 2025posterarXiv:2406.05826

#2088

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Wei Li, Pin-Yu Chen, Sijia Liu et al.

CVPR 2025posterarXiv:2412.09910

#2089

Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attack on Breast Ultrasound Images

Yasamin Medghalchi, Moein Heidari, Clayton Allard et al.

CVPR 2025posterarXiv:2412.07739

#2090

GASP: Gaussian Avatars with Synthetic Priors

Jack Saunders, Charlie Hewitt, Yanan Jian et al.

CVPR 2025posterarXiv:2504.04834

#2091

Learning Affine Correspondences by Integrating Geometric Constraints

Pengju Sun, Banglei Guan, Zhenbao Yu et al.

CVPR 2025posterarXiv:2504.16030

#2092

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Joya Chen, Yiqi Lin, Ziyun Zeng et al.

CVPR 2025posterarXiv:2505.23068

#2093

URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration

Rui Xu, Yuzhen Niu, Yuezhou Li et al.

CVPR 2024posterarXiv:2404.01743

#2094

Atom-Level Optical Chemical Structure Recognition with Limited Supervision

Martijn Oldenhof, Edward De Brouwer, Adam Arany et al.

CVPR 2025highlightarXiv:2505.21943

#2095

Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting

Wei Lin, Chenyang ZHAO, Antoni B. Chan

#2096

Multi-modal Medical Diagnosis via Large-small Model Collaboration

Wanyi Chen, Zihua Zhao, Jiangchao Yao et al.

CVPR 2024posterarXiv:2405.19646

#2097

FaceLift: Semi-supervised 3D Facial Landmark Localization

David Ferman, Pablo Garrido, Gaurav Bharaj

CVPR 2025posterarXiv:2412.05161

#2098

DNF: Unconditional 4D Generation with Dictionary-based Neural Fields

Xinyi Zhang, Naiqi Li, Angela Dai

CVPR 2025posterarXiv:2503.15110

#2099

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation

Ziqin Huang, Gu Wang, Chenyangguang Zhang et al.

#2100

Simplification Is All You Need against Out-of-Distribution Overconfidence

Keke Tang, Chao Hou, Weilong Peng et al.

CVPR 2025posterarXiv:2503.22328

#2101

VoteFlow: Enforcing Local Rigidity in Self-Supervised Scene Flow

Yancong Lin, Shiming Wang, Liangliang Nan et al.

CVPR 2025posterarXiv:2504.16023

#2102

PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning

Song Wang, Xiaolu Liu, Lingdong Kong et al.

CVPR 2024posterarXiv:2405.01356

#2103

Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

Kelvin C.K. Chan, Yang Zhao, Xuhui Jia et al.

#2104

Radio Frequency Ray Tracing with Neural Object Representation for Enhanced RF Modeling

Xingyu Chen, Zihao Feng, Kun Qian et al.

#2105

DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing

Yufei Huang, Bangyan Liao, Yuqi Hu et al.

CVPR 2025highlightarXiv:2412.20651

#2106

Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis

Yousef Yeganeh, Ioannis Charisiadis, Marta Hasny et al.

CVPR 2025posterarXiv:2403.10344

#2107

ViiNeuS: Volumetric Initialization for Implicit Neural Surface Reconstruction of Urban Scenes with Limited Image Overlap

Hala Djeghim, Nathan Piasco, Moussab Bennehar et al.

CVPR 2025posterarXiv:2506.08210

#2108

A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation

Andrew Z Wang, Songwei Ge, Tero Karras et al.

CVPR 2025posterarXiv:2503.01175

#2109

HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation

Hongye Cheng, Tianyu Wang, guangsi shi et al.

#2110

Dual-Agent Optimization framework for Cross-Domain Few-Shot Segmentation

Zhaoyang Li, Yuan Wang, Wangkai Li et al.

CVPR 2025posterarXiv:2503.04006

#2111

DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation

Amin Karimi, Charalambos Poullis

CVPR 2025highlightarXiv:2411.18180

#2112

DistinctAD: Distinctive Audio Description Generation in Contexts

Bo Fang, Wenhao Wu, Qiangqiang Wu et al.

CVPR 2025posterarXiv:2404.10620

#2113

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Sinisa Stekovic, Arslan Artykov, Stefan Ainetter et al.

CVPR 2024posterarXiv:2405.19819

#2114

Gated Fields: Learning Scene Reconstruction from Gated Videos

Andrea Ramazzina, Stefanie Walz, Pragyan Dahal et al.

#2115

One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models

Senmao Li, Lei Wang, Kai Wang et al.

CVPR 2025posterarXiv:2503.19653

#2116

OpenSDI: Spotting Diffusion-Generated Images in the Open World

Yabin Wang, Zhiwu Huang, Xiaopeng Hong

CVPR 2025posterarXiv:2503.21777

#2117

Test-Time Visual In-Context Tuning

Jiahao Xie, Alessio Tonioni, Nathalie Rauschmayr et al.

CVPR 2025highlightarXiv:2409.17993

#2118

SSHNet: Unsupervised Cross-modal Homography Estimation via Problem Reformulation and Split Optimization

Junchen Yu, Siyuan Cao, Runmin Zhang et al.

CVPR 2025posterarXiv:2503.16406

#2119

VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness

SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim et al.

CVPR 2025posterarXiv:2503.01653

#2120

Distilled Prompt Learning for Incomplete Multimodal Survival Prediction

Yingxue Xu, Fengtao ZHOU, Chenyu Zhao et al.

CVPR 2025posterarXiv:2505.05505

#2121

Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation

Yiming Qin, Zhu Xu, Yang Liu

CVPR 2025posterarXiv:2307.16375

#2122

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

Hao Lin, Ke Wu, Jie Li et al.

CVPR 2025highlightarXiv:2411.17763

#2123

Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation

Xiang Li, Zixuan Huang, Anh Thai et al.

CVPR 2025posterarXiv:2503.01980

#2124

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia et al.

CVPR 2024posterarXiv:2311.14097

#2125

ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models

Fei Kong, Jinhao Duan, Lichao Sun et al.

CVPR 2025posterarXiv:2503.18055

#2126

PolarFree: Polarization-based Reflection-Free Imaging

Mingde Yao, Menglu Wang, King Man Tam et al.

CVPR 2025highlightarXiv:2503.19718

#2127

QuCOOP: A Versatile Framework for Solving Composite and Binary-Parametrised Problems on Quantum Annealers

Natacha Kuete Meli, Vladislav Golyanik, Marcel Seelbach Benkner et al.

CVPR 2025posterarXiv:2504.06815

#2128

SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering

Hanxiao Sun, Yupeng Gao, Jin Xie et al.

CVPR 2025posterarXiv:2503.18406

#2129

Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning

Sherry X. Chen, Misha Sra, Pradeep Sen

CVPR 2025posterarXiv:2411.12089

#2130

FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting

Fangyu Wu, Yuhao Chen

CVPR 2025posterarXiv:2503.00063

#2131

NoPain: No-box Point Cloud Attack via Optimal Transport Singular Boundary

Zezeng Li, Xiaoyu Du, Na Lei et al.

CVPR 2025posterarXiv:2506.05313

#2132

MARBLE: Material Recomposition and Blending in CLIP-Space

Ta-Ying Cheng, Prafull Sharma, Mark Boss et al.

#2133

Noise-Resistant Video Anomaly Detection via RGB Error-Guided Multiscale Predictive Coding and Dynamic Memory

Han Hu, Wenli Du, Peng Liao et al.

CVPR 2025highlightarXiv:2411.15580

#2134

TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser et al.

CVPR 2025posterarXiv:2503.13961

#2135

BG-Triangle: Bézier Gaussian Triangle for 3D Vectorization and Rendering

Minye Wu, Haizhao Dai, Kaixin Yao et al.

CVPR 2025posterarXiv:2506.08005

#2136

ZeroVO: Visual Odometry with Minimal Assumptions

Lei Lai, Zekai Yin, Eshed Ohn-Bar

CVPR 2025posterarXiv:2503.04718

#2137

Floxels: Fast Unsupervised Voxel Based Scene Flow Estimation

David T. Hoffmann, Syed Haseeb Raza, Hanqiu Jiang et al.

CVPR 2025highlightarXiv:2503.16944

#2138

HyperLoRA: Parameter-Efficient Adaptive Generation for Portrait Synthesis

Mengtian Li, Jinshu Chen, Wanquan Feng et al.

CVPR 2025posterarXiv:2503.18094

#2139

Anomize: Better Open Vocabulary Video Anomaly Detection

Fei Li, Wenxuan Liu, Jingjing Chen et al.

CVPR 2025posterarXiv:2412.01826

#2140

RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations

Savya Khosla, Sethuraman T V, Alexander G. Schwing et al.

CVPR 2025highlightarXiv:2412.06191

#2141

Event Fields: Capturing Light Fields at High Speed, Resolution, and Dynamic Range

Ziyuan Qu, Zihao Zou, Vivek Boominathan et al.

CVPR 2025posterarXiv:2411.14169

#2142

Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting

Jingyi Xu, Xieyuanli Chen, Junyi Ma et al.

CVPR 2024posterarXiv:2407.04260

#2143

Efficient Detection of Long Consistent Cycles and its Application to Distributed Synchronization

Shaohan Li, Yunpeng Shi, Gilad Lerman

CVPR 2025posterarXiv:2503.16096

#2144

MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures

Lucas Morin, Valery Weber, Ahmed Nassar et al.

#2145

SlowFormer: Adversarial Attack on Compute and Energy Consumption of Efficient Vision Transformers

Navaneet K L, Soroush Abbasi Koohpayegani, Essam Sleiman et al.

CVPR 2025posterarXiv:2503.18817

#2146

Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations

Jeonghyeon Kim, Sangheum Hwang

CVPR 2025posterarXiv:2503.19868

#2147

GENIUS: A Generative Framework for Universal Multimodal Search

Sungyeon Kim, Xinliang Zhu, Xiaofan Lin et al.

CVPR 2025posterarXiv:2412.07767

#2148

Learning Visual Generative Priors without Text

Shuailei Ma, Kecheng Zheng, Ying Wei et al.

#2149

HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver

Cong Wei, Haoxian Tan, Yujie Zhong et al.

CVPR 2025highlightarXiv:2412.00782

#2150

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025posterarXiv:2412.06968

#2151

SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

Yaniv Benny, Lior Wolf

CVPR 2025posterarXiv:2505.02166

#2152

Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation

Xiaoqi Li, Lingyun Xu, Mingxu Zhang et al.

CVPR 2024posterarXiv:2405.14934

#2153

Universal Robustness via Median Randomized Smoothing for Real-World Super-Resolution

Zakariya Chaouai, Mohamed Tamaazousti

CVPR 2025posterarXiv:2411.19824

#2154

SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens

Chi Su, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2025posterarXiv:2503.02009

#2155

Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization

Jamie Wynn, Zawar Qureshi, Jakub Powierza et al.

CVPR 2025posterarXiv:2504.09621

#2156

Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images

Jiuchen Chen, Xinyu Yan, Qizhi Xu et al.

#2157

SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization

Jianyu LAI, Sixiang Chen, yunlong lin et al.

CVPR 2025posterarXiv:2503.16997

#2158

Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Zekun Li et al.

#2159

LidarGait++: Learning Local Features and Size Awareness from LiDAR Point Clouds for 3D Gait Recognition

Chuanfu Shen, Rui Wang, Lixin Duan et al.

CVPR 2025posterarXiv:2503.10112

#2160

MoEdit: On Learning Quantity Perception for Multi-object Image Editing

Yanfeng Li, Ka-Hou Chan, Yue Sun et al.

#2161

Understanding Multi-Task Activities from Single-Task Videos

Yuhan Shen, Ehsan Elhamifar

CVPR 2025highlight

CVPR 2024posterarXiv:2403.08768

#2162

3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surfaces

Linyi Jin, Nilesh Kulkarni, David Fouhey

CVPR 2025posterarXiv:2505.16778

#2163

Single Domain Generalization for Few-Shot Counting via Universal Representation Matching

Xianing Chen, Si Huo, Borui Jiang et al.

CVPR 2025highlightarXiv:2504.10676

#2164

H-MoRe: Learning Human-centric Motion Representation for Action Analysis

Zhanbo Huang, Xiaoming Liu, Yu Kong

CVPR 2025posterarXiv:2412.07696

#2165

SimVS: Simulating World Inconsistencies for Robust View Synthesis

Alex Trevithick, Roni Paiss, Philipp Henzler et al.

CVPR 2025posterarXiv:2411.17249

#2166

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

Zhengfei Kuang, Tianyuan Zhang, Kai Zhang et al.

CVPR 2024posterarXiv:2404.19250

#2167

Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair

Jeonghoon Park, Chaeyeon Chung, Jaegul Choo

CVPR 2025posterarXiv:2503.21824

#2168

Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations

Haitong Liu, Kuofeng Gao, Yang Bai et al.

CVPR 2025posterarXiv:2504.01472

#2169

ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction

YUEJIAO SU, Yi Wang, Qiongyang Hu et al.

CVPR 2025highlightarXiv:2411.15099

#2170

Context-Aware Multimodal Pretraining

Karsten Roth, Zeynep Akata, Dima Damen et al.

CVPR 2024posterarXiv:2403.19811

#2171

X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization

Anna Kukleva, Fadime Sener, Edoardo Remelli et al.

CVPR 2025posterarXiv:2411.18711

#2172

Evaluating Vision-Language Models as Evaluators in Path Planning

Mohamed Aghzal, Xiang Yue, Erion Plaku et al.

CVPR 2025posterarXiv:2506.02781

#2173

FreeScene: Mixed Graph Diffusion for 3D Scene Synthesis from Free Prompts

Tongyuan Bai, Wangyuanfan Bai, Dong Chen et al.

#2174

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.

Muchen Li, Sammy Christen, Chengde Wan et al.

CVPR 2025posterarXiv:2411.13549

#2175

Generating 3D-Consistent Videos from Unposed Internet Photos

Gene Chou, Kai Zhang, Sai Bi et al.

CVPR 2025posterarXiv:2504.03639

#2176

Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions

Ting-Hsuan Liao, Yi Zhou, Yu Shen et al.

CVPR 2025posterarXiv:2504.01019

#2177

MixerMDM: Learnable Composition of Human Motion Diffusion Models

Pablo Ruiz-Ponce, German Barquero, Cristina Palmero et al.

CVPR 2025posterarXiv:2503.12507

#2178

Segment Any-Quality Images with Generative Latent Space Enhancement

Guangqian Guo, Yong Guo, Xuehui Yu et al.

CVPR 2025highlightarXiv:2504.19478

#2179

CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design

Weitao Feng, Hang Zhou, Jing Liao et al.

CVPR 2024posterarXiv:2406.04961

#2180

Multiplane Prior Guided Few-Shot Aerial Scene Rendering

Zihan Gao, Licheng Jiao, Lingling Li et al.

#2181

Pose Adapted Shape Learning for Large-Pose Face Reenactment

Gee-Sern Hsu, Jie-Ying Zhang, Yu-Hsiang Huang et al.

CVPR 2024posterarXiv:2312.10634

#2182

Anomaly Score: Evaluating Generative Models and Individual Generated Images based on Complexity and Vulnerability

Jaehui Hwang, Junghyuk Lee, Jong-Seok Lee

CVPR 2025posterarXiv:2505.21755

#2183

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

Chengyue Huang, Brisa Maneechotesuwan, Shivang Chopra et al.

CVPR 2024posterarXiv:2404.00524

#2184

TexVocab: Texture Vocabulary-conditioned Human Avatars

Yuxiao Liu, Zhe Li, Yebin Liu et al.

CVPR 2024posterarXiv:2403.04368

#2185

Learning to Remove Wrinkled Transparent Film with Polarized Prior

Jiaqi Tang, RUIZHENG WU, Xiaogang Xu et al.

CVPR 2024posterarXiv:2310.08332

#2186

Real-Time Neural BRDF with Spherically Distributed Primitives

Yishun Dou, Zhong Zheng, Qiaoqiao Jin et al.

CVPR 2025posterarXiv:2505.11707

#2187

Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration

Haipeng Fang, Sheng Tang, Juan Cao et al.

CVPR 2025posterarXiv:2412.11785

#2188

InterDyn: Controllable Interactive Dynamics with Video Diffusion Models

Rick Akkerman, Haiwen Feng, Michael J. Black et al.

CVPR 2025posterarXiv:2503.15898

#2189

Reconstructing In-the-Wild Open-Vocabulary Human-Object Interactions

Boran Wen, Dingbang Huang, Zichen Zhang et al.

CVPR 2024posterarXiv:2312.07538

#2190

Anatomically Constrained Implicit Face Models

Prashanth Chandran, Gaspard Zoss

CVPR 2025highlightarXiv:2403.11295

#2191

Order-One Rolling Shutter Cameras

Marvin Anas Hahn, Kathlén Kohn, Orlando Marigliano et al.

CVPR 2025posterarXiv:2503.06186

#2192

PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model

Xiang Gao, Shuai Yang, Jiaying Liu

#2193

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning

Jiayi Guan, Li Shen, Ao Zhou et al.

CVPR 2025posterarXiv:2503.18703

#2194

Channel Consistency Prior and Self-Reconstruction Strategy Based Unsupervised Image Deraining

Guanglu Dong, Tianheng Zheng, Yuanzhouhan Cao et al.

CVPR 2025posterarXiv:2503.03782

#2195

ReRAW: RGB-to-RAW Image Reconstruction via Stratified Sampling for Efficient Object Detection on the Edge

Radu Berdan, Beril Besbinar, Christoph Reinders et al.

CVPR 2025highlightarXiv:2503.03307

#2196

Full-DoF Egomotion Estimation for Event Cameras Using Geometric Solvers

Ji Zhao, Banglei Guan, Zibin Liu et al.

CVPR 2025posterarXiv:2512.23463

#2197

Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators

Bohan Xiao, PEIYONG WANG, Qisheng He et al.

CVPR 2024posterarXiv:2404.01941

#2198

LPSNet: End-to-End Human Pose and Shape Estimation with Lensless Imaging

Haoyang Ge, Qiao Feng, Hailong Jia et al.

CVPR 2025posterarXiv:2501.04815

#2199

Towards Generalizable Trajectory Prediction using Dual-Level Representation Learning and Adaptive Prompting

Kaouther Messaoud, Matthieu Cord, Alex Alahi

#2200

Unity in Diversity: Video Editing via Gradient-Latent Purification

Junyu Gao, Kunlin Yang, Xuan Yao et al.