Most Cited CVPR &quot;temporal graph metrics&quot; Papers

CVPR 2025posterarXiv:2503.18137

#2002

TCFG: Tangential Damping Classifier-free Guidance

Mingi Kwon, Shin seong Kim, Jaeseok Jeong et al.

CVPR 2025posterarXiv:2403.14539

#2003

Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild

Junhyeong Cho, Kim Youwang, Hunmin Yang et al.

CVPR 2025posterarXiv:2503.10247

#2004

Interpretable Image Classification via Non-parametric Part Prototype Learning

Zhijie Zhu, Lei Fan, Maurice Pagnucco et al.

CVPR 2025posterarXiv:2503.24129

#2005

It’s a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data

Dominik Schnaus, Nikita Araslanov, Daniel Cremers

CVPR 2025posterarXiv:2411.10818

#2006

FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations

Hmrishav Bandyopadhyay, Yi-Zhe Song

CVPR 2025highlightarXiv:2505.05309

#2007

Augmented Deep Contexts for Spatially Embedded Video Coding

Yifan Bian, Chuanbo Tang, Li Li et al.

CVPR 2025posterarXiv:2501.06184

#2008

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs

Yangyu Huang, Tianyi Gao, Haoran Xu et al.

#2009

Rethinking Spiking Self-Attention Mechanism: Implementing α-XNOR Similarity Calculation in Spiking Transformers

Yichen Xiao, Shuai Wang, Dehao Zhang et al.

CVPR 2024posterarXiv:2403.16258

#2010

Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis

Atefeh Khoshkhahtinat, Ali Zafari, Piyush Mehta et al.

CVPR 2025posterarXiv:2502.03629

#2011

RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations

Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim et al.

CVPR 2025highlightarXiv:2412.04464

#2012

DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction

Ben Kaye, Tomas Jakab, Shangzhe Wu et al.

CVPR 2024highlightarXiv:2312.04529

#2013

Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance

Yuto Enyo, Ko Nishino

CVPR 2025posterarXiv:2505.22859

#2014

4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians

Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison

CVPR 2024posterarXiv:2403.11380

#2015

Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach

Beichen Zhang, Xiaoxing Wang, Xiaohan Qin et al.

CVPR 2024posterarXiv:2404.00680

#2016

Learning to Rank Patches for Unbiased Image Redundancy Reduction

Yang Luo, Zhineng Chen, Peng Zhou et al.

CVPR 2025posterarXiv:2502.19781

#2017

RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings

Aayush Dhakal, Srikumar Sastry, Subash Khanal et al.

CVPR 2025posterarXiv:2504.00996

#2018

TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

Liangbin Xie, Daniil Pakhomov, Zhonghao Wang et al.

CVPR 2024posterarXiv:2401.07114

#2019

Revisiting Sampson Approximations for Geometric Estimation Problems

Felix Rydell, Angelica Torres, Viktor Larsson

#2020

FluxSpace: Disentangled Semantic Editing in Rectified Flow Models

Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag

CVPR 2025posterarXiv:2412.00071

#2021

COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection

Jinqi Xiao, Shen Sang, Tiancheng Zhi et al.

CVPR 2025posterarXiv:2503.21766

#2022

Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence

Haolin Liu, Xiaohang Zhan, Zizheng Yan et al.

CVPR 2024posterarXiv:2404.15263

#2023

Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization

Lahav Lipson, Jia Deng

CVPR 2025posterarXiv:2503.15185

#2024

3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation

Gyeongrok Oh, Sung June Kim, Heeju Ko et al.

CVPR 2025highlightarXiv:2503.13985

#2025

DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection

Jaewoo Song, Daemin Park, Kanghyun Baek et al.

CVPR 2025posterarXiv:2506.08210

#2026

A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation

Andrew Z Wang, Songwei Ge, Tero Karras et al.

CVPR 2025posterarXiv:2501.11043

#2027

BF-STVSR: B-Splines and Fourier---Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution

Eunjin Kim, HYEONJIN KIM, Kyong Hwan Jin et al.

CVPR 2025posterarXiv:2503.20172

#2028

Guiding Human-Object Interactions with Rich Geometry and Relations

Mengqing Xue, Yifei Liu, Ling Guo et al.

CVPR 2025highlightarXiv:2501.11319

#2029

StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer

ruojun xu, Weijie Xi, Xiaodi Wang et al.

CVPR 2024posterarXiv:2309.05073

#2030

FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions

Jiong WANG, Fengyu Yang, Bingliang Li et al.

#2031

Generative Sparse-View Gaussian Splatting

Hanyang Kong, Xingyi Yang, Xinchao Wang

CVPR 2025posterarXiv:2503.22201

#2032

Multi-modal Knowledge Distillation-based Human Trajectory Forecasting

Jaewoo Jeong, Seohee Lee, Daehee Park et al.

CVPR 2025posterarXiv:2507.17083

#2033

SDGOCC: Semantic and Depth-Guided Bird's-Eye View Transformation for 3D Multimodal Occupancy Prediction

ZaiPeng Duan, Xuzhong Hu, Pei An et al.

#2034

AlphaPre: Amplitude-Phase Disentanglement Model for Precipitation Nowcasting

Kenghong Lin, Baoquan Zhang, Demin Yu et al.

CVPR 2024posterarXiv:2403.19904

#2035

Fully Geometric Panoramic Localization

Junho Kim, Jiwon Jeong, Young Min Kim

CVPR 2025posterarXiv:2503.19913

#2036

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

Mingju Gao, Yike Pan, Huan-ang Gao et al.

CVPR 2025highlightarXiv:2502.07814

#2037

Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution

Siwei Tu, Ben Fei, Weidong Yang et al.

CVPR 2025posterarXiv:2503.13063

#2038

Federated Learning with Domain Shift Eraser

Zheng Wang, Zihui Wang, Zheng Wang et al.

#2039

D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

Weinan Jia, Mengqi Huang, Nan Chen et al.

CVPR 2025posterarXiv:2407.13772

#2040

GroupMamba: Efficient Group-Based Visual State Space Model

Abdelrahman Shaker, Syed Talal Wasim, Salman Khan et al.

CVPR 2025posterarXiv:2504.04744

#2041

Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions

He Zhu, Quyu Kong, Kechun Xu et al.

CVPR 2025highlightarXiv:2504.12284

#2042

How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions

Aditya Prakash, Benjamin E Lundell, Dmitry Andreychuk et al.

CVPR 2024posterarXiv:2403.18469

#2043

Density-guided Translator Boosts Synthetic-to-Real Unsupervised Domain Adaptive Segmentation of 3D Point Clouds

Zhimin Yuan, Wankang Zeng, Yanfei Su et al.

CVPR 2025posterarXiv:2410.15980

#2044

Learning from Neighbors: Category Extrapolation for Long-Tail Learning

Shizhen Zhao, Xin Wen, Jiahui Liu et al.

CVPR 2025posterarXiv:2505.22079

#2045

Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis

Hanbin Ko, Chang Min Park

#2046

Audio-Visual Semantic Graph Network for Audio-Visual Event Localization

Liang Liu, Shuaiyong Li, Yongqiang Zhu

#2047

Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine

Zhaohu Xing, Lihao Liu, Yijun Yang et al.

#2048

AniMo: Species-Aware Model for Text-Driven Animal Motion Generation

Xuan Wang, Kai Ruan, Xing Zhang et al.

CVPR 2025posterarXiv:2412.07696

#2049

SimVS: Simulating World Inconsistencies for Robust View Synthesis

Alex Trevithick, Roni Paiss, Philipp Henzler et al.

CVPR 2024posterarXiv:2404.00252

#2050

Learned Scanpaths Aid Blind Panoramic Video Quality Assessment

Kanglong FAN, Wen Wen, Mu Li et al.

CVPR 2025highlightarXiv:2505.21943

#2051

Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting

Wei Lin, Chenyang ZHAO, Antoni B. Chan

CVPR 2024posterarXiv:2403.10335

#2052

NECA: Neural Customizable Human Avatar

Junjin Xiao, Qing Zhang, Zhan Xu et al.

CVPR 2025posterarXiv:2503.09243

#2053

GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation

Ruihai Wu, Ziyu Zhu, Yuran Wang et al.

CVPR 2024posterarXiv:2404.19384

#2054

Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection

Zhanwei Zhang, Minghao Chen, Shuai Xiao et al.

CVPR 2025posterarXiv:2504.03639

#2055

Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions

Ting-Hsuan Liao, Yi Zhou, Yu Shen et al.

#2056

Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression

Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai et al.

CVPR 2024posterarXiv:2311.18695

#2057

Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction

Cheng Sun, Wei-En Tai, Yu-Lin Shih et al.

#2058

EdgeDiff: Edge-aware Diffusion Network for Building Reconstruction from Point Clouds

Yujun Liu, Ruisheng Wang, Shangfeng Huang et al.

#2059

Boosting Adversarial Transferability through Augmentation in Hypothesis Space

Yu Guo, Weiquan Liu, Qingshan Xu et al.

#2060

HeMoRa: Unsupervised Heuristic Consensus Sampling for Robust Point Cloud Registration

Shaocheng Yan, Yiming Wang, Kaiyan Zhao et al.

CVPR 2024posterarXiv:2211.14309

#2061

FutureHuman3D: Forecasting Complex Long-Term 3D Human Behavior from Video Observations

Christian Diller, Thomas Funkhouser, Angela Dai

CVPR 2025posterarXiv:2412.16915

#2062

FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation

Tianyun Zhong, Chao Liang, Jianwen Jiang et al.

CVPR 2025posterarXiv:2503.16572

#2063

Efficient ANN-Guided Distillation: Aligning Rate-based Features of Spiking Neural Networks through Hybrid Block-wise Replacement

Shu Yang, Chengting Yu, Lei Liu et al.

CVPR 2025posterarXiv:2504.06827

#2064

IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments

Can Zhang, Gim Hee Lee

CVPR 2025posterarXiv:2503.20824

#2065

Exploiting Temporal State Space Sharing for Video Semantic Segmentation

Hesham Syed, Yun Liu, Guolei Sun et al.

CVPR 2025highlightarXiv:2411.15678

#2066

Towards RAW Object Detection in Diverse Conditions

Zhong-Yu Li, Xin Jin, Bo-Yuan Sun et al.

CVPR 2025posterarXiv:2412.16645

#2067

Complementary Advantages: Exploiting Cross-Field Frequency Correlation for NIR-Assisted Image Denoising

Yuchen Wang, Hongyuan Wang, Lizhi Wang et al.

CVPR 2024posterarXiv:2405.07481

#2068

Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis

Tianci Bi, Xiaoyi Zhang, Zhizheng Zhang et al.

CVPR 2025posterarXiv:2502.19894

#2069

High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model

Mingtao Guo, Guanyu Xing, Yanli Liu

CVPR 2025posterarXiv:2411.19756

#2070

DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering

Yihao Wang, Marcus Klasson, Matias Turkulainen et al.

CVPR 2025posterarXiv:2501.01589

#2071

D^3-Human: Dynamic Disentangled Digital Human from Monocular Video

Honghu Chen, Bo Peng, Yunfan Tao et al.

CVPR 2024highlightarXiv:2405.02581

#2072

Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements

Niccolò Biondi, Federico Pernici, Simone Ricci et al.

CVPR 2025posterarXiv:2504.01515

#2073

Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis

Zixuan Wang, DUO PENG, Feng Chen et al.

CVPR 2024posterarXiv:2308.10638

#2074

SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes

Soubhik Sanyal, Partha Ghosh, Jinlong Yang et al.

CVPR 2024posterarXiv:2404.02585

#2075

Unsegment Anything by Simulating Deformation

Jiahao Lu, Xingyi Yang, Xinchao Wang

#2076

ShapeWalk: Compositional Shape Editing Through Language-Guided Chains

Habib Slim, Mohamed Elhoseiny

CVPR 2025posterarXiv:2411.12951

#2077

On the Consistency of Video Large Language Models in Temporal Comprehension

Minjoon Jung, Junbin Xiao, Byoung-Tak Zhang et al.

CVPR 2025highlightarXiv:2504.05046

#2078

MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond

Shenghao Ren, Yi Lu, Jiayi Huang et al.

#2079

A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition

Duosheng Chen, Shihao Zhou, Jinshan Pan et al.

CVPR 2025highlight

CVPR 2025highlightarXiv:2504.05838

#2080

Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking

Junxi Chen, Junhao Dong, Xiaohua Xie

CVPR 2025posterarXiv:2411.16799

#2081

One is Plenty: A Polymorphic Feature Interpreter for Immutable Heterogeneous Collaborative Perception

Yuchen Xia, Quan Yuan, Guiyang Luo et al.

CVPR 2025posterarXiv:2503.12745

#2082

ProtoDepth: Unsupervised Continual Depth Completion with Prototypes

Patrick Rim, Hyoungseob Park, Suchisrit Gangopadhyay et al.

CVPR 2024posterarXiv:2311.14097

#2083

ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models

Fei Kong, Jinhao Duan, Lichao Sun et al.

CVPR 2025posterarXiv:2503.00591

#2084

AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models

Sohan Patnaik, Rishabh Jain, Balaji Krishnamurthy et al.

CVPR 2025posterarXiv:2504.10857

#2085

ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping

Shun Iwase, Muhammad Zubair Irshad, Katherine Liu et al.

CVPR 2024posterarXiv:2403.16937

#2086

Hyperspherical Classification with Dynamic Label-to-Prototype Assignment

Mohammad Saadabadi Saadabadi, Ali Dabouei, Sahar Rahimi Malakshan et al.

CVPR 2025highlightarXiv:2412.20651

#2087

Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis

Yousef Yeganeh, Ioannis Charisiadis, Marta Hasny et al.

CVPR 2024posterarXiv:2403.06846

#2088

DiaLoc: An Iterative Approach to Embodied Dialog Localization

Chao Zhang, Mohan Li, Ignas Budvytis et al.

CVPR 2025posterarXiv:2502.06029

#2089

DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations

Krishna Sri Ipsit Mantri, Carola-Bibiane Schönlieb, Bruno Ribeiro et al.

CVPR 2025posterarXiv:2505.07539

#2090

GIFStream: 4D Gaussian-based Immersive Video with Feature Stream

Hao Li, Sicheng Li, Xiang Gao et al.

#2091

Intensity-Robust Autofocus for Spike Camera

Changqing Su, Zhiyuan Ye, Yongsheng Xiao et al.

CVPR 2025posterarXiv:2505.11800

#2092

Self-Learning Hyperspectral and Multispectral Image Fusion via Adaptive Residual Guided Subspace Diffusion Model

Jian Zhu, He Wang, Yang Xu et al.

CVPR 2024posterarXiv:2403.17761

#2093

Makeup Prior Models for 3D Facial Makeup Estimation and Applications

Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.

CVPR 2025posterarXiv:2504.07095

#2094

Neural Motion Simulator Pushing the Limit of World Models in Reinforcement Learning

Chenjie Hao, Weyl Lu, Yifan Xu et al.

CVPR 2025posterarXiv:2406.16321

#2095

Mosaic of Modalities: A Comprehensive Benchmark for Multimodal Graph Learning

Jing Zhu, Yuhang Zhou, Shengyi Qian et al.

CVPR 2025posterarXiv:2506.07865

#2096

FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity

Jinxi Li, Ziyang Song, Siyuan Zhou et al.

CVPR 2025posterarXiv:2405.16414

#2097

Robust Message Embedding via Attention Flow-Based Steganography

Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.

CVPR 2025highlightarXiv:2410.10604

#2098

Multi-modal Vision Pre-training for Medical Image Analysis

Shaohao Rui, Lingzhi Chen, Zhenyu Tang et al.

CVPR 2025highlightarXiv:2412.03937

#2099

AIpparel: A Multimodal Foundation Model for Digital Garments

Kiyohiro Nakayama, Jan Ackermann, Timur Levent Kesdogan et al.

CVPR 2025posterarXiv:2503.20418

#2100

ITA-MDT: Image-Timestep-Adaptive Masked Diffusion Transformer Framework for Image-Based Virtual Try-On

Ji Woo Hong, Tri Ton, Trung X. Pham et al.

#2101

UHD-processer: Unified UHD Image Restoration with Progressive Frequency Learning and Degradation-aware Prompts

Yidi Liu, Dong Li, Xueyang Fu et al.

CVPR 2025posterarXiv:2503.00063

#2102

NoPain: No-box Point Cloud Attack via Optimal Transport Singular Boundary

Zezeng Li, Xiaoyu Du, Na Lei et al.

#2103

Revisiting Source-Free Domain Adaptation: Insights into Representativeness, Generalization, and Variety

Ronghang Zhu, Mengxuan Hu, Weiming Zhuang et al.

CVPR 2025posterarXiv:2505.02148

#2104

Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving

Alexey Nekrasov, Malcolm Burdorf, Stewart Worrall et al.

CVPR 2025posterarXiv:2411.16173

#2105

SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis

Junho Kim, Hyunjun Kim, Hosu Lee et al.

CVPR 2025highlightarXiv:2412.19637

#2106

ReNeg: Learning Negative Embedding with Reward Guidance

Xiaomin Li, yixuan liu, Takashi Isobe et al.

CVPR 2025posterarXiv:2504.17813

#2107

CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss

Dileepa Pitawela, Gustavo Carneiro, Hsiang-Ting Chen

CVPR 2024posterarXiv:2401.10219

#2108

Edit One for All: Interactive Batch Image Editing

Thao Nguyen, Utkarsh Ojha, Yuheng Li et al.

CVPR 2025posterarXiv:2412.01537

#2109

HandOS: 3D Hand Reconstruction in One Stage

Xingyu Chen, Zhuheng Song, Xiaoke Jiang et al.

CVPR 2025posterarXiv:2503.21659

#2110

InteractionMap: Improving Online Vectorized HDMap Construction with Interaction

Kuang Wu, Chuan Yang, Zhanbin Li

CVPR 2024posterarXiv:2403.11812

#2111

Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

Yuqi Zhang, Guanying Chen, Jiaxing Chen et al.

CVPR 2025posterarXiv:2408.15503

#2112

RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments

Haisheng Su, Feixiang Song, CONG MA et al.

#2113

Efficient Hyperparameter Optimization with Adaptive Fidelity Identification

Jiantong Jiang, Zeyi Wen, Atif Mansoor et al.

CVPR 2025posterarXiv:2503.12077

#2114

V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents

Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.

#2115

Binarized Neural Network for Multi-spectral Image Fusion

Junming Hou, Xiaoyu Chen, Ran Ran et al.

#2116

Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild

Wei Liu, Yufei Chen, Xiaodong Yue

#2117

Pos3R: 6D Pose Estimation for Unseen Objects Made Easy

Weijian Deng, Dylan Campbell, Chunyi Sun et al.

CVPR 2025posterarXiv:2412.01792

#2118

CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion

Kai He, Chin-Hsuan Wu, Igor Gilitschenski

CVPR 2025posterarXiv:2503.04565

#2119

Omnidirectional Multi-Object Tracking

Kai Luo, Hao Shi, Sheng Wu et al.

CVPR 2024posterarXiv:2403.20031

#2120

A Unified Framework for Human-centric Point Cloud Video Understanding

Yiteng Xu, Kecheng Ye, xiao han et al.

CVPR 2025posterarXiv:2506.10966

#2121

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Ning Gao, Yilun Chen, Shuai Yang et al.

CVPR 2025posterarXiv:2412.10084

#2122

ProbeSDF: Light Field Probes For Neural Surface Reconstruction

Briac Toussaint, Diego Thomas, Jean-Sébastien Franco

CVPR 2025posterarXiv:2501.08303

#2123

Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers

Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris et al.

#2124

Dynamic Stereotype Theory Induced Micro-expression Recognition with Oriented Deformation

Bohao Zhang, Xuejiao Wang, Changbo Wang et al.

CVPR 2024posterarXiv:2311.16304

#2125

Robust Self-calibration of Focal Lengths from the Fundamental Matrix

Viktor Kocur, Daniel Kyselica, Zuzana Kukelova

CVPR 2025posterarXiv:2503.19191

#2126

FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing

Yufan Ren, Zicong Jiang, Tong Zhang et al.

CVPR 2025posterarXiv:2503.02101

#2127

Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection

Boyong He, Yuxiang Ji, Qianwen Ye et al.

CVPR 2025posterarXiv:2503.18695

#2128

OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad

Luyao Tang, Chaoqi Chen, Yuxuan Yuan et al.

CVPR 2024highlightarXiv:2401.13296

#2129

Visual Objectification in Films: Towards a New AI Task for Video Interpretation

Julie Tores, Lucile Sassatelli, Hui-Yin Wu et al.

CVPR 2025posterarXiv:2412.07767

#2130

Learning Visual Generative Priors without Text

Shuailei Ma, Kecheng Zheng, Ying Wei et al.

CVPR 2025highlightarXiv:2412.06767

#2131

MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views

Antoine Guédon, Tomoki Ichikawa, Kohei Yamashita et al.

#2132

HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics

Jongsung Lee, HARIN PARK, Byeong-Uk Lee et al.

CVPR 2025posterarXiv:2502.19739

#2133

LUCAS: Layered Universal Codec Avatars

Di Liu, Teng Deng, Giljoo Nam et al.

CVPR 2025posterarXiv:2503.15404

#2134

Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

#2135

Detection-Friendly Nonuniformity Correction: A Union Framework for Infrared UAV Target Detection

Houzhang Fang, Xiaolin Wang, Zengyang Li et al.

CVPR 2025highlight

CVPR 2025posterarXiv:2503.18359

#2136

Context-Enhanced Memory-Refined Transformer for Online Action Detection

Zhanzhong Pang, Fadime Sener, Angela Yao

#2137

Projecting Trackable Thermal Patterns for Dynamic Computer Vision

Mark Sheinin, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan

CVPR 2025highlightarXiv:2503.09962

#2138

Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification

Jiayu Jiang, Changxing Ding, Wentao Tan et al.

CVPR 2025posterarXiv:2503.12042

#2139

Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing

Zhedong Zhang, Liang Li, Chenggang Yan et al.

#2140

On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach

Baoshun Tong, Hanjiang Lai, Yan Pan et al.

CVPR 2025posterarXiv:2412.18565

#2141

3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

Yihang Luo, Shangchen Zhou, Yushi Lan et al.

CVPR 2025posterarXiv:2412.01244

#2142

Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization

lingyun zhang, Yu Xie, Yanwei Fu et al.

#2143

Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2025posterarXiv:2504.04156

#2144

CoMBO: Conflict Mitigation via Branched Optimization for Class Incremental Segmentation

Kai Fang, Anqi Zhang, Guangyu Gao et al.

CVPR 2025highlightarXiv:2503.04459

#2145

Question-Aware Gaussian Experts for Audio-Visual Question Answering

Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.

CVPR 2025posterarXiv:2502.19754

#2146

Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold Network

Xingyu Qiu, Mengying Yang, Xinghua Ma et al.

CVPR 2025posterarXiv:2503.01309

#2147

OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging

Yijie Tang, Jiazhao Zhang, Yuqing Lan et al.

CVPR 2024posterarXiv:2401.14349

#2148

Learning to Navigate Efficiently and Precisely in Real Environments

Guillaume Bono, Hervé Poirier, Leonid Antsfeld et al.

CVPR 2025posterarXiv:2506.18557

#2149

Object-aware Sound Source Localization via Audio-Visual Scene Understanding

Sung Jin Um, Dongjin Kim, Sangmin Lee et al.

CVPR 2025posterarXiv:2406.13378

#2150

PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation

Zidong Cao, Jinjing Zhu, Weiming Zhang et al.

CVPR 2025posterarXiv:2408.07790

#2151

Cropper: Vision-Language Model for Image Cropping through In-Context Learning

Seung Hyun Lee, Jijun jiang, Yiran Xu et al.

CVPR 2025posterarXiv:2503.22168

#2152

Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis

Woojung Han, Yeonkyung Lee, Chanyoung Kim et al.

CVPR 2025posterarXiv:2503.19377

#2153

Interpretable Generative Models through Post-hoc Concept Bottlenecks

Akshay R. Kulkarni, Ge Yan, Chung-En Sun et al.

CVPR 2025posterarXiv:2503.01653

#2154

Distilled Prompt Learning for Incomplete Multimodal Survival Prediction

Yingxue Xu, Fengtao ZHOU, Chenyu Zhao et al.

CVPR 2025posterarXiv:2504.08449

#2155

Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input

Jian Wang, Rishabh Dabral, Diogo Luvizon et al.

CVPR 2025posterarXiv:2411.15231

#2156

IterIS: Iterative Inference-Solving Alignment for LoRA Merging

Hongxu chen, Zhen Wang, Runshi Li et al.

CVPR 2024posterarXiv:2406.18540

#2157

Fully Exploiting Every Real Sample: SuperPixel Sample Gradient Model Stealing

Yunlong Zhao, Xiaoheng Deng, Yijing Liu et al.

CVPR 2025posterarXiv:2411.18654

#2158

AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

Haonan Han, Xiangzuo Wu, Huan Liao et al.

CVPR 2025posterarXiv:2412.02993

#2159

EchoONE: Segmenting Multiple Echocardiography Planes in One Model

Jiongtong Hu, Wei Zhuo, Jun Cheng et al.

CVPR 2025posterarXiv:2503.17080

#2160

Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection

Gensheng Pei, Tao Chen, Yujia Wang et al.

CVPR 2025posterarXiv:2503.24382

#2161

Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views

Chong Bao, Xiyu Zhang, Zehao Yu et al.

CVPR 2025posterarXiv:2502.20208

#2162

4Deform: Neural Surface Deformation for Robust Shape Interpolation

Lu Sang, Zehranaz Canfes, Dongliang Cao et al.

CVPR 2025posterarXiv:2503.05484

#2163

DecoupledGaussian: Object-Scene Decoupling for Physics-Based Interaction

Miaowei Wang, Yibo Zhang, Rui Ma et al.

CVPR 2024posterarXiv:2405.14934

#2164

Universal Robustness via Median Randomized Smoothing for Real-World Super-Resolution

Zakariya Chaouai, Mohamed Tamaazousti

CVPR 2025posterarXiv:2505.01428

#2165

Multi-party Collaborative Attention Control for Image Customization

Han Yang, Chuanguang Yang, Qiuli Wang et al.

CVPR 2025posterarXiv:2310.11439

#2166

From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport

Quentin Bouniot, Ievgen Redko, Anton Mallasto et al.

CVPR 2025highlightarXiv:2503.18682

#2167

Hardware-Rasterized Ray-Based Gaussian Splatting

Samuel Rota Bulò, Lorenzo Porzi, Nemanja Bartolovic et al.

#2168

Flow-Guided Online Stereo Rectification for Wide Baseline Stereo

Anush Kumar, Fahim Mannan, Omid Hosseini Jafari et al.

CVPR 2025posterarXiv:2404.14414

#2169

Removing Reflections from RAW Photos

Eric Kee, Adam Pikielny, Kevin Blackburn-Matzen et al.

CVPR 2025posterarXiv:2411.15432

#2170

Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts

Qizhou Chen, Chengyu Wang, Dakan Wang et al.

#2171

Learning Heterogeneous Tissues with Mixture of Experts for Gigapixel Whole Slide Images

Junxian Wu, Minheng Chen, Xinyi Ke et al.

CVPR 2025posterarXiv:2310.14356

#2172

Semantic and Expressive Variations in Image Captions Across Languages

Andre Ye, Sebastin Santy, Jena D. Hwang et al.

CVPR 2025highlightarXiv:2412.11441

#2173

UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models

Yuning Han, Bingyin Zhao, Rui Chu et al.

CVPR 2025posterarXiv:2504.02244

#2174

SocialGesture: Delving into Multi-person Gesture Understanding

Xu Cao, Pranav Virupaksha, Wenqi Jia et al.

#2175

TFCustom: Customized Image Generation with Time-Aware Frequency Feature Guidance

Mushui Liu, Dong She, Qihan Huang et al.

CVPR 2025highlight

CVPR 2025posterarXiv:2412.01316

#2176

Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Xin Yan, Yuxuan Cai, Qiuyue Wang et al.

CVPR 2024posterarXiv:2405.19646

#2177

FaceLift: Semi-supervised 3D Facial Landmark Localization

David Ferman, Pablo Garrido, Gaurav Bharaj

CVPR 2025posterarXiv:2412.15050

#2178

Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion

ZhiFei Chen, Tianshuo Xu, Wenhang Ge et al.

CVPR 2025posterarXiv:2411.15553

#2179

Improving Transferable Targeted Attacks with Feature Tuning Mixup

Kaisheng Liang, Xuelong Dai, Yanjie Li et al.

CVPR 2024posterarXiv:2403.07560

#2180

Unleashing Network Potentials for Semantic Scene Completion

Fengyun Wang, Qianru Sun, Dong Zhang et al.

CVPR 2025posterarXiv:2412.19712

#2181

From Elements to Design: A Layered Approach for Automatic Graphic Design Composition

Jiawei Lin, Shizhao Sun, Danqing Huang et al.

CVPR 2025posterarXiv:2503.13957

#2182

DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation

Mu Chen, Liulei Li, Wenguan Wang et al.

#2183

Open-World Objectness Modeling Unifies Novel Object Detection

Shan Zhang, Yao Ni, Jinhao Du et al.

CVPR 2024posterarXiv:2404.05675

#2184

Normalizing Flows on the Product Space of SO(3) Manifolds for Probabilistic Human Pose Modeling

Olaf Dünkel, Tim Salzmann, Florian Pfaff

CVPR 2024posterarXiv:2404.12322

#2185

Generalizable Face Landmarking Guided by Conditional Face Warping

Jiayi Liang, Haotian Liu, Hongteng Xu et al.

CVPR 2025posterarXiv:2505.04109

#2186

One2Any: One-Reference 6D Pose Estimation for Any Object

Mengya Liu, Siyuan Li, Ajad Chhatkuli et al.

#2187

JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems

Yifan Wang, Jian Zhao, Zhaoxin Fan et al.

CVPR 2025posterarXiv:2412.16158

#2188

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding

Chenxin Tao, Shiqian Su, Xizhou Zhu et al.

CVPR 2025posterarXiv:2412.05278

#2189

Birth and Death of a Rose

Chen Geng, Yunzhi Zhang, Shangzhe Wu et al.

CVPR 2024posterarXiv:2404.02242

#2190

Towards Robust 3D Pose Transfer with Adversarial Learning

Haoyu Chen, Hao Tang, Ehsan Adeli et al.

CVPR 2024posterarXiv:2312.02136

#2191

BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation

Qihang Zhang, Yinghao Xu, Yujun Shen et al.

CVPR 2025highlightarXiv:2502.20126

#2192

FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute

Sotiris Anagnostidis, Gregor Bachmann, Yeongmin Kim et al.

#2193

Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization

Zhipeng Xu, De Cheng, XINYANG JIANG et al.

CVPR 2025posterarXiv:2504.18856

#2194

Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation

Shahad Albastaki, Anabia Sohail, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2024posterarXiv:2403.17638

#2195

Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency

Xu Yingjie, Bangzhen Liu, Hao Tang et al.

#2196

NightAdapter: Learning a Frequency Adapter for Generalizable Night-time Scene Segmentation

Qi Bi, Jingjun Yi, Huimin Huang et al.

CVPR 2024posterarXiv:2310.12153

#2197

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

Jan-Nico Zaech, Martin Danelljan, Tolga Birdal et al.

CVPR 2025posterarXiv:2411.09998

#2198

Adaptive Non-Uniform Timestep Sampling for Accelerating Diffusion Model Training

Myunsoo Kim, Donghyeon Ki, Seong-Woong Shim et al.

CVPR 2024posterarXiv:2308.12831

#2199

EFormer: Enhanced Transformer towards Semantic-Contour Features of Foreground for Portraits Matting

Zitao Wang, Qiguang Miao, Yue Xi et al.

CVPR 2025posterarXiv:2503.18150

#2200

LongDiff: Training-Free Long Video Generation in One Go

Zhuoling Li, Hossein Rahmani, Qiuhong Ke et al.