Most Cited CVPR "temporal graph metrics" Papers

5,589 papers found • Page 11 of 28

#2001

TSAM: Temporal SAM Augmented with Multimodal Prompts for Referring Audio-Visual Segmentation

Abduljalil Radman, Jorma Laaksonen

CVPR 2025poster
6
citations
#2002

TCFG: Tangential Damping Classifier-free Guidance

Mingi Kwon, Shin seong Kim, Jaeseok Jeong et al.

CVPR 2025posterarXiv:2503.18137
6
citations
#2003

Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild

Junhyeong Cho, Kim Youwang, Hunmin Yang et al.

CVPR 2025posterarXiv:2403.14539
6
citations
#2004

Interpretable Image Classification via Non-parametric Part Prototype Learning

Zhijie Zhu, Lei Fan, Maurice Pagnucco et al.

CVPR 2025posterarXiv:2503.10247
6
citations
#2005

It’s a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data

Dominik Schnaus, Nikita Araslanov, Daniel Cremers

CVPR 2025posterarXiv:2503.24129
6
citations
#2006

FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations

Hmrishav Bandyopadhyay, Yi-Zhe Song

CVPR 2025posterarXiv:2411.10818
6
citations
#2007

Augmented Deep Contexts for Spatially Embedded Video Coding

Yifan Bian, Chuanbo Tang, Li Li et al.

CVPR 2025highlightarXiv:2505.05309
6
citations
#2008

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs

Yangyu Huang, Tianyi Gao, Haoran Xu et al.

CVPR 2025posterarXiv:2501.06184
6
citations
#2009

Rethinking Spiking Self-Attention Mechanism: Implementing α-XNOR Similarity Calculation in Spiking Transformers

Yichen Xiao, Shuai Wang, Dehao Zhang et al.

CVPR 2025poster
6
citations
#2010

Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis

Atefeh Khoshkhahtinat, Ali Zafari, Piyush Mehta et al.

CVPR 2024posterarXiv:2403.16258
6
citations
#2011

RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations

Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim et al.

CVPR 2025posterarXiv:2502.03629
6
citations
#2012

DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction

Ben Kaye, Tomas Jakab, Shangzhe Wu et al.

CVPR 2025highlightarXiv:2412.04464
6
citations
#2013

Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance

Yuto Enyo, Ko Nishino

CVPR 2024highlightarXiv:2312.04529
6
citations
#2014

4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians

Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison

CVPR 2025posterarXiv:2505.22859
6
citations
#2015

Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach

Beichen Zhang, Xiaoxing Wang, Xiaohan Qin et al.

CVPR 2024posterarXiv:2403.11380
6
citations
#2016

Learning to Rank Patches for Unbiased Image Redundancy Reduction

Yang Luo, Zhineng Chen, Peng Zhou et al.

CVPR 2024posterarXiv:2404.00680
6
citations
#2017

RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings

Aayush Dhakal, Srikumar Sastry, Subash Khanal et al.

CVPR 2025posterarXiv:2502.19781
6
citations
#2018

TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

Liangbin Xie, Daniil Pakhomov, Zhonghao Wang et al.

CVPR 2025posterarXiv:2504.00996
6
citations
#2019

Revisiting Sampson Approximations for Geometric Estimation Problems

Felix Rydell, Angelica Torres, Viktor Larsson

CVPR 2024posterarXiv:2401.07114
6
citations
#2020

FluxSpace: Disentangled Semantic Editing in Rectified Flow Models

Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag

CVPR 2025poster
6
citations
#2021

COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection

Jinqi Xiao, Shen Sang, Tiancheng Zhi et al.

CVPR 2025posterarXiv:2412.00071
6
citations
#2022

Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence

Haolin Liu, Xiaohang Zhan, Zizheng Yan et al.

CVPR 2025posterarXiv:2503.21766
6
citations
#2023

Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization

Lahav Lipson, Jia Deng

CVPR 2024posterarXiv:2404.15263
6
citations
#2024

3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation

Gyeongrok Oh, Sung June Kim, Heeju Ko et al.

CVPR 2025posterarXiv:2503.15185
6
citations
#2025

DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection

Jaewoo Song, Daemin Park, Kanghyun Baek et al.

CVPR 2025highlightarXiv:2503.13985
6
citations
#2026

A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation

Andrew Z Wang, Songwei Ge, Tero Karras et al.

CVPR 2025posterarXiv:2506.08210
6
citations
#2027

BF-STVSR: B-Splines and Fourier---Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution

Eunjin Kim, HYEONJIN KIM, Kyong Hwan Jin et al.

CVPR 2025posterarXiv:2501.11043
6
citations
#2028

Guiding Human-Object Interactions with Rich Geometry and Relations

Mengqing Xue, Yifei Liu, Ling Guo et al.

CVPR 2025posterarXiv:2503.20172
6
citations
#2029

StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer

ruojun xu, Weijie Xi, Xiaodi Wang et al.

CVPR 2025highlightarXiv:2501.11319
6
citations
#2030

FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions

Jiong WANG, Fengyu Yang, Bingliang Li et al.

CVPR 2024posterarXiv:2309.05073
6
citations
#2031

Generative Sparse-View Gaussian Splatting

Hanyang Kong, Xingyi Yang, Xinchao Wang

CVPR 2025poster
6
citations
#2032

Multi-modal Knowledge Distillation-based Human Trajectory Forecasting

Jaewoo Jeong, Seohee Lee, Daehee Park et al.

CVPR 2025posterarXiv:2503.22201
6
citations
#2033

SDGOCC: Semantic and Depth-Guided Bird's-Eye View Transformation for 3D Multimodal Occupancy Prediction

ZaiPeng Duan, Xuzhong Hu, Pei An et al.

CVPR 2025posterarXiv:2507.17083
6
citations
#2034

AlphaPre: Amplitude-Phase Disentanglement Model for Precipitation Nowcasting

Kenghong Lin, Baoquan Zhang, Demin Yu et al.

CVPR 2025poster
6
citations
#2035

Fully Geometric Panoramic Localization

Junho Kim, Jiwon Jeong, Young Min Kim

CVPR 2024posterarXiv:2403.19904
6
citations
#2036

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

Mingju Gao, Yike Pan, Huan-ang Gao et al.

CVPR 2025posterarXiv:2503.19913
6
citations
#2037

Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution

Siwei Tu, Ben Fei, Weidong Yang et al.

CVPR 2025highlightarXiv:2502.07814
6
citations
#2038

Federated Learning with Domain Shift Eraser

Zheng Wang, Zihui Wang, Zheng Wang et al.

CVPR 2025posterarXiv:2503.13063
6
citations
#2039

D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

Weinan Jia, Mengqi Huang, Nan Chen et al.

CVPR 2025poster
6
citations
#2040

GroupMamba: Efficient Group-Based Visual State Space Model

Abdelrahman Shaker, Syed Talal Wasim, Salman Khan et al.

CVPR 2025posterarXiv:2407.13772
6
citations
#2041

Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions

He Zhu, Quyu Kong, Kechun Xu et al.

CVPR 2025posterarXiv:2504.04744
6
citations
#2042

How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions

Aditya Prakash, Benjamin E Lundell, Dmitry Andreychuk et al.

CVPR 2025highlightarXiv:2504.12284
6
citations
#2043

Density-guided Translator Boosts Synthetic-to-Real Unsupervised Domain Adaptive Segmentation of 3D Point Clouds

Zhimin Yuan, Wankang Zeng, Yanfei Su et al.

CVPR 2024posterarXiv:2403.18469
6
citations
#2044

Learning from Neighbors: Category Extrapolation for Long-Tail Learning

Shizhen Zhao, Xin Wen, Jiahui Liu et al.

CVPR 2025posterarXiv:2410.15980
5
citations
#2045

Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis

Hanbin Ko, Chang Min Park

CVPR 2025posterarXiv:2505.22079
5
citations
#2046

Audio-Visual Semantic Graph Network for Audio-Visual Event Localization

Liang Liu, Shuaiyong Li, Yongqiang Zhu

CVPR 2025poster
5
citations
#2047

Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine

Zhaohu Xing, Lihao Liu, Yijun Yang et al.

CVPR 2025poster
5
citations
#2048

AniMo: Species-Aware Model for Text-Driven Animal Motion Generation

Xuan Wang, Kai Ruan, Xing Zhang et al.

CVPR 2025poster
5
citations
#2049

SimVS: Simulating World Inconsistencies for Robust View Synthesis

Alex Trevithick, Roni Paiss, Philipp Henzler et al.

CVPR 2025posterarXiv:2412.07696
5
citations
#2050

Learned Scanpaths Aid Blind Panoramic Video Quality Assessment

Kanglong FAN, Wen Wen, Mu Li et al.

CVPR 2024posterarXiv:2404.00252
5
citations
#2051

Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting

Wei Lin, Chenyang ZHAO, Antoni B. Chan

CVPR 2025highlightarXiv:2505.21943
5
citations
#2052

NECA: Neural Customizable Human Avatar

Junjin Xiao, Qing Zhang, Zhan Xu et al.

CVPR 2024posterarXiv:2403.10335
5
citations
#2053

GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation

Ruihai Wu, Ziyu Zhu, Yuran Wang et al.

CVPR 2025posterarXiv:2503.09243
5
citations
#2054

Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection

Zhanwei Zhang, Minghao Chen, Shuai Xiao et al.

CVPR 2024posterarXiv:2404.19384
5
citations
#2055

Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions

Ting-Hsuan Liao, Yi Zhou, Yu Shen et al.

CVPR 2025posterarXiv:2504.03639
5
citations
#2056

Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression

Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai et al.

CVPR 2025poster
5
citations
#2057

Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction

Cheng Sun, Wei-En Tai, Yu-Lin Shih et al.

CVPR 2024posterarXiv:2311.18695
5
citations
#2058

EdgeDiff: Edge-aware Diffusion Network for Building Reconstruction from Point Clouds

Yujun Liu, Ruisheng Wang, Shangfeng Huang et al.

CVPR 2025poster
5
citations
#2059

Boosting Adversarial Transferability through Augmentation in Hypothesis Space

Yu Guo, Weiquan Liu, Qingshan Xu et al.

CVPR 2025poster
5
citations
#2060

HeMoRa: Unsupervised Heuristic Consensus Sampling for Robust Point Cloud Registration

Shaocheng Yan, Yiming Wang, Kaiyan Zhao et al.

CVPR 2025poster
5
citations
#2061

FutureHuman3D: Forecasting Complex Long-Term 3D Human Behavior from Video Observations

Christian Diller, Thomas Funkhouser, Angela Dai

CVPR 2024posterarXiv:2211.14309
5
citations
#2062

FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation

Tianyun Zhong, Chao Liang, Jianwen Jiang et al.

CVPR 2025posterarXiv:2412.16915
5
citations
#2063

Efficient ANN-Guided Distillation: Aligning Rate-based Features of Spiking Neural Networks through Hybrid Block-wise Replacement

Shu Yang, Chengting Yu, Lei Liu et al.

CVPR 2025posterarXiv:2503.16572
5
citations
#2064

IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments

Can Zhang, Gim Hee Lee

CVPR 2025posterarXiv:2504.06827
5
citations
#2065

Exploiting Temporal State Space Sharing for Video Semantic Segmentation

Hesham Syed, Yun Liu, Guolei Sun et al.

CVPR 2025posterarXiv:2503.20824
5
citations
#2066

Towards RAW Object Detection in Diverse Conditions

Zhong-Yu Li, Xin Jin, Bo-Yuan Sun et al.

CVPR 2025highlightarXiv:2411.15678
5
citations
#2067

Complementary Advantages: Exploiting Cross-Field Frequency Correlation for NIR-Assisted Image Denoising

Yuchen Wang, Hongyuan Wang, Lizhi Wang et al.

CVPR 2025posterarXiv:2412.16645
5
citations
#2068

Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis

Tianci Bi, Xiaoyi Zhang, Zhizheng Zhang et al.

CVPR 2024posterarXiv:2405.07481
5
citations
#2069

High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model

Mingtao Guo, Guanyu Xing, Yanli Liu

CVPR 2025posterarXiv:2502.19894
5
citations
#2070

DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering

Yihao Wang, Marcus Klasson, Matias Turkulainen et al.

CVPR 2025posterarXiv:2411.19756
5
citations
#2071

D^3-Human: Dynamic Disentangled Digital Human from Monocular Video

Honghu Chen, Bo Peng, Yunfan Tao et al.

CVPR 2025posterarXiv:2501.01589
5
citations
#2072

Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements

Niccolò Biondi, Federico Pernici, Simone Ricci et al.

CVPR 2024highlightarXiv:2405.02581
5
citations
#2073

Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis

Zixuan Wang, DUO PENG, Feng Chen et al.

CVPR 2025posterarXiv:2504.01515
5
citations
#2074

SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes

Soubhik Sanyal, Partha Ghosh, Jinlong Yang et al.

CVPR 2024posterarXiv:2308.10638
5
citations
#2075

Unsegment Anything by Simulating Deformation

Jiahao Lu, Xingyi Yang, Xinchao Wang

CVPR 2024posterarXiv:2404.02585
5
citations
#2076

ShapeWalk: Compositional Shape Editing Through Language-Guided Chains

Habib Slim, Mohamed Elhoseiny

CVPR 2024poster
5
citations
#2077

On the Consistency of Video Large Language Models in Temporal Comprehension

Minjoon Jung, Junbin Xiao, Byoung-Tak Zhang et al.

CVPR 2025posterarXiv:2411.12951
5
citations
#2078

MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond

Shenghao Ren, Yi Lu, Jiayi Huang et al.

CVPR 2025highlightarXiv:2504.05046
5
citations
#2079

A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition

Duosheng Chen, Shihao Zhou, Jinshan Pan et al.

CVPR 2025highlight
5
citations
#2080

Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking

Junxi Chen, Junhao Dong, Xiaohua Xie

CVPR 2025highlightarXiv:2504.05838
5
citations
#2081

One is Plenty: A Polymorphic Feature Interpreter for Immutable Heterogeneous Collaborative Perception

Yuchen Xia, Quan Yuan, Guiyang Luo et al.

CVPR 2025posterarXiv:2411.16799
5
citations
#2082

ProtoDepth: Unsupervised Continual Depth Completion with Prototypes

Patrick Rim, Hyoungseob Park, Suchisrit Gangopadhyay et al.

CVPR 2025posterarXiv:2503.12745
5
citations
#2083

ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models

Fei Kong, Jinhao Duan, Lichao Sun et al.

CVPR 2024posterarXiv:2311.14097
5
citations
#2084

AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models

Sohan Patnaik, Rishabh Jain, Balaji Krishnamurthy et al.

CVPR 2025posterarXiv:2503.00591
5
citations
#2085

ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping

Shun Iwase, Muhammad Zubair Irshad, Katherine Liu et al.

CVPR 2025posterarXiv:2504.10857
5
citations
#2086

Hyperspherical Classification with Dynamic Label-to-Prototype Assignment

Mohammad Saadabadi Saadabadi, Ali Dabouei, Sahar Rahimi Malakshan et al.

CVPR 2024posterarXiv:2403.16937
5
citations
#2087

Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis

Yousef Yeganeh, Ioannis Charisiadis, Marta Hasny et al.

CVPR 2025highlightarXiv:2412.20651
5
citations
#2088

DiaLoc: An Iterative Approach to Embodied Dialog Localization

Chao Zhang, Mohan Li, Ignas Budvytis et al.

CVPR 2024posterarXiv:2403.06846
5
citations
#2089

DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations

Krishna Sri Ipsit Mantri, Carola-Bibiane Schönlieb, Bruno Ribeiro et al.

CVPR 2025posterarXiv:2502.06029
5
citations
#2090

GIFStream: 4D Gaussian-based Immersive Video with Feature Stream

Hao Li, Sicheng Li, Xiang Gao et al.

CVPR 2025posterarXiv:2505.07539
5
citations
#2091

Intensity-Robust Autofocus for Spike Camera

Changqing Su, Zhiyuan Ye, Yongsheng Xiao et al.

CVPR 2024poster
5
citations
#2092

Self-Learning Hyperspectral and Multispectral Image Fusion via Adaptive Residual Guided Subspace Diffusion Model

Jian Zhu, He Wang, Yang Xu et al.

CVPR 2025posterarXiv:2505.11800
5
citations
#2093

Makeup Prior Models for 3D Facial Makeup Estimation and Applications

Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.

CVPR 2024posterarXiv:2403.17761
5
citations
#2094

Neural Motion Simulator Pushing the Limit of World Models in Reinforcement Learning

Chenjie Hao, Weyl Lu, Yifan Xu et al.

CVPR 2025posterarXiv:2504.07095
5
citations
#2095

Mosaic of Modalities: A Comprehensive Benchmark for Multimodal Graph Learning

Jing Zhu, Yuhang Zhou, Shengyi Qian et al.

CVPR 2025posterarXiv:2406.16321
5
citations
#2096

FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity

Jinxi Li, Ziyang Song, Siyuan Zhou et al.

CVPR 2025posterarXiv:2506.07865
5
citations
#2097

Robust Message Embedding via Attention Flow-Based Steganography

Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.

CVPR 2025posterarXiv:2405.16414
5
citations
#2098

Multi-modal Vision Pre-training for Medical Image Analysis

Shaohao Rui, Lingzhi Chen, Zhenyu Tang et al.

CVPR 2025highlightarXiv:2410.10604
5
citations
#2099

AIpparel: A Multimodal Foundation Model for Digital Garments

Kiyohiro Nakayama, Jan Ackermann, Timur Levent Kesdogan et al.

CVPR 2025highlightarXiv:2412.03937
5
citations
#2100

ITA-MDT: Image-Timestep-Adaptive Masked Diffusion Transformer Framework for Image-Based Virtual Try-On

Ji Woo Hong, Tri Ton, Trung X. Pham et al.

CVPR 2025posterarXiv:2503.20418
5
citations
#2101

UHD-processer: Unified UHD Image Restoration with Progressive Frequency Learning and Degradation-aware Prompts

Yidi Liu, Dong Li, Xueyang Fu et al.

CVPR 2025poster
5
citations
#2102

NoPain: No-box Point Cloud Attack via Optimal Transport Singular Boundary

Zezeng Li, Xiaoyu Du, Na Lei et al.

CVPR 2025posterarXiv:2503.00063
5
citations
#2103

Revisiting Source-Free Domain Adaptation: Insights into Representativeness, Generalization, and Variety

Ronghang Zhu, Mengxuan Hu, Weiming Zhuang et al.

CVPR 2025poster
5
citations
#2104

Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving

Alexey Nekrasov, Malcolm Burdorf, Stewart Worrall et al.

CVPR 2025posterarXiv:2505.02148
5
citations
#2105

SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis

Junho Kim, Hyunjun Kim, Hosu Lee et al.

CVPR 2025posterarXiv:2411.16173
5
citations
#2106

ReNeg: Learning Negative Embedding with Reward Guidance

Xiaomin Li, yixuan liu, Takashi Isobe et al.

CVPR 2025highlightarXiv:2412.19637
5
citations
#2107

CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss

Dileepa Pitawela, Gustavo Carneiro, Hsiang-Ting Chen

CVPR 2025posterarXiv:2504.17813
5
citations
#2108

Edit One for All: Interactive Batch Image Editing

Thao Nguyen, Utkarsh Ojha, Yuheng Li et al.

CVPR 2024posterarXiv:2401.10219
5
citations
#2109

HandOS: 3D Hand Reconstruction in One Stage

Xingyu Chen, Zhuheng Song, Xiaoke Jiang et al.

CVPR 2025posterarXiv:2412.01537
5
citations
#2110

InteractionMap: Improving Online Vectorized HDMap Construction with Interaction

Kuang Wu, Chuan Yang, Zhanbin Li

CVPR 2025posterarXiv:2503.21659
5
citations
#2111

Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

Yuqi Zhang, Guanying Chen, Jiaxing Chen et al.

CVPR 2024posterarXiv:2403.11812
5
citations
#2112

RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments

Haisheng Su, Feixiang Song, CONG MA et al.

CVPR 2025posterarXiv:2408.15503
5
citations
#2113

Efficient Hyperparameter Optimization with Adaptive Fidelity Identification

Jiantong Jiang, Zeyi Wen, Atif Mansoor et al.

CVPR 2024poster
5
citations
#2114

V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents

Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.

CVPR 2025posterarXiv:2503.12077
5
citations
#2115

Binarized Neural Network for Multi-spectral Image Fusion

Junming Hou, Xiaoyu Chen, Ran Ran et al.

CVPR 2025poster
5
citations
#2116

Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild

Wei Liu, Yufei Chen, Xiaodong Yue

CVPR 2025poster
5
citations
#2117

Pos3R: 6D Pose Estimation for Unseen Objects Made Easy

Weijian Deng, Dylan Campbell, Chunyi Sun et al.

CVPR 2025poster
5
citations
#2118

CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion

Kai He, Chin-Hsuan Wu, Igor Gilitschenski

CVPR 2025posterarXiv:2412.01792
5
citations
#2119

Omnidirectional Multi-Object Tracking

Kai Luo, Hao Shi, Sheng Wu et al.

CVPR 2025posterarXiv:2503.04565
5
citations
#2120

A Unified Framework for Human-centric Point Cloud Video Understanding

Yiteng Xu, Kecheng Ye, xiao han et al.

CVPR 2024posterarXiv:2403.20031
5
citations
#2121

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Ning Gao, Yilun Chen, Shuai Yang et al.

CVPR 2025posterarXiv:2506.10966
5
citations
#2122

ProbeSDF: Light Field Probes For Neural Surface Reconstruction

Briac Toussaint, Diego Thomas, Jean-Sébastien Franco

CVPR 2025posterarXiv:2412.10084
5
citations
#2123

Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers

Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris et al.

CVPR 2025posterarXiv:2501.08303
5
citations
#2124

Dynamic Stereotype Theory Induced Micro-expression Recognition with Oriented Deformation

Bohao Zhang, Xuejiao Wang, Changbo Wang et al.

CVPR 2025poster
5
citations
#2125

Robust Self-calibration of Focal Lengths from the Fundamental Matrix

Viktor Kocur, Daniel Kyselica, Zuzana Kukelova

CVPR 2024posterarXiv:2311.16304
5
citations
#2126

FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing

Yufan Ren, Zicong Jiang, Tong Zhang et al.

CVPR 2025posterarXiv:2503.19191
5
citations
#2127

Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection

Boyong He, Yuxiang Ji, Qianwen Ye et al.

CVPR 2025posterarXiv:2503.02101
5
citations
#2128

OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad

Luyao Tang, Chaoqi Chen, Yuxuan Yuan et al.

CVPR 2025posterarXiv:2503.18695
5
citations
#2129

Visual Objectification in Films: Towards a New AI Task for Video Interpretation

Julie Tores, Lucile Sassatelli, Hui-Yin Wu et al.

CVPR 2024highlightarXiv:2401.13296
5
citations
#2130

Learning Visual Generative Priors without Text

Shuailei Ma, Kecheng Zheng, Ying Wei et al.

CVPR 2025posterarXiv:2412.07767
5
citations
#2131

MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views

Antoine Guédon, Tomoki Ichikawa, Kohei Yamashita et al.

CVPR 2025highlightarXiv:2412.06767
5
citations
#2132

HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics

Jongsung Lee, HARIN PARK, Byeong-Uk Lee et al.

CVPR 2025poster
5
citations
#2133

LUCAS: Layered Universal Codec Avatars

Di Liu, Teng Deng, Giljoo Nam et al.

CVPR 2025posterarXiv:2502.19739
5
citations
#2134

Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

CVPR 2025posterarXiv:2503.15404
5
citations
#2135

Detection-Friendly Nonuniformity Correction: A Union Framework for Infrared UAV Target Detection

Houzhang Fang, Xiaolin Wang, Zengyang Li et al.

CVPR 2025highlight
5
citations
#2136

Context-Enhanced Memory-Refined Transformer for Online Action Detection

Zhanzhong Pang, Fadime Sener, Angela Yao

CVPR 2025posterarXiv:2503.18359
5
citations
#2137

Projecting Trackable Thermal Patterns for Dynamic Computer Vision

Mark Sheinin, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan

CVPR 2024poster
5
citations
#2138

Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification

Jiayu Jiang, Changxing Ding, Wentao Tan et al.

CVPR 2025highlightarXiv:2503.09962
5
citations
#2139

Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing

Zhedong Zhang, Liang Li, Chenggang Yan et al.

CVPR 2025posterarXiv:2503.12042
5
citations
#2140

On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach

Baoshun Tong, Hanjiang Lai, Yan Pan et al.

CVPR 2025poster
5
citations
#2141

3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

Yihang Luo, Shangchen Zhou, Yushi Lan et al.

CVPR 2025posterarXiv:2412.18565
5
citations
#2142

Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization

lingyun zhang, Yu Xie, Yanwei Fu et al.

CVPR 2025posterarXiv:2412.01244
5
citations
#2143

Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2025poster
5
citations
#2144

CoMBO: Conflict Mitigation via Branched Optimization for Class Incremental Segmentation

Kai Fang, Anqi Zhang, Guangyu Gao et al.

CVPR 2025posterarXiv:2504.04156
5
citations
#2145

Question-Aware Gaussian Experts for Audio-Visual Question Answering

Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.

CVPR 2025highlightarXiv:2503.04459
5
citations
#2146

Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold Network

Xingyu Qiu, Mengying Yang, Xinghua Ma et al.

CVPR 2025posterarXiv:2502.19754
5
citations
#2147

OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging

Yijie Tang, Jiazhao Zhang, Yuqing Lan et al.

CVPR 2025posterarXiv:2503.01309
5
citations
#2148

Learning to Navigate Efficiently and Precisely in Real Environments

Guillaume Bono, Hervé Poirier, Leonid Antsfeld et al.

CVPR 2024posterarXiv:2401.14349
5
citations
#2149

Object-aware Sound Source Localization via Audio-Visual Scene Understanding

Sung Jin Um, Dongjin Kim, Sangmin Lee et al.

CVPR 2025posterarXiv:2506.18557
5
citations
#2150

PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation

Zidong Cao, Jinjing Zhu, Weiming Zhang et al.

CVPR 2025posterarXiv:2406.13378
5
citations
#2151

Cropper: Vision-Language Model for Image Cropping through In-Context Learning

Seung Hyun Lee, Jijun jiang, Yiran Xu et al.

CVPR 2025posterarXiv:2408.07790
5
citations
#2152

Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis

Woojung Han, Yeonkyung Lee, Chanyoung Kim et al.

CVPR 2025posterarXiv:2503.22168
5
citations
#2153

Interpretable Generative Models through Post-hoc Concept Bottlenecks

Akshay R. Kulkarni, Ge Yan, Chung-En Sun et al.

CVPR 2025posterarXiv:2503.19377
5
citations
#2154

Distilled Prompt Learning for Incomplete Multimodal Survival Prediction

Yingxue Xu, Fengtao ZHOU, Chenyu Zhao et al.

CVPR 2025posterarXiv:2503.01653
5
citations
#2155

Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input

Jian Wang, Rishabh Dabral, Diogo Luvizon et al.

CVPR 2025posterarXiv:2504.08449
5
citations
#2156

IterIS: Iterative Inference-Solving Alignment for LoRA Merging

Hongxu chen, Zhen Wang, Runshi Li et al.

CVPR 2025posterarXiv:2411.15231
5
citations
#2157

Fully Exploiting Every Real Sample: SuperPixel Sample Gradient Model Stealing

Yunlong Zhao, Xiaoheng Deng, Yijing Liu et al.

CVPR 2024posterarXiv:2406.18540
5
citations
#2158

AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

Haonan Han, Xiangzuo Wu, Huan Liao et al.

CVPR 2025posterarXiv:2411.18654
5
citations
#2159

EchoONE: Segmenting Multiple Echocardiography Planes in One Model

Jiongtong Hu, Wei Zhuo, Jun Cheng et al.

CVPR 2025posterarXiv:2412.02993
5
citations
#2160

Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection

Gensheng Pei, Tao Chen, Yujia Wang et al.

CVPR 2025posterarXiv:2503.17080
5
citations
#2161

Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views

Chong Bao, Xiyu Zhang, Zehao Yu et al.

CVPR 2025posterarXiv:2503.24382
5
citations
#2162

4Deform: Neural Surface Deformation for Robust Shape Interpolation

Lu Sang, Zehranaz Canfes, Dongliang Cao et al.

CVPR 2025posterarXiv:2502.20208
5
citations
#2163

DecoupledGaussian: Object-Scene Decoupling for Physics-Based Interaction

Miaowei Wang, Yibo Zhang, Rui Ma et al.

CVPR 2025posterarXiv:2503.05484
5
citations
#2164

Universal Robustness via Median Randomized Smoothing for Real-World Super-Resolution

Zakariya Chaouai, Mohamed Tamaazousti

CVPR 2024posterarXiv:2405.14934
5
citations
#2165

Multi-party Collaborative Attention Control for Image Customization

Han Yang, Chuanguang Yang, Qiuli Wang et al.

CVPR 2025posterarXiv:2505.01428
5
citations
#2166

From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport

Quentin Bouniot, Ievgen Redko, Anton Mallasto et al.

CVPR 2025posterarXiv:2310.11439
5
citations
#2167

Hardware-Rasterized Ray-Based Gaussian Splatting

Samuel Rota Bulò, Lorenzo Porzi, Nemanja Bartolovic et al.

CVPR 2025highlightarXiv:2503.18682
5
citations
#2168

Flow-Guided Online Stereo Rectification for Wide Baseline Stereo

Anush Kumar, Fahim Mannan, Omid Hosseini Jafari et al.

CVPR 2024poster
5
citations
#2169

Removing Reflections from RAW Photos

Eric Kee, Adam Pikielny, Kevin Blackburn-Matzen et al.

CVPR 2025posterarXiv:2404.14414
5
citations
#2170

Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts

Qizhou Chen, Chengyu Wang, Dakan Wang et al.

CVPR 2025posterarXiv:2411.15432
5
citations
#2171

Learning Heterogeneous Tissues with Mixture of Experts for Gigapixel Whole Slide Images

Junxian Wu, Minheng Chen, Xinyi Ke et al.

CVPR 2025poster
5
citations
#2172

Semantic and Expressive Variations in Image Captions Across Languages

Andre Ye, Sebastin Santy, Jena D. Hwang et al.

CVPR 2025posterarXiv:2310.14356
5
citations
#2173

UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models

Yuning Han, Bingyin Zhao, Rui Chu et al.

CVPR 2025highlightarXiv:2412.11441
5
citations
#2174

SocialGesture: Delving into Multi-person Gesture Understanding

Xu Cao, Pranav Virupaksha, Wenqi Jia et al.

CVPR 2025posterarXiv:2504.02244
5
citations
#2175

TFCustom: Customized Image Generation with Time-Aware Frequency Feature Guidance

Mushui Liu, Dong She, Qihan Huang et al.

CVPR 2025highlight
5
citations
#2176

Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Xin Yan, Yuxuan Cai, Qiuyue Wang et al.

CVPR 2025posterarXiv:2412.01316
5
citations
#2177

FaceLift: Semi-supervised 3D Facial Landmark Localization

David Ferman, Pablo Garrido, Gaurav Bharaj

CVPR 2024posterarXiv:2405.19646
5
citations
#2178

Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion

ZhiFei Chen, Tianshuo Xu, Wenhang Ge et al.

CVPR 2025posterarXiv:2412.15050
5
citations
#2179

Improving Transferable Targeted Attacks with Feature Tuning Mixup

Kaisheng Liang, Xuelong Dai, Yanjie Li et al.

CVPR 2025posterarXiv:2411.15553
5
citations
#2180

Unleashing Network Potentials for Semantic Scene Completion

Fengyun Wang, Qianru Sun, Dong Zhang et al.

CVPR 2024posterarXiv:2403.07560
5
citations
#2181

From Elements to Design: A Layered Approach for Automatic Graphic Design Composition

Jiawei Lin, Shizhao Sun, Danqing Huang et al.

CVPR 2025posterarXiv:2412.19712
5
citations
#2182

DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation

Mu Chen, Liulei Li, Wenguan Wang et al.

CVPR 2025posterarXiv:2503.13957
5
citations
#2183

Open-World Objectness Modeling Unifies Novel Object Detection

Shan Zhang, Yao Ni, Jinhao Du et al.

CVPR 2025poster
5
citations
#2184

Normalizing Flows on the Product Space of SO(3) Manifolds for Probabilistic Human Pose Modeling

Olaf Dünkel, Tim Salzmann, Florian Pfaff

CVPR 2024posterarXiv:2404.05675
5
citations
#2185

Generalizable Face Landmarking Guided by Conditional Face Warping

Jiayi Liang, Haotian Liu, Hongteng Xu et al.

CVPR 2024posterarXiv:2404.12322
5
citations
#2186

One2Any: One-Reference 6D Pose Estimation for Any Object

Mengya Liu, Siyuan Li, Ajad Chhatkuli et al.

CVPR 2025posterarXiv:2505.04109
5
citations
#2187

JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems

Yifan Wang, Jian Zhao, Zhaoxin Fan et al.

CVPR 2025poster
5
citations
#2188

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding

Chenxin Tao, Shiqian Su, Xizhou Zhu et al.

CVPR 2025posterarXiv:2412.16158
5
citations
#2189

Birth and Death of a Rose

Chen Geng, Yunzhi Zhang, Shangzhe Wu et al.

CVPR 2025posterarXiv:2412.05278
5
citations
#2190

Towards Robust 3D Pose Transfer with Adversarial Learning

Haoyu Chen, Hao Tang, Ehsan Adeli et al.

CVPR 2024posterarXiv:2404.02242
5
citations
#2191

BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation

Qihang Zhang, Yinghao Xu, Yujun Shen et al.

CVPR 2024posterarXiv:2312.02136
5
citations
#2192

FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute

Sotiris Anagnostidis, Gregor Bachmann, Yeongmin Kim et al.

CVPR 2025highlightarXiv:2502.20126
5
citations
#2193

Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization

Zhipeng Xu, De Cheng, XINYANG JIANG et al.

CVPR 2025poster
5
citations
#2194

Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation

Shahad Albastaki, Anabia Sohail, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2025posterarXiv:2504.18856
5
citations
#2195

Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency

Xu Yingjie, Bangzhen Liu, Hao Tang et al.

CVPR 2024posterarXiv:2403.17638
5
citations
#2196

NightAdapter: Learning a Frequency Adapter for Generalizable Night-time Scene Segmentation

Qi Bi, Jingjun Yi, Huimin Huang et al.

CVPR 2025poster
5
citations
#2197

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

Jan-Nico Zaech, Martin Danelljan, Tolga Birdal et al.

CVPR 2024posterarXiv:2310.12153
5
citations
#2198

Adaptive Non-Uniform Timestep Sampling for Accelerating Diffusion Model Training

Myunsoo Kim, Donghyeon Ki, Seong-Woong Shim et al.

CVPR 2025posterarXiv:2411.09998
5
citations
#2199

EFormer: Enhanced Transformer towards Semantic-Contour Features of Foreground for Portraits Matting

Zitao Wang, Qiguang Miao, Yue Xi et al.

CVPR 2024posterarXiv:2308.12831
5
citations
#2200

LongDiff: Training-Free Long Video Generation in One Go

Zhuoling Li, Hossein Rahmani, Qiuhong Ke et al.

CVPR 2025posterarXiv:2503.18150
5
citations