Most Cited CVPR "markov chain estimation" Papers

5,589 papers found • Page 17 of 28

#3201

Compositional Video Understanding with Spatiotemporal Structure-based Transformers

Hoyeoung Yun, Jinwoo Ahn, Minseo Kim et al.

CVPR 2024poster
#3202

RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models

Greg Heinrich, Mike Ranzinger, Danny Yin et al.

CVPR 2025posterarXiv:2412.07679
#3203

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

Junshu Tang, Yanhong Zeng, Ke Fan et al.

CVPR 2024posterarXiv:2403.16897
#3204

Font-Agent: Enhancing Font Understanding with Large Language Models

Yingxin Lai, Cuijie Xu, Haitian Shi et al.

CVPR 2025poster
#3205

Multi-Modal Contrastive Masked Autoencoders: A Two-Stage Progressive Pre-training Approach for RGBD Datasets

Muhammad Abdullah Jamal, Omid Mohareri

CVPR 2025poster
#3206

Coherent Temporal Synthesis for Incremental Action Segmentation

Guodong Ding, Hans Golong, Angela Yao

CVPR 2024posterarXiv:2403.06102
#3207

Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing

ChangHee Yang, ChanHee Kang, Kyeongbo Kong et al.

CVPR 2024poster
#3208

MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling

Yifang Men, Yuan Yao, Miaomiao Cui et al.

CVPR 2025posterarXiv:2409.16160
#3209

Estimating Extreme 3D Image Rotations using Cascaded Attention

Shay Dekel, Yosi Keller, Martin Čadík

CVPR 2024poster
#3210

Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network

Yong Shu, Liquan Shen, Xiangyu Hu et al.

CVPR 2024posterarXiv:2405.00244
#3211

Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular Stereo and RGB-D Cameras

Huajian Huang, Longwei Li, Hui Cheng et al.

CVPR 2024posterarXiv:2311.16728
#3212

Attention Calibration for Disentangled Text-to-Image Personalization

Yanbing Zhang, Mengping Yang, Qin Zhou et al.

CVPR 2024posterarXiv:2403.18551
#3213

SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control

Jaskirat Singh, Jianming Zhang, Qing Liu et al.

CVPR 2024posterarXiv:2312.05039
#3214

GraCo: Granularity-Controllable Interactive Segmentation

Yian Zhao, Kehan Li, Zesen Cheng et al.

CVPR 2024highlightarXiv:2405.00587
#3215

Segment Every Out-of-Distribution Object

Wenjie Zhao, Jia Li, Xin Dong et al.

CVPR 2024posterarXiv:2311.16516
#3216

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.

CVPR 2024highlightarXiv:2312.06640
#3217

Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Fanghua Yu, Jinjin Gu, Zheyuan Li et al.

CVPR 2024posterarXiv:2401.13627
#3218

Masked and Shuffled Blind Spot Denoising for Real-World Images

Hamadi Chihaoui, Paolo Favaro

CVPR 2024posterarXiv:2404.09389
#3219

STINR: Deciphering Spatial Transcriptomics via Implicit Neural Representation

Yisi Luo, Xile Zhao, Kai Ye et al.

CVPR 2025poster
#3220

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

CVPR 2024highlightarXiv:2312.00690
#3221

Generative Region-Language Pretraining for Open-Ended Object Detection

Chuang Lin, Yi Jiang, Lizhen Qu et al.

CVPR 2024posterarXiv:2403.10191
#3222

3D-SLNR: A Super Lightweight Neural Representation for Large-scale 3D Mapping

Chenhui Shi, Fulin Tang, Ning An et al.

CVPR 2025poster
#3223

Boosting Diffusion Models with Moving Average Sampling in Frequency Domain

Yurui Qian, Qi Cai, Yingwei Pan et al.

CVPR 2024posterarXiv:2403.17870
#3224

R-SCoRe: Revisiting Scene Coordinate Regression for Robust Large-Scale Visual Localization

Xudong Jiang, Fangjinhua Wang, Silvano Galliani et al.

CVPR 2025posterarXiv:2501.01421
#3225

A Unified Image-Dense Annotation Generation Model for Underwater Scenes

Hongkai Lin, Dingkang Liang, Zhenghao Qi et al.

CVPR 2025posterarXiv:2503.21771
#3226

Discovering Syntactic Interaction Clues for Human-Object Interaction Detection

Jinguo Luo, Weihong Ren, Weibo Jiang et al.

CVPR 2024poster
#3227

Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture

Juanwu Lu, Can Cui, Yunsheng Ma et al.

CVPR 2024posterarXiv:2404.03789
#3228

Generative Latent Coding for Ultra-Low Bitrate Image Compression

Zhaoyang Jia, Jiahao Li, Bin Li et al.

CVPR 2024posterarXiv:2512.20194
#3229

Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization

Jimyeong Kim, Jungwon Park, Wonjong Rhee

CVPR 2024posterarXiv:2403.15330
#3230

FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

Alice Heiman, Xiaoman Zhang, Emma Chen et al.

CVPR 2025posterarXiv:2411.18672
#3231

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

Yaxu Xie, Alain Pagani, Didier Stricker

CVPR 2024posterarXiv:2403.19474
#3232

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

Zongjian Li, Bin Lin, Yang Ye et al.

CVPR 2025posterarXiv:2411.17459
#3233

Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction

Dong Li, Wenqi Zhong, Wei Yu et al.

CVPR 2025posterarXiv:2505.16980
#3234

Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features

Thomas Wimmer, Peter Wonka, Maks Ovsjanikov

CVPR 2024posterarXiv:2311.18113
#3235

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Zeyi Sun, Ye Fang, Tong Wu et al.

CVPR 2024posterarXiv:2312.03818
#3236

DemoFusion: Democratising High-Resolution Image Generation With No $$$

Ruoyi DU, Dongliang Chang, Timothy Hospedales et al.

CVPR 2024posterarXiv:2311.16973
#3237

Activity-Biometrics: Person Identification from Daily Activities

Shehreen Azad, Yogesh S. Rawat

CVPR 2024posterarXiv:2403.17360
#3238

Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras

Ashwath Shetty, Marc Habermann, Guoxing Sun et al.

CVPR 2024posterarXiv:2312.07423
#3239

Neighbor Relations Matter in Video Scene Detection

Jiawei Tan, Hongxing Wang, Jiaxin Li et al.

CVPR 2024poster
#3240

Temporally Consistent Object-Centric Learning by Contrasting Slots

Anna Manasyan, Maximilian Seitzer, Filip Radovic et al.

CVPR 2025posterarXiv:2412.14295
#3241

Fast ODE-based Sampling for Diffusion Models in Around 5 Steps

Zhenyu Zhou, Defang Chen, Can Wang et al.

CVPR 2024highlightarXiv:2312.00094
#3242

Referring Image Editing: Object-level Image Editing via Referring Expressions

Chang Liu, Xiangtai Li, Henghui Ding

CVPR 2024poster
#3243

InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields

Dongqing Wang, Tong Zhang, Alaa Abboud et al.

CVPR 2024posterarXiv:2305.15094
#3244

SET: Spectral Enhancement for Tiny Object Detection

Huixin Sun, Runqi Wang, Yanjing Li et al.

CVPR 2025poster
#3245

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Jianing "Jed" Yang, Alexander Sax, Kevin Liang et al.

CVPR 2025posterarXiv:2501.13928
#3246

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.

CVPR 2024posterarXiv:2312.10118
#3247

Unsupervised Blind Image Deblurring Based on Self-Enhancement

Lufei Chen, Xiangpeng Tian, Shuhua Xiong et al.

CVPR 2024poster
#3248

Mask Grounding for Referring Image Segmentation

Yong Xien Chng, Henry Zheng, Yizeng Han et al.

CVPR 2024posterarXiv:2312.12198
#3249

SignGraph: A Sign Sequence is Worth Graphs of Nodes

Shiwei Gan, Yafeng Yin, Zhiwei Jiang et al.

CVPR 2024poster
#3250

Illumination Spectrum Estimation for Multispectral Images via Surface Reflectance Modeling and Spatial-Spectral Feature Generation

Hyejin Oh, Woo-Shik Kim, Sangyoon Lee et al.

CVPR 2025poster
#3251

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion

Zixian Gao, Xun Jiang, Xing Xu et al.

CVPR 2024poster
#3252

DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching

Shuzhe Wang, Juho Kannala, Daniel Barath

CVPR 2024posterarXiv:2306.12547
#3253

Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain et al.

CVPR 2024posterarXiv:2403.07214
#3254

FreeDrag: Feature Dragging for Reliable Point-based Image Editing

Pengyang Ling, Lin Chen, Pan Zhang et al.

CVPR 2024posterarXiv:2307.04684
#3255

Towards More General Video-based Deepfake Detection through Facial Component Guided Adaptation for Foundation Model

Yue-Hua Han, Tai-Ming Huang, Kailung Hua et al.

CVPR 2025posterarXiv:2404.05583
#3256

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

Yang Yue, Yulin Wang, Haojun Jiang et al.

CVPR 2025posterarXiv:2504.13065
#3257

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

Xiaoding Yuan, Shitao Tang, Kejie Li et al.

CVPR 2025posterarXiv:2407.07174
#3258

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders

jiajun cao, Yuan Zhang, Tao Huang et al.

CVPR 2025posterarXiv:2501.01709
#3259

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Yushi Huang, Ruihao Gong, Jing Liu et al.

CVPR 2024highlightarXiv:2311.16503
#3260

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

Shenhan Qian, Tobias Kirschstein, Liam Schoneveld et al.

CVPR 2024highlightarXiv:2312.02069
#3261

DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation

Tianyi Yan, Dongming Wu, Wencheng Han et al.

CVPR 2025posterarXiv:2411.11252
#3262

Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users

Daniela Massiceti, Camilla Longden, Agnieszka Słowik et al.

CVPR 2024posterarXiv:2311.17315
#3263

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

Yanting Wang, Hongye Fu, Wei Zou et al.

CVPR 2024posterarXiv:2403.19080
#3264

Rethinking the Adversarial Robustness of Multi-Exit Neural Networks in an Attack-Defense Game

Keyizhi Xu, Chi Zhang, Zhan Chen et al.

CVPR 2025poster
#3265

EntropyMark: Towards More Harmless Backdoor Watermark via Entropy-based Constraint for Open-source Dataset Copyright Protection

Ming Sun, Rui Wang, Zixuan Zhu et al.

CVPR 2025poster
#3266

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

Hanrong Ye, Dan Xu

CVPR 2024posterarXiv:2403.15389
#3267

Revisiting Spatial-Frequency Information Integration from a Hierarchical Perspective for Panchromatic and Multi-Spectral Image Fusion

Jiangtong Tan, Jie Huang, Naishan Zheng et al.

CVPR 2024poster
#3268

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Xin Jin, Haisheng Su, Kai Liu et al.

CVPR 2025posterarXiv:2503.12009
#3269

FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding

Jinglin Xu, Guohao Zhao, Sibo Yin et al.

CVPR 2024poster
#3270

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

Matteo Farina, Massimiliano Mancini, Elia Cunegatti et al.

CVPR 2024posterarXiv:2404.05621
#3271

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization

Guopeng Li, Ming Qian, Gui-Song Xia

CVPR 2024posterarXiv:2403.14198
#3272

VolFormer: Explore More Comprehensive Cube Interaction for Hyperspectral Image Restoration and Beyond

Dabing Yu, Zheng Gao

CVPR 2025poster
#3273

CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition

Qixuan Zheng, Ming Zhang, Hong Yan

CVPR 2024posterarXiv:2402.16594
#3274

FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning

Qiwei Li, Yuxin Peng, Jiahuan Zhou

CVPR 2024poster
#3275

GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis

You Wang, Li Fang, Hao Zhu et al.

CVPR 2025posterarXiv:2505.19813
#3276

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation

Yi Zhang, Meng-Hao Guo, Miao Wang et al.

CVPR 2024poster
#3277

GALA: Generating Animatable Layered Assets from a Single Scan

Taeksoo Kim, Byungjun Kim, Shunsuke Saito et al.

CVPR 2024posterarXiv:2401.12979
#3278

Improving Graph Contrastive Learning via Adaptive Positive Sampling

Jiaming Zhuo, Feiyang Qin, Can Cui et al.

CVPR 2024poster
#3279

Hearing Anything Anywhere

Mason Wang, Ryosuke Sawata, Samuel Clarke et al.

CVPR 2024posterarXiv:2406.07532
#3280

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.

CVPR 2024poster
#3281

SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow

Qingyuan Wang, Rui Song, Jiaojiao Li et al.

CVPR 2025posterarXiv:2504.09160
#3282

Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation

Hyunwoo Ryu, Jiwoo Kim, Hyunseok An et al.

CVPR 2024highlightarXiv:2309.02685
#3283

BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics

Wenqian Zhang, Molin Huang, Yuxuan Zhou et al.

CVPR 2024posterarXiv:2312.07937
#3284

Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems

Song Xia, Yi Yu, Wenhan Yang et al.

CVPR 2025highlightarXiv:2503.00383
#3285

Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data

Wenxin Su, Song Tang, Xiaofeng Liu et al.

CVPR 2025posterarXiv:2412.01203
#3286

Bayesian Exploration of Pre-trained Models for Low-shot Image Classification

Yibo Miao, Yu lei, Feng Zhou et al.

CVPR 2024posterarXiv:2404.00312
#3287

Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

Runhao Zeng, Xiaoyong Chen, Jiaming Liang et al.

CVPR 2024posterarXiv:2403.20254
#3288

Rethinking Personalized Aesthetics Assessment: Employing Physique Aesthetics Assessment as An Exemplification

Haobin Zhong, Shuai He, Anlong Ming et al.

CVPR 2025highlight
#3289

Frequency-Biased Synergistic Design for Image Compression and Compensation

Jiaming Liu, Qi Zheng, Zihao Liu et al.

CVPR 2025poster
#3290

RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation

Yi Rong, Haoran Zhou, Kang Xia et al.

CVPR 2024poster
#3291

4K4D: Real-Time 4D View Synthesis at 4K Resolution

Zhen Xu, Sida Peng, Haotong Lin et al.

CVPR 2024posterarXiv:2310.11448
#3292

APT: Adaptive Personalized Training for Diffusion Models with Limited Data

JungWoo Chae, Jiyoon Kim, Jaewoong Choi et al.

CVPR 2025posterarXiv:2507.02687
#3293

Context-Guided Spatio-Temporal Video Grounding

Xin Gu, Heng Fan, Yan Huang et al.

CVPR 2024posterarXiv:2401.01578
#3294

Foundations of the Theory of Performance-Based Ranking

Sébastien Piérard, Anaïs Halin, Anthony Cioppa et al.

CVPR 2025posterarXiv:2412.04227
#3295

WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2025highlight
#3296

Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation

Xinyu Zhao, Jun Xie, Shengzhe Chen et al.

CVPR 2025poster
#3297

Learning Conditional Space-Time Prompt Distributions for Video Class-Incremental Learning

Xiaohan Zou, Wenchao Ma, Shu Zhao

CVPR 2025highlight
#3298

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

Sai Kumar Dwivedi, Yu Sun, Priyanka Patel et al.

CVPR 2024posterarXiv:2404.16752
#3299

Re-thinking Data Availability Attacks Against Deep Neural Networks

Bin Fang, Bo Li, Shuang Wu et al.

CVPR 2024poster
#3300

Rotation-Equivariant Self-Supervised Method in Image Denoising

Hanze Liu, Jiahong Fu, Qi Xie et al.

CVPR 2025posterarXiv:2505.19618
#3301

Logit Standardization in Knowledge Distillation

Shangquan Sun, Wenqi Ren, Jingzhi Li et al.

CVPR 2024highlightarXiv:2403.01427
#3302

A Unified Approach for Text- and Image-guided 4D Scene Generation

Yufeng Zheng, Xueting Li, Koki Nagano et al.

CVPR 2024posterarXiv:2311.16854
#3303

CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models

Tuna Han Salih Meral, Enis Simsar, Federico Tombari et al.

CVPR 2024posterarXiv:2312.06059
#3304

SPECAT: SPatial-spEctral Cumulative-Attention Transformer for High-Resolution Hyperspectral Image Reconstruction

Zhiyang Yao, Shuyang Liu, Xiaoyun Yuan et al.

CVPR 2024poster
#3305

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024posterarXiv:2403.19926
#3306

Neural Refinement for Absolute Pose Regression with Feature Synthesis

Shuai Chen, Yash Bhalgat, Xinghui Li et al.

CVPR 2024posterarXiv:2303.10087
#3307

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.

CVPR 2024highlight
#3308

ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way

Jiazi Bu, Pengyang Ling, Pan Zhang et al.

CVPR 2025posterarXiv:2410.06241
#3309

Boosting Image Restoration via Priors from Pre-trained Models

Xiaogang Xu, Shu Kong, Tao Hu et al.

CVPR 2024posterarXiv:2403.06793
#3310

CPP-Net: Embracing Multi-Scale Feature Fusion into Deep Unfolding CP-PPA Network for Compressive Sensing

Zhen Guo, Hongping Gan

CVPR 2024poster
#3311

MPDrive: Improving Spatial Understanding with Marker-Based Prompt Learning for Autonomous Driving

Zhi-Yuan Zhang, Xiaofan Li, Zhihao Xu et al.

CVPR 2025highlightarXiv:2504.00379
#3312

GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2024posterarXiv:2403.11510
#3313

PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling

Xiaoyun Zheng, Liwei Liao, Xufeng Li et al.

CVPR 2024posterarXiv:2403.16080
#3314

Disentangled Pose and Appearance Guidance for Multi-Pose Generation

Tengfei Xiao, Yue Wu, Yuelong Li et al.

CVPR 2025poster
#3315

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Yawar Siddiqui, Antonio Alliegro, Alexey Artemov et al.

CVPR 2024highlightarXiv:2311.15475
#3316

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Geonho Bang, Kwangjin Choi, Jisong Kim et al.

CVPR 2024posterarXiv:2403.05061
#3317

VI^3NR: Variance Informed Initialization for Implicit Neural Representations

Chamin Hewa Koneputugodage, Yizhak Ben-Shabat, Sameera Ramasinghe et al.

CVPR 2025poster
#3318

Task-Conditioned Adaptation of Visual Features in Multi-Task Policy Learning

Pierre Marza, Laetitia Matignon, Olivier Simonin et al.

CVPR 2024posterarXiv:2402.07739
#3319

Efficient Diffusion as Low Light Enhancer

Guanzhou Lan, Qianli Ma, YUQI YANG et al.

CVPR 2025posterarXiv:2410.12346
#3320

GliaNet: Adaptive Neural Network Structure Learning with Glia-Driven

Mengqiao Han, Liyuan Pan, Xiabi Liu

CVPR 2025poster
#3321

Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion

Xiangfeng Xu, Pinyi Zhang, Wenxuan Huang et al.

CVPR 2025poster
#3322

EasyDrag: Efficient Point-based Manipulation on Diffusion Models

Xingzhong Hou, Boxiao Liu, Yi Zhang et al.

CVPR 2024poster
#3323

VidSeg: Training-free Video Semantic Segmentation based on Diffusion Models

Qian Wang, Abdelrahman Eldesokey, Mohit Mendiratta et al.

CVPR 2025poster
#3324

MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework

Ping Guo, Cheng Gong, Fei Liu et al.

CVPR 2025posterarXiv:2501.07251
#3325

Learned Lossless Image Compression based on Bit Plane Slicing

Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.

CVPR 2024poster
#3326

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Hongwei Zheng, Linyuan Zhou, Han Li et al.

CVPR 2024posterarXiv:2404.01179
#3327

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

Ziyu Wang, Yue Xu, Cewu Lu et al.

CVPR 2024posterarXiv:2312.00362
#3328

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Tongtian Yue, Jie Cheng, Longteng Guo et al.

CVPR 2024posterarXiv:2403.13263
#3329

Frequency-Adaptive Dilated Convolution for Semantic Segmentation

Linwei Chen, Lin Gu, Dezhi Zheng et al.

CVPR 2024highlightarXiv:2403.05369
#3330

TexTile: A Differentiable Metric for Texture Tileability

Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.

CVPR 2024posterarXiv:2403.12961
#3331

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

CVPR 2024posterarXiv:2401.06056
#3332

Image Processing GNN: Breaking Rigidity in Super-Resolution

Yuchuan Tian, Hanting Chen, Chao Xu et al.

CVPR 2024poster
#3333

Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity

Huaxin Zhang, Xiaohao Xu, Xiang Wang et al.

CVPR 2025highlightarXiv:2412.06171
#3334

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation

Suraj Patni, Aradhye Agarwal, Chetan Arora

CVPR 2024posterarXiv:2403.18807
#3335

SuperLightNet: Lightweight Parameter Aggregation Network for Multimodal Brain Tumor Segmentation

Feng Yu, Jiacheng Cao, Li Liu et al.

CVPR 2025poster
#3336

Learning from Streaming Video with Orthogonal Gradients

Tengda Han, Dilara Gokay, Joseph Heyward et al.

CVPR 2025posterarXiv:2504.01961
#3337

HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction

Yi ZHOU, Hui Zhang, Jiaqian Yu et al.

CVPR 2024posterarXiv:2403.08639
#3338

Riemannian Multinomial Logistics Regression for SPD Neural Networks

Ziheng Chen, Yue Song, Gaowen Liu et al.

CVPR 2024posterarXiv:2305.11288
#3339

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

Yuxing Duan

CVPR 2024posterarXiv:2405.19718
#3340

NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging

Takahiro Shirakawa, Seiichi Uchida

CVPR 2024posterarXiv:2403.03485
#3341

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM

Yutao Hu, Tianbin, Quanfeng Lu et al.

CVPR 2024posterarXiv:2402.09181
#3342

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng, Inbum Park, Andrew Owens

CVPR 2024posterarXiv:2311.17919
#3343

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.

CVPR 2024posterarXiv:2311.15383
#3344

Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations

Ahmad Rahimi, Po-Chien Luan, Yuejiang Liu et al.

CVPR 2025posterarXiv:2312.04540
#3345

MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Action Anticipation

Olga Zatsarynna, Emad Bahrami, Yazan Abu Farha et al.

CVPR 2025poster
#3346

Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings

Yakun Chang, Yeliduosi Xiaokaiti, Yujia Liu et al.

CVPR 2024poster
#3347

Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering

Suyuan Liu, KE LIANG, Zhibin Dong et al.

CVPR 2024poster
#3348

Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.

CVPR 2024posterarXiv:2402.18102
#3349

GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior

Zichen Tang, Yuan Yao, Miaomiao Cui et al.

CVPR 2025posterarXiv:2503.11143
#3350

CDI: Copyrighted Data Identification in Diffusion Models

Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.

CVPR 2025posterarXiv:2411.12858
#3351

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

Honghui Yang, Sha Zhang, Di Huang et al.

CVPR 2024posterarXiv:2310.08370
#3352

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

Maitreya Patel, Changhoon Kim, Sheng Cheng et al.

CVPR 2024posterarXiv:2312.04655
#3353

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

Haoyu Chen, Wenbo Li, Jinjin Gu et al.

CVPR 2024posterarXiv:2403.02601
#3354

Bridging Gait Recognition and Large Language Models Sequence Modeling

Shaopeng Yang, Jilong Wang, Saihui Hou et al.

CVPR 2025poster
#3355

Neural Video Compression with Feature Modulation

Jiahao Li, Bin Li, Yan Lu

CVPR 2024posterarXiv:2402.17414
#3356

Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

Boheng Li, Yishuo Cai, Haowei Li et al.

CVPR 2024posterarXiv:2405.12725
#3357

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu, Guozhen Zhang, Jing Tan et al.

CVPR 2024posterarXiv:2404.00653
#3358

Discriminative Probing and Tuning for Text-to-Image Generation

Leigang Qu, Wenjie Wang, Yongqi Li et al.

CVPR 2024posterarXiv:2403.04321
#3359

GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes

Haozhe Lin, Chunyu Wei, Li He et al.

CVPR 2024poster
#3360

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

Dongsu Zhang, Francis Williams, Žan Gojčič et al.

CVPR 2024highlightarXiv:2406.08292
#3361

Towards Practical Real-Time Neural Video Compression

Zhaoyang Jia, Bin Li, Jiahao Li et al.

CVPR 2025posterarXiv:2502.20762
#3362

Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods

Mingqi Jiang, Saeed Khorram, Li Fuxin

CVPR 2024posterarXiv:2212.06872
#3363

Continual Segmentation with Disentangled Objectness Learning and Class Recognition

Yizheng Gong, Siyue Yu, Xiaoyang Wang et al.

CVPR 2024posterarXiv:2403.03477
#3364

Image Sculpting: Precise Object Editing with 3D Geometry Control

Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.

CVPR 2024posterarXiv:2401.01702
#3365

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Feng Liu, Shiwei Zhang, Xiaofeng Wang et al.

CVPR 2025highlightarXiv:2411.19108
#3366

Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability

Yan Huang, Zhang Zhang, Qiang Wu et al.

CVPR 2024poster
#3367

Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection

Chen Chen, Jiahao Qi, Xingyue Liu et al.

CVPR 2024poster
#3368

Cross-Rejective Open-Set SAR Image Registration

Shasha Mao, Shiming Lu, Zhaolong Du et al.

CVPR 2025poster
#3369

Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization

Deng Li, Aming Wu, Yaowei Wang et al.

CVPR 2024posterarXiv:2402.18447
#3370

EscherNet: A Generative Model for Scalable View Synthesis

Xin Kong, Shikun Liu, Xiaoyang Lyu et al.

CVPR 2024posterarXiv:2402.03908
#3371

TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models

Xin Wang, Kai Chen, Jiaming Zhang et al.

CVPR 2025posterarXiv:2411.13136
#3372

MVCPS-NeuS: Multi-view Constrained Photometric Stereo for Neural Surface Reconstruction

Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita

CVPR 2024poster
#3373

OHTA: One-shot Hand Avatar via Data-driven Implicit Priors

Xiaozheng Zheng, Chao Wen, Zhuo Su et al.

CVPR 2024posterarXiv:2402.18969
#3374

E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator

Wenjun Wu, Lingling Zhang, Jun Liu et al.

CVPR 2024poster
#3375

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

Nicolás Ugrinovic, Boxiao Pan, Georgios Pavlakos et al.

CVPR 2024posterarXiv:2404.11987
#3376

LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Hao Shao, Yuxuan Hu, Letian Wang et al.

CVPR 2024posterarXiv:2312.07488
#3377

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

CVPR 2024posterarXiv:2312.10998
#3378

Taxonomy-Aware Evaluation of Vision-Language Models

Vésteinn Snæbjarnarson, Kevin Du, Niklas Stoehr et al.

CVPR 2025posterarXiv:2504.05457
#3379

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos

Shoukang Hu, Tao Hu, Ziwei Liu

CVPR 2024posterarXiv:2312.02973
#3380

BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection

Zhenxin Li, Shiyi Lan, Jose M. Alvarez et al.

CVPR 2024posterarXiv:2312.01696
#3381

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

Jieming Cui, Tengyu Liu, Nian Liu et al.

CVPR 2024posterarXiv:2403.12835
#3382

HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses

Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang et al.

CVPR 2024posterarXiv:2312.02232
#3383

SOAP: Vision-Centric 3D Semantic Scene Completion with Scene-Adaptive Decoder and Occluded Region-Aware View Projection

Hyo-Jun Lee, Yeong Jun Koh, Hanul Kim et al.

CVPR 2025poster
#3384

SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation

Jiehong Lin, lihua liu, Dekun Lu et al.

CVPR 2024posterarXiv:2311.15707
#3385

SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering

Tao Hu, Fangzhou Hong, Ziwei Liu

CVPR 2024posterarXiv:2404.01225
#3386

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

Chenjie Cao, Yunuo Cai, Qiaole Dong et al.

CVPR 2024posterarXiv:2305.11577
#3387

ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

Xiaoqi Li, Mingxu Zhang, Yiran Geng et al.

CVPR 2024posterarXiv:2312.16217
#3388

DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models

Haoyang Li, Liang Wang, Chao Wang et al.

CVPR 2025posterarXiv:2503.13443
#3389

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation

Wenxuan Wang, Tongtian Yue, Yisi Zhang et al.

CVPR 2024poster
#3390

PanoPose: Self-supervised Relative Pose Estimation for Panoramic Images

Diantao Tu, Hainan Cui, Xianwei Zheng et al.

CVPR 2024highlight
#3391

Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems

Haoquan Zhang, Ronggang Huang, Yi Xie et al.

CVPR 2024poster
#3392

FedCS: Coreset Selection for Federated Learning

Chenhe Hao, Weiying Xie, Daixun Li et al.

CVPR 2025poster
#3393

Global and Local Prompts Cooperation via Optimal Transport for Federated Learning

Hongxia Li, Wei Huang, Jingya Wang et al.

CVPR 2024posterarXiv:2403.00041
#3394

VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

Ziyang Luo, Nian Liu, Wangbo Zhao et al.

CVPR 2024posterarXiv:2311.15011
#3395

GraphI2P: Image-to-Point Cloud Registration with Exploring Pattern of Correspondence via Graph Learning

Lin Bie, Shouan Pan, Siqi Li et al.

CVPR 2025poster
#3396

Dense Optical Tracking: Connecting the Dots

Guillaume Le Moing, Jean Ponce, Cordelia Schmid

CVPR 2024highlightarXiv:2312.00786
#3397

Multi-agent Collaborative Perception via Motion-aware Robust Communication Network

Shixin Hong, Yu LIU, Zhi Li et al.

CVPR 2024poster
#3398

Ungeneralizable Examples

Jingwen Ye, Xinchao Wang

CVPR 2024posterarXiv:2404.14016
#3399

Language-only Training of Zero-shot Composed Image Retrieval

Geonmo Gu, Sanghyuk Chun, Wonjae Kim et al.

CVPR 2024poster
#3400

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

Shitian Zhao, Zhuowan Li, YadongLu et al.

CVPR 2024highlightarXiv:2312.06685