Most Cited 2024 "weighted-majority voting" Papers

12,324 papers found • Page 52 of 62

#10201

A Versatile Framework for Continual Test-Time Domain Adaptation: Balancing Discriminability and Generalizability

Xu Yang, Xuan chen, Moqi Li et al.

CVPR 2024
#10202

Efficient Solution of Point-Line Absolute Pose

Petr Hruby, Timothy Duff, Marc Pollefeys

CVPR 2024highlightarXiv:2404.16552
#10203

SPIN: Simultaneous Perception Interaction and Navigation

Shagun Uppal, Ananye Agarwal, Haoyu Xiong et al.

CVPR 2024arXiv:2405.07991
#10204

CAMixerSR: Only Details Need More "Attention"

Yan Wang, Yi Liu, Shijie Zhao et al.

CVPR 2024
#10205

FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures

Lisa Mais, Peter Hirsch, Claire Managan et al.

CVPR 2024arXiv:2404.00130
#10206

POPDG: Popular 3D Dance Generation with PopDanceSet

Zhenye Luo, Min Ren, Xuecai Hu et al.

CVPR 2024arXiv:2405.03178
#10207

RankMatch: Exploring the Better Consistency Regularization for Semi-supervised Semantic Segmentation

Huayu Mai, Rui Sun, Tianzhu Zhang et al.

CVPR 2024
#10208

CoDe: An Explicit Content Decoupling Framework for Image Restoration

Enxuan Gu, Hongwei Ge, Yong Guo

CVPR 2024
#10209

Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement

Jinyoung Jun, Jae-Han Lee, Chang-Su Kim

CVPR 2024arXiv:2404.19294
#10210

D^4: Dataset Distillation via Disentangled Diffusion Model

Duo Su, Junjie Hou, Weizhi Gao et al.

CVPR 2024
#10211

An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains

George Eskandar

CVPR 2024arXiv:2402.17562
#10212

MarkovGen: Structured Prediction for Efficient Text-to-Image Generation

Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam et al.

CVPR 2024arXiv:2308.10997
#10213

Rethinking the Representation in Federated Unsupervised Learning with Non-IID Data

Xinting Liao, Weiming Liu, Chaochao Chen et al.

CVPR 2024arXiv:2403.16398
#10214

ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification

Jiangbo Shi, Chen Li, Tieliang Gong et al.

CVPR 2024arXiv:2502.08391
#10215

Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

Phuc Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis et al.

CVPR 2024arXiv:2312.10671
#10216

Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs

Kanchana Ranasinghe, Satya Narayan Shukla, Omid Poursaeed et al.

CVPR 2024arXiv:2404.07449
#10217

CaDeT: a Causal Disentanglement Approach for Robust Trajectory Prediction in Autonomous Driving

Mozhgan Pourkeshavarz, Junrui Zhang, Amir Rasouli

CVPR 2024
#10218

Continual Forgetting for Pre-trained Vision Models

Hongbo Zhao, Bolin Ni, Junsong Fan et al.

CVPR 2024arXiv:2403.11530
#10219

Boosting Neural Representations for Videos with a Conditional Decoder

XINJIE ZHANG, Ren Yang, Dailan He et al.

CVPR 2024highlightarXiv:2402.18152
#10220

Unsupervised Feature Learning with Emergent Data-Driven Prototypicality

Yunhui Guo, Youren Zhang, Yubei Chen et al.

CVPR 2024arXiv:2307.01421
#10221

Text-Guided 3D Face Synthesis - From Generation to Editing

Yunjie Wu, Yapeng Meng, Zhipeng Hu et al.

CVPR 2024arXiv:2312.00375
#10222

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

Huan Ling, Seung Wook Kim, Antonio Torralba et al.

CVPR 2024highlightarXiv:2312.13763
#10223

IReNe: Instant Recoloring of Neural Radiance Fields

Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces et al.

CVPR 2024arXiv:2405.19876
#10224

Constrained Layout Generation with Factor Graphs

Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng et al.

CVPR 2024arXiv:2404.00385
#10225

URHand: Universal Relightable Hands

Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo et al.

CVPR 2024arXiv:2401.05334
#10226

Neural Implicit Morphing of Face Images

Guilherme Schardong, Tiago Novello, Hallison Paz et al.

CVPR 2024arXiv:2308.13888
#10227

Distilling CLIP with Dual Guidance for Learning Discriminative Human Body Shape Representation

Feng Liu, Minchul Kim, Zhiyuan Ren et al.

CVPR 2024
#10228

Snapshot Lidar: Fourier Embedding of Amplitude and Phase for Single-Image Depth Reconstruction

Sarah Friday, Yunzi Shi, Yaswanth Kumar Cherivirala et al.

CVPR 2024
#10229

CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification

Haoran Lai, Qingsong Yao, Zihang Jiang et al.

CVPR 2024arXiv:2402.17417
#10230

MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant

Chenlu Zhan, Gaoang Wang, Yu LIN et al.

CVPR 2024arXiv:2403.04290
#10231

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

Jihao Liu, Jinliang Zheng, Yu Liu et al.

CVPR 2024arXiv:2404.07603
#10232

Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion

Sofia Casarin, Cynthia Ugwu, Sergio Escalera et al.

CVPR 2024arXiv:2403.15194
#10233

Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction

Jinzhi Zheng, Heng Fan, Libo Zhang

CVPR 2024
#10234

NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation

Sicheng Li, Hao Li, Yiyi Liao et al.

CVPR 2024arXiv:2404.02185
#10235

Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing

Hyelin Nam, Gihyun Kwon, Geon Yeong Park et al.

CVPR 2024arXiv:2311.18608
#10236

PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar

Tzofi Klinghoffer, Xiaoyu Xiang, Siddharth Somasundaram et al.

CVPR 2024arXiv:2312.14239
#10237

DiffLoc: Diffusion Model for Outdoor LiDAR Localization

Wen Li, Yuyang Yang, Shangshu Yu et al.

CVPR 2024
#10238

Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation

Mingyu Lee, Jongwon Choi

CVPR 2024arXiv:2403.06247
#10239

Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models

Pengze Zhang, Hubery Yin, Chen Li et al.

CVPR 2024highlightarXiv:2403.08381
#10240

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

Siteng Huang, Biao Gong, Yutong Feng et al.

CVPR 2024arXiv:2303.15230
#10241

Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement

Daiwei Yu, Zhuorong Li, Lina Wei et al.

CVPR 2024arXiv:2403.09101
#10242

LoCoNet: Long-Short Context Network for Active Speaker Detection

Xizi Wang, Feng Cheng, Gedas Bertasius

CVPR 2024arXiv:2301.08237
#10243

WinSyn: : A High Resolution Testbed for Synthetic Data

Tom Kelly, John Femiani, Peter Wonka

CVPR 2024arXiv:2310.08471
#10244

Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation

Daichi Horita, Naoto Inoue, Kotaro Kikuchi et al.

CVPR 2024arXiv:2311.13602
#10245

Wired Perspectives: Multi-View Wire Art Embraces Generative AI

Zhiyu Qu, LAN YANG, Honggang Zhang et al.

CVPR 2024arXiv:2311.15421
#10246

Small Scale Data-Free Knowledge Distillation

He Liu, Yikai Wang, Huaping Liu et al.

CVPR 2024arXiv:2406.07876
#10247

Transfer CLIP for Generalizable Image Denoising

Jun Cheng, Dong Liang, Shan Tan

CVPR 2024arXiv:2403.15132
#10248

Validating Privacy-Preserving Face Recognition under a Minimum Assumption

Hui Zhang, Xingbo Dong, YenLungLai et al.

CVPR 2024
#10249

CLiC: Concept Learning in Context

Mehdi Safaee, Aryan Mikaeili, Or Patashnik et al.

CVPR 2024highlightarXiv:2311.17083
#10250

IDGuard: Robust General Identity-centric POI Proactive Defense Against Face Editing Abuse

Yunshu Dai, Jianwei Fei, Fangjun Huang

CVPR 2024
#10251

Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology

Wenhao Tang, Fengtao ZHOU, Sheng Huang et al.

CVPR 2024arXiv:2402.17228
#10252

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.

CVPR 2024highlightarXiv:2404.04319
#10253

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models

Zhongwei Zhang, Fuchen Long, Yingwei Pan et al.

CVPR 2024arXiv:2403.17005
#10254

Perceptual Assessment and Optimization of HDR Image Rendering

Peibei Cao, Rafal Mantiuk, Kede Ma

CVPR 2024arXiv:2310.12877
#10255

Pose-Transformed Equivariant Network for 3D Point Trajectory Prediction

Ruixuan Yu, Jian Sun

CVPR 2024
#10256

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications

Yuwen Xiong, Zhiqi Li, Yuntao Chen et al.

CVPR 2024highlightarXiv:2401.06197
#10257

Multimodal Representation Learning by Alternating Unimodal Adaptation

Xiaohui Zhang, Jaehong Yoon, Mohit Bansal et al.

CVPR 2024arXiv:2311.10707
#10258

Compositional Video Understanding with Spatiotemporal Structure-based Transformers

Hoyeoung Yun, Jinwoo Ahn, Minseo Kim et al.

CVPR 2024
#10259

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

Junshu Tang, Yanhong Zeng, Ke Fan et al.

CVPR 2024arXiv:2403.16897
#10260

Coherent Temporal Synthesis for Incremental Action Segmentation

Guodong Ding, Hans Golong, Angela Yao

CVPR 2024arXiv:2403.06102
#10261

Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing

ChangHee Yang, ChanHee Kang, Kyeongbo Kong et al.

CVPR 2024
#10262

Estimating Extreme 3D Image Rotations using Cascaded Attention

Shay Dekel, Yosi Keller, Martin Čadík

CVPR 2024
#10263

Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network

Yong Shu, Liquan Shen, Xiangyu Hu et al.

CVPR 2024arXiv:2405.00244
#10264

Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular Stereo and RGB-D Cameras

Huajian Huang, Longwei Li, Hui Cheng et al.

CVPR 2024arXiv:2311.16728
#10265

Attention Calibration for Disentangled Text-to-Image Personalization

Yanbing Zhang, Mengping Yang, Qin Zhou et al.

CVPR 2024arXiv:2403.18551
#10266

SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control

Jaskirat Singh, Jianming Zhang, Qing Liu et al.

CVPR 2024arXiv:2312.05039
#10267

GraCo: Granularity-Controllable Interactive Segmentation

Yian Zhao, Kehan Li, Zesen Cheng et al.

CVPR 2024highlightarXiv:2405.00587
#10268

Segment Every Out-of-Distribution Object

Wenjie Zhao, Jia Li, Xin Dong et al.

CVPR 2024arXiv:2311.16516
#10269

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.

CVPR 2024highlightarXiv:2312.06640
#10270

Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Fanghua Yu, Jinjin Gu, Zheyuan Li et al.

CVPR 2024arXiv:2401.13627
#10271

Masked and Shuffled Blind Spot Denoising for Real-World Images

Hamadi Chihaoui, Paolo Favaro

CVPR 2024arXiv:2404.09389
#10272

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

CVPR 2024highlightarXiv:2312.00690
#10273

Generative Region-Language Pretraining for Open-Ended Object Detection

Chuang Lin, Yi Jiang, Lizhen Qu et al.

CVPR 2024arXiv:2403.10191
#10274

Boosting Diffusion Models with Moving Average Sampling in Frequency Domain

Yurui Qian, Qi Cai, Yingwei Pan et al.

CVPR 2024arXiv:2403.17870
#10275

Discovering Syntactic Interaction Clues for Human-Object Interaction Detection

Jinguo Luo, Weihong Ren, Weibo Jiang et al.

CVPR 2024
#10276

Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture

Juanwu Lu, Can Cui, Yunsheng Ma et al.

CVPR 2024arXiv:2404.03789
#10277

Generative Latent Coding for Ultra-Low Bitrate Image Compression

Zhaoyang Jia, Jiahao Li, Bin Li et al.

CVPR 2024arXiv:2512.20194
#10278

Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization

Jimyeong Kim, Jungwon Park, Wonjong Rhee

CVPR 2024arXiv:2403.15330
#10279

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

Yaxu Xie, Alain Pagani, Didier Stricker

CVPR 2024arXiv:2403.19474
#10280

Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features

Thomas Wimmer, Peter Wonka, Maks Ovsjanikov

CVPR 2024arXiv:2311.18113
#10281

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Zeyi Sun, Ye Fang, Tong Wu et al.

CVPR 2024arXiv:2312.03818
#10282

DemoFusion: Democratising High-Resolution Image Generation With No $$$

Ruoyi DU, Dongliang Chang, Timothy Hospedales et al.

CVPR 2024arXiv:2311.16973
#10283

Activity-Biometrics: Person Identification from Daily Activities

Shehreen Azad, Yogesh S. Rawat

CVPR 2024arXiv:2403.17360
#10284

Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras

Ashwath Shetty, Marc Habermann, Guoxing Sun et al.

CVPR 2024arXiv:2312.07423
#10285

Neighbor Relations Matter in Video Scene Detection

Jiawei Tan, Hongxing Wang, Jiaxin Li et al.

CVPR 2024
#10286

Fast ODE-based Sampling for Diffusion Models in Around 5 Steps

Zhenyu Zhou, Defang Chen, Can Wang et al.

CVPR 2024highlightarXiv:2312.00094
#10287

Referring Image Editing: Object-level Image Editing via Referring Expressions

Chang Liu, Xiangtai Li, Henghui Ding

CVPR 2024
#10288

InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields

Dongqing Wang, Tong Zhang, Alaa Abboud et al.

CVPR 2024arXiv:2305.15094
#10289

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.

CVPR 2024arXiv:2312.10118
#10290

Unsupervised Blind Image Deblurring Based on Self-Enhancement

Lufei Chen, Xiangpeng Tian, Shuhua Xiong et al.

CVPR 2024
#10291

Mask Grounding for Referring Image Segmentation

Yong Xien Chng, Henry Zheng, Yizeng Han et al.

CVPR 2024arXiv:2312.12198
#10292

SignGraph: A Sign Sequence is Worth Graphs of Nodes

Shiwei Gan, Yafeng Yin, Zhiwei Jiang et al.

CVPR 2024
#10293

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion

Zixian Gao, Xun Jiang, Xing Xu et al.

CVPR 2024
#10294

DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching

Shuzhe Wang, Juho Kannala, Daniel Barath

CVPR 2024arXiv:2306.12547
#10295

FreeDrag: Feature Dragging for Reliable Point-based Image Editing

Pengyang Ling, Lin Chen, Pan Zhang et al.

CVPR 2024arXiv:2307.04684
#10296

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Yushi Huang, Ruihao Gong, Jing Liu et al.

CVPR 2024highlightarXiv:2311.16503
#10297

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

Shenhan Qian, Tobias Kirschstein, Liam Schoneveld et al.

CVPR 2024highlightarXiv:2312.02069
#10298

Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users

Daniela Massiceti, Camilla Longden, Agnieszka Słowik et al.

CVPR 2024arXiv:2311.17315
#10299

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

Yanting Wang, Hongye Fu, Wei Zou et al.

CVPR 2024arXiv:2403.19080
#10300

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

Hanrong Ye, Dan Xu

CVPR 2024arXiv:2403.15389
#10301

Revisiting Spatial-Frequency Information Integration from a Hierarchical Perspective for Panchromatic and Multi-Spectral Image Fusion

Jiangtong Tan, Jie Huang, Naishan Zheng et al.

CVPR 2024
#10302

FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding

Jinglin Xu, Guohao Zhao, Sibo Yin et al.

CVPR 2024
#10303

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

Matteo Farina, Massimiliano Mancini, Elia Cunegatti et al.

CVPR 2024arXiv:2404.05621
#10304

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization

Guopeng Li, Ming Qian, Gui-Song Xia

CVPR 2024arXiv:2403.14198
#10305

CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition

Qixuan Zheng, Ming Zhang, Hong Yan

CVPR 2024arXiv:2402.16594
#10306

FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning

Qiwei Li, Yuxin Peng, Jiahuan Zhou

CVPR 2024
#10307

GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs

Mustafa Munir, William Avery, Md Mostafijur Rahman et al.

CVPR 2024arXiv:2405.06849
#10308

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation

Yi Zhang, Meng-Hao Guo, Miao Wang et al.

CVPR 2024
#10309

GALA: Generating Animatable Layered Assets from a Single Scan

Taeksoo Kim, Byungjun Kim, Shunsuke Saito et al.

CVPR 2024arXiv:2401.12979
#10310

Improving Graph Contrastive Learning via Adaptive Positive Sampling

Jiaming Zhuo, Feiyang Qin, Can Cui et al.

CVPR 2024
#10311

Hearing Anything Anywhere

Mason Wang, Ryosuke Sawata, Samuel Clarke et al.

CVPR 2024arXiv:2406.07532
#10312

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.

CVPR 2024
#10313

Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation

Hyunwoo Ryu, Jiwoo Kim, Hyunseok An et al.

CVPR 2024highlightarXiv:2309.02685
#10314

BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics

Wenqian Zhang, Molin Huang, Yuxuan Zhou et al.

CVPR 2024arXiv:2312.07937
#10315

Bayesian Exploration of Pre-trained Models for Low-shot Image Classification

Yibo Miao, Yu lei, Feng Zhou et al.

CVPR 2024arXiv:2404.00312
#10316

Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

Runhao Zeng, Xiaoyong Chen, Jiaming Liang et al.

CVPR 2024arXiv:2403.20254
#10317

RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation

Yi Rong, Haoran Zhou, Kang Xia et al.

CVPR 2024
#10318

4K4D: Real-Time 4D View Synthesis at 4K Resolution

Zhen Xu, Sida Peng, Haotong Lin et al.

CVPR 2024arXiv:2310.11448
#10319

Context-Guided Spatio-Temporal Video Grounding

Xin Gu, Heng Fan, Yan Huang et al.

CVPR 2024arXiv:2401.01578
#10320

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

Sai Kumar Dwivedi, Yu Sun, Priyanka Patel et al.

CVPR 2024arXiv:2404.16752
#10321

Re-thinking Data Availability Attacks Against Deep Neural Networks

Bin Fang, Bo Li, Shuang Wu et al.

CVPR 2024
#10322

Logit Standardization in Knowledge Distillation

Shangquan Sun, Wenqi Ren, Jingzhi Li et al.

CVPR 2024highlightarXiv:2403.01427
#10323

A Unified Approach for Text- and Image-guided 4D Scene Generation

Yufeng Zheng, Xueting Li, Koki Nagano et al.

CVPR 2024arXiv:2311.16854
#10324

CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models

Tuna Han Salih Meral, Enis Simsar, Federico Tombari et al.

CVPR 2024arXiv:2312.06059
#10325

SPECAT: SPatial-spEctral Cumulative-Attention Transformer for High-Resolution Hyperspectral Image Reconstruction

Zhiyang Yao, Shuyang Liu, Xiaoyun Yuan et al.

CVPR 2024
#10326

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024arXiv:2403.19926
#10327

Neural Refinement for Absolute Pose Regression with Feature Synthesis

Shuai Chen, Yash Bhalgat, Xinghui Li et al.

CVPR 2024arXiv:2303.10087
#10328

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.

CVPR 2024highlight
#10329

Boosting Image Restoration via Priors from Pre-trained Models

Xiaogang Xu, Shu Kong, Tao Hu et al.

CVPR 2024arXiv:2403.06793
#10330

CPP-Net: Embracing Multi-Scale Feature Fusion into Deep Unfolding CP-PPA Network for Compressive Sensing

Zhen Guo, Hongping Gan

CVPR 2024
#10331

GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2024arXiv:2403.11510
#10332

PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling

Xiaoyun Zheng, Liwei Liao, Xufeng Li et al.

CVPR 2024arXiv:2403.16080
#10333

DiffCast: A Unified Framework via Residual Diffusion for Precipitation Nowcasting

Demin Yu, Xutao Li, Yunming Ye et al.

CVPR 2024arXiv:2312.06734
#10334

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Yawar Siddiqui, Antonio Alliegro, Alexey Artemov et al.

CVPR 2024highlightarXiv:2311.15475
#10335

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Geonho Bang, Kwangjin Choi, Jisong Kim et al.

CVPR 2024arXiv:2403.05061
#10336

Task-Conditioned Adaptation of Visual Features in Multi-Task Policy Learning

Pierre Marza, Laetitia Matignon, Olivier Simonin et al.

CVPR 2024arXiv:2402.07739
#10337

EasyDrag: Efficient Point-based Manipulation on Diffusion Models

Xingzhong Hou, Boxiao Liu, Yi Zhang et al.

CVPR 2024
#10338

Learned Lossless Image Compression based on Bit Plane Slicing

Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.

CVPR 2024
#10339

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Hongwei Zheng, Linyuan Zhou, Han Li et al.

CVPR 2024arXiv:2404.01179
#10340

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

Ziyu Wang, Yue Xu, Cewu Lu et al.

CVPR 2024arXiv:2312.00362
#10341

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Tongtian Yue, Jie Cheng, Longteng Guo et al.

CVPR 2024arXiv:2403.13263
#10342

Frequency-Adaptive Dilated Convolution for Semantic Segmentation

Linwei Chen, Lin Gu, Dezhi Zheng et al.

CVPR 2024highlightarXiv:2403.05369
#10343

TexTile: A Differentiable Metric for Texture Tileability

Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.

CVPR 2024arXiv:2403.12961
#10344

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

CVPR 2024arXiv:2401.06056
#10345

Image Processing GNN: Breaking Rigidity in Super-Resolution

Yuchuan Tian, Hanting Chen, Chao Xu et al.

CVPR 2024
#10346

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation

Suraj Patni, Aradhye Agarwal, Chetan Arora

CVPR 2024arXiv:2403.18807
#10347

Bi-Causal: Group Activity Recognition via Bidirectional Causality

Youliang Zhang, Wenxuan Liu, danni xu et al.

CVPR 2024
#10348

Riemannian Multinomial Logistics Regression for SPD Neural Networks

Ziheng Chen, Yue Song, Gaowen Liu et al.

CVPR 2024arXiv:2305.11288
#10349

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

Yuxing Duan

CVPR 2024arXiv:2405.19718
#10350

NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging

Takahiro Shirakawa, Seiichi Uchida

CVPR 2024arXiv:2403.03485
#10351

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM

Yutao Hu, Tianbin, Quanfeng Lu et al.

CVPR 2024arXiv:2402.09181
#10352

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng, Inbum Park, Andrew Owens

CVPR 2024arXiv:2311.17919
#10353

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.

CVPR 2024arXiv:2311.15383
#10354

Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings

Yakun Chang, Yeliduosi Xiaokaiti, Yujia Liu et al.

CVPR 2024
#10355

Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering

Suyuan Liu, KE LIANG, Zhibin Dong et al.

CVPR 2024
#10356

Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.

CVPR 2024arXiv:2402.18102
#10357

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

Honghui Yang, Sha Zhang, Di Huang et al.

CVPR 2024arXiv:2310.08370
#10358

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

Maitreya Patel, Changhoon Kim, Sheng Cheng et al.

CVPR 2024arXiv:2312.04655
#10359

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

Haoyu Chen, Wenbo Li, Jinjin Gu et al.

CVPR 2024arXiv:2403.02601
#10360

Neural Video Compression with Feature Modulation

Jiahao Li, Bin Li, Yan Lu

CVPR 2024arXiv:2402.17414
#10361

Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

Boheng Li, Yishuo Cai, Haowei Li et al.

CVPR 2024arXiv:2405.12725
#10362

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu, Guozhen Zhang, Jing Tan et al.

CVPR 2024arXiv:2404.00653
#10363

Discriminative Probing and Tuning for Text-to-Image Generation

Leigang Qu, Wenjie Wang, Yongqi Li et al.

CVPR 2024arXiv:2403.04321
#10364

GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes

Haozhe Lin, Chunyu Wei, Li He et al.

CVPR 2024
#10365

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

Dongsu Zhang, Francis Williams, Žan Gojčič et al.

CVPR 2024highlightarXiv:2406.08292
#10366

Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods

Mingqi Jiang, Saeed Khorram, Li Fuxin

CVPR 2024arXiv:2212.06872
#10367

Continual Segmentation with Disentangled Objectness Learning and Class Recognition

Yizheng Gong, Siyue Yu, Xiaoyang Wang et al.

CVPR 2024arXiv:2403.03477
#10368

Image Sculpting: Precise Object Editing with 3D Geometry Control

Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.

CVPR 2024arXiv:2401.01702
#10369

Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability

Yan Huang, Zhang Zhang, Qiang Wu et al.

CVPR 2024
#10370

Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection

Chen Chen, Jiahao Qi, Xingyue Liu et al.

CVPR 2024
#10371

Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization

Deng Li, Aming Wu, Yaowei Wang et al.

CVPR 2024arXiv:2402.18447
#10372

EscherNet: A Generative Model for Scalable View Synthesis

Xin Kong, Shikun Liu, Xiaoyang Lyu et al.

CVPR 2024arXiv:2402.03908
#10373

MVCPS-NeuS: Multi-view Constrained Photometric Stereo for Neural Surface Reconstruction

Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita

CVPR 2024
#10374

OHTA: One-shot Hand Avatar via Data-driven Implicit Priors

Xiaozheng Zheng, Chao Wen, Zhuo Su et al.

CVPR 2024arXiv:2402.18969
#10375

E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator

Wenjun Wu, Lingling Zhang, Jun Liu et al.

CVPR 2024
#10376

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

Nicolás Ugrinovic, Boxiao Pan, Georgios Pavlakos et al.

CVPR 2024arXiv:2404.11987
#10377

LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Hao Shao, Yuxuan Hu, Letian Wang et al.

CVPR 2024arXiv:2312.07488
#10378

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

CVPR 2024arXiv:2312.10998
#10379

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos

Shoukang Hu, Tao Hu, Ziwei Liu

CVPR 2024arXiv:2312.02973
#10380

BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection

Zhenxin Li, Shiyi Lan, Jose M. Alvarez et al.

CVPR 2024arXiv:2312.01696
#10381

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

Jieming Cui, Tengyu Liu, Nian Liu et al.

CVPR 2024arXiv:2403.12835
#10382

HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses

Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang et al.

CVPR 2024arXiv:2312.02232
#10383

SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation

Jiehong Lin, lihua liu, Dekun Lu et al.

CVPR 2024arXiv:2311.15707
#10384

SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering

Tao Hu, Fangzhou Hong, Ziwei Liu

CVPR 2024arXiv:2404.01225
#10385

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

Chenjie Cao, Yunuo Cai, Qiaole Dong et al.

CVPR 2024arXiv:2305.11577
#10386

ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

Xiaoqi Li, Mingxu Zhang, Yiran Geng et al.

CVPR 2024arXiv:2312.16217
#10387

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation

Wenxuan Wang, Tongtian Yue, Yisi Zhang et al.

CVPR 2024
#10388

PanoPose: Self-supervised Relative Pose Estimation for Panoramic Images

Diantao Tu, Hainan Cui, Xianwei Zheng et al.

CVPR 2024highlight
#10389

Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems

Haoquan Zhang, Ronggang Huang, Yi Xie et al.

CVPR 2024
#10390

Global and Local Prompts Cooperation via Optimal Transport for Federated Learning

Hongxia Li, Wei Huang, Jingya Wang et al.

CVPR 2024arXiv:2403.00041
#10391

VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

Ziyang Luo, Nian Liu, Wangbo Zhao et al.

CVPR 2024arXiv:2311.15011
#10392

Dense Optical Tracking: Connecting the Dots

Guillaume Le Moing, Jean Ponce, Cordelia Schmid

CVPR 2024highlightarXiv:2312.00786
#10393

Multi-agent Collaborative Perception via Motion-aware Robust Communication Network

Shixin Hong, Yu LIU, Zhi Li et al.

CVPR 2024
#10394

Ungeneralizable Examples

Jingwen Ye, Xinchao Wang

CVPR 2024arXiv:2404.14016
#10395

Language-only Training of Zero-shot Composed Image Retrieval

Geonmo Gu, Sanghyuk Chun, Wonjae Kim et al.

CVPR 2024
#10396

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

Shitian Zhao, Zhuowan Li, YadongLu et al.

CVPR 2024highlightarXiv:2312.06685
#10397

Rapid Motor Adaptation for Robotic Manipulator Arms

Yichao Liang, Kevin Ellis, João F. Henriques

CVPR 2024arXiv:2312.04670
#10398

Instruct-Imagen: Image Generation with Multi-modal Instruction

Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su et al.

CVPR 2024arXiv:2401.01952
#10399

Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation

Dong Lao, Congli Wang, Alex Wong et al.

CVPR 2024highlightarXiv:2405.03662
#10400

Adapting to Length Shift: FlexiLength Network for Trajectory Prediction

Yi Xu, Yun Fu

CVPR 2024arXiv:2404.00742