Most Cited 2024 "weighted-majority voting" Papers

12,324 papers found • Page 52 of 62

Filters:Most Cited 2024 weighted-majority voting Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#10201

A Versatile Framework for Continual Test-Time Domain Adaptation: Balancing Discriminability and Generalizability

Xu Yang, Xuan chen, Moqi Li et al.

CVPR 2024

#10202

Efficient Solution of Point-Line Absolute Pose

Petr Hruby, Timothy Duff, Marc Pollefeys

CVPR 2024highlightarXiv:2404.16552

#10203

SPIN: Simultaneous Perception Interaction and Navigation

Shagun Uppal, Ananye Agarwal, Haoyu Xiong et al.

CVPR 2024arXiv:2405.07991

#10204

CAMixerSR: Only Details Need More "Attention"

Yan Wang, Yi Liu, Shijie Zhao et al.

CVPR 2024

#10205

FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures

Lisa Mais, Peter Hirsch, Claire Managan et al.

CVPR 2024arXiv:2404.00130

#10206

POPDG: Popular 3D Dance Generation with PopDanceSet

Zhenye Luo, Min Ren, Xuecai Hu et al.

CVPR 2024arXiv:2405.03178

#10207

RankMatch: Exploring the Better Consistency Regularization for Semi-supervised Semantic Segmentation

Huayu Mai, Rui Sun, Tianzhu Zhang et al.

CVPR 2024

#10208

CoDe: An Explicit Content Decoupling Framework for Image Restoration

Enxuan Gu, Hongwei Ge, Yong Guo

CVPR 2024

#10209

Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement

Jinyoung Jun, Jae-Han Lee, Chang-Su Kim

CVPR 2024arXiv:2404.19294

#10210

D^4: Dataset Distillation via Disentangled Diffusion Model

Duo Su, Junjie Hou, Weizhi Gao et al.

CVPR 2024

#10211

An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains

George Eskandar

CVPR 2024arXiv:2402.17562

#10212

MarkovGen: Structured Prediction for Efficient Text-to-Image Generation

Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam et al.

CVPR 2024arXiv:2308.10997

#10213

Rethinking the Representation in Federated Unsupervised Learning with Non-IID Data

Xinting Liao, Weiming Liu, Chaochao Chen et al.

CVPR 2024arXiv:2403.16398

#10214

ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification

Jiangbo Shi, Chen Li, Tieliang Gong et al.

CVPR 2024arXiv:2502.08391

#10215

Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

Phuc Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis et al.

CVPR 2024arXiv:2312.10671

#10216

Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs

Kanchana Ranasinghe, Satya Narayan Shukla, Omid Poursaeed et al.

CVPR 2024arXiv:2404.07449

#10217

CaDeT: a Causal Disentanglement Approach for Robust Trajectory Prediction in Autonomous Driving

Mozhgan Pourkeshavarz, Junrui Zhang, Amir Rasouli

CVPR 2024

#10218

Continual Forgetting for Pre-trained Vision Models

Hongbo Zhao, Bolin Ni, Junsong Fan et al.

CVPR 2024arXiv:2403.11530

#10219

Boosting Neural Representations for Videos with a Conditional Decoder

XINJIE ZHANG, Ren Yang, Dailan He et al.

CVPR 2024highlightarXiv:2402.18152

#10220

Unsupervised Feature Learning with Emergent Data-Driven Prototypicality

Yunhui Guo, Youren Zhang, Yubei Chen et al.

CVPR 2024arXiv:2307.01421

#10221

Text-Guided 3D Face Synthesis - From Generation to Editing

Yunjie Wu, Yapeng Meng, Zhipeng Hu et al.

CVPR 2024arXiv:2312.00375

#10222

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

Huan Ling, Seung Wook Kim, Antonio Torralba et al.

CVPR 2024highlightarXiv:2312.13763

#10223

IReNe: Instant Recoloring of Neural Radiance Fields

Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces et al.

CVPR 2024arXiv:2405.19876

#10224

Constrained Layout Generation with Factor Graphs

Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng et al.

CVPR 2024arXiv:2404.00385

#10225

URHand: Universal Relightable Hands

Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo et al.

CVPR 2024arXiv:2401.05334

#10226

Neural Implicit Morphing of Face Images

Guilherme Schardong, Tiago Novello, Hallison Paz et al.

CVPR 2024arXiv:2308.13888

#10227

Distilling CLIP with Dual Guidance for Learning Discriminative Human Body Shape Representation

Feng Liu, Minchul Kim, Zhiyuan Ren et al.

CVPR 2024

#10228

Snapshot Lidar: Fourier Embedding of Amplitude and Phase for Single-Image Depth Reconstruction

Sarah Friday, Yunzi Shi, Yaswanth Kumar Cherivirala et al.

CVPR 2024

#10229

CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification

Haoran Lai, Qingsong Yao, Zihang Jiang et al.

CVPR 2024arXiv:2402.17417

#10230

MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant

Chenlu Zhan, Gaoang Wang, Yu LIN et al.

CVPR 2024arXiv:2403.04290

#10231

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

Jihao Liu, Jinliang Zheng, Yu Liu et al.

CVPR 2024arXiv:2404.07603

#10232

Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion

Sofia Casarin, Cynthia Ugwu, Sergio Escalera et al.

CVPR 2024arXiv:2403.15194

#10233

Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction

Jinzhi Zheng, Heng Fan, Libo Zhang

CVPR 2024

#10234

NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation

Sicheng Li, Hao Li, Yiyi Liao et al.

CVPR 2024arXiv:2404.02185

#10235

Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing

Hyelin Nam, Gihyun Kwon, Geon Yeong Park et al.

CVPR 2024arXiv:2311.18608

#10236

PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar

Tzofi Klinghoffer, Xiaoyu Xiang, Siddharth Somasundaram et al.

CVPR 2024arXiv:2312.14239

#10237

DiffLoc: Diffusion Model for Outdoor LiDAR Localization

Wen Li, Yuyang Yang, Shangshu Yu et al.

CVPR 2024

#10238

Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation

Mingyu Lee, Jongwon Choi

CVPR 2024arXiv:2403.06247

#10239

Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models

Pengze Zhang, Hubery Yin, Chen Li et al.

CVPR 2024highlightarXiv:2403.08381

#10240

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

Siteng Huang, Biao Gong, Yutong Feng et al.

CVPR 2024arXiv:2303.15230

#10241

Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement

Daiwei Yu, Zhuorong Li, Lina Wei et al.

CVPR 2024arXiv:2403.09101

#10242

LoCoNet: Long-Short Context Network for Active Speaker Detection

Xizi Wang, Feng Cheng, Gedas Bertasius

CVPR 2024arXiv:2301.08237

#10243

WinSyn: : A High Resolution Testbed for Synthetic Data

Tom Kelly, John Femiani, Peter Wonka

CVPR 2024arXiv:2310.08471

#10244

Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation

Daichi Horita, Naoto Inoue, Kotaro Kikuchi et al.

CVPR 2024arXiv:2311.13602

#10245

Wired Perspectives: Multi-View Wire Art Embraces Generative AI

Zhiyu Qu, LAN YANG, Honggang Zhang et al.

CVPR 2024arXiv:2311.15421

#10246

Small Scale Data-Free Knowledge Distillation

He Liu, Yikai Wang, Huaping Liu et al.

CVPR 2024arXiv:2406.07876

#10247

Transfer CLIP for Generalizable Image Denoising

Jun Cheng, Dong Liang, Shan Tan

CVPR 2024arXiv:2403.15132

#10248

Validating Privacy-Preserving Face Recognition under a Minimum Assumption

Hui Zhang, Xingbo Dong, YenLungLai et al.

CVPR 2024

#10249

CLiC: Concept Learning in Context

Mehdi Safaee, Aryan Mikaeili, Or Patashnik et al.

CVPR 2024highlightarXiv:2311.17083

#10250

IDGuard: Robust General Identity-centric POI Proactive Defense Against Face Editing Abuse

Yunshu Dai, Jianwei Fei, Fangjun Huang

CVPR 2024

#10251

Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology

Wenhao Tang, Fengtao ZHOU, Sheng Huang et al.

CVPR 2024arXiv:2402.17228

#10252

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.

CVPR 2024highlightarXiv:2404.04319

#10253

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models

Zhongwei Zhang, Fuchen Long, Yingwei Pan et al.

CVPR 2024arXiv:2403.17005

#10254

Perceptual Assessment and Optimization of HDR Image Rendering

Peibei Cao, Rafal Mantiuk, Kede Ma

CVPR 2024arXiv:2310.12877

#10255

Pose-Transformed Equivariant Network for 3D Point Trajectory Prediction

Ruixuan Yu, Jian Sun

CVPR 2024

#10256

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications

Yuwen Xiong, Zhiqi Li, Yuntao Chen et al.

CVPR 2024highlightarXiv:2401.06197

#10257

Multimodal Representation Learning by Alternating Unimodal Adaptation

Xiaohui Zhang, Jaehong Yoon, Mohit Bansal et al.

CVPR 2024arXiv:2311.10707

#10258

Compositional Video Understanding with Spatiotemporal Structure-based Transformers

Hoyeoung Yun, Jinwoo Ahn, Minseo Kim et al.

CVPR 2024

#10259

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

Junshu Tang, Yanhong Zeng, Ke Fan et al.

CVPR 2024arXiv:2403.16897

#10260

Coherent Temporal Synthesis for Incremental Action Segmentation

Guodong Ding, Hans Golong, Angela Yao

CVPR 2024arXiv:2403.06102

#10261

Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing

ChangHee Yang, ChanHee Kang, Kyeongbo Kong et al.

CVPR 2024

#10262

Estimating Extreme 3D Image Rotations using Cascaded Attention

Shay Dekel, Yosi Keller, Martin Čadík

CVPR 2024

#10263

Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network

Yong Shu, Liquan Shen, Xiangyu Hu et al.

CVPR 2024arXiv:2405.00244

#10264

Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular Stereo and RGB-D Cameras

Huajian Huang, Longwei Li, Hui Cheng et al.

CVPR 2024arXiv:2311.16728

#10265

Attention Calibration for Disentangled Text-to-Image Personalization

Yanbing Zhang, Mengping Yang, Qin Zhou et al.

CVPR 2024arXiv:2403.18551

#10266

SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control

Jaskirat Singh, Jianming Zhang, Qing Liu et al.

CVPR 2024arXiv:2312.05039

#10267

GraCo: Granularity-Controllable Interactive Segmentation

Yian Zhao, Kehan Li, Zesen Cheng et al.

CVPR 2024highlightarXiv:2405.00587

#10268

Segment Every Out-of-Distribution Object

Wenjie Zhao, Jia Li, Xin Dong et al.

CVPR 2024arXiv:2311.16516

#10269

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.

CVPR 2024highlightarXiv:2312.06640

#10270

Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Fanghua Yu, Jinjin Gu, Zheyuan Li et al.

CVPR 2024arXiv:2401.13627

#10271

Masked and Shuffled Blind Spot Denoising for Real-World Images

Hamadi Chihaoui, Paolo Favaro

CVPR 2024arXiv:2404.09389

#10272

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

CVPR 2024highlightarXiv:2312.00690

#10273

Generative Region-Language Pretraining for Open-Ended Object Detection

Chuang Lin, Yi Jiang, Lizhen Qu et al.

CVPR 2024arXiv:2403.10191

#10274

Boosting Diffusion Models with Moving Average Sampling in Frequency Domain

Yurui Qian, Qi Cai, Yingwei Pan et al.

CVPR 2024arXiv:2403.17870

#10275

Discovering Syntactic Interaction Clues for Human-Object Interaction Detection

Jinguo Luo, Weihong Ren, Weibo Jiang et al.

CVPR 2024

#10276

Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture

Juanwu Lu, Can Cui, Yunsheng Ma et al.

CVPR 2024arXiv:2404.03789

#10277

Generative Latent Coding for Ultra-Low Bitrate Image Compression

Zhaoyang Jia, Jiahao Li, Bin Li et al.

CVPR 2024arXiv:2512.20194

#10278

Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization

Jimyeong Kim, Jungwon Park, Wonjong Rhee

CVPR 2024arXiv:2403.15330

#10279

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

Yaxu Xie, Alain Pagani, Didier Stricker

CVPR 2024arXiv:2403.19474

#10280

Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features

Thomas Wimmer, Peter Wonka, Maks Ovsjanikov

CVPR 2024arXiv:2311.18113

#10281

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Zeyi Sun, Ye Fang, Tong Wu et al.

CVPR 2024arXiv:2312.03818

#10282

DemoFusion: Democratising High-Resolution Image Generation With No $$$

Ruoyi DU, Dongliang Chang, Timothy Hospedales et al.

CVPR 2024arXiv:2311.16973

#10283

Activity-Biometrics: Person Identification from Daily Activities

Shehreen Azad, Yogesh S. Rawat

CVPR 2024arXiv:2403.17360

#10284

Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras

Ashwath Shetty, Marc Habermann, Guoxing Sun et al.

CVPR 2024arXiv:2312.07423

#10285

Neighbor Relations Matter in Video Scene Detection

Jiawei Tan, Hongxing Wang, Jiaxin Li et al.

CVPR 2024

#10286

Fast ODE-based Sampling for Diffusion Models in Around 5 Steps

Zhenyu Zhou, Defang Chen, Can Wang et al.

CVPR 2024highlightarXiv:2312.00094

#10287

Referring Image Editing: Object-level Image Editing via Referring Expressions

Chang Liu, Xiangtai Li, Henghui Ding

CVPR 2024

#10288

InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields

Dongqing Wang, Tong Zhang, Alaa Abboud et al.

CVPR 2024arXiv:2305.15094

#10289

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.

CVPR 2024arXiv:2312.10118

#10290

Unsupervised Blind Image Deblurring Based on Self-Enhancement

Lufei Chen, Xiangpeng Tian, Shuhua Xiong et al.

CVPR 2024

#10291

Mask Grounding for Referring Image Segmentation

Yong Xien Chng, Henry Zheng, Yizeng Han et al.

CVPR 2024arXiv:2312.12198

#10292

SignGraph: A Sign Sequence is Worth Graphs of Nodes

Shiwei Gan, Yafeng Yin, Zhiwei Jiang et al.

CVPR 2024

#10293

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion

Zixian Gao, Xun Jiang, Xing Xu et al.

CVPR 2024

#10294

DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching

Shuzhe Wang, Juho Kannala, Daniel Barath

CVPR 2024arXiv:2306.12547

#10295

FreeDrag: Feature Dragging for Reliable Point-based Image Editing

Pengyang Ling, Lin Chen, Pan Zhang et al.

CVPR 2024arXiv:2307.04684

#10296

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Yushi Huang, Ruihao Gong, Jing Liu et al.

CVPR 2024highlightarXiv:2311.16503

#10297

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

Shenhan Qian, Tobias Kirschstein, Liam Schoneveld et al.

CVPR 2024highlightarXiv:2312.02069

#10298

Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users

Daniela Massiceti, Camilla Longden, Agnieszka Słowik et al.

CVPR 2024arXiv:2311.17315

#10299

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

Yanting Wang, Hongye Fu, Wei Zou et al.

CVPR 2024arXiv:2403.19080

#10300

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

Hanrong Ye, Dan Xu

CVPR 2024arXiv:2403.15389

#10301

Revisiting Spatial-Frequency Information Integration from a Hierarchical Perspective for Panchromatic and Multi-Spectral Image Fusion

Jiangtong Tan, Jie Huang, Naishan Zheng et al.

CVPR 2024

#10302

FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding

Jinglin Xu, Guohao Zhao, Sibo Yin et al.

CVPR 2024

#10303

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

Matteo Farina, Massimiliano Mancini, Elia Cunegatti et al.

CVPR 2024arXiv:2404.05621

#10304

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization

Guopeng Li, Ming Qian, Gui-Song Xia

CVPR 2024arXiv:2403.14198

#10305

CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition

Qixuan Zheng, Ming Zhang, Hong Yan

CVPR 2024arXiv:2402.16594

#10306

FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning

Qiwei Li, Yuxin Peng, Jiahuan Zhou

CVPR 2024

#10307

GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs

Mustafa Munir, William Avery, Md Mostafijur Rahman et al.

CVPR 2024arXiv:2405.06849

#10308

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation

Yi Zhang, Meng-Hao Guo, Miao Wang et al.

CVPR 2024

#10309

GALA: Generating Animatable Layered Assets from a Single Scan

Taeksoo Kim, Byungjun Kim, Shunsuke Saito et al.

CVPR 2024arXiv:2401.12979

#10310

Improving Graph Contrastive Learning via Adaptive Positive Sampling

Jiaming Zhuo, Feiyang Qin, Can Cui et al.

CVPR 2024

#10311

Hearing Anything Anywhere

Mason Wang, Ryosuke Sawata, Samuel Clarke et al.

CVPR 2024arXiv:2406.07532

#10312

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.

CVPR 2024

#10313

Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation

Hyunwoo Ryu, Jiwoo Kim, Hyunseok An et al.

CVPR 2024highlightarXiv:2309.02685

#10314

BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics

Wenqian Zhang, Molin Huang, Yuxuan Zhou et al.

CVPR 2024arXiv:2312.07937

#10315

Bayesian Exploration of Pre-trained Models for Low-shot Image Classification

Yibo Miao, Yu lei, Feng Zhou et al.

CVPR 2024arXiv:2404.00312

#10316

Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

Runhao Zeng, Xiaoyong Chen, Jiaming Liang et al.

CVPR 2024arXiv:2403.20254

#10317

RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation

Yi Rong, Haoran Zhou, Kang Xia et al.

CVPR 2024

#10318

4K4D: Real-Time 4D View Synthesis at 4K Resolution

Zhen Xu, Sida Peng, Haotong Lin et al.

CVPR 2024arXiv:2310.11448

#10319

Context-Guided Spatio-Temporal Video Grounding

Xin Gu, Heng Fan, Yan Huang et al.

CVPR 2024arXiv:2401.01578

#10320

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

Sai Kumar Dwivedi, Yu Sun, Priyanka Patel et al.

CVPR 2024arXiv:2404.16752

#10321

Re-thinking Data Availability Attacks Against Deep Neural Networks

Bin Fang, Bo Li, Shuang Wu et al.

CVPR 2024

#10322

Logit Standardization in Knowledge Distillation

Shangquan Sun, Wenqi Ren, Jingzhi Li et al.

CVPR 2024highlightarXiv:2403.01427

#10323

A Unified Approach for Text- and Image-guided 4D Scene Generation

Yufeng Zheng, Xueting Li, Koki Nagano et al.

CVPR 2024arXiv:2311.16854

#10324

CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models

Tuna Han Salih Meral, Enis Simsar, Federico Tombari et al.

CVPR 2024arXiv:2312.06059

#10325

SPECAT: SPatial-spEctral Cumulative-Attention Transformer for High-Resolution Hyperspectral Image Reconstruction

Zhiyang Yao, Shuyang Liu, Xiaoyun Yuan et al.

CVPR 2024

#10326

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024arXiv:2403.19926

#10327

Neural Refinement for Absolute Pose Regression with Feature Synthesis

Shuai Chen, Yash Bhalgat, Xinghui Li et al.

CVPR 2024arXiv:2303.10087

#10328

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.

CVPR 2024highlight

#10329

Boosting Image Restoration via Priors from Pre-trained Models

Xiaogang Xu, Shu Kong, Tao Hu et al.

CVPR 2024arXiv:2403.06793

#10330

CPP-Net: Embracing Multi-Scale Feature Fusion into Deep Unfolding CP-PPA Network for Compressive Sensing

Zhen Guo, Hongping Gan

CVPR 2024

#10331

GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2024arXiv:2403.11510

#10332

PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling

Xiaoyun Zheng, Liwei Liao, Xufeng Li et al.

CVPR 2024arXiv:2403.16080

#10333

DiffCast: A Unified Framework via Residual Diffusion for Precipitation Nowcasting

Demin Yu, Xutao Li, Yunming Ye et al.

CVPR 2024arXiv:2312.06734

#10334

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Yawar Siddiqui, Antonio Alliegro, Alexey Artemov et al.

CVPR 2024highlightarXiv:2311.15475

#10335

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Geonho Bang, Kwangjin Choi, Jisong Kim et al.

CVPR 2024arXiv:2403.05061

#10336

Task-Conditioned Adaptation of Visual Features in Multi-Task Policy Learning

Pierre Marza, Laetitia Matignon, Olivier Simonin et al.

CVPR 2024arXiv:2402.07739

#10337

EasyDrag: Efficient Point-based Manipulation on Diffusion Models

Xingzhong Hou, Boxiao Liu, Yi Zhang et al.

CVPR 2024

#10338

Learned Lossless Image Compression based on Bit Plane Slicing

Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.

CVPR 2024

#10339

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Hongwei Zheng, Linyuan Zhou, Han Li et al.

CVPR 2024arXiv:2404.01179

#10340

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

Ziyu Wang, Yue Xu, Cewu Lu et al.

CVPR 2024arXiv:2312.00362

#10341

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Tongtian Yue, Jie Cheng, Longteng Guo et al.

CVPR 2024arXiv:2403.13263

#10342

Frequency-Adaptive Dilated Convolution for Semantic Segmentation

Linwei Chen, Lin Gu, Dezhi Zheng et al.

CVPR 2024highlightarXiv:2403.05369

#10343

TexTile: A Differentiable Metric for Texture Tileability

Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.

CVPR 2024arXiv:2403.12961

#10344

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

CVPR 2024arXiv:2401.06056

#10345

Image Processing GNN: Breaking Rigidity in Super-Resolution

Yuchuan Tian, Hanting Chen, Chao Xu et al.

CVPR 2024

#10346

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation

Suraj Patni, Aradhye Agarwal, Chetan Arora

CVPR 2024arXiv:2403.18807

#10347

Bi-Causal: Group Activity Recognition via Bidirectional Causality

Youliang Zhang, Wenxuan Liu, danni xu et al.

CVPR 2024

#10348

Riemannian Multinomial Logistics Regression for SPD Neural Networks

Ziheng Chen, Yue Song, Gaowen Liu et al.

CVPR 2024arXiv:2305.11288

#10349

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

Yuxing Duan

CVPR 2024arXiv:2405.19718

#10350

NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging

Takahiro Shirakawa, Seiichi Uchida

CVPR 2024arXiv:2403.03485

#10351

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM

Yutao Hu, Tianbin, Quanfeng Lu et al.

CVPR 2024arXiv:2402.09181

#10352

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng, Inbum Park, Andrew Owens

CVPR 2024arXiv:2311.17919

#10353

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.

CVPR 2024arXiv:2311.15383

#10354

Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings

Yakun Chang, Yeliduosi Xiaokaiti, Yujia Liu et al.

CVPR 2024

#10355

Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering

Suyuan Liu, KE LIANG, Zhibin Dong et al.

CVPR 2024

#10356

Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.

CVPR 2024arXiv:2402.18102

#10357

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

Honghui Yang, Sha Zhang, Di Huang et al.

CVPR 2024arXiv:2310.08370

#10358

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

Maitreya Patel, Changhoon Kim, Sheng Cheng et al.

CVPR 2024arXiv:2312.04655

#10359

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

Haoyu Chen, Wenbo Li, Jinjin Gu et al.

CVPR 2024arXiv:2403.02601

#10360

Neural Video Compression with Feature Modulation

Jiahao Li, Bin Li, Yan Lu

CVPR 2024arXiv:2402.17414

#10361

Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

Boheng Li, Yishuo Cai, Haowei Li et al.

CVPR 2024arXiv:2405.12725

#10362

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu, Guozhen Zhang, Jing Tan et al.

CVPR 2024arXiv:2404.00653

#10363

Discriminative Probing and Tuning for Text-to-Image Generation

Leigang Qu, Wenjie Wang, Yongqi Li et al.

CVPR 2024arXiv:2403.04321

#10364

GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes

Haozhe Lin, Chunyu Wei, Li He et al.

CVPR 2024

#10365

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

Dongsu Zhang, Francis Williams, Žan Gojčič et al.

CVPR 2024highlightarXiv:2406.08292

#10366

Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods

Mingqi Jiang, Saeed Khorram, Li Fuxin

CVPR 2024arXiv:2212.06872

#10367

Continual Segmentation with Disentangled Objectness Learning and Class Recognition

Yizheng Gong, Siyue Yu, Xiaoyang Wang et al.

CVPR 2024arXiv:2403.03477

#10368

Image Sculpting: Precise Object Editing with 3D Geometry Control

Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.

CVPR 2024arXiv:2401.01702

#10369

Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability

Yan Huang, Zhang Zhang, Qiang Wu et al.

CVPR 2024

#10370

Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection

Chen Chen, Jiahao Qi, Xingyue Liu et al.

CVPR 2024

#10371

Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization

Deng Li, Aming Wu, Yaowei Wang et al.

CVPR 2024arXiv:2402.18447

#10372

EscherNet: A Generative Model for Scalable View Synthesis

Xin Kong, Shikun Liu, Xiaoyang Lyu et al.

CVPR 2024arXiv:2402.03908

#10373

MVCPS-NeuS: Multi-view Constrained Photometric Stereo for Neural Surface Reconstruction

Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita

CVPR 2024

#10374

OHTA: One-shot Hand Avatar via Data-driven Implicit Priors

Xiaozheng Zheng, Chao Wen, Zhuo Su et al.

CVPR 2024arXiv:2402.18969

#10375

E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator

Wenjun Wu, Lingling Zhang, Jun Liu et al.

CVPR 2024

#10376

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

Nicolás Ugrinovic, Boxiao Pan, Georgios Pavlakos et al.

CVPR 2024arXiv:2404.11987

#10377

LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Hao Shao, Yuxuan Hu, Letian Wang et al.

CVPR 2024arXiv:2312.07488

#10378

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

CVPR 2024arXiv:2312.10998

#10379

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos

Shoukang Hu, Tao Hu, Ziwei Liu

CVPR 2024arXiv:2312.02973

#10380

BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection

Zhenxin Li, Shiyi Lan, Jose M. Alvarez et al.

CVPR 2024arXiv:2312.01696

#10381

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

Jieming Cui, Tengyu Liu, Nian Liu et al.

CVPR 2024arXiv:2403.12835

#10382

HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses

Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang et al.

CVPR 2024arXiv:2312.02232

#10383

SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation

Jiehong Lin, lihua liu, Dekun Lu et al.

CVPR 2024arXiv:2311.15707

#10384

SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering

Tao Hu, Fangzhou Hong, Ziwei Liu

CVPR 2024arXiv:2404.01225

#10385

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

Chenjie Cao, Yunuo Cai, Qiaole Dong et al.

CVPR 2024arXiv:2305.11577

#10386

ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

Xiaoqi Li, Mingxu Zhang, Yiran Geng et al.

CVPR 2024arXiv:2312.16217

#10387

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation

Wenxuan Wang, Tongtian Yue, Yisi Zhang et al.

CVPR 2024

#10388

PanoPose: Self-supervised Relative Pose Estimation for Panoramic Images

Diantao Tu, Hainan Cui, Xianwei Zheng et al.

CVPR 2024highlight

#10389

Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems

Haoquan Zhang, Ronggang Huang, Yi Xie et al.

CVPR 2024

#10390

Global and Local Prompts Cooperation via Optimal Transport for Federated Learning

Hongxia Li, Wei Huang, Jingya Wang et al.

CVPR 2024arXiv:2403.00041

#10391

VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

Ziyang Luo, Nian Liu, Wangbo Zhao et al.

CVPR 2024arXiv:2311.15011

#10392

Dense Optical Tracking: Connecting the Dots

Guillaume Le Moing, Jean Ponce, Cordelia Schmid

CVPR 2024highlightarXiv:2312.00786

#10393

Multi-agent Collaborative Perception via Motion-aware Robust Communication Network

Shixin Hong, Yu LIU, Zhi Li et al.

CVPR 2024

#10394

Ungeneralizable Examples

Jingwen Ye, Xinchao Wang

CVPR 2024arXiv:2404.14016

#10395

Language-only Training of Zero-shot Composed Image Retrieval

Geonmo Gu, Sanghyuk Chun, Wonjae Kim et al.

CVPR 2024

#10396

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

Shitian Zhao, Zhuowan Li, YadongLu et al.

CVPR 2024highlightarXiv:2312.06685

#10397

Rapid Motor Adaptation for Robotic Manipulator Arms

Yichao Liang, Kevin Ellis, João F. Henriques

CVPR 2024arXiv:2312.04670

#10398

Instruct-Imagen: Image Generation with Multi-modal Instruction

Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su et al.

CVPR 2024arXiv:2401.01952

#10399

Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation

Dong Lao, Congli Wang, Alex Wong et al.

CVPR 2024highlightarXiv:2405.03662

#10400

Adapting to Length Shift: FlexiLength Network for Trajectory Prediction

Yi Xu, Yun Fu

CVPR 2024arXiv:2404.00742

← Previous

1...50 51 52 53 54...62