Most Cited 2024 "ventral stream selectivity" Papers

12,324 papers found • Page 52 of 62

#10201

Boosting Neural Representations for Videos with a Conditional Decoder

XINJIE ZHANG, Ren Yang, Dailan He et al.

CVPR 2024highlightarXiv:2402.18152
#10202

Unsupervised Feature Learning with Emergent Data-Driven Prototypicality

Yunhui Guo, Youren Zhang, Yubei Chen et al.

CVPR 2024posterarXiv:2307.01421
#10203

Text-Guided 3D Face Synthesis - From Generation to Editing

Yunjie Wu, Yapeng Meng, Zhipeng Hu et al.

CVPR 2024posterarXiv:2312.00375
#10204

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

Huan Ling, Seung Wook Kim, Antonio Torralba et al.

CVPR 2024highlightarXiv:2312.13763
#10205

IReNe: Instant Recoloring of Neural Radiance Fields

Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces et al.

CVPR 2024posterarXiv:2405.19876
#10206

Constrained Layout Generation with Factor Graphs

Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng et al.

CVPR 2024posterarXiv:2404.00385
#10207

URHand: Universal Relightable Hands

Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo et al.

CVPR 2024posterarXiv:2401.05334
#10208

Neural Implicit Morphing of Face Images

Guilherme Schardong, Tiago Novello, Hallison Paz et al.

CVPR 2024posterarXiv:2308.13888
#10209

Distilling CLIP with Dual Guidance for Learning Discriminative Human Body Shape Representation

Feng Liu, Minchul Kim, Zhiyuan Ren et al.

CVPR 2024poster
#10210

Snapshot Lidar: Fourier Embedding of Amplitude and Phase for Single-Image Depth Reconstruction

Sarah Friday, Yunzi Shi, Yaswanth Kumar Cherivirala et al.

CVPR 2024poster
#10211

CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification

Haoran Lai, Qingsong Yao, Zihang Jiang et al.

CVPR 2024posterarXiv:2402.17417
#10212

MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant

Chenlu Zhan, Gaoang Wang, Yu LIN et al.

CVPR 2024posterarXiv:2403.04290
#10213

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

Jihao Liu, Jinliang Zheng, Yu Liu et al.

CVPR 2024posterarXiv:2404.07603
#10214

Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion

Sofia Casarin, Cynthia Ugwu, Sergio Escalera et al.

CVPR 2024posterarXiv:2403.15194
#10215

Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction

Jinzhi Zheng, Heng Fan, Libo Zhang

CVPR 2024poster
#10216

NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation

Sicheng Li, Hao Li, Yiyi Liao et al.

CVPR 2024posterarXiv:2404.02185
#10217

Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing

Hyelin Nam, Gihyun Kwon, Geon Yeong Park et al.

CVPR 2024posterarXiv:2311.18608
#10218

PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar

Tzofi Klinghoffer, Xiaoyu Xiang, Siddharth Somasundaram et al.

CVPR 2024posterarXiv:2312.14239
#10219

DiffLoc: Diffusion Model for Outdoor LiDAR Localization

Wen Li, Yuyang Yang, Shangshu Yu et al.

CVPR 2024poster
#10220

Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation

Mingyu Lee, Jongwon Choi

CVPR 2024posterarXiv:2403.06247
#10221

Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models

Pengze Zhang, Hubery Yin, Chen Li et al.

CVPR 2024highlightarXiv:2403.08381
#10222

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

Siteng Huang, Biao Gong, Yutong Feng et al.

CVPR 2024posterarXiv:2303.15230
#10223

Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement

Daiwei Yu, Zhuorong Li, Lina Wei et al.

CVPR 2024posterarXiv:2403.09101
#10224

LoCoNet: Long-Short Context Network for Active Speaker Detection

Xizi Wang, Feng Cheng, Gedas Bertasius

CVPR 2024posterarXiv:2301.08237
#10225

WinSyn: : A High Resolution Testbed for Synthetic Data

Tom Kelly, John Femiani, Peter Wonka

CVPR 2024posterarXiv:2310.08471
#10226

Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation

Daichi Horita, Naoto Inoue, Kotaro Kikuchi et al.

CVPR 2024posterarXiv:2311.13602
#10227

Wired Perspectives: Multi-View Wire Art Embraces Generative AI

Zhiyu Qu, LAN YANG, Honggang Zhang et al.

CVPR 2024posterarXiv:2311.15421
#10228

Small Scale Data-Free Knowledge Distillation

He Liu, Yikai Wang, Huaping Liu et al.

CVPR 2024posterarXiv:2406.07876
#10229

Transfer CLIP for Generalizable Image Denoising

Jun Cheng, Dong Liang, Shan Tan

CVPR 2024posterarXiv:2403.15132
#10230

Validating Privacy-Preserving Face Recognition under a Minimum Assumption

Hui Zhang, Xingbo Dong, YenLungLai et al.

CVPR 2024poster
#10231

CLiC: Concept Learning in Context

Mehdi Safaee, Aryan Mikaeili, Or Patashnik et al.

CVPR 2024highlightarXiv:2311.17083
#10232

IDGuard: Robust General Identity-centric POI Proactive Defense Against Face Editing Abuse

Yunshu Dai, Jianwei Fei, Fangjun Huang

CVPR 2024poster
#10233

Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology

Wenhao Tang, Fengtao ZHOU, Sheng Huang et al.

CVPR 2024posterarXiv:2402.17228
#10234

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.

CVPR 2024highlightarXiv:2404.04319
#10235

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models

Zhongwei Zhang, Fuchen Long, Yingwei Pan et al.

CVPR 2024posterarXiv:2403.17005
#10236

Perceptual Assessment and Optimization of HDR Image Rendering

Peibei Cao, Rafal Mantiuk, Kede Ma

CVPR 2024posterarXiv:2310.12877
#10237

Pose-Transformed Equivariant Network for 3D Point Trajectory Prediction

Ruixuan Yu, Jian Sun

CVPR 2024poster
#10238

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications

Yuwen Xiong, Zhiqi Li, Yuntao Chen et al.

CVPR 2024highlightarXiv:2401.06197
#10239

Multimodal Representation Learning by Alternating Unimodal Adaptation

Xiaohui Zhang, Jaehong Yoon, Mohit Bansal et al.

CVPR 2024posterarXiv:2311.10707
#10240

Compositional Video Understanding with Spatiotemporal Structure-based Transformers

Hoyeoung Yun, Jinwoo Ahn, Minseo Kim et al.

CVPR 2024poster
#10241

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

Junshu Tang, Yanhong Zeng, Ke Fan et al.

CVPR 2024posterarXiv:2403.16897
#10242

Coherent Temporal Synthesis for Incremental Action Segmentation

Guodong Ding, Hans Golong, Angela Yao

CVPR 2024posterarXiv:2403.06102
#10243

Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing

ChangHee Yang, ChanHee Kang, Kyeongbo Kong et al.

CVPR 2024poster
#10244

Estimating Extreme 3D Image Rotations using Cascaded Attention

Shay Dekel, Yosi Keller, Martin Čadík

CVPR 2024poster
#10245

Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network

Yong Shu, Liquan Shen, Xiangyu Hu et al.

CVPR 2024posterarXiv:2405.00244
#10246

Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular Stereo and RGB-D Cameras

Huajian Huang, Longwei Li, Hui Cheng et al.

CVPR 2024posterarXiv:2311.16728
#10247

Attention Calibration for Disentangled Text-to-Image Personalization

Yanbing Zhang, Mengping Yang, Qin Zhou et al.

CVPR 2024posterarXiv:2403.18551
#10248

SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control

Jaskirat Singh, Jianming Zhang, Qing Liu et al.

CVPR 2024posterarXiv:2312.05039
#10249

GraCo: Granularity-Controllable Interactive Segmentation

Yian Zhao, Kehan Li, Zesen Cheng et al.

CVPR 2024highlightarXiv:2405.00587
#10250

Segment Every Out-of-Distribution Object

Wenjie Zhao, Jia Li, Xin Dong et al.

CVPR 2024posterarXiv:2311.16516
#10251

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.

CVPR 2024highlightarXiv:2312.06640
#10252

Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Fanghua Yu, Jinjin Gu, Zheyuan Li et al.

CVPR 2024posterarXiv:2401.13627
#10253

Masked and Shuffled Blind Spot Denoising for Real-World Images

Hamadi Chihaoui, Paolo Favaro

CVPR 2024posterarXiv:2404.09389
#10254

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

CVPR 2024highlightarXiv:2312.00690
#10255

Generative Region-Language Pretraining for Open-Ended Object Detection

Chuang Lin, Yi Jiang, Lizhen Qu et al.

CVPR 2024posterarXiv:2403.10191
#10256

Boosting Diffusion Models with Moving Average Sampling in Frequency Domain

Yurui Qian, Qi Cai, Yingwei Pan et al.

CVPR 2024posterarXiv:2403.17870
#10257

Discovering Syntactic Interaction Clues for Human-Object Interaction Detection

Jinguo Luo, Weihong Ren, Weibo Jiang et al.

CVPR 2024poster
#10258

Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture

Juanwu Lu, Can Cui, Yunsheng Ma et al.

CVPR 2024posterarXiv:2404.03789
#10259

Generative Latent Coding for Ultra-Low Bitrate Image Compression

Zhaoyang Jia, Jiahao Li, Bin Li et al.

CVPR 2024posterarXiv:2512.20194
#10260

Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization

Jimyeong Kim, Jungwon Park, Wonjong Rhee

CVPR 2024posterarXiv:2403.15330
#10261

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

Yaxu Xie, Alain Pagani, Didier Stricker

CVPR 2024posterarXiv:2403.19474
#10262

Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features

Thomas Wimmer, Peter Wonka, Maks Ovsjanikov

CVPR 2024posterarXiv:2311.18113
#10263

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Zeyi Sun, Ye Fang, Tong Wu et al.

CVPR 2024posterarXiv:2312.03818
#10264

DemoFusion: Democratising High-Resolution Image Generation With No $$$

Ruoyi DU, Dongliang Chang, Timothy Hospedales et al.

CVPR 2024posterarXiv:2311.16973
#10265

Activity-Biometrics: Person Identification from Daily Activities

Shehreen Azad, Yogesh S. Rawat

CVPR 2024posterarXiv:2403.17360
#10266

Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras

Ashwath Shetty, Marc Habermann, Guoxing Sun et al.

CVPR 2024posterarXiv:2312.07423
#10267

Neighbor Relations Matter in Video Scene Detection

Jiawei Tan, Hongxing Wang, Jiaxin Li et al.

CVPR 2024poster
#10268

Fast ODE-based Sampling for Diffusion Models in Around 5 Steps

Zhenyu Zhou, Defang Chen, Can Wang et al.

CVPR 2024highlightarXiv:2312.00094
#10269

Referring Image Editing: Object-level Image Editing via Referring Expressions

Chang Liu, Xiangtai Li, Henghui Ding

CVPR 2024poster
#10270

InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields

Dongqing Wang, Tong Zhang, Alaa Abboud et al.

CVPR 2024posterarXiv:2305.15094
#10271

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.

CVPR 2024posterarXiv:2312.10118
#10272

Unsupervised Blind Image Deblurring Based on Self-Enhancement

Lufei Chen, Xiangpeng Tian, Shuhua Xiong et al.

CVPR 2024poster
#10273

Mask Grounding for Referring Image Segmentation

Yong Xien Chng, Henry Zheng, Yizeng Han et al.

CVPR 2024posterarXiv:2312.12198
#10274

SignGraph: A Sign Sequence is Worth Graphs of Nodes

Shiwei Gan, Yafeng Yin, Zhiwei Jiang et al.

CVPR 2024poster
#10275

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion

Zixian Gao, Xun Jiang, Xing Xu et al.

CVPR 2024poster
#10276

DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching

Shuzhe Wang, Juho Kannala, Daniel Barath

CVPR 2024posterarXiv:2306.12547
#10277

FreeDrag: Feature Dragging for Reliable Point-based Image Editing

Pengyang Ling, Lin Chen, Pan Zhang et al.

CVPR 2024posterarXiv:2307.04684
#10278

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Yushi Huang, Ruihao Gong, Jing Liu et al.

CVPR 2024highlightarXiv:2311.16503
#10279

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

Shenhan Qian, Tobias Kirschstein, Liam Schoneveld et al.

CVPR 2024highlightarXiv:2312.02069
#10280

Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users

Daniela Massiceti, Camilla Longden, Agnieszka Słowik et al.

CVPR 2024posterarXiv:2311.17315
#10281

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

Yanting Wang, Hongye Fu, Wei Zou et al.

CVPR 2024posterarXiv:2403.19080
#10282

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

Hanrong Ye, Dan Xu

CVPR 2024posterarXiv:2403.15389
#10283

Revisiting Spatial-Frequency Information Integration from a Hierarchical Perspective for Panchromatic and Multi-Spectral Image Fusion

Jiangtong Tan, Jie Huang, Naishan Zheng et al.

CVPR 2024poster
#10284

FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding

Jinglin Xu, Guohao Zhao, Sibo Yin et al.

CVPR 2024poster
#10285

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

Matteo Farina, Massimiliano Mancini, Elia Cunegatti et al.

CVPR 2024posterarXiv:2404.05621
#10286

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization

Guopeng Li, Ming Qian, Gui-Song Xia

CVPR 2024posterarXiv:2403.14198
#10287

CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition

Qixuan Zheng, Ming Zhang, Hong Yan

CVPR 2024posterarXiv:2402.16594
#10288

FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning

Qiwei Li, Yuxin Peng, Jiahuan Zhou

CVPR 2024poster
#10289

GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs

Mustafa Munir, William Avery, Md Mostafijur Rahman et al.

CVPR 2024posterarXiv:2405.06849
#10290

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation

Yi Zhang, Meng-Hao Guo, Miao Wang et al.

CVPR 2024poster
#10291

GALA: Generating Animatable Layered Assets from a Single Scan

Taeksoo Kim, Byungjun Kim, Shunsuke Saito et al.

CVPR 2024posterarXiv:2401.12979
#10292

Improving Graph Contrastive Learning via Adaptive Positive Sampling

Jiaming Zhuo, Feiyang Qin, Can Cui et al.

CVPR 2024poster
#10293

Hearing Anything Anywhere

Mason Wang, Ryosuke Sawata, Samuel Clarke et al.

CVPR 2024posterarXiv:2406.07532
#10294

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.

CVPR 2024poster
#10295

Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation

Hyunwoo Ryu, Jiwoo Kim, Hyunseok An et al.

CVPR 2024highlightarXiv:2309.02685
#10296

BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics

Wenqian Zhang, Molin Huang, Yuxuan Zhou et al.

CVPR 2024posterarXiv:2312.07937
#10297

Bayesian Exploration of Pre-trained Models for Low-shot Image Classification

Yibo Miao, Yu lei, Feng Zhou et al.

CVPR 2024posterarXiv:2404.00312
#10298

Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

Runhao Zeng, Xiaoyong Chen, Jiaming Liang et al.

CVPR 2024posterarXiv:2403.20254
#10299

RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation

Yi Rong, Haoran Zhou, Kang Xia et al.

CVPR 2024poster
#10300

4K4D: Real-Time 4D View Synthesis at 4K Resolution

Zhen Xu, Sida Peng, Haotong Lin et al.

CVPR 2024posterarXiv:2310.11448
#10301

Context-Guided Spatio-Temporal Video Grounding

Xin Gu, Heng Fan, Yan Huang et al.

CVPR 2024posterarXiv:2401.01578
#10302

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

Sai Kumar Dwivedi, Yu Sun, Priyanka Patel et al.

CVPR 2024posterarXiv:2404.16752
#10303

Re-thinking Data Availability Attacks Against Deep Neural Networks

Bin Fang, Bo Li, Shuang Wu et al.

CVPR 2024poster
#10304

Logit Standardization in Knowledge Distillation

Shangquan Sun, Wenqi Ren, Jingzhi Li et al.

CVPR 2024highlightarXiv:2403.01427
#10305

A Unified Approach for Text- and Image-guided 4D Scene Generation

Yufeng Zheng, Xueting Li, Koki Nagano et al.

CVPR 2024posterarXiv:2311.16854
#10306

CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models

Tuna Han Salih Meral, Enis Simsar, Federico Tombari et al.

CVPR 2024posterarXiv:2312.06059
#10307

SPECAT: SPatial-spEctral Cumulative-Attention Transformer for High-Resolution Hyperspectral Image Reconstruction

Zhiyang Yao, Shuyang Liu, Xiaoyun Yuan et al.

CVPR 2024poster
#10308

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024posterarXiv:2403.19926
#10309

Neural Refinement for Absolute Pose Regression with Feature Synthesis

Shuai Chen, Yash Bhalgat, Xinghui Li et al.

CVPR 2024posterarXiv:2303.10087
#10310

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.

CVPR 2024highlight
#10311

Boosting Image Restoration via Priors from Pre-trained Models

Xiaogang Xu, Shu Kong, Tao Hu et al.

CVPR 2024posterarXiv:2403.06793
#10312

CPP-Net: Embracing Multi-Scale Feature Fusion into Deep Unfolding CP-PPA Network for Compressive Sensing

Zhen Guo, Hongping Gan

CVPR 2024poster
#10313

GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2024posterarXiv:2403.11510
#10314

PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling

Xiaoyun Zheng, Liwei Liao, Xufeng Li et al.

CVPR 2024posterarXiv:2403.16080
#10315

DiffCast: A Unified Framework via Residual Diffusion for Precipitation Nowcasting

Demin Yu, Xutao Li, Yunming Ye et al.

CVPR 2024posterarXiv:2312.06734
#10316

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Yawar Siddiqui, Antonio Alliegro, Alexey Artemov et al.

CVPR 2024highlightarXiv:2311.15475
#10317

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Geonho Bang, Kwangjin Choi, Jisong Kim et al.

CVPR 2024posterarXiv:2403.05061
#10318

Task-Conditioned Adaptation of Visual Features in Multi-Task Policy Learning

Pierre Marza, Laetitia Matignon, Olivier Simonin et al.

CVPR 2024posterarXiv:2402.07739
#10319

EasyDrag: Efficient Point-based Manipulation on Diffusion Models

Xingzhong Hou, Boxiao Liu, Yi Zhang et al.

CVPR 2024poster
#10320

Learned Lossless Image Compression based on Bit Plane Slicing

Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.

CVPR 2024poster
#10321

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Hongwei Zheng, Linyuan Zhou, Han Li et al.

CVPR 2024posterarXiv:2404.01179
#10322

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

Ziyu Wang, Yue Xu, Cewu Lu et al.

CVPR 2024posterarXiv:2312.00362
#10323

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Tongtian Yue, Jie Cheng, Longteng Guo et al.

CVPR 2024posterarXiv:2403.13263
#10324

Frequency-Adaptive Dilated Convolution for Semantic Segmentation

Linwei Chen, Lin Gu, Dezhi Zheng et al.

CVPR 2024highlightarXiv:2403.05369
#10325

TexTile: A Differentiable Metric for Texture Tileability

Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.

CVPR 2024posterarXiv:2403.12961
#10326

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

CVPR 2024posterarXiv:2401.06056
#10327

Image Processing GNN: Breaking Rigidity in Super-Resolution

Yuchuan Tian, Hanting Chen, Chao Xu et al.

CVPR 2024poster
#10328

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation

Suraj Patni, Aradhye Agarwal, Chetan Arora

CVPR 2024posterarXiv:2403.18807
#10329

Bi-Causal: Group Activity Recognition via Bidirectional Causality

Youliang Zhang, Wenxuan Liu, danni xu et al.

CVPR 2024poster
#10330

Riemannian Multinomial Logistics Regression for SPD Neural Networks

Ziheng Chen, Yue Song, Gaowen Liu et al.

CVPR 2024posterarXiv:2305.11288
#10331

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

Yuxing Duan

CVPR 2024posterarXiv:2405.19718
#10332

NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging

Takahiro Shirakawa, Seiichi Uchida

CVPR 2024posterarXiv:2403.03485
#10333

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM

Yutao Hu, Tianbin, Quanfeng Lu et al.

CVPR 2024posterarXiv:2402.09181
#10334

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng, Inbum Park, Andrew Owens

CVPR 2024posterarXiv:2311.17919
#10335

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.

CVPR 2024posterarXiv:2311.15383
#10336

Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings

Yakun Chang, Yeliduosi Xiaokaiti, Yujia Liu et al.

CVPR 2024poster
#10337

Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering

Suyuan Liu, KE LIANG, Zhibin Dong et al.

CVPR 2024poster
#10338

Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.

CVPR 2024posterarXiv:2402.18102
#10339

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

Honghui Yang, Sha Zhang, Di Huang et al.

CVPR 2024posterarXiv:2310.08370
#10340

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

Maitreya Patel, Changhoon Kim, Sheng Cheng et al.

CVPR 2024posterarXiv:2312.04655
#10341

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

Haoyu Chen, Wenbo Li, Jinjin Gu et al.

CVPR 2024posterarXiv:2403.02601
#10342

Neural Video Compression with Feature Modulation

Jiahao Li, Bin Li, Yan Lu

CVPR 2024posterarXiv:2402.17414
#10343

Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

Boheng Li, Yishuo Cai, Haowei Li et al.

CVPR 2024posterarXiv:2405.12725
#10344

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu, Guozhen Zhang, Jing Tan et al.

CVPR 2024posterarXiv:2404.00653
#10345

Discriminative Probing and Tuning for Text-to-Image Generation

Leigang Qu, Wenjie Wang, Yongqi Li et al.

CVPR 2024posterarXiv:2403.04321
#10346

GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes

Haozhe Lin, Chunyu Wei, Li He et al.

CVPR 2024poster
#10347

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

Dongsu Zhang, Francis Williams, Žan Gojčič et al.

CVPR 2024highlightarXiv:2406.08292
#10348

Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods

Mingqi Jiang, Saeed Khorram, Li Fuxin

CVPR 2024posterarXiv:2212.06872
#10349

Continual Segmentation with Disentangled Objectness Learning and Class Recognition

Yizheng Gong, Siyue Yu, Xiaoyang Wang et al.

CVPR 2024posterarXiv:2403.03477
#10350

Image Sculpting: Precise Object Editing with 3D Geometry Control

Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.

CVPR 2024posterarXiv:2401.01702
#10351

Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability

Yan Huang, Zhang Zhang, Qiang Wu et al.

CVPR 2024poster
#10352

Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection

Chen Chen, Jiahao Qi, Xingyue Liu et al.

CVPR 2024poster
#10353

Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization

Deng Li, Aming Wu, Yaowei Wang et al.

CVPR 2024posterarXiv:2402.18447
#10354

EscherNet: A Generative Model for Scalable View Synthesis

Xin Kong, Shikun Liu, Xiaoyang Lyu et al.

CVPR 2024posterarXiv:2402.03908
#10355

MVCPS-NeuS: Multi-view Constrained Photometric Stereo for Neural Surface Reconstruction

Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita

CVPR 2024poster
#10356

OHTA: One-shot Hand Avatar via Data-driven Implicit Priors

Xiaozheng Zheng, Chao Wen, Zhuo Su et al.

CVPR 2024posterarXiv:2402.18969
#10357

E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator

Wenjun Wu, Lingling Zhang, Jun Liu et al.

CVPR 2024poster
#10358

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

Nicolás Ugrinovic, Boxiao Pan, Georgios Pavlakos et al.

CVPR 2024posterarXiv:2404.11987
#10359

LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Hao Shao, Yuxuan Hu, Letian Wang et al.

CVPR 2024posterarXiv:2312.07488
#10360

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

CVPR 2024posterarXiv:2312.10998
#10361

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos

Shoukang Hu, Tao Hu, Ziwei Liu

CVPR 2024posterarXiv:2312.02973
#10362

BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection

Zhenxin Li, Shiyi Lan, Jose M. Alvarez et al.

CVPR 2024posterarXiv:2312.01696
#10363

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

Jieming Cui, Tengyu Liu, Nian Liu et al.

CVPR 2024posterarXiv:2403.12835
#10364

HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses

Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang et al.

CVPR 2024posterarXiv:2312.02232
#10365

SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation

Jiehong Lin, lihua liu, Dekun Lu et al.

CVPR 2024posterarXiv:2311.15707
#10366

SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering

Tao Hu, Fangzhou Hong, Ziwei Liu

CVPR 2024posterarXiv:2404.01225
#10367

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

Chenjie Cao, Yunuo Cai, Qiaole Dong et al.

CVPR 2024posterarXiv:2305.11577
#10368

ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

Xiaoqi Li, Mingxu Zhang, Yiran Geng et al.

CVPR 2024posterarXiv:2312.16217
#10369

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation

Wenxuan Wang, Tongtian Yue, Yisi Zhang et al.

CVPR 2024poster
#10370

PanoPose: Self-supervised Relative Pose Estimation for Panoramic Images

Diantao Tu, Hainan Cui, Xianwei Zheng et al.

CVPR 2024highlight
#10371

Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems

Haoquan Zhang, Ronggang Huang, Yi Xie et al.

CVPR 2024poster
#10372

Global and Local Prompts Cooperation via Optimal Transport for Federated Learning

Hongxia Li, Wei Huang, Jingya Wang et al.

CVPR 2024posterarXiv:2403.00041
#10373

VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

Ziyang Luo, Nian Liu, Wangbo Zhao et al.

CVPR 2024posterarXiv:2311.15011
#10374

Dense Optical Tracking: Connecting the Dots

Guillaume Le Moing, Jean Ponce, Cordelia Schmid

CVPR 2024highlightarXiv:2312.00786
#10375

Multi-agent Collaborative Perception via Motion-aware Robust Communication Network

Shixin Hong, Yu LIU, Zhi Li et al.

CVPR 2024poster
#10376

Ungeneralizable Examples

Jingwen Ye, Xinchao Wang

CVPR 2024posterarXiv:2404.14016
#10377

Language-only Training of Zero-shot Composed Image Retrieval

Geonmo Gu, Sanghyuk Chun, Wonjae Kim et al.

CVPR 2024poster
#10378

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

Shitian Zhao, Zhuowan Li, YadongLu et al.

CVPR 2024highlightarXiv:2312.06685
#10379

Rapid Motor Adaptation for Robotic Manipulator Arms

Yichao Liang, Kevin Ellis, João F. Henriques

CVPR 2024posterarXiv:2312.04670
#10380

Instruct-Imagen: Image Generation with Multi-modal Instruction

Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su et al.

CVPR 2024posterarXiv:2401.01952
#10381

Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation

Dong Lao, Congli Wang, Alex Wong et al.

CVPR 2024highlightarXiv:2405.03662
#10382

Adapting to Length Shift: FlexiLength Network for Trajectory Prediction

Yi Xu, Yun Fu

CVPR 2024posterarXiv:2404.00742
#10383

CausalPC: Improving the Robustness of Point Cloud Classification by Causal Effect Identification

Yuanmin Huang, Mi Zhang, Daizong Ding et al.

CVPR 2024poster
#10384

LiSA: LiDAR Localization with Semantic Awareness

Bochun Yang, Zijun Li, Wen Li et al.

CVPR 2024highlight
#10385

Rethinking Prior Information Generation with CLIP for Few-Shot Segmentation

Jin Wang, Bingfeng Zhang, Jian Pang et al.

CVPR 2024posterarXiv:2405.08458
#10386

Adaptive Slot Attention: Object Discovery with Dynamic Slot Number

Ke Fan, Zechen Bai, Tianjun Xiao et al.

CVPR 2024posterarXiv:2406.09196
#10387

Learning Coupled Dictionaries from Unpaired Data for Image Super-Resolution

Longguang Wang, Juncheng Li, Yingqian Wang et al.

CVPR 2024poster
#10388

C3: High-Performance and Low-Complexity Neural Compression from a Single Image or Video

Hyunjik Kim, Matthias Bauer, Lucas Theis et al.

CVPR 2024posterarXiv:2312.02753
#10389

AttriHuman-3D: Editable 3D Human Avatar Generation with Attribute Decomposition and Indexing

Fan Yang, Tianyi Chen, XIAOSHENG HE et al.

CVPR 2024posterarXiv:2312.02209
#10390

iToF-flow-based High Frame Rate Depth Imaging

Yu Meng, Zhou Xue, Xu Chang et al.

CVPR 2024poster
#10391

Rethinking Human Motion Prediction with Symplectic Integral

Haipeng Chen, Kedi L yu, Zhenguang Liu et al.

CVPR 2024poster
#10392

Detector-Free Structure from Motion

Xingyi He, Jiaming Sun, Yifan Wang et al.

CVPR 2024posterarXiv:2306.15669
#10393

Holodeck: Language Guided Generation of 3D Embodied AI Environments

Yue Yang, Fan-Yun Sun, Luca Weihs et al.

CVPR 2024posterarXiv:2312.09067
#10394

DiVAS: Video and Audio Synchronization with Dynamic Frame Rates

Clara Maria Fernandez Labrador, Mertcan Akcay, Eitan Abecassis et al.

CVPR 2024poster
#10395

Benchmarking Audio Visual Segmentation for Long-Untrimmed Videos

Chen Liu, Peike Li, Qingtao Yu et al.

CVPR 2024poster
#10396

Inter-X: Towards Versatile Human-Human Interaction Analysis

Liang Xu, Xintao Lv, Yichao Yan et al.

CVPR 2024posterarXiv:2312.16051
#10397

Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld

Yijun Yang, Tianyi Zhou, kanxue Li et al.

CVPR 2024posterarXiv:2311.16714
#10398

One-Shot Open Affordance Learning with Foundation Models

Gen Li, Deqing Sun, Laura Sevilla-Lara et al.

CVPR 2024posterarXiv:2311.17776
#10399

SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks

Xinyu Shi, Zecheng Hao, Zhaofei Yu

CVPR 2024posterarXiv:2403.14302
#10400

Tactile-Augmented Radiance Fields

Yiming Dou, Fengyu Yang, Yi Liu et al.

CVPR 2024posterarXiv:2405.04534