Most Cited 2024 Poster Papers

12,324 papers found • Page 38 of 62

#7401

FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models

LIn Zhao, Tianchen Zhao, Zinan Lin et al.

CVPR 2024posterarXiv:2403.16379
#7402

COLMAP-Free 3D Gaussian Splatting

Yang Fu, Sifei Liu, Amey Kulkarni et al.

CVPR 2024highlightarXiv:2312.07504
#7403

Personalized Residuals for Concept-Driven Text-to-Image Generation

Cusuh Ham, Matthew Fisher, James Hays et al.

CVPR 2024posterarXiv:2405.12978
#7404

Forecasting of 3D Whole-body Human Poses with Grasping Objects

yan haitao, Qiongjie Cui, Jiexin Xie et al.

CVPR 2024poster
#7405

Edge-Aware 3D Instance Segmentation Network with Intelligent Semantic Prior

Wonseok Roh, Hwanhee Jung, Giljoo Nam et al.

CVPR 2024poster
#7406

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe et al.

CVPR 2024highlightarXiv:2312.04524
#7407

Generalizable Novel-View Synthesis using a Stereo Camera

Haechan Lee, Wonjoon Jin, Seung-Hwan Baek et al.

CVPR 2024posterarXiv:2404.13541
#7408

Don’t Drop Your Samples! Coherence-Aware Training Benefits Conditional Diffusion

Nicolas Dufour, Victor Besnier, Vicky Kalogeiton et al.

CVPR 2024highlight
#7409

Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes

Chi-Hsi Kung, 書緯 呂, Yi-Hsuan Tsai et al.

CVPR 2024posterarXiv:2311.17948
#7410

Diff-BGM: A Diffusion Model for Video Background Music Generation

Sizhe Li, Yiming Qin, Minghang Zheng et al.

CVPR 2024posterarXiv:2405.11913
#7411

Shadow-Enlightened Image Outpainting

Hang Yu, Ruilin Li, Shaorong Xie et al.

CVPR 2024poster
#7412

Specularity Factorization for Low-Light Enhancement

Saurabh Saini, P. J. Narayanan

CVPR 2024posterarXiv:2404.01998
#7413

Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification

Bin Yang, Jun Chen, Mang Ye

CVPR 2024poster
#7414

Attack To Defend: Exploiting Adversarial Attacks for Detecting Poisoned Models

Samar Fares, Karthik Nandakumar

CVPR 2024poster
#7415

LiDAR-Net: A Real-scanned 3D Point Cloud Dataset for Indoor Scenes

Yanwen Guo, Yuanqi Li, Dayong Ren et al.

CVPR 2024poster
#7416

Non-autoregressive Sequence-to-Sequence Vision-Language Models

Kunyu Shi, Qi Dong, Luis Goncalves et al.

CVPR 2024posterarXiv:2403.02249
#7417

No More Ambiguity in 360° Room Layout via Bi-Layout Estimation

Yu-Ju Tsai, Jin-Cheng Jhang, JINGJING ZHENG et al.

CVPR 2024posterarXiv:2404.09993
#7418

HUNTER: Unsupervised Human-centric 3D Detection via Transferring Knowledge from Synthetic Instances to Real Scenes

Yichen Yao, Zimo Jiang, YUJING SUN et al.

CVPR 2024posterarXiv:2403.02769
#7419

3D LiDAR Mapping in Dynamic Environments using a 4D Implicit Neural Representation

Xingguang Zhong, Yue Pan, Cyrill Stachniss et al.

CVPR 2024posterarXiv:2405.03388
#7420

See Say and Segment: Teaching LMMs to Overcome False Premises

Tsung-Han Wu, Giscard Biamby, David Chan et al.

CVPR 2024poster
#7421

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation

Li Hu

CVPR 2024posterarXiv:2311.17117
#7422

Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation

Xiao Lin, Wenfei Yang, Yuan Gao et al.

CVPR 2024posterarXiv:2403.19527
#7423

PostureHMR: Posture Transformation for 3D Human Mesh Recovery

Yu-Pei Song, Xiao WU, Zhaoquan Yuan et al.

CVPR 2024poster
#7424

WANDR: Intention-guided Human Motion Generation

Markos Diomataris, Nikos Athanasiou, Omid Taheri et al.

CVPR 2024posterarXiv:2404.15383
#7425

WWW: A Unified Framework for Explaining What Where and Why of Neural Networks by Interpretation of Neuron Concepts

Yong Hyun Ahn, Hyeon Bae Kim, Seong Tae Kim

CVPR 2024posterarXiv:2402.18956
#7426

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

Yuxuan Zhang, Yiren Song, Jiaming Liu et al.

CVPR 2024posterarXiv:2312.16272
#7427

Denoising Point Clouds in Latent Space via Graph Convolution and Invertible Neural Network

Aihua Mao, Biao Yan, Zijing Ma et al.

CVPR 2024poster
#7428

SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation

Thuan Nguyen, Anh Tran

CVPR 2024posterarXiv:2312.05239
#7429

Looking 3D: Anomaly Detection with 2D-3D Alignment

Ankan Kumar Bhunia, Changjian Li, Hakan Bilen

CVPR 2024posterarXiv:2406.19393
#7430

EventPS: Real-Time Photometric Stereo Using an Event Camera

Bohan Yu, Jieji Ren, Jin Han et al.

CVPR 2024poster
#7431

Circuit Design and Efficient Simulation of Quantum Inner Product and Empirical Studies of Its Effect on Near-Term Hybrid Quantum-Classic Machine Learning

Hao Xiong, Yehui Tang, Xinyu Ye et al.

CVPR 2024poster
#7432

PSDPM: Prototype-based Secondary Discriminative Pixels Mining for Weakly Supervised Semantic Segmentation

Xinqiao Zhao, Ziqian Yang, Tianhong Dai et al.

CVPR 2024poster
#7433

Towards 3D Vision with Low-Cost Single-Photon Cameras

Fangzhou Mu, Carter Sifferman, Sacha Jungerman et al.

CVPR 2024posterarXiv:2403.17801
#7434

On Train-Test Class Overlap and Detection for Image Retrieval

Chull Hwan Song, Jooyoung Yoon, Taebaek Hwang et al.

CVPR 2024posterarXiv:2404.01524
#7435

ViTamin: Designing Scalable Vision Models in the Vision-Language Era

Jieneng Chen, Qihang Yu, Xiaohui Shen et al.

CVPR 2024posterarXiv:2404.02132
#7436

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Yiming Zhang, Zhening Xing, Yanhong Zeng et al.

CVPR 2024posterarXiv:2312.13964
#7437

DemoCaricature: Democratising Caricature Generation with a Rough Sketch

Dar-Yen Chen, Ayan Kumar Bhunia, Subhadeep Koley et al.

CVPR 2024posterarXiv:2312.04364
#7438

Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates

Ka Chun SHUM, Jaeyeon Kim, Binh-Son Hua et al.

CVPR 2024posterarXiv:2309.11281
#7439

Unlocking the Potential of Pre-trained Vision Transformers for Few-Shot Semantic Segmentation through Relationship Descriptors

Ziqin Zhou, Hai-Ming Xu, Yangyang Shu et al.

CVPR 2024poster
#7440

Relightful Harmonization: Lighting-aware Portrait Background Replacement

Mengwei Ren, Wei Xiong, Jae Shin Yoon et al.

CVPR 2024posterarXiv:2312.06886
#7441

Taming the Tail in Class-Conditional GANs: Knowledge Sharing via Unconditional Training at Lower Resolutions

Saeed Khorram, Mingqi Jiang, Mohamad Shahbazi et al.

CVPR 2024posterarXiv:2402.17065
#7442

MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation

Yanhui Wang, Jianmin Bao, Wenming Weng et al.

CVPR 2024highlightarXiv:2311.18829
#7443

LightOctree: Lightweight 3D Spatially-Coherent Indoor Lighting Estimation

Xuecan Wang, Shibang Xiao, Xiaohui Liang

CVPR 2024posterarXiv:2404.03925
#7444

Bi-level Learning of Task-Specific Decoders for Joint Registration and One-Shot Medical Image Segmentation

Xin Fan, Xiaolin Wang, Jiaxin Gao et al.

CVPR 2024poster
#7445

FairCLIP: Harnessing Fairness in Vision-Language Learning

Yan Luo, MIN SHI, Muhammad Osama Khan et al.

CVPR 2024posterarXiv:2403.19949
#7446

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

Xinyu Tian, Shu Zou, Zhaoyuan Yang et al.

CVPR 2024posterarXiv:2311.16494
#7447

MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints

Pengfei Xie, Wenqiang Xu, Tutian Tang et al.

CVPR 2024posterarXiv:2404.10227
#7448

CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion

Xiaoyu Wu, Yang Hua, Chumeng Liang et al.

CVPR 2024posterarXiv:2403.11162
#7449

Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Kexue Fu, Minghong Duan et al.

CVPR 2024posterarXiv:2402.18467
#7450

Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

Junyi Zhang, Charles Herrmann, Junhwa Hur et al.

CVPR 2024posterarXiv:2311.17034
#7451

MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding

Chun-Peng Chang, Shaoxiang Wang, Alain Pagani et al.

CVPR 2024posterarXiv:2403.03077
#7452

RTracker: Recoverable Tracking via PN Tree Structured Memory

Yuqing Huang, Xin Li, Zikun Zhou et al.

CVPR 2024posterarXiv:2403.19242
#7453

Efficient Solution of Point-Line Absolute Pose

Petr Hruby, Timothy Duff, Marc Pollefeys

CVPR 2024highlightarXiv:2404.16552
#7454

SPIN: Simultaneous Perception Interaction and Navigation

Shagun Uppal, Ananye Agarwal, Haoyu Xiong et al.

CVPR 2024posterarXiv:2405.07991
#7455

CAMixerSR: Only Details Need More "Attention"

Yan Wang, Yi Liu, Shijie Zhao et al.

CVPR 2024poster
#7456

MarkovGen: Structured Prediction for Efficient Text-to-Image Generation

Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam et al.

CVPR 2024posterarXiv:2308.10997
#7457

Constrained Layout Generation with Factor Graphs

Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng et al.

CVPR 2024posterarXiv:2404.00385
#7458

Neural Implicit Morphing of Face Images

Guilherme Schardong, Tiago Novello, Hallison Paz et al.

CVPR 2024posterarXiv:2308.13888
#7459

Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction

Jinzhi Zheng, Heng Fan, Libo Zhang

CVPR 2024poster
#7460

PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar

Tzofi Klinghoffer, Xiaoyu Xiang, Siddharth Somasundaram et al.

CVPR 2024posterarXiv:2312.14239
#7461

Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models

Pengze Zhang, Hubery Yin, Chen Li et al.

CVPR 2024highlightarXiv:2403.08381
#7462

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

Siteng Huang, Biao Gong, Yutong Feng et al.

CVPR 2024posterarXiv:2303.15230
#7463

LoCoNet: Long-Short Context Network for Active Speaker Detection

Xizi Wang, Feng Cheng, Gedas Bertasius

CVPR 2024posterarXiv:2301.08237
#7464

WinSyn: : A High Resolution Testbed for Synthetic Data

Tom Kelly, John Femiani, Peter Wonka

CVPR 2024posterarXiv:2310.08471
#7465

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.

CVPR 2024highlightarXiv:2404.04319
#7466

Multimodal Representation Learning by Alternating Unimodal Adaptation

Xiaohui Zhang, Jaehong Yoon, Mohit Bansal et al.

CVPR 2024posterarXiv:2311.10707
#7467

Attention Calibration for Disentangled Text-to-Image Personalization

Yanbing Zhang, Mengping Yang, Qin Zhou et al.

CVPR 2024posterarXiv:2403.18551
#7468

Segment Every Out-of-Distribution Object

Wenjie Zhao, Jia Li, Xin Dong et al.

CVPR 2024posterarXiv:2311.16516
#7469

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

CVPR 2024highlightarXiv:2312.00690
#7470

Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture

Juanwu Lu, Can Cui, Yunsheng Ma et al.

CVPR 2024posterarXiv:2404.03789
#7471

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

Yaxu Xie, Alain Pagani, Didier Stricker

CVPR 2024posterarXiv:2403.19474
#7472

Fast ODE-based Sampling for Diffusion Models in Around 5 Steps

Zhenyu Zhou, Defang Chen, Can Wang et al.

CVPR 2024highlightarXiv:2312.00094
#7473

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.

CVPR 2024posterarXiv:2312.10118
#7474

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

Yanting Wang, Hongye Fu, Wei Zou et al.

CVPR 2024posterarXiv:2403.19080
#7475

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization

Guopeng Li, Ming Qian, Gui-Song Xia

CVPR 2024posterarXiv:2403.14198
#7476

Hearing Anything Anywhere

Mason Wang, Ryosuke Sawata, Samuel Clarke et al.

CVPR 2024posterarXiv:2406.07532
#7477

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.

CVPR 2024poster
#7478

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024posterarXiv:2403.19926
#7479

Boosting Image Restoration via Priors from Pre-trained Models

Xiaogang Xu, Shu Kong, Tao Hu et al.

CVPR 2024posterarXiv:2403.06793
#7480

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Geonho Bang, Kwangjin Choi, Jisong Kim et al.

CVPR 2024posterarXiv:2403.05061
#7481

EasyDrag: Efficient Point-based Manipulation on Diffusion Models

Xingzhong Hou, Boxiao Liu, Yi Zhang et al.

CVPR 2024poster
#7482

Learned Lossless Image Compression based on Bit Plane Slicing

Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.

CVPR 2024poster
#7483

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Hongwei Zheng, Linyuan Zhou, Han Li et al.

CVPR 2024posterarXiv:2404.01179
#7484

TexTile: A Differentiable Metric for Texture Tileability

Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.

CVPR 2024posterarXiv:2403.12961
#7485

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

CVPR 2024posterarXiv:2401.06056
#7486

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

Yuxing Duan

CVPR 2024posterarXiv:2405.19718
#7487

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng, Inbum Park, Andrew Owens

CVPR 2024posterarXiv:2311.17919
#7488

Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering

Suyuan Liu, KE LIANG, Zhibin Dong et al.

CVPR 2024poster
#7489

Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.

CVPR 2024posterarXiv:2402.18102
#7490

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

Maitreya Patel, Changhoon Kim, Sheng Cheng et al.

CVPR 2024posterarXiv:2312.04655
#7491

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

Haoyu Chen, Wenbo Li, Jinjin Gu et al.

CVPR 2024posterarXiv:2403.02601
#7492

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu, Guozhen Zhang, Jing Tan et al.

CVPR 2024posterarXiv:2404.00653
#7493

GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes

Haozhe Lin, Chunyu Wei, Li He et al.

CVPR 2024poster
#7494

Image Sculpting: Precise Object Editing with 3D Geometry Control

Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.

CVPR 2024posterarXiv:2401.01702
#7495

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

CVPR 2024posterarXiv:2312.10998
#7496

HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses

Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang et al.

CVPR 2024posterarXiv:2312.02232
#7497

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

Chenjie Cao, Yunuo Cai, Qiaole Dong et al.

CVPR 2024posterarXiv:2305.11577
#7498

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation

Wenxuan Wang, Tongtian Yue, Yisi Zhang et al.

CVPR 2024poster
#7499

Ungeneralizable Examples

Jingwen Ye, Xinchao Wang

CVPR 2024posterarXiv:2404.14016
#7500

Language-only Training of Zero-shot Composed Image Retrieval

Geonmo Gu, Sanghyuk Chun, Wonjae Kim et al.

CVPR 2024poster
#7501

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

Shitian Zhao, Zhuowan Li, YadongLu et al.

CVPR 2024highlightarXiv:2312.06685
#7502

Instruct-Imagen: Image Generation with Multi-modal Instruction

Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su et al.

CVPR 2024posterarXiv:2401.01952
#7503

Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation

Dong Lao, Congli Wang, Alex Wong et al.

CVPR 2024highlightarXiv:2405.03662
#7504

Holodeck: Language Guided Generation of 3D Embodied AI Environments

Yue Yang, Fan-Yun Sun, Luca Weihs et al.

CVPR 2024posterarXiv:2312.09067
#7505

SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks

Xinyu Shi, Zecheng Hao, Zhaofei Yu

CVPR 2024posterarXiv:2403.14302
#7506

Tactile-Augmented Radiance Fields

Yiming Dou, Fengyu Yang, Yi Liu et al.

CVPR 2024posterarXiv:2405.04534
#7507

KVQ: Kwai Video Quality Assessment for Short-form Videos

Yiting Lu, Xin Li, Yajing Pei et al.

CVPR 2024posterarXiv:2402.07220
#7508

Purified and Unified Steganographic Network

GuoBiao Li, Sheng Li, Zicong Luo et al.

CVPR 2024posterarXiv:2402.17210
#7509

Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model

Dian Zheng, Xiao-Ming Wu, Shuzhou Yang et al.

CVPR 2024posterarXiv:2403.11157
#7510

L4D-Track: Language-to-4D Modeling Towards 6-DoF Tracking and Shape Reconstruction in 3D Point Cloud Stream

Jingtao Sun, Yaonan Wang, Mingtao Feng et al.

CVPR 2024poster
#7511

MAPSeg: Unified Unsupervised Domain Adaptation for Heterogeneous Medical Image Segmentation Based on 3D Masked Autoencoding and Pseudo-Labeling

Xuzhe Zhang, Yuhao Wu, Elsa Angelini et al.

CVPR 2024posterarXiv:2303.09373
#7512

Traffic Scene Parsing through the TSP6K Dataset

Peng-Tao Jiang, Yuqi Yang, Yang Cao et al.

CVPR 2024posterarXiv:2303.02835
#7513

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Muyang Li, Tianle Cai, Jiaxin Cao et al.

CVPR 2024highlightarXiv:2402.19481
#7514

MoReVQA: Exploring Modular Reasoning Models for Video Question Answering

Juhong Min, Shyamal Buch, Arsha Nagrani et al.

CVPR 2024posterarXiv:2404.06511
#7515

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models

Nastaran Saadati, Minh Pham, Nasla Saleem et al.

CVPR 2024posterarXiv:2404.08079
#7516

Text-Driven Image Editing via Learnable Regions

Yuanze Lin, Yi-Wen Chen, Yi-Hsuan Tsai et al.

CVPR 2024posterarXiv:2311.16432
#7517

Self-Supervised Representation Learning from Arbitrary Scenarios

Zhaowen Li, Yousong Zhu, Zhiyang Chen et al.

CVPR 2024poster
#7518

The Neglected Tails in Vision-Language Models

Shubham Parashar, Tian Liu, Zhiqiu Lin et al.

CVPR 2024posterarXiv:2401.12425
#7519

SODA: Bottleneck Diffusion Models for Representation Learning

Drew Hudson, Daniel Zoran, Mateusz Malinowski et al.

CVPR 2024posterarXiv:2311.17901
#7520

Enhancing the Power of OOD Detection via Sample-Aware Model Selection

Feng Xue, Zi He, Yuan Zhang et al.

CVPR 2024poster
#7521

Towards More Unified In-context Visual Understanding

Dianmo Sheng, Dongdong Chen, Zhentao Tan et al.

CVPR 2024posterarXiv:2312.02520
#7522

Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification

Mei Vaish, Shunxin Wang, Nicola Strisciuglio

CVPR 2024posterarXiv:2403.01944
#7523

Molecular Data Programming: Towards Molecule Pseudo-labeling with Systematic Weak Supervision

Xin Juan, Kaixiong Zhou, Ninghao Liu et al.

CVPR 2024poster
#7524

Hyper-MD: Mesh Denoising with Customized Parameters Aware of Noise Intensity and Geometric Characteristics

Xingtao Wang, Hongliang Wei, Xiaopeng Fan et al.

CVPR 2024poster
#7525

Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning

Zhengwei Fang, Rui Wang, Tao Huang et al.

CVPR 2024highlightarXiv:2209.11964
#7526

An Empirical Study of Scaling Law for Scene Text Recognition

Miao Rang, Zhenni Bi, Chuanjian Liu et al.

CVPR 2024poster
#7527

Differentiable Neural Surface Refinement for Modeling Transparent Objects

Weijian Deng, Dylan Campbell, Chunyi Sun et al.

CVPR 2024poster
#7528

Tune-An-Ellipse: CLIP Has Potential to Find What You Want

Jinheng Xie, Songhe Deng, Bing Li et al.

CVPR 2024highlight
#7529

OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos

Dongyoung Choi, Hyeonjoong Jang, Min H. Kim

CVPR 2024posterarXiv:2404.00676
#7530

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Kai Yang, Jian Tao, Jiafei Lyu et al.

CVPR 2024posterarXiv:2311.13231
#7531

XFeat: Accelerated Features for Lightweight Image Matching

Guilherme Potje, Felipe Cadar, André Araujo et al.

CVPR 2024posterarXiv:2404.19174
#7532

Looking Similar Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning

Nikhil Singh, Chih-Wei Wu, Iroro Orife et al.

CVPR 2024posterarXiv:2304.05600
#7533

Taming Self-Training for Open-Vocabulary Object Detection

Shiyu Zhao, Samuel Schulter, Long Zhao et al.

CVPR 2024posterarXiv:2308.06412
#7534

FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Action Segmentation

Zijia Lu, Ehsan Elhamifar

CVPR 2024poster
#7535

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

Liangxiao Hu, Hongwen Zhang, Yuxiang Zhang et al.

CVPR 2024posterarXiv:2312.02134
#7536

FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer

Dongyeong Hwang, Hyunju Kim, Sunwoo Kim et al.

CVPR 2024posterarXiv:2403.12821
#7537

Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation

Guangyang Wu, Xiaohong Liu, Jun Jia et al.

CVPR 2024posterarXiv:2403.06452
#7538

ProMark: Proactive Diffusion Watermarking for Causal Attribution

Vishal Asnani, John Collomosse, Tu Bui et al.

CVPR 2024posterarXiv:2403.09914
#7539

DiffForensics: Leveraging Diffusion Prior to Image Forgery Detection and Localization

Zeqin Yu, Jiangqun Ni, Yuzhen Lin et al.

CVPR 2024poster
#7540

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

Liwen Wu, Sai Bi, Zexiang Xu et al.

CVPR 2024highlightarXiv:2405.14847
#7541

SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection

Gang Zhang, Chen Junnan, Guohuan Gao et al.

CVPR 2024posterarXiv:2403.05817
#7542

Sheared Backpropagation for Fine-tuning Foundation Models

Zhiyuan Yu, Li Shen, Liang Ding et al.

CVPR 2024poster
#7543

On the Content Bias in Fréchet Video Distance

Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar et al.

CVPR 2024posterarXiv:2404.12391
#7544

Multiview Aerial Visual RECognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

Aritra Dutta, Srijan Das, Jacob Nielsen et al.

CVPR 2024posterarXiv:2312.04548
#7545

CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement

Qiang Zhu, Jinhua Hao, Yukang Ding et al.

CVPR 2024posterarXiv:2403.10362
#7546

Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video

Hongchi Xia, Chih-Hao Lin, Wei-Chiu Ma et al.

CVPR 2024poster
#7547

Identifying Important Group of Pixels using Interactions

Kosuke Sumiyasu, Kazuhiko Kawamoto, Hiroshi Kera

CVPR 2024posterarXiv:2401.03785
#7548

Are Conventional SNNs Really Efficient? A Perspective from Network Quantization

Guobin Shen, Dongcheng Zhao, Tenglong Li et al.

CVPR 2024highlight
#7549

CapHuman: Capture Your Moments in Parallel Universes

Chao Liang, Fan Ma, Linchao Zhu et al.

CVPR 2024posterarXiv:2402.00627
#7550

Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation

Hang Li, Chengzhi Shen, Philip H.S. Torr et al.

CVPR 2024posterarXiv:2311.17216
#7551

Towards Automatic Power Battery Detection: New Challenge Benchmark Dataset and Baseline

Xiaoqi Zhao, Youwei Pang, Zhenyu Chen et al.

CVPR 2024posterarXiv:2312.02528
#7552

Infrared Small Target Detection with Scale and Location Sensitivity

Qiankun Liu, Rui Liu, Bolun Zheng et al.

CVPR 2024posterarXiv:2403.19366
#7553

SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers

Jonathan F. Carter, Joao Jorge, Oliver Gibson et al.

CVPR 2024highlightarXiv:2404.03831
#7554

Empowering Resampling Operation for Ultra-High-Definition Image Enhancement with Model-Aware Guidance

Yu, Jie Huang, Li et al.

CVPR 2024poster
#7555

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models

Rongjie Li, Songyang Zhang, Dahua Lin et al.

CVPR 2024posterarXiv:2404.00906
#7556

OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

Xinyu Zhan, Lixin Yang, Yifei Zhao et al.

CVPR 2024posterarXiv:2403.19417
#7557

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

Zetong Yang, Li Chen, Yanan Sun et al.

CVPR 2024highlightarXiv:2312.17655
#7558

HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation

Xin Huang, Ruizhi Shao, Qi Zhang et al.

CVPR 2024posterarXiv:2310.01406
#7559

HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

Zicong Fan, Maria Parelli, Maria Kadoglou et al.

CVPR 2024highlightarXiv:2311.18448
#7560

AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving

Mingfu Liang, Jong-Chyi Su, Samuel Schulter et al.

CVPR 2024posterarXiv:2403.17373
#7561

Attentive Illumination Decomposition Model for Multi-Illuminant White Balancing

Dongyoung Kim, Jinwoo Kim, Junsang Yu et al.

CVPR 2024posterarXiv:2402.18277
#7562

OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

Lingdong Kong, Youquan Liu, Lai Xing Ng et al.

CVPR 2024highlightarXiv:2405.05259
#7563

Z*: Zero-shot Style Transfer via Attention Reweighting

Yingying Deng, Xiangyu He, Fan Tang et al.

CVPR 2024poster
#7564

G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis

Yufei Ye, Abhinav Gupta, Kris Kitani et al.

CVPR 2024posterarXiv:2404.12383
#7565

ConCon-Chi: Concept-Context Chimera Benchmark for Personalized Vision-Language Tasks

Andrea Rosasco, Stefano Berti, Giulia Pasquale et al.

CVPR 2024poster
#7566

Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal Navigation

Mukul Khanna, Yongsen Mao, Hanxiao Jiang et al.

CVPR 2024posterarXiv:2306.11290
#7567

KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim, Feng Liu, Yiyang Su et al.

CVPR 2024posterarXiv:2403.14852
#7568

QDFormer: Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition

Xiang Li, Jinglu Wang, Xiaohao Xu et al.

CVPR 2024posterarXiv:2310.00132
#7569

From a Bird's Eye View to See: Joint Camera and Subject Registration without the Camera Calibration

Zekun Qian, Ruize Han, Wei Feng et al.

CVPR 2024posterarXiv:2212.09298
#7570

Joint2Human: High-Quality 3D Human Generation via Compact Spherical Embedding of 3D Joints

Muxin Zhang, Qiao Feng, Zhuo Su et al.

CVPR 2024posterarXiv:2312.08591
#7571

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Feng Liang, Bichen Wu, Jialiang Wang et al.

CVPR 2024highlightarXiv:2312.17681
#7572

Accept the Modality Gap: An Exploration in the Hyperbolic Space

Sameera Ramasinghe, Violetta Shevchenko, Gil Avraham et al.

CVPR 2024highlight
#7573

Random Entangled Tokens for Adversarially Robust Vision Transformer

Huihui Gong, Minjing Dong, Siqi Ma et al.

CVPR 2024poster
#7574

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

Tongjia Chen, Hongshan Yu, Zhengeng Yang et al.

CVPR 2024posterarXiv:2312.00096
#7575

Continuous Pose for Monocular Cameras in Neural Implicit Representation

Qi Ma, Danda Paudel, Ajad Chhatkuli et al.

CVPR 2024posterarXiv:2311.17119
#7576

BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning

Siyuan Liang, Mingli Zhu, Aishan Liu et al.

CVPR 2024highlightarXiv:2311.12075
#7577

A Pedestrian is Worth One Prompt: Towards Language Guidance Person Re-Identification

Zexian Yang, Dayan Wu, Chenming Wu et al.

CVPR 2024highlight
#7578

From SAM to CAMs: Exploring Segment Anything Model for Weakly Supervised Semantic Segmentation

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2024poster
#7579

Vlogger: Make Your Dream A Vlog

Shaobin Zhuang, Kunchang Li, Xinyuan Chen et al.

CVPR 2024posterarXiv:2401.09414
#7580

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

Zehuan Huang, Hao Wen, Junting Dong et al.

CVPR 2024posterarXiv:2312.06725
#7581

IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images

Yushuang Wu, Luyue Shi, Junhao Cai et al.

CVPR 2024highlightarXiv:2404.00269
#7582

SVGDreamer: Text Guided SVG Generation with Diffusion Model

XiMing Xing, Chuang Wang, Haitao Zhou et al.

CVPR 2024posterarXiv:2312.16476
#7583

Dual Prototype Attention for Unsupervised Video Object Segmentation

Suhwan Cho, Minhyeok Lee, Seunghoon Lee et al.

CVPR 2024posterarXiv:2211.12036
#7584

Contrastive Mean-Shift Learning for Generalized Category Discovery

Sua Choi, Dahyun Kang, Minsu Cho

CVPR 2024posterarXiv:2404.09451
#7585

Panacea: Panoramic and Controllable Video Generation for Autonomous Driving

Yuqing Wen, Yucheng Zhao, Yingfei Liu et al.

CVPR 2024posterarXiv:2408.07605
#7586

Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention Alignment and Prompt Tuning

Leslie Ching Ow Tiong, Dick Sigmund, Chen-Hui Chan et al.

CVPR 2024poster
#7587

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Narges Norouzi, Svetlana Orlova, Daan de Geus et al.

CVPR 2024posterarXiv:2406.09936
#7588

UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion

Junsheng Zhou, Weiqi Zhang, Baorui Ma et al.

CVPR 2024posterarXiv:2404.06851
#7589

UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity

Jialong Zuo, Hanyu Zhou, Ying Nie et al.

CVPR 2024posterarXiv:2312.03441
#7590

Test-Time Zero-Shot Temporal Action Localization

Benedetta Liberatori, Alessandro Conti, Paolo Rota et al.

CVPR 2024posterarXiv:2404.05426
#7591

Towards Text-guided 3D Scene Composition

Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin et al.

CVPR 2024posterarXiv:2312.08885
#7592

Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

Xiaohan Lei, Min Wang, Wengang Zhou et al.

CVPR 2024posterarXiv:2402.17587
#7593

Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform

Chunghyun Park, Seungwook Kim, Jaesik Park et al.

CVPR 2024posterarXiv:2404.11156
#7594

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

Peihao Wang, Dejia Xu, Zhiwen Fan et al.

CVPR 2024posterarXiv:2401.00909
#7595

SnAG: Scalable and Accurate Video Grounding

Fangzhou Mu, Sicheng Mo, Yin Li

CVPR 2024posterarXiv:2404.02257
#7596

GLaMM: Pixel Grounding Large Multimodal Model

Hanoona Rasheed, Muhammad Maaz, Sahal Shaji Mullappilly et al.

CVPR 2024posterarXiv:2311.03356
#7597

ManiFPT: Defining and Analyzing Fingerprints of Generative Models

Hae Jin Song, Mahyar Khayatkhoei, Wael AbdAlmageed

CVPR 2024posterarXiv:2402.10401
#7598

CORE-MPI: Consistency Object Removal with Embedding MultiPlane Image

Donggeun Yoon, Donghyeon Cho

CVPR 2024poster
#7599

ScoreHypo: Probabilistic Human Mesh Estimation with Hypothesis Scoring

Yuan Xu, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2024poster
#7600

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

Nataniel Ruiz, Yuanzhen Li, Varun Jampani et al.

CVPR 2024posterarXiv:2307.06949