Most Cited CVPR "watermark accessibility" Papers

5,589 papers found • Page 24 of 28

#4601

MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation

Petru-Daniel Tudosiu, Yongxin Yang, Shifeng Zhang et al.

CVPR 2024posterarXiv:2404.02790
#4602

Collaborative Learning of Anomalies with Privacy (CLAP) for Unsupervised Video Anomaly Detection: A New Baseline

Anas Al-lahham, Muhammad Zaigham Zaheer, Nurbek Tastan et al.

CVPR 2024posterarXiv:2404.00847
#4603

DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

Chengxiang Fan, Muzhi Zhu, Hao Chen et al.

CVPR 2024posterarXiv:2405.10185
#4604

SPAD: Spatially Aware Multi-View Diffusers

Yash Kant, Aliaksandr Siarohin, Ziyi Wu et al.

CVPR 2024poster
#4605

DiffPerformer: Iterative Learning of Consistent Latent Guidance for Diffusion-based Human Video Generation

Chenyang Wang, Zerong Zheng, Tao Yu et al.

CVPR 2024poster
#4606

Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training

Arun Reddy, William Paul, Corban Rivera et al.

CVPR 2024posterarXiv:2312.02914
#4607

RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection

Zhiwei Lin, Zhe Liu, Zhongyu Xia et al.

CVPR 2024posterarXiv:2403.16440
#4608

FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment

Jinglin Xu, Sibo Yin, Guohao Zhao et al.

CVPR 2024posterarXiv:2405.06887
#4609

SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes

Alexandros Delitzas, Ayça Takmaz, Federico Tombari et al.

CVPR 2024poster
#4610

Do Vision and Language Encoders Represent the World Similarly?

Mayug Maniparambil, Raiymbek Akshulakov, YASSER ABDELAZIZ DAHOU DJILALI et al.

CVPR 2024posterarXiv:2401.05224
#4611

Construct to Associate: Cooperative Context Learning for Domain Adaptive Point Cloud Segmentation

Guangrui Li

CVPR 2024poster
#4612

Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration

Chen Zhao, Weiling Cai, Chenyu Dong et al.

CVPR 2024posterarXiv:2311.16845
#4613

Map-Relative Pose Regression for Visual Re-Localization

Shuai Chen, Tommaso Cavallari, Victor Adrian Prisacariu et al.

CVPR 2024highlightarXiv:2404.09884
#4614

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov et al.

CVPR 2024highlightarXiv:2402.14797
#4615

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

Zigang Geng, Binxin Yang, Tiankai Hang et al.

CVPR 2024posterarXiv:2309.03895
#4616

Towards General Robustness Verification of MaxPool-based Convolutional Neural Networks via Tightening Linear Approximation

Yuan Xiao, Shiqing Ma, Juan Zhai et al.

CVPR 2024posterarXiv:2406.00699
#4617

Overcoming Generic Knowledge Loss with Selective Parameter Update

Wenxuan Zhang, Paul Janson, Rahaf Aljundi et al.

CVPR 2024posterarXiv:2308.12462
#4618

Lane2Seq: Towards Unified Lane Detection via Sequence Generation

Kunyang Zhou

CVPR 2024posterarXiv:2402.17172
#4619

Diffusion Model Alignment Using Direct Preference Optimization

Bram Wallace, Meihua Dang, Rafael Rafailov et al.

CVPR 2024posterarXiv:2311.12908
#4620

SD4Match: Learning to Prompt Stable Diffusion Model for Semantic Matching

Xinghui Li, Jingyi Lu, Kai Han et al.

CVPR 2024posterarXiv:2310.17569
#4621

LTM: Lightweight Textured Mesh Extraction and Refinement of Large Unbounded Scenes for Efficient Storage and Real-time Rendering

Jaehoon Choi, Rajvi Shah, Qinbo Li et al.

CVPR 2024poster
#4622

Geometry Transfer for Stylizing Radiance Fields

Hyunyoung Jung, Seonghyeon Nam, Nikolaos Sarafianos et al.

CVPR 2024posterarXiv:2402.00863
#4623

Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation

Keonhee Han, Dominik Muhle, Felix Wimbauer et al.

CVPR 2024posterarXiv:2404.07933
#4624

CrossKD: Cross-Head Knowledge Distillation for Object Detection

JiaBao Wang, yuming chen, Zhaohui Zheng et al.

CVPR 2024posterarXiv:2306.11369
#4625

TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing

Sherry X. Chen, Yaron Vaxman, Elad Ben Baruch et al.

CVPR 2024posterarXiv:2404.11120
#4626

Leveraging Camera Triplets for Efficient and Accurate Structure-from-Motion

Lalit Manam, Venu Madhav Govindu

CVPR 2024poster
#4627

CG-HOI: Contact-Guided 3D Human-Object Interaction Generation

Christian Diller, Angela Dai

CVPR 2024posterarXiv:2311.16097
#4628

Resurrecting Old Classes with New Data for Exemplar-Free Continual Learning

Dipam Goswami, Albin Soutif, Yuyang Liu et al.

CVPR 2024posterarXiv:2405.19074
#4629

HOI-M^3: Capture Multiple Humans and Objects Interaction within Contextual Environment

Juze Zhang, Jingyan Zhang, Zining Song et al.

CVPR 2024highlight
#4630

PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation

Jinfeng Xu, Siyuan Yang, Xianzhi Li et al.

CVPR 2024posterarXiv:2404.00979
#4631

NeISF: Neural Incident Stokes Field for Geometry and Material Estimation

Chenhao Li, Taishi Ono, Takeshi Uemori et al.

CVPR 2024highlightarXiv:2311.13187
#4632

TransLoc4D: Transformer-based 4D Radar Place Recognition

Guohao Peng, Heshan Li, Yangyang Zhao et al.

CVPR 2024poster
#4633

Higher-order Relational Reasoning for Pedestrian Trajectory Prediction

Sungjune Kim, Hyung-gun Chi, Hyerin Lim et al.

CVPR 2024poster
#4634

DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation

Zeeshan Hayder, Xuming He

CVPR 2024posterarXiv:2403.14886
#4635

Design2Cloth: 3D Cloth Generation from 2D Masks

Jiali Zheng, Rolandos Alexandros Potamias, Stefanos Zafeiriou

CVPR 2024posterarXiv:2404.02686
#4636

S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes

Xingyi Li, Zhiguo Cao, Yizheng Wu et al.

CVPR 2024posterarXiv:2403.06205
#4637

SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation

Aysim Toker, Marvin Eisenberger, Daniel Cremers et al.

CVPR 2024posterarXiv:2403.16605
#4638

Dual-Consistency Model Inversion for Non-Exemplar Class Incremental Learning

Zihuan Qiu, Yi Xu, Fanman Meng et al.

CVPR 2024poster
#4639

DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes

Hao Yan, Zhihui Ke, Xiaobo Zhou et al.

CVPR 2024posterarXiv:2403.15679
#4640

Rolling Shutter Correction with Intermediate Distortion Flow Estimation

Mingdeng Cao, Sidi Yang, Yujiu Yang et al.

CVPR 2024posterarXiv:2404.06350
#4641

Towards Transferable Targeted 3D Adversarial Attack in the Physical World

Yao Huang, Yinpeng Dong, Shouwei Ruan et al.

CVPR 2024posterarXiv:2312.09558
#4642

Hybrid Functional Maps for Crease-Aware Non-Isometric Shape Matching

Lennart Bastian, Yizheng Xie, Nassir Navab et al.

CVPR 2024posterarXiv:2312.03678
#4643

AnyDoor: Zero-shot Object-level Image Customization

Xi Chen, Lianghua Huang, Yu Liu et al.

CVPR 2024posterarXiv:2307.09481
#4644

Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architecture

Huijie Zhang, Yifu Lu, Ismail Alkhouri et al.

CVPR 2024poster
#4645

Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans

Romain Loiseau, Elliot Vincent, Mathieu Aubry et al.

CVPR 2024posterarXiv:2304.09704
#4646

PIGEON: Predicting Image Geolocations

Lukas Haas, Michal Skreta, Silas Alberti et al.

CVPR 2024highlightarXiv:2307.05845
#4647

GPLD3D: Latent Diffusion of 3D Shape Generative Models by Enforcing Geometric and Physical Priors

Yuan Dong, Qi Zuo, Xiaodong Gu et al.

CVPR 2024poster
#4648

Synthesize Diagnose and Optimize: Towards Fine-Grained Vision-Language Understanding

Wujian Peng, Sicheng Xie, Zuyao You et al.

CVPR 2024poster
#4649

FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models

LIn Zhao, Tianchen Zhao, Zinan Lin et al.

CVPR 2024posterarXiv:2403.16379
#4650

COLMAP-Free 3D Gaussian Splatting

Yang Fu, Sifei Liu, Amey Kulkarni et al.

CVPR 2024highlightarXiv:2312.07504
#4651

Personalized Residuals for Concept-Driven Text-to-Image Generation

Cusuh Ham, Matthew Fisher, James Hays et al.

CVPR 2024posterarXiv:2405.12978
#4652

Forecasting of 3D Whole-body Human Poses with Grasping Objects

yan haitao, Qiongjie Cui, Jiexin Xie et al.

CVPR 2024poster
#4653

Edge-Aware 3D Instance Segmentation Network with Intelligent Semantic Prior

Wonseok Roh, Hwanhee Jung, Giljoo Nam et al.

CVPR 2024poster
#4654

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe et al.

CVPR 2024highlightarXiv:2312.04524
#4655

Generalizable Novel-View Synthesis using a Stereo Camera

Haechan Lee, Wonjoon Jin, Seung-Hwan Baek et al.

CVPR 2024posterarXiv:2404.13541
#4656

Don’t Drop Your Samples! Coherence-Aware Training Benefits Conditional Diffusion

Nicolas Dufour, Victor Besnier, Vicky Kalogeiton et al.

CVPR 2024highlight
#4657

Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes

Chi-Hsi Kung, 書緯 呂, Yi-Hsuan Tsai et al.

CVPR 2024posterarXiv:2311.17948
#4658

Diff-BGM: A Diffusion Model for Video Background Music Generation

Sizhe Li, Yiming Qin, Minghang Zheng et al.

CVPR 2024posterarXiv:2405.11913
#4659

Shadow-Enlightened Image Outpainting

Hang Yu, Ruilin Li, Shaorong Xie et al.

CVPR 2024poster
#4660

Specularity Factorization for Low-Light Enhancement

Saurabh Saini, P. J. Narayanan

CVPR 2024posterarXiv:2404.01998
#4661

Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification

Bin Yang, Jun Chen, Mang Ye

CVPR 2024poster
#4662

Attack To Defend: Exploiting Adversarial Attacks for Detecting Poisoned Models

Samar Fares, Karthik Nandakumar

CVPR 2024poster
#4663

LiDAR-Net: A Real-scanned 3D Point Cloud Dataset for Indoor Scenes

Yanwen Guo, Yuanqi Li, Dayong Ren et al.

CVPR 2024poster
#4664

Non-autoregressive Sequence-to-Sequence Vision-Language Models

Kunyu Shi, Qi Dong, Luis Goncalves et al.

CVPR 2024posterarXiv:2403.02249
#4665

No More Ambiguity in 360° Room Layout via Bi-Layout Estimation

Yu-Ju Tsai, Jin-Cheng Jhang, JINGJING ZHENG et al.

CVPR 2024posterarXiv:2404.09993
#4666

HUNTER: Unsupervised Human-centric 3D Detection via Transferring Knowledge from Synthetic Instances to Real Scenes

Yichen Yao, Zimo Jiang, YUJING SUN et al.

CVPR 2024posterarXiv:2403.02769
#4667

3D LiDAR Mapping in Dynamic Environments using a 4D Implicit Neural Representation

Xingguang Zhong, Yue Pan, Cyrill Stachniss et al.

CVPR 2024posterarXiv:2405.03388
#4668

See Say and Segment: Teaching LMMs to Overcome False Premises

Tsung-Han Wu, Giscard Biamby, David Chan et al.

CVPR 2024poster
#4669

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation

Li Hu

CVPR 2024posterarXiv:2311.17117
#4670

Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation

Xiao Lin, Wenfei Yang, Yuan Gao et al.

CVPR 2024posterarXiv:2403.19527
#4671

PostureHMR: Posture Transformation for 3D Human Mesh Recovery

Yu-Pei Song, Xiao WU, Zhaoquan Yuan et al.

CVPR 2024poster
#4672

WANDR: Intention-guided Human Motion Generation

Markos Diomataris, Nikos Athanasiou, Omid Taheri et al.

CVPR 2024posterarXiv:2404.15383
#4673

WWW: A Unified Framework for Explaining What Where and Why of Neural Networks by Interpretation of Neuron Concepts

Yong Hyun Ahn, Hyeon Bae Kim, Seong Tae Kim

CVPR 2024posterarXiv:2402.18956
#4674

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

Yuxuan Zhang, Yiren Song, Jiaming Liu et al.

CVPR 2024posterarXiv:2312.16272
#4675

Denoising Point Clouds in Latent Space via Graph Convolution and Invertible Neural Network

Aihua Mao, Biao Yan, Zijing Ma et al.

CVPR 2024poster
#4676

SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation

Thuan Nguyen, Anh Tran

CVPR 2024posterarXiv:2312.05239
#4677

Looking 3D: Anomaly Detection with 2D-3D Alignment

Ankan Kumar Bhunia, Changjian Li, Hakan Bilen

CVPR 2024posterarXiv:2406.19393
#4678

EventPS: Real-Time Photometric Stereo Using an Event Camera

Bohan Yu, Jieji Ren, Jin Han et al.

CVPR 2024poster
#4679

Circuit Design and Efficient Simulation of Quantum Inner Product and Empirical Studies of Its Effect on Near-Term Hybrid Quantum-Classic Machine Learning

Hao Xiong, Yehui Tang, Xinyu Ye et al.

CVPR 2024poster
#4680

PSDPM: Prototype-based Secondary Discriminative Pixels Mining for Weakly Supervised Semantic Segmentation

Xinqiao Zhao, Ziqian Yang, Tianhong Dai et al.

CVPR 2024poster
#4681

Towards 3D Vision with Low-Cost Single-Photon Cameras

Fangzhou Mu, Carter Sifferman, Sacha Jungerman et al.

CVPR 2024posterarXiv:2403.17801
#4682

On Train-Test Class Overlap and Detection for Image Retrieval

Chull Hwan Song, Jooyoung Yoon, Taebaek Hwang et al.

CVPR 2024posterarXiv:2404.01524
#4683

ViTamin: Designing Scalable Vision Models in the Vision-Language Era

Jieneng Chen, Qihang Yu, Xiaohui Shen et al.

CVPR 2024posterarXiv:2404.02132
#4684

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Yiming Zhang, Zhening Xing, Yanhong Zeng et al.

CVPR 2024posterarXiv:2312.13964
#4685

DemoCaricature: Democratising Caricature Generation with a Rough Sketch

Dar-Yen Chen, Ayan Kumar Bhunia, Subhadeep Koley et al.

CVPR 2024posterarXiv:2312.04364
#4686

Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates

Ka Chun SHUM, Jaeyeon Kim, Binh-Son Hua et al.

CVPR 2024posterarXiv:2309.11281
#4687

Unlocking the Potential of Pre-trained Vision Transformers for Few-Shot Semantic Segmentation through Relationship Descriptors

Ziqin Zhou, Hai-Ming Xu, Yangyang Shu et al.

CVPR 2024poster
#4688

Relightful Harmonization: Lighting-aware Portrait Background Replacement

Mengwei Ren, Wei Xiong, Jae Shin Yoon et al.

CVPR 2024posterarXiv:2312.06886
#4689

Taming the Tail in Class-Conditional GANs: Knowledge Sharing via Unconditional Training at Lower Resolutions

Saeed Khorram, Mingqi Jiang, Mohamad Shahbazi et al.

CVPR 2024posterarXiv:2402.17065
#4690

MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation

Yanhui Wang, Jianmin Bao, Wenming Weng et al.

CVPR 2024highlightarXiv:2311.18829
#4691

LightOctree: Lightweight 3D Spatially-Coherent Indoor Lighting Estimation

Xuecan Wang, Shibang Xiao, Xiaohui Liang

CVPR 2024posterarXiv:2404.03925
#4692

Bi-level Learning of Task-Specific Decoders for Joint Registration and One-Shot Medical Image Segmentation

Xin Fan, Xiaolin Wang, Jiaxin Gao et al.

CVPR 2024poster
#4693

FairCLIP: Harnessing Fairness in Vision-Language Learning

Yan Luo, MIN SHI, Muhammad Osama Khan et al.

CVPR 2024posterarXiv:2403.19949
#4694

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

Xinyu Tian, Shu Zou, Zhaoyuan Yang et al.

CVPR 2024posterarXiv:2311.16494
#4695

MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints

Pengfei Xie, Wenqiang Xu, Tutian Tang et al.

CVPR 2024posterarXiv:2404.10227
#4696

CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion

Xiaoyu Wu, Yang Hua, Chumeng Liang et al.

CVPR 2024posterarXiv:2403.11162
#4697

Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Kexue Fu, Minghong Duan et al.

CVPR 2024posterarXiv:2402.18467
#4698

Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

Junyi Zhang, Charles Herrmann, Junhwa Hur et al.

CVPR 2024posterarXiv:2311.17034
#4699

MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding

Chun-Peng Chang, Shaoxiang Wang, Alain Pagani et al.

CVPR 2024posterarXiv:2403.03077
#4700

RTracker: Recoverable Tracking via PN Tree Structured Memory

Yuqing Huang, Xin Li, Zikun Zhou et al.

CVPR 2024posterarXiv:2403.19242
#4701

Efficient Solution of Point-Line Absolute Pose

Petr Hruby, Timothy Duff, Marc Pollefeys

CVPR 2024highlightarXiv:2404.16552
#4702

SPIN: Simultaneous Perception Interaction and Navigation

Shagun Uppal, Ananye Agarwal, Haoyu Xiong et al.

CVPR 2024posterarXiv:2405.07991
#4703

CAMixerSR: Only Details Need More "Attention"

Yan Wang, Yi Liu, Shijie Zhao et al.

CVPR 2024poster
#4704

MarkovGen: Structured Prediction for Efficient Text-to-Image Generation

Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam et al.

CVPR 2024posterarXiv:2308.10997
#4705

Constrained Layout Generation with Factor Graphs

Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng et al.

CVPR 2024posterarXiv:2404.00385
#4706

Neural Implicit Morphing of Face Images

Guilherme Schardong, Tiago Novello, Hallison Paz et al.

CVPR 2024posterarXiv:2308.13888
#4707

Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction

Jinzhi Zheng, Heng Fan, Libo Zhang

CVPR 2024poster
#4708

PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar

Tzofi Klinghoffer, Xiaoyu Xiang, Siddharth Somasundaram et al.

CVPR 2024posterarXiv:2312.14239
#4709

Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models

Pengze Zhang, Hubery Yin, Chen Li et al.

CVPR 2024highlightarXiv:2403.08381
#4710

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

Siteng Huang, Biao Gong, Yutong Feng et al.

CVPR 2024posterarXiv:2303.15230
#4711

LoCoNet: Long-Short Context Network for Active Speaker Detection

Xizi Wang, Feng Cheng, Gedas Bertasius

CVPR 2024posterarXiv:2301.08237
#4712

WinSyn: : A High Resolution Testbed for Synthetic Data

Tom Kelly, John Femiani, Peter Wonka

CVPR 2024posterarXiv:2310.08471
#4713

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.

CVPR 2024highlightarXiv:2404.04319
#4714

Multimodal Representation Learning by Alternating Unimodal Adaptation

Xiaohui Zhang, Jaehong Yoon, Mohit Bansal et al.

CVPR 2024posterarXiv:2311.10707
#4715

Attention Calibration for Disentangled Text-to-Image Personalization

Yanbing Zhang, Mengping Yang, Qin Zhou et al.

CVPR 2024posterarXiv:2403.18551
#4716

Segment Every Out-of-Distribution Object

Wenjie Zhao, Jia Li, Xin Dong et al.

CVPR 2024posterarXiv:2311.16516
#4717

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

CVPR 2024highlightarXiv:2312.00690
#4718

Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture

Juanwu Lu, Can Cui, Yunsheng Ma et al.

CVPR 2024posterarXiv:2404.03789
#4719

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

Yaxu Xie, Alain Pagani, Didier Stricker

CVPR 2024posterarXiv:2403.19474
#4720

Fast ODE-based Sampling for Diffusion Models in Around 5 Steps

Zhenyu Zhou, Defang Chen, Can Wang et al.

CVPR 2024highlightarXiv:2312.00094
#4721

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.

CVPR 2024posterarXiv:2312.10118
#4722

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

Yanting Wang, Hongye Fu, Wei Zou et al.

CVPR 2024posterarXiv:2403.19080
#4723

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization

Guopeng Li, Ming Qian, Gui-Song Xia

CVPR 2024posterarXiv:2403.14198
#4724

Hearing Anything Anywhere

Mason Wang, Ryosuke Sawata, Samuel Clarke et al.

CVPR 2024posterarXiv:2406.07532
#4725

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.

CVPR 2024poster
#4726

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024posterarXiv:2403.19926
#4727

Boosting Image Restoration via Priors from Pre-trained Models

Xiaogang Xu, Shu Kong, Tao Hu et al.

CVPR 2024posterarXiv:2403.06793
#4728

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Geonho Bang, Kwangjin Choi, Jisong Kim et al.

CVPR 2024posterarXiv:2403.05061
#4729

EasyDrag: Efficient Point-based Manipulation on Diffusion Models

Xingzhong Hou, Boxiao Liu, Yi Zhang et al.

CVPR 2024poster
#4730

Learned Lossless Image Compression based on Bit Plane Slicing

Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.

CVPR 2024poster
#4731

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Hongwei Zheng, Linyuan Zhou, Han Li et al.

CVPR 2024posterarXiv:2404.01179
#4732

TexTile: A Differentiable Metric for Texture Tileability

Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.

CVPR 2024posterarXiv:2403.12961
#4733

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

CVPR 2024posterarXiv:2401.06056
#4734

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

Yuxing Duan

CVPR 2024posterarXiv:2405.19718
#4735

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng, Inbum Park, Andrew Owens

CVPR 2024posterarXiv:2311.17919
#4736

Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering

Suyuan Liu, KE LIANG, Zhibin Dong et al.

CVPR 2024poster
#4737

Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.

CVPR 2024posterarXiv:2402.18102
#4738

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

Maitreya Patel, Changhoon Kim, Sheng Cheng et al.

CVPR 2024posterarXiv:2312.04655
#4739

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

Haoyu Chen, Wenbo Li, Jinjin Gu et al.

CVPR 2024posterarXiv:2403.02601
#4740

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu, Guozhen Zhang, Jing Tan et al.

CVPR 2024posterarXiv:2404.00653
#4741

GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes

Haozhe Lin, Chunyu Wei, Li He et al.

CVPR 2024poster
#4742

Image Sculpting: Precise Object Editing with 3D Geometry Control

Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.

CVPR 2024posterarXiv:2401.01702
#4743

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

CVPR 2024posterarXiv:2312.10998
#4744

HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses

Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang et al.

CVPR 2024posterarXiv:2312.02232
#4745

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

Chenjie Cao, Yunuo Cai, Qiaole Dong et al.

CVPR 2024posterarXiv:2305.11577
#4746

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation

Wenxuan Wang, Tongtian Yue, Yisi Zhang et al.

CVPR 2024poster
#4747

Ungeneralizable Examples

Jingwen Ye, Xinchao Wang

CVPR 2024posterarXiv:2404.14016
#4748

Language-only Training of Zero-shot Composed Image Retrieval

Geonmo Gu, Sanghyuk Chun, Wonjae Kim et al.

CVPR 2024poster
#4749

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

Shitian Zhao, Zhuowan Li, YadongLu et al.

CVPR 2024highlightarXiv:2312.06685
#4750

Instruct-Imagen: Image Generation with Multi-modal Instruction

Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su et al.

CVPR 2024posterarXiv:2401.01952
#4751

Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation

Dong Lao, Congli Wang, Alex Wong et al.

CVPR 2024highlightarXiv:2405.03662
#4752

Holodeck: Language Guided Generation of 3D Embodied AI Environments

Yue Yang, Fan-Yun Sun, Luca Weihs et al.

CVPR 2024posterarXiv:2312.09067
#4753

SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks

Xinyu Shi, Zecheng Hao, Zhaofei Yu

CVPR 2024posterarXiv:2403.14302
#4754

Tactile-Augmented Radiance Fields

Yiming Dou, Fengyu Yang, Yi Liu et al.

CVPR 2024posterarXiv:2405.04534
#4755

KVQ: Kwai Video Quality Assessment for Short-form Videos

Yiting Lu, Xin Li, Yajing Pei et al.

CVPR 2024posterarXiv:2402.07220
#4756

Purified and Unified Steganographic Network

GuoBiao Li, Sheng Li, Zicong Luo et al.

CVPR 2024posterarXiv:2402.17210
#4757

Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model

Dian Zheng, Xiao-Ming Wu, Shuzhou Yang et al.

CVPR 2024posterarXiv:2403.11157
#4758

L4D-Track: Language-to-4D Modeling Towards 6-DoF Tracking and Shape Reconstruction in 3D Point Cloud Stream

Jingtao Sun, Yaonan Wang, Mingtao Feng et al.

CVPR 2024poster
#4759

MAPSeg: Unified Unsupervised Domain Adaptation for Heterogeneous Medical Image Segmentation Based on 3D Masked Autoencoding and Pseudo-Labeling

Xuzhe Zhang, Yuhao Wu, Elsa Angelini et al.

CVPR 2024posterarXiv:2303.09373
#4760

Traffic Scene Parsing through the TSP6K Dataset

Peng-Tao Jiang, Yuqi Yang, Yang Cao et al.

CVPR 2024posterarXiv:2303.02835
#4761

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Muyang Li, Tianle Cai, Jiaxin Cao et al.

CVPR 2024highlightarXiv:2402.19481
#4762

MoReVQA: Exploring Modular Reasoning Models for Video Question Answering

Juhong Min, Shyamal Buch, Arsha Nagrani et al.

CVPR 2024posterarXiv:2404.06511
#4763

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models

Nastaran Saadati, Minh Pham, Nasla Saleem et al.

CVPR 2024posterarXiv:2404.08079
#4764

Text-Driven Image Editing via Learnable Regions

Yuanze Lin, Yi-Wen Chen, Yi-Hsuan Tsai et al.

CVPR 2024posterarXiv:2311.16432
#4765

Self-Supervised Representation Learning from Arbitrary Scenarios

Zhaowen Li, Yousong Zhu, Zhiyang Chen et al.

CVPR 2024poster
#4766

The Neglected Tails in Vision-Language Models

Shubham Parashar, Tian Liu, Zhiqiu Lin et al.

CVPR 2024posterarXiv:2401.12425
#4767

SODA: Bottleneck Diffusion Models for Representation Learning

Drew Hudson, Daniel Zoran, Mateusz Malinowski et al.

CVPR 2024posterarXiv:2311.17901
#4768

Enhancing the Power of OOD Detection via Sample-Aware Model Selection

Feng Xue, Zi He, Yuan Zhang et al.

CVPR 2024poster
#4769

Towards More Unified In-context Visual Understanding

Dianmo Sheng, Dongdong Chen, Zhentao Tan et al.

CVPR 2024posterarXiv:2312.02520
#4770

Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification

Mei Vaish, Shunxin Wang, Nicola Strisciuglio

CVPR 2024posterarXiv:2403.01944
#4771

Molecular Data Programming: Towards Molecule Pseudo-labeling with Systematic Weak Supervision

Xin Juan, Kaixiong Zhou, Ninghao Liu et al.

CVPR 2024poster
#4772

Hyper-MD: Mesh Denoising with Customized Parameters Aware of Noise Intensity and Geometric Characteristics

Xingtao Wang, Hongliang Wei, Xiaopeng Fan et al.

CVPR 2024poster
#4773

Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning

Zhengwei Fang, Rui Wang, Tao Huang et al.

CVPR 2024highlightarXiv:2209.11964
#4774

An Empirical Study of Scaling Law for Scene Text Recognition

Miao Rang, Zhenni Bi, Chuanjian Liu et al.

CVPR 2024poster
#4775

Differentiable Neural Surface Refinement for Modeling Transparent Objects

Weijian Deng, Dylan Campbell, Chunyi Sun et al.

CVPR 2024poster
#4776

Tune-An-Ellipse: CLIP Has Potential to Find What You Want

Jinheng Xie, Songhe Deng, Bing Li et al.

CVPR 2024highlight
#4777

OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos

Dongyoung Choi, Hyeonjoong Jang, Min H. Kim

CVPR 2024posterarXiv:2404.00676
#4778

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Kai Yang, Jian Tao, Jiafei Lyu et al.

CVPR 2024posterarXiv:2311.13231
#4779

XFeat: Accelerated Features for Lightweight Image Matching

Guilherme Potje, Felipe Cadar, André Araujo et al.

CVPR 2024posterarXiv:2404.19174
#4780

Looking Similar Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning

Nikhil Singh, Chih-Wei Wu, Iroro Orife et al.

CVPR 2024posterarXiv:2304.05600
#4781

Taming Self-Training for Open-Vocabulary Object Detection

Shiyu Zhao, Samuel Schulter, Long Zhao et al.

CVPR 2024posterarXiv:2308.06412
#4782

FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Action Segmentation

Zijia Lu, Ehsan Elhamifar

CVPR 2024poster
#4783

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

Liangxiao Hu, Hongwen Zhang, Yuxiang Zhang et al.

CVPR 2024posterarXiv:2312.02134
#4784

FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer

Dongyeong Hwang, Hyunju Kim, Sunwoo Kim et al.

CVPR 2024posterarXiv:2403.12821
#4785

Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation

Guangyang Wu, Xiaohong Liu, Jun Jia et al.

CVPR 2024posterarXiv:2403.06452
#4786

ProMark: Proactive Diffusion Watermarking for Causal Attribution

Vishal Asnani, John Collomosse, Tu Bui et al.

CVPR 2024posterarXiv:2403.09914
#4787

DiffForensics: Leveraging Diffusion Prior to Image Forgery Detection and Localization

Zeqin Yu, Jiangqun Ni, Yuzhen Lin et al.

CVPR 2024poster
#4788

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

Liwen Wu, Sai Bi, Zexiang Xu et al.

CVPR 2024highlightarXiv:2405.14847
#4789

SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection

Gang Zhang, Chen Junnan, Guohuan Gao et al.

CVPR 2024posterarXiv:2403.05817
#4790

Sheared Backpropagation for Fine-tuning Foundation Models

Zhiyuan Yu, Li Shen, Liang Ding et al.

CVPR 2024poster
#4791

On the Content Bias in Fréchet Video Distance

Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar et al.

CVPR 2024posterarXiv:2404.12391
#4792

Multiview Aerial Visual RECognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

Aritra Dutta, Srijan Das, Jacob Nielsen et al.

CVPR 2024posterarXiv:2312.04548
#4793

CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement

Qiang Zhu, Jinhua Hao, Yukang Ding et al.

CVPR 2024posterarXiv:2403.10362
#4794

Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video

Hongchi Xia, Chih-Hao Lin, Wei-Chiu Ma et al.

CVPR 2024poster
#4795

Identifying Important Group of Pixels using Interactions

Kosuke Sumiyasu, Kazuhiko Kawamoto, Hiroshi Kera

CVPR 2024posterarXiv:2401.03785
#4796

Are Conventional SNNs Really Efficient? A Perspective from Network Quantization

Guobin Shen, Dongcheng Zhao, Tenglong Li et al.

CVPR 2024highlight
#4797

CapHuman: Capture Your Moments in Parallel Universes

Chao Liang, Fan Ma, Linchao Zhu et al.

CVPR 2024posterarXiv:2402.00627
#4798

Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation

Hang Li, Chengzhi Shen, Philip H.S. Torr et al.

CVPR 2024posterarXiv:2311.17216
#4799

Towards Automatic Power Battery Detection: New Challenge Benchmark Dataset and Baseline

Xiaoqi Zhao, Youwei Pang, Zhenyu Chen et al.

CVPR 2024posterarXiv:2312.02528
#4800

Infrared Small Target Detection with Scale and Location Sensitivity

Qiankun Liu, Rui Liu, Bolun Zheng et al.

CVPR 2024posterarXiv:2403.19366