CVPR Highlight Papers

712 papers found • Page 14 of 15

SpatialTracker: Tracking Any 2D Pixels in 3D Space

Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.

CVPR 2024highlightarXiv:2404.04319

SpecNeRF: Gaussian Directional Encoding for Specular Reflections

Li Ma, Vasu Agrawal, Haithem Turki et al.

CVPR 2024highlightarXiv:2312.13102
31
citations

Spectral and Polarization Vision: Spectro-polarimetric Real-world Dataset

Yujin Jeon, Eunsue Choi, Youngchan Kim et al.

CVPR 2024highlightarXiv:2311.17396

SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers

Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos et al.

CVPR 2024highlightarXiv:2312.00648

Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements

Niccolò Biondi, Federico Pernici, Simone Ricci et al.

CVPR 2024highlightarXiv:2405.02581
5
citations

StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation

Yining Shi, Kun JIANG, Ke Wang et al.

CVPR 2024highlightarXiv:2302.09585
8
citations

Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning

Zhengwei Fang, Rui Wang, Tao Huang et al.

CVPR 2024highlightarXiv:2209.11964

Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer

Jiwoo Chung, Sangeek Hyun, Jae-Pil Heo

CVPR 2024highlightarXiv:2312.09008
211
citations

Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Xun Lin, Shuai Wang, RIZHAO CAI et al.

CVPR 2024highlightarXiv:2402.19298

SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field

Lizhe Liu, Bohua Wang, Hongwei Xie et al.

CVPR 2024highlightarXiv:2403.14366

SVDinsTN: A Tensor Network Paradigm for Efficient Structure Search from Regularized Modeling Perspective

Yu-Bang Zheng, Xile Zhao, Junhua Zeng et al.

CVPR 2024highlightarXiv:2305.14912

SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting

Hoon Kim, Minje Jang, Wonjun Yoon et al.

CVPR 2024highlightarXiv:2402.18848

Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models

Pengze Zhang, Hubery Yin, Chen Li et al.

CVPR 2024highlightarXiv:2403.08381

Taming Stable Diffusion for Text to 360 Panorama Image Generation

Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella et al.

CVPR 2024highlightarXiv:2404.07949

Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships

Rangel Daroya, Aaron Sun, Subhransu Maji

CVPR 2024highlightarXiv:2403.17173

Template Free Reconstruction of Human-object Interaction with Procedural Interaction Generation

Xianghui Xie, Bharat Lal Bhatnagar, Jan Lenssen et al.

CVPR 2024highlightarXiv:2312.07063
24
citations

Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval

Jiamian Wang, Guohao Sun, Pichao Wang et al.

CVPR 2024highlightarXiv:2403.17998
63
citations

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Yushi Huang, Ruihao Gong, Jing Liu et al.

CVPR 2024highlightarXiv:2311.16503

The Devil is in the Fine-Grained Details: Evaluating Open-Vocabulary Object Detectors for Fine-Grained Understanding

Lorenzo Bianchi, Fabio Carrara, Nicola Messina et al.

CVPR 2024highlightarXiv:2311.17518
26
citations

The More You See in 2D the More You Perceive in 3D

Xinyang Han, Zelin Gao, Angjoo Kanazawa et al.

CVPR 2024highlightarXiv:2404.03652

The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement

Gabriele Trivigno, Carlo Masone, Barbara Caputo et al.

CVPR 2024highlightarXiv:2404.10438
19
citations

Time- Memory- and Parameter-Efficient Visual Adaptation

Otniel-Bogdan Mercea, Alexey Gritsenko, Cordelia Schmid et al.

CVPR 2024highlightarXiv:2402.02887

Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction

Xiaoyang Lyu, Chirui Chang, Peng Dai et al.

CVPR 2024highlightarXiv:2403.19314
12
citations

Total Selfie: Generating Full-Body Selfies

Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman et al.

CVPR 2024highlightarXiv:2308.14740

Towards Accurate Post-training Quantization for Diffusion Models

Changyuan Wang, Ziwei Wang, Xiuwei Xu et al.

CVPR 2024highlightarXiv:2305.18723

Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation

Renshuai Liu, Bowen Ma, Wei Zhang et al.

CVPR 2024highlightarXiv:2401.01207
32
citations

Towards Learning a Generalist Model for Embodied Navigation

Duo Zheng, Shijia Huang, Lin Zhao et al.

CVPR 2024highlightarXiv:2312.02010
117
citations

Transductive Zero-Shot and Few-Shot CLIP

Ségolène Martin, Yunshi HUANG, Fereshteh Shakeri et al.

CVPR 2024highlightarXiv:2405.18437
32
citations

Tri-Modal Motion Retrieval by Learning a Joint Embedding Space

Kangning Yin, Shihao Zou, Yuxuan Ge et al.

CVPR 2024highlightarXiv:2403.00691
14
citations

Tune-An-Ellipse: CLIP Has Potential to Find What You Want

Jinheng Xie, Songhe Deng, Bing Li et al.

CVPR 2024highlight

TutteNet: Injective 3D Deformations by Composition of 2D Mesh Deformations

Bo Sun, Thibault Groueix, Chen Song et al.

CVPR 2024highlightarXiv:2406.12121
3
citations

Tyche: Stochastic In-Context Learning for Medical Image Segmentation

Marianne Rakic, Hallee Wong, Jose Javier Gonzalez Ortiz et al.

CVPR 2024highlightarXiv:2401.13650
24
citations

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Yanwu Xu, Yang Zhao, Zhisheng Xiao et al.

CVPR 2024highlightarXiv:2311.09257

Unbiased Estimator for Distorted Conics in Camera Calibration

Chaehyeon Song, Jaeho Shin, Myung-Hwan Jeon et al.

CVPR 2024highlightarXiv:2403.04583

Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection

Yajing Liu, Shijun Zhou, Xiyao Liu et al.

CVPR 2024highlightarXiv:2405.15225
37
citations

Uncertainty-aware Action Decoupling Transformer for Action Anticipation

Hongji Guo, Nakul Agarwal, Shao-Yuan Lo et al.

CVPR 2024highlight

Understanding Video Transformers via Universal Concept Discovery

Matthew Kowal, Achal Dave, Rares Andrei Ambrus et al.

CVPR 2024highlightarXiv:2401.10831
17
citations

UniDepth: Universal Monocular Metric Depth Estimation

Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis et al.

CVPR 2024highlightarXiv:2403.18913

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action

Jiasen Lu, Christopher Clark, Sangho Lee et al.

CVPR 2024highlight

Unifying Correspondence Pose and NeRF for Generalized Pose-Free Novel View Synthesis

Sunghwan Hong, Jaewoo Jung, Heeseong Shin et al.

CVPR 2024highlight
28
citations

UniMODE: Unified Monocular 3D Object Detection

Zhuoling Li, Xiaogang Xu, Ser-Nam Lim et al.

CVPR 2024highlight

Unsupervised Keypoints from Pretrained Diffusion Models

Eric Hedlin, Gopal Sharma, Shweta Mahajan et al.

CVPR 2024highlightarXiv:2312.00065
19
citations

Unsupervised Occupancy Learning from Sparse Point Cloud

Amine Ouasfi, Adnane Boukhayma

CVPR 2024highlightarXiv:2404.02759

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.

CVPR 2024highlightarXiv:2312.06640

VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models

Xiang Li, Qianli Shen, Kenji Kawaguchi

CVPR 2024highlightarXiv:2312.00057

Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes

Diandian Guo, Deng-Ping Fan, Tongyu Lu et al.

CVPR 2024highlightarXiv:2401.15261
9
citations

VBench: Comprehensive Benchmark Suite for Video Generative Models

Ziqi Huang, Yinan He, Jiashuo Yu et al.

CVPR 2024highlightarXiv:2311.17982
996
citations

VecFusion: Vector Font Generation with Diffusion

Vikas Thamizharasan, Difan Liu, Shantanu Agarwal et al.

CVPR 2024highlightarXiv:2312.10540

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.

CVPR 2024highlight

View-Category Interactive Sharing Transformer for Incomplete Multi-View Multi-Label Learning

Shilong Ou, Zhe Xue, Yawen Li et al.

CVPR 2024highlight