CVPR Highlight Papers
712 papers found • Page 14 of 15
SpatialTracker: Tracking Any 2D Pixels in 3D Space
Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.
SpecNeRF: Gaussian Directional Encoding for Specular Reflections
Li Ma, Vasu Agrawal, Haithem Turki et al.
Spectral and Polarization Vision: Spectro-polarimetric Real-world Dataset
Yujin Jeon, Eunsue Choi, Youngchan Kim et al.
SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers
Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos et al.
Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements
Niccolò Biondi, Federico Pernici, Simone Ricci et al.
StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation
Yining Shi, Kun JIANG, Ke Wang et al.
Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning
Zhengwei Fang, Rui Wang, Tao Huang et al.
Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer
Jiwoo Chung, Sangeek Hyun, Jae-Pil Heo
Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing
Xun Lin, Shuai Wang, RIZHAO CAI et al.
SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field
Lizhe Liu, Bohua Wang, Hongwei Xie et al.
SVDinsTN: A Tensor Network Paradigm for Efficient Structure Search from Regularized Modeling Perspective
Yu-Bang Zheng, Xile Zhao, Junhua Zeng et al.
SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting
Hoon Kim, Minje Jang, Wonjun Yoon et al.
Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models
Pengze Zhang, Hubery Yin, Chen Li et al.
Taming Stable Diffusion for Text to 360 Panorama Image Generation
Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella et al.
Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships
Rangel Daroya, Aaron Sun, Subhransu Maji
Template Free Reconstruction of Human-object Interaction with Procedural Interaction Generation
Xianghui Xie, Bharat Lal Bhatnagar, Jan Lenssen et al.
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Jiamian Wang, Guohao Sun, Pichao Wang et al.
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Yushi Huang, Ruihao Gong, Jing Liu et al.
The Devil is in the Fine-Grained Details: Evaluating Open-Vocabulary Object Detectors for Fine-Grained Understanding
Lorenzo Bianchi, Fabio Carrara, Nicola Messina et al.
The More You See in 2D the More You Perceive in 3D
Xinyang Han, Zelin Gao, Angjoo Kanazawa et al.
The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement
Gabriele Trivigno, Carlo Masone, Barbara Caputo et al.
Time- Memory- and Parameter-Efficient Visual Adaptation
Otniel-Bogdan Mercea, Alexey Gritsenko, Cordelia Schmid et al.
Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction
Xiaoyang Lyu, Chirui Chang, Peng Dai et al.
Total Selfie: Generating Full-Body Selfies
Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman et al.
Towards Accurate Post-training Quantization for Diffusion Models
Changyuan Wang, Ziwei Wang, Xiuwei Xu et al.
Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation
Renshuai Liu, Bowen Ma, Wei Zhang et al.
Towards Learning a Generalist Model for Embodied Navigation
Duo Zheng, Shijia Huang, Lin Zhao et al.
Transductive Zero-Shot and Few-Shot CLIP
Ségolène Martin, Yunshi HUANG, Fereshteh Shakeri et al.
Tri-Modal Motion Retrieval by Learning a Joint Embedding Space
Kangning Yin, Shihao Zou, Yuxuan Ge et al.
Tune-An-Ellipse: CLIP Has Potential to Find What You Want
Jinheng Xie, Songhe Deng, Bing Li et al.
TutteNet: Injective 3D Deformations by Composition of 2D Mesh Deformations
Bo Sun, Thibault Groueix, Chen Song et al.
Tyche: Stochastic In-Context Learning for Medical Image Segmentation
Marianne Rakic, Hallee Wong, Jose Javier Gonzalez Ortiz et al.
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Yanwu Xu, Yang Zhao, Zhisheng Xiao et al.
Unbiased Estimator for Distorted Conics in Camera Calibration
Chaehyeon Song, Jaeho Shin, Myung-Hwan Jeon et al.
Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection
Yajing Liu, Shijun Zhou, Xiyao Liu et al.
Uncertainty-aware Action Decoupling Transformer for Action Anticipation
Hongji Guo, Nakul Agarwal, Shao-Yuan Lo et al.
Understanding Video Transformers via Universal Concept Discovery
Matthew Kowal, Achal Dave, Rares Andrei Ambrus et al.
UniDepth: Universal Monocular Metric Depth Estimation
Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis et al.
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action
Jiasen Lu, Christopher Clark, Sangho Lee et al.
Unifying Correspondence Pose and NeRF for Generalized Pose-Free Novel View Synthesis
Sunghwan Hong, Jaewoo Jung, Heeseong Shin et al.
UniMODE: Unified Monocular 3D Object Detection
Zhuoling Li, Xiaogang Xu, Ser-Nam Lim et al.
Unsupervised Keypoints from Pretrained Diffusion Models
Eric Hedlin, Gopal Sharma, Shweta Mahajan et al.
Unsupervised Occupancy Learning from Sparse Point Cloud
Amine Ouasfi, Adnane Boukhayma
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.
VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models
Xiang Li, Qianli Shen, Kenji Kawaguchi
Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes
Diandian Guo, Deng-Ping Fan, Tongyu Lu et al.
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang, Yinan He, Jiashuo Yu et al.
VecFusion: Vector Font Generation with Diffusion
Vikas Thamizharasan, Difan Liu, Shantanu Agarwal et al.
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.
View-Category Interactive Sharing Transformer for Incomplete Multi-View Multi-Label Learning
Shilong Ou, Zhe Xue, Yawen Li et al.