CVPR Highlight Papers
712 papers found • Page 6 of 15
Order-One Rolling Shutter Cameras
Marvin Anas Hahn, Kathlén Kohn, Orlando Marigliano et al.
O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models
Ashshak Sharifdeen, Muhammad Akhtar Munir, Sanoojan Baliah et al.
Overcoming Shortcut Problem in VLM for Robust Out-of-Distribution Detection
Zhuo Xu, Xiang Xiang, Yifan Liang
Panorama Generation From NFoV Image Done Right
Dian Zheng, Cheng Zhang, Xiao-Ming Wu et al.
Parallelized Autoregressive Visual Generation
Yuqing Wang, Shuhuai Ren, Zhijie Lin et al.
PartGen: Part-level 3D Generation and Reconstruction with Multi-view Diffusion Models
Minghao Chen, Roman Shapovalov, Iro Laina et al.
Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Lee Chae-Yeon, Oh Hyun-Bin, Han EunGi et al.
PGC: Physics-Based Gaussian Cloth from a Single Pose
Michelle Guo, Matt Jen-Yuan Chiang, Igor Santesteban et al.
PhD: A ChatGPT-Prompted Visual Hallucination Evaluation Dataset
Jiazhen Liu, Yuhan Fu, Ruobing Xie et al.
Pippo: High-Resolution Multi-View Humans from a Single Image
Yash Kant, Ethan Weber, Jin Kyu Kim et al.
PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes
Bin Tan, Rui Yu, Yujun Shen et al.
Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting
Wei Lin, Chenyang ZHAO, Antoni B. Chan
Polarized Color Screen Matting
Kenji Enomoto, Scott Cohen, Brian Price et al.
Prior-free 3D Object Tracking
Xiuqiang Song, Li Jin, Zhengxian Zhang et al.
Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness
Beier Zhu, Jiequan Cui, Hanwang Zhang et al.
QuCOOP: A Versatile Framework for Solving Composite and Binary-Parametrised Problems on Quantum Annealers
Natacha Kuete Meli, Vladislav Golyanik, Marcel Seelbach Benkner et al.
Question-Aware Gaussian Experts for Audio-Visual Question Answering
Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.
Realistic Test-Time Adaptation of Vision-Language Models
Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer et al.
Real-time Free-view Human Rendering from Sparse-view RGB Videos using Double Unprojected Textures
Guoxing Sun, Rishabh Dabral, Heming Zhu et al.
Real-time High-fidelity Gaussian Human Avatars with Position-based Interpolation of Spatially Distributed MLPs
Youyi Zhan, Tianjia Shao, Yin Yang et al.
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Yuanmin Tang, Jue Zhang, Xiaoting Qin et al.
Reasoning in Visual Navigation of End-to-end Trained Agents: A Dynamical Systems Approach
Steeven JANNY, Hervé Poirier, Leonid Antsfeld et al.
Reconstructing People, Places, and Cameras
Lea Müller, Hongsuk Choi, Anthony Zhang et al.
Reference-Based 3D-Aware Image Editing with Triplanes
Bahri Batuhan Bilecen, Yiğit Yalın, Ning Yu et al.
Relative Pose Estimation through Affine Corrections of Monocular Depth Priors
Yifan Yu, Shaohui Liu, Rémi Pautrat et al.
ReNeg: Learning Negative Embedding with Reward Guidance
Xiaomin Li, yixuan liu, Takashi Isobe et al.
Rethinking Personalized Aesthetics Assessment: Employing Physique Aesthetics Assessment as An Exemplification
Haobin Zhong, Shuai He, Anlong Ming et al.
Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition
Hongda Liu, Yunfan Liu, Min Ren et al.
Revisiting MAE Pre-training for 3D Medical Image Segmentation
Tassilo Wald, Constantin Ulrich, Stanislav Lukyanenko et al.
RGBAvatar: Reduced Gaussian Blendshapes for Online Modeling of Head Avatars
Linzhou Li, Yumeng Li, Yanlin Weng et al.
RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Tianyu Yu, Haoye Zhang, Qiming Li et al.
RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training
Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun et al.
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
Yao Mu, Tianxing Chen, Zanxin Chen et al.
ROLL: Robust Noisy Pseudo-label Learning for Multi-View Clustering with Noisy Correspondence
Yuan Sun, Yongxiang Li, Zhenwen Ren et al.
SACB-Net: Spatial-awareness Convolutions for Medical Image Registration
Xinxing Cheng, Tianyang Zhang, Wenqi Lu et al.
SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
Hongda Liu, Longguang Wang, Ye Zhang et al.
Samba: A Unified Mamba-based Framework for General Salient Object Detection
Jiahao He, Keren Fu, Xiaohong Liu et al.
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano, Gabriele Trivigno, Gabriele Rosi et al.
Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution
Siwei Tu, Ben Fei, Weidong Yang et al.
Scaling Inference Time Compute for Diffusion Models
Nanye Ma, Shangyuan Tong, Haolin Jia et al.
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi, Boyi Li, Han Cai et al.
Scene-Centric Unsupervised Panoptic Segmentation
Oliver Hahn, Christoph Reich, Nikita Araslanov et al.
SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer
Chunnan Shang, Zhizhong Wang, Hongwei Wang et al.
SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks
Shining Wang, Yunlong Wang, Ruiqi Wu et al.
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
Jianyi Wang, Zhijie Lin, Meng Wei et al.
Seeing More with Less: Human-like Representations in Vision Models
Andrey Gizdov, Shimon Ullman, Daniel Harari
Self-Supervised Cross-View Correspondence with Predictive Cycle Consistency
Alan Baade, Changan Chen
Seurat: From Moving Points to Depth
Seokju Cho, Gabriel Huang, Seungryong Kim et al.
Shape Abstraction via Marching Differentiable Support Functions
Sunkyung Park, Jeongmin Lee, Dongjun Lee
Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models
Itay Benou, Tammy Riklin Raviv