CVPR Highlight Papers
712 papers found • Page 13 of 15
PanoPose: Self-supervised Relative Pose Estimation for Panoramic Images
Diantao Tu, Hainan Cui, Xianwei Zheng et al.
PAPR in Motion: Seamless Point-level 3D Scene Interpolation
Shichong Peng, Yanshu Zhang, Ke Li
PerceptionGPT: Effectively Fusing Visual Perception into LLM
Renjie Pi, Lewei Yao, Jiahui Gao et al.
PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI
Yandan Yang, Baoxiong Jia, Peiyuan Zhi et al.
PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
Tianyi Xie, Zeshun Zong, Yuxing Qiu et al.
PIGEON: Predicting Image Geolocations
Lukas Haas, Michal Skreta, Silas Alberti et al.
pix2gestalt: Amodal Segmentation by Synthesizing Wholes
Ege Ozguroglu, Ruoshi Liu, Dídac Surís et al.
PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis
Zhengyao Lv, Yuxiang Wei, Wangmeng Zuo et al.
Point2CAD: Reverse Engineering CAD Models from 3D Point Clouds
Yujia Liu, Anton Obukhov, Jan D. Wegner et al.
Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
Yuiga Wada, Kanta Kaneda, Daichi Saito et al.
Pose-Guided Self-Training with Two-Stage Clustering for Unsupervised Landmark Discovery
Siddharth Tourani, Ahmed Alwheibi, Arif Mahmood et al.
Predicated Diffusion: Predicate Logic-Based Attention Guidance for Text-to-Image Diffusion Models
Kota Sueyoshi, Takashi Matsubara
Programmable Motion Generation for Open-Set Motion Control Tasks
Hanchao Liu, Xiaohang Zhan, Shaoli Huang et al.
Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI
Chong Wang, Lanqing Guo, Yufei Wang et al.
Putting the Object Back into Video Object Segmentation
Ho Kei Cheng, Seoung Wug Oh, Brian Price et al.
QUADify: Extracting Meshes with Pixel-level Details and Materials from Images
Maximilian Frühauf, Hayko Riemenschneider, Markus Gross et al.
Question Aware Vision Transformer for Multimodal Reasoning
Roy Ganz, Yair Kittenplon, Aviad Aberdam et al.
Rapid 3D Model Generation with Intuitive 3D Input
Tianrun Chen, Chaotao Ding, Shangzhan Zhang et al.
RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models
Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe et al.
Readout Guidance: Learning Control from Diffusion Features
Grace Luo, Trevor Darrell, Oliver Wang et al.
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
Ziyang Chen, Israel D. Gebru, Christian Richardt et al.
Real-time 3D-aware Portrait Video Relighting
Ziqi Cai, Kaiwen Jiang, Shu-Yu Chen et al.
Real-Time Simulated Avatar from Head-Mounted Sensors
Zhengyi Luo, Jinkun Cao, Rawal Khirodkar et al.
Referring Expression Counting
Siyang Dai, Jun Liu, Ngai-Man Cheung
Relightable and Animatable Neural Avatar from Sparse-View Video
Zhen Xu, Sida Peng, Chen Geng et al.
Residual Learning in Diffusion Models
Junyu Zhang, Daochang Liu, Eunbyung Park et al.
Restoration by Generation with Constrained Priors
Zheng Ding, Xuaner Zhang, Zhuowen Tu et al.
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
Sadeep Jayasumana, Srikumar Ramalingam, Andreas Veit et al.
Rethinking Generalizable Face Anti-spoofing via Hierarchical Prototype-guided Distribution Refinement in Hyperbolic Space
Chengyang Hu, Ke-Yue Zhang, Taiping Yao et al.
RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
Lingteng Qiu, Guanying Chen, Xiaodong Gu et al.
RobustSAM: Segment Anything Robustly on Degraded Images
Wei-Ting Chen, Yu Jiet Vong, Sy-Yen Kuo et al.
Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion
Zuoyue Li, Zhenqiang Li, Zhaopeng Cui et al.
Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering
Tao Lu, Mulin Yu, Linning Xu et al.
Scaling Up Dynamic Human-Scene Interaction Modeling
Nan Jiang, Zhiyuan Zhang, Hongjie Li et al.
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
Zeyinzi Jiang, Chaojie Mao, Yulin Pan et al.
SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors
Dave Zhenyu Chen, Haoxuan Li, Hsin-Ying Lee et al.
SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image
Yunhao Li, Xiaodong Wang, Ping Wang et al.
SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection
JUNSU KIM, Hoseong Cho, Jihyeon Kim et al.
Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching
Xianqi Wang, Gangwei Xu, Hao Jia et al.
Self-Supervised Dual Contouring
Ramana Sundararaman, Roman Klokov, Maks Ovsjanikov
Self-Supervised Multi-Object Tracking with Path Consistency
Zijia Lu, Bing Shuai, Yanbei Chen et al.
Semantic-aware SAM for Point-Prompted Instance Segmentation
Zhaoyang Wei, Pengfei Chen, Xuehui Yu et al.
SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation
Junyan Ye, Qiyan Luo, Jinhua Yu et al.
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
Mingxuan Liu, Tyler Hayes, Elisa Ricci et al.
SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction
Zechuan Zhang, Zongxin Yang, Yi Yang
SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers
Jonathan F. Carter, Joao Jorge, Oliver Gibson et al.
SLICE: Stabilized LIME for Consistent Explanations for Image Classification
Revoti Prasad Bora, Kiran Raja, Philipp Terhörst et al.
SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models
Yuzhou Huang, Liangbin Xie, Xintao Wang et al.
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov et al.
Spatial-Aware Regression for Keypoint Localization
Dongkai Wang, Shiliang Zhang