2024 Highlight Papers
324 papers found • Page 3 of 7
From Correspondences to Pose: Non-minimal Certifiably Optimal Relative Pose without Disambiguation
Javier Tirado-Garín, Javier Civera
From Feature to Gaze: A Generalizable Replacement of Linear Layer for Gaze Estimation
Yiwei Bao, Feng Lu
From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding
Yonglu Li, Xiaoqian Wu, Xinpeng Liu et al.
Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation
Bingfeng Zhang, Siyue Yu, Yunchao Wei et al.
GART: Gaussian Articulated Template Models
Jiahui Lei, Yufu Wang, Georgios Pavlakos et al.
GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians
Shenhan Qian, Tobias Kirschstein, Liam Schoneveld et al.
Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle
Youtian Lin, Zuozhuo Dai, Siyu Zhu et al.
Gaussian Splatting SLAM
Hidenobu Matsuki, Riku Murai, Paul Kelly et al.
GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning
Ye Yuan, Xueting Li, Yangyi Huang et al.
Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling
Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang et al.
Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching
Shitong Shao, Zeyuan Yin, Muxin Zhou et al.
Generalized Predictive Model for Autonomous Driving
Jiazhi Yang, Shenyuan Gao, Yihang Qiu et al.
General Object Foundation Model for Images and Videos at Scale
Junfeng Wu, Yi Jiang, Qihao Liu et al.
Generative Powers of Ten
Xiaojuan Wang, Janne Kontkanen, Brian Curless et al.
Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding
Guofeng Mei, Luigi Riz, Yiming Wang et al.
GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding
Hao Li, Dingwen Zhang, Yalun Dai et al.
GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis
Shunyuan Zheng, Boyao ZHOU, Ruizhi Shao et al.
GPT4Point: A Unified Framework for Point-Language Understanding and Generation
Zhangyang Qi, Ye Fang, Zeyi Sun et al.
GraCo: Granularity-Controllable Interactive Segmentation
Yian Zhao, Kehan Li, Zesen Cheng et al.
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting
Chi Yan, Delin Qu, Dong Wang et al.
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
WENCAN CHENG, Hao Tang, Luc Van Gool et al.
HashPoint: Accelerated Point Searching and Sampling for Neural Rendering
Jiahao Ma, Miaomiao Liu, David Ahmedt-Aristizabal et al.
HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models
Mengcheng Li, Hongwen Zhang, Yuxiang Zhang et al.
HOI-M^3: Capture Multiple Humans and Objects Interaction within Contextual Environment
Juze Zhang, Jingyan Zhang, Zining Song et al.
HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video
Zicong Fan, Maria Parelli, Maria Kadoglou et al.
Honeybee: Locality-enhanced Projector for Multimodal LLM
Junbum Cha, Woo-Young Kang, Jonghwan Mun et al.
Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation
Wenhao Li, Mengyuan Liu, Hong Liu et al.
HouseCat6D - A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios
HyunJun Jung, Shun-Cheng Wu, Patrick Ruhkamp et al.
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
Xian Liu, Xiaohang Zhan, Jiaxiang Tang et al.
Human Motion Prediction Under Unexpected Perturbation
Jiangbei Yue, Baiyi Li, Julien Pettré et al.
H-ViT: A Hierarchical Vision Transformer for Deformable Image Registration
Morteza Ghahremani, Mohammad Khateri, Bailiang Jian et al.
HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces
Haithem Turki, Vasu Agrawal, Samuel Rota Bulò et al.
ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Chenshuang Zhang, Fei Pan, Junmo Kim et al.
Image Neural Field Diffusion Models
Yinbo Chen, Oliver Wang, Richard Zhang et al.
Implicit Event-RGBD Neural SLAM
Delin Qu, Chi Yan, Dong Wang et al.
Improved Baselines with Visual Instruction Tuning
Haotian Liu, Chunyuan Li, Yuheng Li et al.
In-Context Matting
He Guo, Zixuan Ye, Zhiguo Cao et al.
Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding
Hoang-Quan Nguyen, Thanh-Dat Truong, Xuan-Bac Nguyen et al.
Investigating Compositional Challenges in Vision-Language Models for Visual Grounding
Yunan Zeng, Yan Huang, Jinjin Zhang et al.
IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images
Yushuang Wu, Luyue Shi, Junhao Cai et al.
IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection
Junbo Yin, Wenguan Wang, Runnan Chen et al.
Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model
Shraman Pramanick, Guangxing Han, Rui Hou et al.
Koala: Key Frame-Conditioned Long Video-LLM
Reuben Tan, Ximeng Sun, Ping Hu et al.
LangSplat: 3D Language Gaussian Splatting
Minghan Qin, Wanhua Li, Jiawei ZHOU et al.
Latent Modulated Function for Computational Optimal Continuous Image Representation
Zongyao He, Zhi Jin
Learning Adaptive Spatial Coherent Correlations for Speech-Preserving Facial Expression Manipulation
Tianshui Chen, Jianman Lin, Zhijing Yang et al.
Learning Diffusion Texture Priors for Image Restoration
Tian Ye, Sixiang Chen, Wenhao Chai et al.
Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels
Zhuohong Li, Wei He, Jiepan Li et al.
LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example
Soyeon Yoon, Kwan Yun, Kwanggyoon Seo et al.
LidaRF: Delving into Lidar for Neural Radiance Field on Street Scenes
Shanlin Sun, Bingbing Zhuang, Ziyu Jiang et al.