3D Object Detection
Detecting objects in 3D space
Top Papers
DETRs Beat YOLOs on Real-time Object Detection
Yian Zhao, Wenyu Lv, Shangliang Xu et al.
Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering
Tao Lu, Mulin Yu, Linning Xu et al.
Grounding Image Matching in 3D with MASt3R
Vincent Leroy, Yohann Cabon, Jerome Revaud
SplaTAM: Splat Track & Map 3D Gaussians for Dense RGB-D SLAM
Nikhil Keetha, Jay Karhade, Krishna Murthy Jatavallabhula et al.
Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
Zeyu Yang, Hongye Yang, Zijie Pan et al.
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Bowen Wen, Wei Yang, Jan Kautz et al.
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting
Chi Yan, Delin Qu, Dong Wang et al.
Text-to-3D using Gaussian Splatting
Zilong Chen, Feng Wang, Yikai Wang et al.
SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion
Vikram Voleti, Chun-Han Yao, Mark Boss et al.
DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision
Lu Ling, Yichen Sheng, Zhi Tu et al.
GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis
Shunyuan Zheng, Boyao ZHOU, Ruizhi Shao et al.
SweetDreamer: Aligning Geometric Priors in 2D diffusion for Consistent Text-to-3D
Weiyu LI, Rui Chen, Xuelin Chen et al.
Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection
Zhiyuan Yan, Yuhao Luo, Siwei Lyu et al.
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani, Amit Raj, Kevis-kokitsi Maninis et al.
Pinwheel-shaped Convolution and Scale-based Dynamic Loss for Infrared Small Target Detection
Jiangnan Yang, Shuangli Liu, Jingjun Wu et al.
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Decoupled Video Diffusion
Wenqiang Sun, Shuo Chen, Fangfu Liu et al.
3D Geometry-Aware Deformable Gaussian Splatting for Dynamic View Synthesis
Zhicheng Lu, xiang guo, Le Hui et al.
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
Zhening Huang, Xiaoyang Wu, Xi Chen et al.
IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection
Junbo Yin, Wenguan Wang, Runnan Chen et al.
CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field
Jiarui Hu, Xianhao Chen, Boyin Feng et al.
RGBD GS-ICP SLAM
Seongbo Ha, Jiung Yeon, Hyeonwoo Yu
Frequency-Spatial Entanglement Learning for Camouflaged Object Detection
Yanguang Sun, Chunyan Xu, Jian Yang et al.
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
Yufu Wang, Ziyun Wang, Lingjie Liu et al.
Unifying 3D Vision-Language Understanding via Promptable Queries
ziyu zhu, Zhuofan Zhang, Xiaojian Ma et al.
MonoCD: Monocular 3D Object Detection with Complementary Depths
Longfei Yan, Pei Yan, Shengzhou Xiong et al.
Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos
Linyi Jin, Richard Tucker, Zhengqi Li et al.
DQ-DETR: DETR with Dynamic Query for Tiny Object Detection
Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai et al.
Controlling Space and Time with Diffusion Models
Daniel Watson, Saurabh Saxena, Lala Li et al.
Improving 2D Feature Representations by 3D-Aware Fine-Tuning
Yuanwen Yue, Anurag Das, Francis Engelmann et al.
Wonderland: Navigating 3D Scenes from a Single Image
Hanwen Liang, Junli Cao, Vidit Goel et al.
Towards Scalable 3D Anomaly Detection and Localization: A Benchmark via 3D Anomaly Synthesis and A Self-Supervised Learning Network
wenqiao Li, Xiaohao Xu, Yao Gu et al.
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Lewei Yao, Renjie Pi, Jianhua Han et al.
ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance
Yongwei Chen, Tengfei Wang, Tong Wu et al.
FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally
Qiuhong Shen, Xingyi Yang, Xinchao Wang
HouseCat6D - A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios
HyunJun Jung, Shun-Cheng Wu, Patrick Ruhkamp et al.
UnScene3D: Unsupervised 3D Instance Segmentation for Indoor Scenes
David Rozenberszki, Or Litany, Angela Dai
EgoLifter: Open-world 3D Segmentation for Egocentric Perception
Qiao Gu, Zhaoyang Lv, Duncan Frost et al.
Scene Adaptive Sparse Transformer for Event-based Object Detection
Yansong Peng, Li Hebei, Yueyi Zhang et al.
Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection
Yajing Liu, Shijun Zhou, Xiyao Liu et al.
SplatAD: Real-Time Lidar and Camera Rendering with 3D Gaussian Splatting for Autonomous Driving
Georg Hess, Carl Lindström, Maryam Fatemi et al.
Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion
Lucas Nunes, Rodrigo Marcuzzi, Benedikt Mersch et al.
GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting
Wanshui Gan, Fang Liu, Hongbin Xu et al.
R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection
Zheyuan Zhou, Wang Le, Naiyu Fang et al.
UnO: Unsupervised Occupancy Fields for Perception and Forecasting
Ben Agro, Quinlan Sykora, Sergio Casas et al.
ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association
Shuxiao Ding, Lukas Schneider, Marius Cordts et al.
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
Yoad Tewel, Rinon Gal, Dvir Samuel et al.
AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation
Yuanwen Yue, Sabarinath Mahadevan, Jonas Schult et al.
DrivingForward: Feed-forward 3D Gaussian Splatting for Driving Scene Reconstruction from Flexible Surround-view Input
Qijian Tian, Xin Tan, Yuan Xie et al.
Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal
YUXIN WANG, Qianyi Wu, Guofeng Zhang et al.
Towards Generalizable Multi-Object Tracking
Zheng Qin, Le Wang, Sanping Zhou et al.
SAM-guided Graph Cut for 3D Instance Segmentation
Haoyu Guo, He Zhu, Sida Peng et al.
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang, Xu Yan, Dongfeng Bai et al.
Open-World Human-Object Interaction Detection via Multi-modal Prompts
Jie Yang, Bingliang Li, Ailing Zeng et al.
CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection
Mikhail Kennerley, Jian-Gang Wang, Bharadwaj Veeravalli et al.
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing "Jed" Yang, Xuweiyi Chen, Nikhil Madaan et al.
3D-HGS: 3D Half-Gaussian Splatting
Haolin Li, Jinyang Liu, Mario Sznaier et al.
DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes
Chensheng Peng, Chengwei Zhang, Yixiao Wang et al.
LaneCPP: Continuous 3D Lane Detection using Physical Priors
Maximilian Pittner, Joel Janai, Alexandru Paul Condurache
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
Zehan Wang, Ziang Zhang, Tianyu Pang et al.
I'M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions
Chengfeng Zhao, Juze Zhang, Jiashen Du et al.
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation
Xiangyang Zhu, Renrui Zhang, Bowei He et al.
CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression
Yu-Ting Zhan, Cheng-Yuan Ho, He-Bi Yang et al.
FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection
Chanho Lee, Jinsu Son, Hyounguk Shon et al.
Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion
Zuoyue Li, Zhenqiang Li, Zhaopeng Cui et al.
Multi-Object Tracking in the Dark
Xinzhe Wang, Kang Ma, Qiankun Liu et al.
CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement
Yun Liu, Chengwen Zhang, Ruofan Xing et al.
Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking
Wei Cao, Chang Luo, Biao Zhang et al.
Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion
Linzhan Mou, Jun-Kun Chen, Yu-Xiong Wang
Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models
Hyeonwoo Kim, Sookwan Han, Patrick Kwon et al.
Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment
Muhammad Sohail Danish, Muhammad Haris Khan, Muhammad Akhtar Munir et al.
MonoDiff: Monocular 3D Object Detection and Pose Estimation with Diffusion Models
Yasiru Ranasinghe, Deepti Hegde, Vishal M. Patel
SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
Haimei Zhao, Qiming Zhang, Shanshan Zhao et al.
OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection
Hu Zhang, xu jianhua, Tao Tang et al.
LISO: Lidar-only Self-Supervised 3D Object Detection
Stefan Baur, Frank Moosmann, Andreas Geiger
Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes
Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das et al.
Towards Robust 3D Object Detection with LiDAR and 4D Radar Fusion in Various Weather Conditions
Yujeong Chae, Hyeonseong Kim, Kuk-Jin Yoon
Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor Spaces
Chenyangguang Zhang, Alexandros Delitzas, Fangjinhua Wang et al.
MANUS: Markerless Grasp Capture using Articulated 3D Gaussians
Chandradeep Pokhariya, Ishaan Shah, Angela Xing et al.
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection
Dongmei Zhang, Chang Li, Renrui Zhang et al.
GAFusion: Adaptive Fusing LiDAR and Camera with Multiple Guidance for 3D Object Detection
Xiaotian Li, Baojie Fan, Jiandong Tian et al.
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Le Yang, Ziwei Zheng, Boxu Chen et al.
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
Mohamed el amine Boudjoghra, Angela Dai, Jean Lahoud et al.
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
YONGWEI CHEN, Yushi Lan, Shangchen Zhou et al.
Multi-Sensor Object Anomaly Detection: Unifying Appearance, Geometry, and Internal Properties
wenqiao Li, BoZhong Zheng, Xiaohao Xu et al.
Deep Orthogonal Hypersphere Compression for Anomaly Detection
Yunhe Zhang, Yan Sun, Jinyu Cai et al.
LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels
Tuo Feng, Wenguan Wang, Fan Ma et al.
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
Zhangchen Ye, Tao Jiang, Chenfeng Xu et al.
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
Yunhan Yang, Yukun Huang, Xiaoyang Wu et al.
Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection
Jiacheng Zhang, Jiaming Li, Xiangru Lin et al.
Scalable 3D Registration via Truncated Entry-wise Absolute Residuals
Tianyu Huang, Liangzu Peng, Rene Vidal et al.
SEED: A Simple and Effective 3D DETR in Point Clouds
Zhe Liu, Jinghua Hou, Xiaoqing Ye et al.
Cubify Anything: Scaling Indoor 3D Object Detection
Justin Lazarow, David Griffiths, Gefen Kohavi et al.
Dense Projection for Anomaly Detection
Dazhi Fu, Zhao Zhang, Jicong Fan
Zero-Shot Aerial Object Detection with Visual Description Regularization
Chenyu Lin, Zhengqing Zang, Chenwei Tang et al.
Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes
Zhi Cai, Yingjie Gao, Yaoyan Zheng et al.
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
WENCAN CHENG, Hao Tang, Luc Van Gool et al.
Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors
Haoxuanye Ji, Pengpeng Liang, Erkang Cheng
3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views
Xiaobiao Du, Yida Wang, Haiyang Sun et al.
Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance
Tien Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang et al.
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration
Kezheng Xiong, Maoji Zheng, Qingshan Xu et al.