3D Object Detection

CVPR 2024arXiv:2312.02126

#2

SplaTAM: Splat Track & Map 3D Gaussians for Dense RGB-D SLAM

Nikhil Keetha, Jay Karhade, Krishna Murthy Jatavallabhula et al.

477

CVPR 2024arXiv:2312.08344

#3

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

Bowen Wen, Wei Yang, Jan Kautz et al.

412

CVPR 2024arXiv:2312.16256

#4

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision

Lu Ling, Yichen Sheng, Zhi Tu et al.

266

CVPR 2024arXiv:2404.08636

#5

Probing the 3D Awareness of Visual Foundation Models

Mohamed El Banani, Amit Raj, Kevis-kokitsi Maninis et al.

130

CVPR 2024arXiv:2403.15241

#6

IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection

Junbo Yin, Wenguan Wang, Runnan Chen et al.

81

CVPR 2024arXiv:2404.03181

#7

MonoCD: Monocular 3D Object Detection with Complementary Depths

Longfei Yan, Pei Yan, Shengzhou Xiong et al.

64

CVPR 2024arXiv:2311.14897

#8

Towards Scalable 3D Anomaly Detection and Localization: A Benchmark via 3D Anomaly Synthesis and A Self-Supervised Learning Network

wenqiao Li, Xiaohao Xu, Yao Gu et al.

50

CVPR 2024arXiv:2404.09216

#9

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

Lewei Yao, Renjie Pi, Jianhua Han et al.

45

CVPR 2024arXiv:2303.14541

#10

UnScene3D: Unsupervised 3D Instance Segmentation for Indoor Scenes

David Rozenberszki, Or Litany, Angela Dai

40

ECCV 2024arXiv:2403.18118

#11

EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Qiao Gu, Zhaoyang Lv, Duncan Frost et al.

3d gaussian representationegocentric perceptionopen-world segmentationsegment anything model+3

40

CVPR 2024arXiv:2404.01882

#12

Scene Adaptive Sparse Transformer for Event-based Object Detection

Yansong Peng, Li Hebei, Yueyi Zhang et al.

40

ECCV 2024arXiv:2407.10862

#13

R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection

Zheyuan Zhou, Wang Le, Naiyu Fang et al.

36

ICLR 2024arXiv:2306.00977

#14

AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation

Yuanwen Yue, Sabarinath Mahadevan, Jonas Schult et al.

34

CVPR 2024arXiv:2405.08909

#15

ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association

Shuxiao Ding, Lukas Schneider, Marius Cordts et al.

34

ECCV 2024arXiv:2312.08372

#16

SAM-guided Graph Cut for 3D Instance Segmentation

Haoyu Guo, He Zhu, Sida Peng et al.

3d instance segmentationmulti-view image informationgraph cut problemsuperpoint graph+4

32

CVPR 2024arXiv:2406.00429

#17

Towards Generalizable Multi-Object Tracking

Zheng Qin, Le Wang, Sanping Zhou et al.

32

CVPR 2024arXiv:2403.19278

#18

CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection

Mikhail Kennerley, Jian-Gang Wang, Bharadwaj Veeravalli et al.

31

ICML 2025arXiv:2412.18605

#19

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

Zehan Wang, Ziang Zhang, Tianyu Pang et al.

28

CVPR 2024arXiv:2405.06600

#20

Multi-Object Tracking in the Dark

Xinzhe Wang, Kang Ma, Qiankun Liu et al.

25

CVPR 2024arXiv:2405.14497

#21

Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment

Muhammad Sohail Danish, Muhammad Haris Khan, Muhammad Akhtar Munir et al.

24

AAAI 2024arXiv:2303.16818

#22

MonoDiff: Monocular 3D Object Detection and Pose Estimation with Diffusion Models

Yasiru Ranasinghe, Deepti Hegde, Vishal M. Patel

SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection

Haimei Zhao, Qiming Zhang, Shanshan Zhao et al.

3d object detectionmulti-view cameralidar-camera fusionbird's-eye-view space+4

24

CVPR 2024arXiv:2411.00340

#24

LISO: Lidar-only Self-Supervised 3D Object Detection

Stefan Baur, Frank Moosmann, Andreas Geiger

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

Hu Zhang, xu jianhua, Tao Tang et al.

Towards Robust 3D Object Detection with LiDAR and 4D Radar Fusion in Various Weather Conditions

Yujeong Chae, Hyeonseong Kim, Kuk-Jin Yoon

GAFusion: Adaptive Fusing LiDAR and Camera with Multiple Guidance for 3D Object Detection

Xiaotian Li, Baojie Fan, Jiandong Tian et al.

22

CVPR 2025arXiv:2411.16856

#28

Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation

Mohamed el amine Boudjoghra, Angela Dai, Jean Lahoud et al.

SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE

YONGWEI CHEN, Yushi Lan, Shangchen Zhou et al.

3d object generationautoregressive modelsvector-quantized variational autoencodermulti-scale representation+3

20

CVPR 2025arXiv:2412.14592

#30

Multi-Sensor Object Anomaly Detection: Unifying Appearance, Geometry, and Internal Properties

wenqiao Li, BoZhong Zheng, Xiaohao Xu et al.

20

ECCV 2024arXiv:2407.10749

#31

SEED: A Simple and Effective 3D DETR in Point Clouds

Zhe Liu, Jinghua Hou, Xiaoqing Ye et al.

3d object detectiondetection transformerspoint cloud processingquery selection mechanisms+3

19

CVPR 2024arXiv:2403.17387

#32

Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection

Jiacheng Zhang, Jiaming Li, Xiangru Lin et al.

19

CVPR 2025arXiv:2412.04458

#33

Cubify Anything: Scaling Indoor 3D Object Detection

Justin Lazarow, David Griffiths, Gefen Kohavi et al.

18

CVPR 2024arXiv:2403.06093

#34

Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

Haoxuanye Ji, Pengpeng Liang, Erkang Cheng

17

CVPR 2024arXiv:2404.03159

#35

HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud

WENCAN CHENG, Hao Tang, Luc Van Gool et al.

17

CVPR 2024arXiv:2404.14410

#36

Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses

Inhee Lee, Byungjun Kim, Hanbyul Joo

16

CVPR 2024arXiv:2404.16493

#37

Commonsense Prototype for Outdoor Unsupervised 3D Object Detection

Hai Wu, Shijia Zhao, Xun Huang et al.

16

AAAI 2025arXiv:2412.11489

#38

HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection

Zijian Gu, Jianwei Ma, Yan Huang et al.

14

ECCV 2024arXiv:2403.13556

#39

Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments

Djamahl Etchegaray, Zi Helen Huang, Tatsuya Harada et al.

14

ICCV 2025arXiv:2504.07958

#40

3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation

Zihao Xiao, Longlong Jing, Shangxuan Wu et al.

Detect Anything 3D in the Wild

Hanxue Zhang, Haoran Jiang, Qingsong Yao et al.

3d object detectionzero-shot generalizationmonocular inputsfoundation models+3

ECCV 2024arXiv:2407.10753

#42

OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection

Jinghua Hou, Tong Wang, Xiaoqing Ye et al.

ECCV 2024arXiv:2407.05256

#43

Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image

Pengkun Jiao, Na Zhao, Jingjing Chen et al.

open-vocabulary 3d detectionvision-language modelshierarchical alignmentzero-shot discovery+2

ICLR 2024arXiv:2402.08138

#44

H2O-SDF: Two-phase Learning for 3D Indoor Reconstruction using Object Surface Fields

Minyoung Park, MIRAE DO, Yeon Jae Shin et al.

CVPR 2024arXiv:2402.19144

#45

Weakly Supervised Monocular 3D Detection with a Single-View Image

Xueying Jiang, Sheng Jin, Lewei Lu et al.

AAAI 2025arXiv:2409.01816

#46

GeoBEV: Learning Geometric BEV Representation for Multi-view 3D Object Detection

Jinqing Zhang, Yanan Zhang, Yunlong Qi et al.

CVPR 2024arXiv:2312.04117

#47

RaCFormer: Towards High-Quality 3D Object Detection via Query-based Radar-Camera Fusion

Xiaomeng Chu, Jiajun Deng, Guoliang You et al.

LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection

Sanmin Kim, Youngseok Kim, Sihwan Hwang et al.

Instance Tracking in 3D Scenes from Egocentric Videos

Yunhan Zhao, Haoyu Ma, Shu Kong et al.

11

CVPR 2024arXiv:2403.01414

#50

Unsigned Orthogonal Distance Fields: An Accurate Neural Implicit Representation for Diverse 3D Shapes

YuJie Lu, Long Wan, Nayu Ding et al.

CVPR 2024arXiv:2404.01725

#51

Disentangled Pre-training for Human-Object Interaction Detection

Zhuolong Li, Xingao Li, Changxing Ding et al.

AAAI 2025arXiv:2412.17297

#52

Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective

Kaifang Long, Guoyang Xie, Lianbo Ma et al.

CVPR 2024arXiv:2212.02081

#53

OctOcc: High-Resolution 3D Occupancy Prediction with Octree

Wenzhe Ouyang, Xiaolin Song, Bailan Feng et al.

YolOOD: Utilizing Object Detection Concepts for Multi-Label Out-of-Distribution Detection

Alon Zolfi, Guy AmiT, Amit Baras et al.

CVPR 2025arXiv:2502.04268

#55

Geometry-Guided Domain Generalization for Monocular 3D Object Detection

Fan Yang, Hui Chen, Yuwei He et al.

Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances

Yi Yu, Botao Ren, Peiyuan Zhang et al.

oriented object detectionweakly-supervised detectionpoint annotationsgaussian overlap loss+4

CVPR 2025arXiv:2411.08402

#57

V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection

Xun Huang, Jinlong Wang, Qiming Xia et al.

AAAI 2024arXiv:2407.09787

#58

Semi-supervised 3D Object Detection with PatchTeacher and PillarMix

Xiaopei Wu, Liang Peng, Liang Xie et al.

semi-supervised learning3d object detectionpseudo label generationpartial scene detection+3

9

NeurIPS 2025arXiv:2411.17761

#59

OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection

Zhongyu Xia, Jishuo Li, Zhiwei Lin et al.

8

AAAI 2024arXiv:2312.16425

#60

In-Hand 3D Object Reconstruction from a Monocular RGB Video

Shijian Jiang, Qi Ye, Rengan Xie et al.

in-hand 3d reconstructionmonocular rgb videoimplicit neural representationsocclusion elucidation+4

7

CVPR 2024arXiv:2403.04198

#61

Weakly Supervised Few-Shot Object Detection with DETR

Chenbo Zhang, Yinglu Zhang, Lu Zhang et al.

CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images

Guanlin Shen, Jingwei Huang, Zhihua Hu et al.

7

CVPR 2024arXiv:2404.00679

#63

Weak-to-Strong 3D Object Detection with X-Ray Distillation

Alexander Gambashidze, Aleksandr Dadukin, Maksim Golyadkin et al.

CVPR 2024arXiv:2211.14456

#64

TetraSphere: A Neural Descriptor for O(3)-Invariant Point Cloud Analysis

Pavlo Melnyk, Andreas Robinson, Michael Felsberg et al.

ECCV 2024arXiv:2407.11382

#65

Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection

Hongru Yan, Yu Zheng, Yueqi Duan

Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts

Jianhao Li, Tianyu Sun, Zhongdao Wang et al.

3d shape prediction2d to 3d liftingautomatic 3d labelinginstance segmentation+3

ICCV 2025arXiv:2505.23044

#67

Functionality Understanding and Segmentation in 3D Scenes

Jaime Corsetti, Francesco Giuliari, Alice Fasoli et al.

SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images

Yu Sheng, Jiajun Deng, Xinran Zhang et al.

semantic 3d reconstruction3d gaussian primitivesfeedforward 3d reconstructiondual-field semantic representation+4

ICCV 2025arXiv:2503.08407

#69

WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images

Yansong Guo, Jie Hu, Yansong Qu et al.

3d object segmentationmulti-view alignmentfeed-forward mechanisminteractive segmentation+3

ICCV 2025arXiv:2408.00619

#70

Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection

Ruiyang Zhang, Hu Zhang, Zhedong Zheng

ECCV 2024arXiv:2407.15354

#71

Towards RAW Object Detection in Diverse Conditions

Zhong-Yu Li, Xin Jin, Bo-Yuan Sun et al.

Pos3R: 6D Pose Estimation for Unseen Objects Made Easy

Weijian Deng, Dylan Campbell, Chunyi Sun et al.

Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection

Zhili Chen, Shuangjie Xu, Maosheng Ye et al.

3d object detectionbird's-eye-view representationmulti-camera imagesvector representation+3

5

CVPR 2024arXiv:2404.19384

#74

Omnidirectional Multi-Object Tracking

Kai Luo, Hao Shi, Sheng Wu et al.

Open-World Objectness Modeling Unifies Novel Object Detection

Shan Zhang, Yao Ni, Jinhao Du et al.

Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection

Zhanwei Zhang, Minghao Chen, Shuai Xiao et al.

5

AAAI 2024arXiv:2401.05011

#77

Dual-Perspective Knowledge Enrichment for Semi-supervised 3D Object Detection

Yucheng Han, Na Zhao, Weiling Chen et al.

semi-supervised 3d object detectionpseudo-label generationteacher-student models3d data annotation+4

5

AAAI 2024arXiv:2312.15449

#78

iDet3D: Towards Efficient Interactive Object Detection for LiDAR Point Clouds

Dongmin Choi, Wonwoo Cho, Kangyeol Kim et al.

interactive object detectionlidar point clouds3d annotation pipelinesnegative click simulation+4

AAAI 2025arXiv:2412.10712

#79

Towards Effective, Efficient and Unsupervised Social Event Detection in the Hyperbolic Space

Xiaoyan Yu, Yifan Wei, Shuaishuai Zhou et al.

CVPR 2025arXiv:2504.21749

#80

Common3D: Self-Supervised Learning of 3D Morphable Models for Common Objects in Neural Feature Space

Leonhard Sommer, Olaf Dünkel, Christian Theobalt et al.

CVPR 2024arXiv:2403.19220

#81

GeoAuxNet: Towards Universal 3D Representation Learning for Multi-sensor Point Clouds

Shengjun Zhang, Xin Fei, Yueqi Duan

CVPR 2024arXiv:2405.02781

#82

Instantaneous Perception of Moving Objects in 3D

Di Liu, Bingbing Zhuang, Dimitris N. Metaxas et al.

CVPR 2024arXiv:2403.19022

#83

WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects Under Occlusion

Khiem Vuong, N. Dinesh Reddy, Robert Tamburo et al.

AAAI 2025arXiv:2503.16811

#84

Seg2Box: 3D Object Detection by Point-Wise Semantics Supervision

Maoji Zheng, Ziyu Xu, Qiming Xia et al.

NeurIPS 2025arXiv:2601.01676

#85

SSLFusion: Scale and Space Aligned Latent Fusion Model for Multimodal 3D Object Detection

Bonan Ding, Jin Xie, Jing Nie et al.

DALDet: Depth-Aware Learning Based Object Detection for Autonomous Driving

K. Hu, Tongbo Cao, Yuan Li et al.

LabelAny3D: Label Any Object 3D in the Wild

Jin Yao, Radowan Mahmud Redoy, Sebastian Elbaum et al.

monocular 3d detection3d bounding box annotationanalysis-by-synthesis frameworkopen-vocabulary detection+4

CVPR 2025arXiv:2503.15211

#88

Details Matter for Indoor Open-vocabulary 3D Instance Segmentation

Sanghun Jung, Jingjing Zheng, Ke Zhang et al.

GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector

Zechuan Li, Hongshan Yu, Yihao Ding et al.

neural radiance fields3d object detectionmulti-view feature fusionvoxel representation+3

ICCV 2025arXiv:2507.23567

#90

3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

Yung-Hsu Yang, Luigi Piccinelli, Mattia Segu et al.

ICCV 2025arXiv:2503.16399

#91

SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World

Chen Chen, Zhirui Wang, Taowei Sheng et al.

ICCV 2025arXiv:2509.08388

#92

Semantic Causality-Aware Vision-Based 3D Occupancy Prediction

Dubing Chen, Huan Zheng, Yucheng Zhou et al.

3d semantic occupancy predictionvision-based 3d reconstruction2d-to-3d transformationsemantic causality+4

CVPR 2025arXiv:2409.18733

#93

Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval

Mankeerat Sidhu, Hetarth Chopra, Ansel Blume et al.

CVPR 2025arXiv:2503.08352

#94

SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts

Shijia Zhao, Qiming Xia, Xusheng Guo et al.

Mitigating Ambiguities in 3D Classification with Gaussian Splatting

Ruiqi Zhang, Hao Zhu, Jingyi Zhao et al.

ICCV 2025arXiv:2411.10715

#96

EVT: Efficient View Transformation for Multi-Modal 3D Object Detection

Yongjin Lee, Hyeon-Mun Jeong, Yurim Jeon et al.

CVPR 2025arXiv:2503.20235

#97

Leveraging 3D Geometric Priors in 2D Rotation Symmetry Detection

Ahyun Seo, Minsu Cho

AAAI 2025arXiv:2412.09050

#98

Interactive 3D Object Detection with Prompts

Ruifei Zhang, Xiangru Lin, Wei Zhang et al.

Rethinking Lanes and Points in Complex Scenarios for Monocular 3D Lane Detection

Yifan Chang, Junjie Huang, Xiaofeng Wang et al.

ContextHOI: Spatial Context Learning for Human-Object Interaction Detection

Mingda Jia, Liming Zhao, Ge Li et al.