🧬Vision Recognition

Object Detection

Detecting and localizing objects in images

100 papers4,977 total citations
Compare with other topics
Feb '24 Jan '26676 papers
Also includes: object detection, detection, 2d object detection, detector

Top Papers

#1

DETRs Beat YOLOs on Real-time Object Detection

Yian Zhao, Wenyu Lv, Shangliang Xu et al.

CVPR 2024
2,424
citations
#2

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed

Yifan Wang, Xingyi He, Sida Peng et al.

CVPR 2024
142
citations
#3

Pinwheel-shaped Convolution and Scale-based Dynamic Loss for Infrared Small Target Detection

Jiangnan Yang, Shuangli Liu, Jingjun Wu et al.

AAAI 2025
115
citations
#4

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

Sihan liu, Yiwei Ma, Xiaoqing Zhang et al.

CVPR 2024
89
citations
#5

CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition

Feng Lu, Xiangyuan Lan, Lijun Zhang et al.

CVPR 2024
75
citations
#6

Frequency-Spatial Entanglement Learning for Camouflaged Object Detection

Yanguang Sun, Chunyan Xu, Jian Yang et al.

ECCV 2024
68
citations
#7

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai et al.

ECCV 2024arXiv:2404.03507
tiny object detectiondetr-like methodsdynamic query selectionobject query adjustment+4
56
citations
#8

FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection

Yao Xiao, Tingfa Xu, Yu Xin et al.

AAAI 2025
55
citations
#9

PointOBB: Learning Oriented Object Detection via Single Point Supervision

Junwei Luo, Xue Yang, Yi Yu et al.

CVPR 2024
51
citations
#10

Few-Shot Object Detection with Foundation Models

Guangxing Han, Ser-Nam Lim

CVPR 2024
50
citations
#11

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Yuqian Fu, Yu Wang, Yixuan Pan et al.

ECCV 2024
48
citations
#12

SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection

JUNSU KIM, Hoseong Cho, Jihyeon Kim et al.

CVPR 2024
47
citations
#13

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

Lewei Yao, Renjie Pi, Jianhua Han et al.

CVPR 2024
45
citations
#14

Boosting Object Detection with Zero-Shot Day-Night Domain Adaptation

Zhipeng Du, Miaojing Shi, Jiankang Deng

CVPR 2024
43
citations
#15

Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision

Yi Yu, Xue Yang, Qingyun Li et al.

CVPR 2024
43
citations
#16

Towards Robust Event-guided Low-Light Image Enhancement: A Large-Scale Real-World Event-Image Dataset and Novel Approach

Guoqiang Liang, Kanghao Chen, Hangyu Li et al.

CVPR 2024
41
citations
#17

Scene Adaptive Sparse Transformer for Event-based Object Detection

Yansong Peng, Li Hebei, Yueyi Zhang et al.

CVPR 2024
40
citations
#18

A Diffusion-Based Framework for Multi-Class Anomaly Detection

Haoyang He, Jiangning Zhang, Hongxu Chen et al.

AAAI 2024arXiv:2312.06607
diffusion modelsanomaly detectionmulti-class settingsemantic-guided reconstruction+4
40
citations
#19

Watermark Anything With Localized Messages

Tom Sander, Pierre Fernandez, Alain Oliviero Durmus et al.

ICLR 2025arXiv:2411.07231
localized image watermarkingwatermark segmentationmultiple watermark embeddingimperceptibility constraints+4
38
citations
#20

5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks

Dongshuo Yin, Leiyi Hu, Bin Li et al.

CVPR 2025
38
citations
#21

Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection

Yajing Liu, Shijun Zhou, Xiyao Liu et al.

CVPR 2024
37
citations
#22

ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation

Moayed Haji Ali, Guha Balakrishnan, Vicente Ordonez

CVPR 2024
37
citations
#23

DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment

Jiuming Liu, Dong Zhuo, Zhiheng Feng et al.

ECCV 2024arXiv:2403.18274
visual-lidar fusionodometry estimationstructure alignmentlocal-to-global fusion+3
36
citations
#24

UnO: Unsupervised Occupancy Fields for Perception and Forecasting

Ben Agro, Quinlan Sykora, Sergio Casas et al.

CVPR 2024
35
citations
#25

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Yoad Tewel, Rinon Gal, Dvir Samuel et al.

ICLR 2025arXiv:2411.07232
attention mechanismdiffusion modelssemantic image editingobject insertion+3
34
citations
#26

LEGION: Learning to Ground and Explain for Synthetic Image Detection

Hengrui Kang, Siwei Wen, Zichen Wen et al.

ICCV 2025
32
citations
#27

CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection

Mikhail Kennerley, Jian-Gang Wang, Bharadwaj Veeravalli et al.

CVPR 2024
31
citations
#28

Localization Is All You Evaluate: Data Leakage in Online Mapping Datasets and How to Fix It

Adam Lilja, Junsheng Fu, Erik Stenborg et al.

CVPR 2024
30
citations
#29

LEOD: Label-Efficient Object Detection for Event Cameras

Ziyi Wu, Mathias Gehrig, Qing Lyu et al.

CVPR 2024
30
citations
#30

Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection

Wei Luo, Yunkang Cao, Haiming Yao et al.

CVPR 2025
28
citations
#31

RUN: Reversible Unfolding Network for Concealed Object Segmentation

Chunming He, Rihan Zhang, Fengyang Xiao et al.

ICML 2025
28
citations
#32

ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning

Beomyoung Kim, Joonsang Yu, Sung Ju Hwang

CVPR 2024
27
citations
#33

Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning

Xinyi Wu, Wentao Ma, Dan Guo et al.

AAAI 2024
26
citations
#34

The Devil is in the Fine-Grained Details: Evaluating Open-Vocabulary Object Detectors for Fine-Grained Understanding

Lorenzo Bianchi, Fabio Carrara, Nicola Messina et al.

CVPR 2024
26
citations
#35

FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection

Chanho Lee, Jinsu Son, Hyounguk Shon et al.

AAAI 2024arXiv:2401.06159
rotation-equivarianceoriented object detectiondeformable convolutionaerial image analysis+4
26
citations
#36

Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification

Jiaer Xia, Lei Tan, Pingyang Dai et al.

AAAI 2024arXiv:2303.10976
occluded person re-identificationattention mechanismtransformer architecturegeneralization enhancement+3
24
citations
#37

MonoDiff: Monocular 3D Object Detection and Pose Estimation with Diffusion Models

Yasiru Ranasinghe, Deepti Hegde, Vishal M. Patel

CVPR 2024
24
citations
#38

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

Hu Zhang, xu jianhua, Tao Tang et al.

ECCV 2024
24
citations
#39

Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment

Muhammad Sohail Danish, Muhammad Haris Khan, Muhammad Akhtar Munir et al.

CVPR 2024
24
citations
#40

Supervised Anomaly Detection for Complex Industrial Images

Aimira Baitieva, David Hurych, Victor Besnier et al.

CVPR 2024
24
citations
#41

Benchmarking Object Detectors with COCO: A New Path Forward

Shweta Singh, Aayan Yadav, Jitesh Jain et al.

ECCV 2024
23
citations
#42

Simple Image-Level Classification Improves Open-Vocabulary Object Detection

Ruohuan Fang, Guansong Pang, Xiao Bai

AAAI 2024arXiv:2312.10439
open-vocabulary object detectionvision-language modelscontextual scene understandingmulti-label recognition+3
22
citations
#43

Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature

Wu Yun, Mengshi Qi, Chuanming Wang et al.

AAAI 2024arXiv:2303.12332
weakly-supervised temporal action localizationsalient snippet-feature inferencepseudo label generationtemporal structure exploitation+3
21
citations
#44

Sketch and Refine: Towards Fast and Accurate Lane Detection

Chao Chen, Jie Liu, Chang Zhou et al.

AAAI 2024arXiv:2401.14729
lane detectionproposal-based methodskeypoint-based methodslane segment association+2
20
citations
#45

360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries

Huajian Huang, Changkun Liu, Yipeng Zhu et al.

CVPR 2024
20
citations
#46

Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation

Jiaqi Huang, Zunnan Xu, Ting Liu et al.

AAAI 2025
20
citations
#47

Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection

Jiacheng Zhang, Jiaming Li, Xiangru Lin et al.

CVPR 2024
19
citations
#48

Continuous Memory Representation for Anomaly Detection

Joo Chan Lee, Taejune Kim, Eunbyung Park et al.

ECCV 2024arXiv:2402.18293
anomaly detectionunsupervised learningmemory-based methodscontinuous representation+3
19
citations
#49

Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset

Xiao Wang, Yu Jin, Wentao Wu et al.

CVPR 2025
18
citations
#50

Dense Projection for Anomaly Detection

Dazhi Fu, Zhao Zhang, Jicong Fan

AAAI 2024
18
citations
#51

OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking

Xuanyu Zhang, Zecheng Tang, Zhipei Xu et al.

CVPR 2025arXiv:2412.01615
digital image watermarkingtamper localizationcopyright protectiongenerative ai editing+4
18
citations
#52

Visible and Clear: Finding Tiny Objects in Difference Map

Bing Cao, Haiyu Yao, Pengfei Zhu et al.

ECCV 2024
18
citations
#53

Zero-Shot Aerial Object Detection with Visual Description Regularization

Chenyu Lin, Zhengqing Zang, Chenwei Tang et al.

AAAI 2024arXiv:2402.18233
zero-shot detectionaerial object detectionvisual description regularizationsemantic-visual correlation+4
18
citations
#54

Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes

Zhi Cai, Yingjie Gao, Yaoyan Zheng et al.

ECCV 2024
18
citations
#55

EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition

Issar Tzachor, Boaz Lerner, Matan Levy et al.

ICLR 2025
18
citations
#56

PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor

Vidit Goel, Elia Peruzzo, Yifan Jiang et al.

CVPR 2024
17
citations
#57

Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

Haoxuanye Ji, Pengpeng Liang, Erkang Cheng

CVPR 2024
17
citations
#58

PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration

Runzhao Yao, Shaoyi Du, Wenting Cui et al.

ECCV 2024arXiv:2407.10142
point cloud registrationrotation-equivariant networksrotation-invariant featuresposition-aware convolution+2
16
citations
#59

ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open Vocabulary Object Detection

Joonhyun Jeong, Geondo Park, Jayeon Yoo et al.

AAAI 2024arXiv:2312.07266
open vocabulary object detectionproxy novel classesclip embedding spaceclasswise mixup+4
16
citations
#60

Weakly Supervised Open-Vocabulary Object Detection

Jianghang Lin, Yunhang Shen, Bingquan Wang et al.

AAAI 2024arXiv:2312.12437
weakly supervised object detectionopen-vocabulary object detectionvision-language alignmentdataset bias adaptation+3
16
citations
#61

Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection

Shengjia Chen, Luping Ji, Weiwei Duan et al.

AAAI 2025
16
citations
#62

Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

Nicolas Dufour, Vicky Kalogeiton, David Picard et al.

CVPR 2025arXiv:2412.06781
visual geolocationgenerative geolocationdiffusion modelsriemannian flow matching+3
16
citations
#63

Commonsense Prototype for Outdoor Unsupervised 3D Object Detection

Hai Wu, Shijia Zhao, Xun Huang et al.

CVPR 2024
16
citations
#64

ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting

Chen Duan, Pei Fu, Shan Guo et al.

CVPR 2024
16
citations
#65

Semi-supervised Open-World Object Detection

Sahal Shaji Mullappilly, Abhishek Singh Gehlot, Rao Muhammad Anwer et al.

AAAI 2024arXiv:2402.16013
open-world object detectionsemi-supervised learningobject query representationsfeature-alignment scheme+4
15
citations
#66

FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection

Ke Li, Di Wang, Zhangyuan Hu et al.

AAAI 2025
15
citations
#67

Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms

Joren Brunekreef, Eric Marcus, Ray Sheombarsing et al.

CVPR 2024
15
citations
#68

What How and When Should Object Detectors Update in Continually Changing Test Domains?

Jayeon Yoo, Dongkwan Lee, Inseop Chung et al.

CVPR 2024
15
citations
#69

ILIAS: Instance-Level Image retrieval At Scale

Giorgos Kordopatis-Zilos, Vladan Stojnić, Anna Manko et al.

CVPR 2025
15
citations
#70

PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

Botao Ren, Xue Yang, Yi Yu et al.

ICLR 2025
15
citations
#71

Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model

Hang Zhou, Jiale Cai, Yuteng Ye et al.

AAAI 2025
14
citations
#72

MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects

Lei Fan, Dongdong Fan, Zhiguang Hu et al.

CVPR 2025
14
citations
#73

JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba

Xiaoyong Lu, Songlin Du

CVPR 2025arXiv:2503.03437
local feature matchingmamba architecturelinear complexityscan-merge strategy+3
14
citations
#74

CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection

Wuyang Li, Xinyu Liu, Jiayi Ma et al.

ECCV 2024
open-vocabulary object detectiondiffusion modelslatent space alignmentdistribution transfer+4
14
citations
#75

ISP-Teacher: Image Signal Process with Disentanglement Regularization for Unsupervised Domain Adaptive Dark Object Detection

Yin Zhang, Yongqiang Zhang, Zian Zhang et al.

AAAI 2024
13
citations
#76

Just a Hint: Point-Supervised Camouflaged Object Detection

Huafeng Chen, Dian SHAO, Guangqian Guo et al.

ECCV 2024
13
citations
#77

SmartEraser: Remove Anything from Images using Masked-Region Guidance

Longtao Jiang, Zhendong Wang, Jianmin Bao et al.

CVPR 2025
12
citations
#78

Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

Changwei Wang, Shunpeng Chen, Yukun Song et al.

AAAI 2025
12
citations
#79

Adapting Fine-Grained Cross-View Localization to Areas without Fine Ground Truth

Zimin Xia, Yujiao Shi, HONGDONG LI et al.

ECCV 2024arXiv:2406.00474
cross-view localizationweakly supervised learningknowledge self-distillationpseudo ground truth+3
12
citations
#80

Weakly Supervised Monocular 3D Detection with a Single-View Image

Xueying Jiang, Sheng Jin, Lewei Lu et al.

CVPR 2024
12
citations
#81

Rethinking Features-Fused-Pyramid-Neck for Object Detection

Hulin Li

ECCV 2024
11
citations
#82

BLADE: Box-Level Supervised Amodal Segmentation through Directed Expansion

Zhaochen Liu, Zhixuan Li, Tingting Jiang

AAAI 2024arXiv:2401.01642
amodal segmentationbox-level supervisiondirected expansionoccluded objects+3
11
citations
#83

LBM: Latent Bridge Matching for Fast Image-to-Image Translation

Clément Chadebec, Onur Tasar, Sanjeev Sreetharan et al.

ICCV 2025
11
citations
#84

MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios

Jiacheng Ruan, Wenzhen Yuan, Zehao Lin et al.

AAAI 2025
11
citations
#85

Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances

Yi Yu, Botao Ren, Peiyuan Zhang et al.

CVPR 2025arXiv:2502.04268
oriented object detectionweakly-supervised detectionpoint annotationsgaussian overlap loss+4
10
citations
#86

Disentangled Pre-training for Human-Object Interaction Detection

Zhuolong Li, Xingao Li, Changxing Ding et al.

CVPR 2024
10
citations
#87

YolOOD: Utilizing Object Detection Concepts for Multi-Label Out-of-Distribution Detection

Alon Zolfi, Guy AmiT, Amit Baras et al.

CVPR 2024
10
citations
#88

Geometry-Guided Domain Generalization for Monocular 3D Object Detection

Fan Yang, Hui Chen, Yuwei He et al.

AAAI 2024
10
citations
#89

Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation

Ting Liu, Siyuan Li

CVPR 2025arXiv:2504.00356
zero-shot referring image segmentationhybrid global-local representationspatial guidance augmentationmask region representation+4
10
citations
#90

GLASS: Guided Latent Slot Diffusion for Object-Centric Learning

Krishnakant Singh, Simone Schaub-Meyer, Stefan Roth

CVPR 2025arXiv:2407.17929
object-centric learningslot attention modelslatent slot diffusionobject discovery+3
9
citations
#91

Region-Based Optimization in Continual Learning for Audio Deepfake Detection

Yujie Chen, Jiangyan Yi, Cunhang Fan et al.

AAAI 2025
9
citations
#92

Symbol as Points: Panoptic Symbol Spotting via Point-based Representation

Wenlong Liu, Tianyu Yang, Yuhan Wang et al.

ICLR 2024
9
citations
#93

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

Yirui Chen, Xudong Huang, Quan Zhang et al.

AAAI 2025
9
citations
#94

Active Object Detection with Knowledge Aggregation and Distillation from Large Models

Dejie Yang, Yang Liu

CVPR 2024
9
citations
#95

Learning to Make Keypoints Sub-Pixel Accurate

Shinjeong Kim, Marc Pollefeys, Daniel Barath

ECCV 2024
9
citations
#96

SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection

Anay Majee, Ryan X Sharp, Rishabh Iyer

ECCV 2024
9
citations
#97

Perspective-Invariant 3D Object Detection

Alan Liang, Lingdong Kong, Dongyue Lu et al.

ICCV 2025arXiv:2507.17665
3d object detectionlidar-based perceptioncross-platform adaptationperspective-invariant detection+4
9
citations
#98

MVREC: A General Few-shot Defect Classification Model Using Multi-View Region-Context

Shuai Lyu, Rongchen Zhang, Zeqi Ma et al.

AAAI 2025
8
citations
#99

RAD: Region-Aware Diffusion Models for Image Inpainting

Sora Kim, Sungho Suh, Minsik Lee

CVPR 2025
8
citations
#100

Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation

Ziheng Zhang, Jianyang Gu, Arpita Chowdhury et al.

CVPR 2025
8
citations