🧬Vision Recognition

Panoptic Segmentation

Unified semantic and instance segmentation

100 papers4,297 total citations
Compare with other topics
Feb '24 Jan '26931 papers
Also includes: panoptic segmentation, scene understanding

Top Papers

#1

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Yunyang Xiong, Balakrishnan Varadarajan, Lemeng Wu et al.

CVPR 2024
241
citations
#2

SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery

Xin Guo, Jiangwei Lao, Bo Dang et al.

CVPR 2024
236
citations
#3

Uni3D: Exploring Unified 3D Representation at Scale

Junsheng Zhou, Jinsheng Wang, Baorui Ma et al.

ICLR 2024
165
citations
#4

SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

Mingrui Li, Shuhong Liu, Heng Zhou et al.

ECCV 2024arXiv:2402.03246
gaussian splattingvisual slamsemantic segmentationneural implicit slam+4
131
citations
#5

GSVA: Generalized Segmentation via Multimodal Large Language Models

Zhuofan Xia, Dongchen Han, Yizeng Han et al.

CVPR 2024
127
citations
#6

SCTNet: Single Branch CNN with Transformer Semantic Information for Real-Time Segmentation

Authors: Zhengze Xu, Dongyue Wu, Changqian Yu et al.

AAAI 2024arXiv:2312.17071
real-time segmentationsemantic segmentationsingle branch cnntransformer semantic information+3
126
citations
#7

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World

Weiyun Wang, Min Shi, Qingyun Li et al.

ICLR 2024
118
citations
#8

SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation

Wenxi Yue, Jing Zhang, Kun Hu et al.

AAAI 2024arXiv:2308.08746
surgical instrument segmentationclass prompt encodercontrastive prototype learningfoundation model adaptation+4
110
citations
#9

Stronger Fewer & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation

ZHIXIANG WEI, Lin Chen, Xiaoxiao Ma et al.

CVPR 2024
100
citations
#10

SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation

Bin Xie, Jiale Cao, Jin Xie et al.

CVPR 2024
90
citations
#11

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

Sihan liu, Yiwei Ma, Xiaoqing Zhang et al.

CVPR 2024
89
citations
#12

PSALM: Pixelwise Segmentation with Large Multi-modal Model

Zheng Zhang, YeYao Ma, Enming Zhang et al.

ECCV 2024arXiv:2403.14598
large multimodal modelsimage segmentationmask decoderreferring expression segmentation+4
82
citations
#13

GROUNDHOG: Grounding Large Language Models to Holistic Segmentation

Yichi Zhang, Ziqiao Ma, Xiaofeng Gao et al.

CVPR 2024
75
citations
#14

ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

Mengcheng Lan, Chaofeng Chen, Yiping Ke et al.

ECCV 2024
68
citations
#15

OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning

Haiyang Ying, Yixuan Yin, Jinzhi Zhang et al.

CVPR 2024
66
citations
#16

Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

Shuting He, Henghui Ding

CVPR 2024
64
citations
#17

pix2gestalt: Amodal Segmentation by Synthesizing Wholes

Ege Ozguroglu, Ruoshi Liu, Dídac Surís et al.

CVPR 2024
61
citations
#18

LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving

Tianyu Li, Peijin Jia, Bangjun Wang et al.

ICLR 2024
60
citations
#19

Neural Parametric Gaussians for Monocular Non-Rigid Object Reconstruction

Devikalyan Das, Christopher Wewer, Raza Yunus et al.

CVPR 2024
57
citations
#20

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

Xurui Li, Ziming Huang, Feng Xue et al.

ICLR 2024
55
citations
#21

MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation

Mi Yan, Jiazhao Zhang, Yan Zhu et al.

CVPR 2024
48
citations
#22

Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles

Rui Song, Chenwei Liang, Hu Cao et al.

CVPR 2024
45
citations
#23

FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

Qiuhong Shen, Xingyi Yang, Xinchao Wang

ECCV 2024
45
citations
#24

MET3R: Measuring Multi-View Consistency in Generated Images

Mohammad Asim, Christopher Wewer, Thomas Wimmer et al.

CVPR 2025arXiv:2501.06336
multi-view consistencyimage generation3d reconstructionnovel view synthesis+3
43
citations
#25

UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization

Junjie He, Yifeng Geng, Liefeng Bo

ICCV 2025
43
citations
#26

UnScene3D: Unsupervised 3D Instance Segmentation for Indoor Scenes

David Rozenberszki, Or Litany, Angela Dai

CVPR 2024
40
citations
#27

VISTA3D: A Unified Segmentation Foundation Model For 3D Medical Imaging

Yufan He, Pengfei Guo, Yucheng Tang et al.

CVPR 2025
38
citations
#28

Multi-view Aggregation Network for Dichotomous Image Segmentation

Qian Yu, Xiaoqi Zhao, Youwei Pang et al.

CVPR 2024
38
citations
#29

Better Call SAL: Towards Learning to Segment Anything in Lidar

Aljoša Ošep, Tim Meinhardt, Francesco Ferroni et al.

ECCV 2024
38
citations
#30

GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting

Wanshui Gan, Fang Liu, Hongbin Xu et al.

ICCV 2025
37
citations
#31

Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection

Yajing Liu, Shijun Zhou, Xiyao Liu et al.

CVPR 2024
37
citations
#32

PEM: Prototype-based Efficient MaskFormer for Image Segmentation

Niccolò Cavagnero, Gabriele Rosi, Claudia Cuttano et al.

CVPR 2024
37
citations
#33

SegPoint: Segment Any Point Cloud via Large Language Model

Shuting He, Henghui Ding, Xudong Jiang et al.

ECCV 2024arXiv:2407.13761
3d point cloud segmentationinstruction segmentationreferring segmentationsemantic segmentation+3
37
citations
#34

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

Han Shu, Wenshuo Li, Yehui Tang et al.

AAAI 2025
37
citations
#35

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

XuDong Wang, Ishan Misra, Ziyun Zeng et al.

CVPR 2024
36
citations
#36

Distilling Semantic Priors from SAM to Efficient Image Restoration Models

Quan Zhang, Xiaoyu Liu, Wei Li et al.

CVPR 2024
36
citations
#37

RobustSAM: Segment Anything Robustly on Degraded Images

Wei-Ting Chen, Yu Jiet Vong, Sy-Yen Kuo et al.

CVPR 2024
35
citations
#38

SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM Optimization

Zhenlong Yuan, Jiakai Cao, Zhaoxin Li et al.

AAAI 2024arXiv:2401.06385
multi-view stereo3d reconstructiontextureless areassegment anything model+4
35
citations
#39

UnO: Unsupervised Occupancy Fields for Perception and Forecasting

Ben Agro, Quinlan Sykora, Sergio Casas et al.

CVPR 2024
35
citations
#40

Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation

Yuanhong Chen, Yuyuan Liu, Hu Wang et al.

CVPR 2024
34
citations
#41

Collaborating Foundation Models for Domain Generalized Semantic Segmentation

Yasser Benigmim, Subhankar Roy, Slim Essid et al.

CVPR 2024
32
citations
#42

SAM-guided Graph Cut for 3D Instance Segmentation

Haoyu Guo, He Zhu, Sida Peng et al.

ECCV 2024
32
citations
#43

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network

ye junyan, Zhutao Lv, Li Weijia et al.

ECCV 2024
31
citations
#44

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Bohan Li, Jiajun Deng, Wenyao Zhang et al.

ECCV 2024arXiv:2407.02077
semantic scene completiontemporal context learningcamera-based 3d reconstructioncross-frame affinity measurement+4
31
citations
#45

Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation

Xiaoyang Wang, Huihui Bai, Limin Yu et al.

CVPR 2024
30
citations
#46

Multi-Space Alignments Towards Universal LiDAR Segmentation

Youquan Liu, Lingdong Kong, Xiaoyang Wu et al.

CVPR 2024
30
citations
#47

Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic Segmentation

Xiaoyi Bao, Jie Qin, Siyang Sun et al.

AAAI 2024arXiv:2312.06474
few-shot semantic segmentationintrinsic feature enhancementmulti-level prototype generationsemantic ambiguity+4
30
citations
#48

Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning

Woo-Jin Ahn, Geun-Yeong Yang, Hyunduck Choi et al.

CVPR 2024
30
citations
#49

EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation

Chanyoung Kim, Woojung Han, Dayun Ju et al.

CVPR 2024
30
citations
#50

Unifying Multi-Modal Uncertainty Modeling and Semantic Alignment for Text-to-Image Person Re-identification

Zhiwei Zhao, Bin Liu, Yan Lu et al.

AAAI 2024
29
citations
#51

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Weifeng Lin, Xinyu Wei, Ruichuan An et al.

NeurIPS 2025arXiv:2506.05302
region-level visual understandingobject segmentationlarge language modelssemantic perceiver+4
29
citations
#52

UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual Correspondence

Ruihai Wu, Haoran Lu, Yiyan Wang et al.

CVPR 2024
29
citations
#53

Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation

Bingfeng Zhang, Siyue Yu, Yunchao Wei et al.

CVPR 2024
29
citations
#54

Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation

feilong tang, Zhongxing Xu, Zhaojun QU et al.

CVPR 2024
29
citations
#55

Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

Xiangheng Shan, Dongyue Wu, Guilin Zhu et al.

CVPR 2024
28
citations
#56

RUN: Reversible Unfolding Network for Concealed Object Segmentation

Chunming He, Rihan Zhang, Fengyang Xiao et al.

ICML 2025
28
citations
#57

ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning

Beomyoung Kim, Joonsang Yu, Sung Ju Hwang

CVPR 2024
27
citations
#58

No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation

Xiangyang Zhu, Renrui Zhang, Bowei He et al.

CVPR 2024
27
citations
#59

Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning

Xinyi Wu, Wentao Ma, Dan Guo et al.

AAAI 2024
26
citations
#60

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

Baichuan Zhou, Haote Yang, Dairong Chen et al.

AAAI 2025
26
citations
#61

Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning

Xinshun Wang, Zhongbin Fang, Xia Li et al.

CVPR 2024
26
citations
#62

Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation

Qiyuan Dai, Sibei Yang

CVPR 2024
25
citations
#63

Image-to-Image Matching via Foundation Models: A New Perspective for Open-Vocabulary Semantic Segmentation

Yuan Wang, Rui Sun, Naisong Luo et al.

CVPR 2024
25
citations
#64

SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation

Junyan Ye, Qiyan Luo, Jinhua Yu et al.

CVPR 2024
25
citations
#65

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

Sehwan Choi, Jun Won Choi, JUNGHO KIM et al.

ECCV 2024
25
citations
#66

360+x: A Panoptic Multi-modal Scene Understanding Dataset

Hao Chen, Yuqi Hou, Chenyuan Qu et al.

CVPR 2024
24
citations
#67

Tyche: Stochastic In-Context Learning for Medical Image Segmentation

Marianne Rakic, Hallee Wong, Jose Javier Gonzalez Ortiz et al.

CVPR 2024
24
citations
#68

Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline

Junlong Cheng, Bin Fu, Jin Ye et al.

CVPR 2025
24
citations
#69

Semantic-aware SAM for Point-Prompted Instance Segmentation

Zhaoyang Wei, Pengfei Chen, Xuehui Yu et al.

CVPR 2024
22
citations
#70

Reliability in Semantic Segmentation: Can We Use Synthetic Data?

Thibaut Loiseau, Tuan Hung Vu, Mickael Chen et al.

ECCV 2024
22
citations
#71

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

Yuxuan Sun, Yixuan Si, Chenglu Zhu et al.

CVPR 2025arXiv:2412.12077
computational pathologymultimodal foundation modelwhole slide image analysisvisual question answering+4
22
citations
#72

SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

Yunxiang Fu, Meng Lou, Yizhou Yu

CVPR 2025arXiv:2412.11890
semantic segmentationstate space modelsglobal context modelinglocal attention+3
22
citations
#73

Rethinking Few-shot 3D Point Cloud Semantic Segmentation

Zhaochong An, Guolei Sun, Yun Liu et al.

CVPR 2024
21
citations
#74

GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation

Lang Lin, Xueyang Yu, Ziqi Pang et al.

CVPR 2025
21
citations
#75

Clustering Propagation for Universal Medical Image Segmentation

Yuhang Ding, Liulei Li, Wenguan Wang et al.

CVPR 2024
21
citations
#76

Region-Adaptive Transform with Segmentation Prior for Image Compression

Yuxi Liu, Wenhan Yang, Huihui Bai et al.

ECCV 2024arXiv:2403.00628
learned image compressionregion-adaptive transformsegmentation prioradaptive convolutions+3
21
citations
#77

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal et al.

CVPR 2024
21
citations
#78

PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation

Ruining Deng, Quan Liu, Can Cui et al.

CVPR 2024
21
citations
#79

An Incremental Unified Framework for Small Defect Inspection

Jiaqi Tang, Hao Lu, Xiaogang Xu et al.

ECCV 2024arXiv:2312.08917
defect inspectionincremental learningobject-aware self-attentionsemantic compression loss+4
21
citations
#80

ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting

Yankai Jiang, Zhongzhen Huang, Rongzhao Zhang et al.

CVPR 2024
20
citations
#81

OmniSAM: Omnidirectional Segment Anything Model for UDA in Panoramic Semantic Segmentation

Ding Zhong, Xu Zheng, Chenfei Liao et al.

ICCV 2025arXiv:2503.07098
panoramic semantic segmentationsegment anything modeldomain adaptationmemory mechanism+4
20
citations
#82

360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries

Huajian Huang, Changkun Liu, Yipeng Zhu et al.

CVPR 2024
20
citations
#83

Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation

Jiaqi Huang, Zunnan Xu, Ting Liu et al.

AAAI 2025
20
citations
#84

Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation

Yunheng Li, Zhong-Yu Li, Quan-Sheng Zeng et al.

ICML 2024
vision-language modelszero-shot semantic segmentationmulti-level visual featuresembedding alignment+2
20
citations
#85

Trusted Unified Feature-Neighborhood Dynamics for Multi-View Classification

Haojian Huang, Chuanyu Qin, Zhe Liu et al.

AAAI 2025
20
citations
#86

Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation

Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang et al.

CVPR 2024
19
citations
#87

The Devil is in Temporal Token: High Quality Video Reasoning Segmentation

Sitong Gong, Yunzhi Zhuge, Lu Zhang et al.

CVPR 2025
19
citations
#88

ODIN: A Single Model for 2D and 3D Segmentation

Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios et al.

CVPR 2024
19
citations
#89

Open-Set Domain Adaptation for Semantic Segmentation

Seun-An Choe, Ah-Hyung Shin, Keon Hee Park et al.

CVPR 2024
18
citations
#90

FreePoint: Unsupervised Point Cloud Instance Segmentation

Zhikai Zhang, Jian Ding, Li Jiang et al.

CVPR 2024
18
citations
#91

Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes

Zhi Cai, Yingjie Gao, Yaoyan Zheng et al.

ECCV 2024
18
citations
#92

MSP-MVS: Multi-Granularity Segmentation Prior Guided Multi-View Stereo

Zhenlong Yuan, Cong Liu, Fei Shen et al.

AAAI 2025
18
citations
#93

InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping

Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO

ECCV 2024arXiv:2308.08543
vectorized hd mappingattention mechanismautonomous drivingpoint set prediction+4
18
citations
#94

RMem: Restricted Memory Banks Improve Video Object Segmentation

Junbao Zhou, Ziqi Pang, Yu-Xiong Wang

CVPR 2024
18
citations
#95

Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation

Xin Zhang, Robby T. Tan

CVPR 2025
17
citations
#96

Weakly Supervised Semantic Segmentation for Driving Scenes

Dongseob Kim, Seungho Lee, Junsuk Choe et al.

AAAI 2024arXiv:2312.13646
weakly supervised semantic segmentationdriving scene datasetscontrastive language-image pre-trainingsmall object detection+4
17
citations
#97

SuperNormal: Neural Surface Reconstruction via Multi-View Normal Integration

Xu Cao, Takafumi Taketomi

CVPR 2024
17
citations
#98

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Rongyao Fang, Chengqi Duan, Kun Wang et al.

ICCV 2025
17
citations
#99

SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds

Yanbo Wang, Wentao Zhao, Cao Chuan et al.

ECCV 2024
17
citations
#100

PartSTAD: 2D-to-3D Part Segmentation Task Adaptation

Hyunjin Kim, Minhyuk Sung

ECCV 2024
16
citations