🧬Learning Paradigms

Self-Supervised Learning

Learning representations without labels

100 papers1,961 total citations
Compare with other topics
Feb '24 Jan '26449 papers
Also includes: self-supervised learning, ssl, pretext tasks, unsupervised pre-training

Top Papers

#1

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Chongyu Fan, Jiancheng Liu, Yihua Zhang et al.

ICLR 2024
263
citations
#2

Revisiting Feature Prediction for Learning Visual Representations from Video

Quentin Garrido, Yann LeCun, Michael Rabbat et al.

ICLR 2025
178
citations
#3

SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference

Feng Wang, Jieru Mei, Alan Yuille

ECCV 2024
120
citations
#4

Deconstructing Denoising Diffusion Models for Self-Supervised Learning

Xinlei Chen, Zhuang Liu, Saining Xie et al.

ICLR 2025
91
citations
#5

Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology

Andrew Song, Richard J. Chen, Tong Ding et al.

CVPR 2024
74
citations
#6

GAMC: An Unsupervised Method for Fake News Detection Using Graph Autoencoder with Masking

Shu Yin, Peican Zhu, Lianwei Wu et al.

AAAI 2024arXiv:2312.05739
fake news detectiongraph autoencoderunsupervised learningself-supervised learning+4
53
citations
#7

SEPT: Towards Efficient Scene Representation Learning for Motion Prediction

Zhiqian Lan, Yuxuan Jiang, Yao Mu et al.

ICLR 2024
45
citations
#8

Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning

Yiwen Ye, Yutong Xie, Jianpeng Zhang et al.

CVPR 2024
42
citations
#9

Scaling Language-Free Visual Representation Learning

David Fan, Shengbang Tong, Jiachen Zhu et al.

ICCV 2025arXiv:2504.01017
visual self-supervised learningcontrastive language-image pretrainingmultimodal representation learningvision encoders+2
39
citations
#10

Sonata: Self-Supervised Learning of Reliable Point Representations

Xiaoyang Wu, Daniel DeTone, Duncan Frost et al.

CVPR 2025
39
citations
#11

Better Call SAL: Towards Learning to Segment Anything in Lidar

Aljoša Ošep, Tim Meinhardt, Francesco Ferroni et al.

ECCV 2024
38
citations
#12

Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction

Jiatong Shi, Hirofumi Inaguma, Xutai Ma et al.

ICLR 2024
36
citations
#13

Self-Supervised Facial Representation Learning with Facial Region Awareness

Zheng Gao, Ioannis Patras

CVPR 2024
29
citations
#14

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Zhongwei Ren, Yunchao Wei, Xun Guo et al.

CVPR 2025
28
citations
#15

No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation

Xiangyang Zhu, Renrui Zhang, Bowei He et al.

CVPR 2024
27
citations
#16

GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding

Chengyao Wang, Li Jiang, Xiaoyang Wu et al.

CVPR 2024
25
citations
#17

SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-Supervised Skeleton-Based Action Recognition

Cong Wu, Xiao-Jun Wu, Josef Kittler et al.

AAAI 2024arXiv:2309.05834
skeleton-based action recognitioncontrastive learningspatiotemporal disentanglementmasked image modeling+4
24
citations
#18

Decoupled Spatio-Temporal Consistency Learning for Self-Supervised Tracking

Yaozong Zheng, Bineng Zhong, Qihua Liang et al.

AAAI 2025
24
citations
#19

FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning

Chenhao Li, Elijah Stanger-Jones, Steve Heim et al.

ICLR 2024
23
citations
#20

On the Provable Advantage of Unsupervised Pretraining

Jiawei Ge, Shange Tang, Jianqing Fan et al.

ICLR 2024
22
citations
#21

TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data

Siyi Du, Shaoming Zheng, Yinsong Wang et al.

ECCV 2024
21
citations
#22

Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature

Wu Yun, Mengshi Qi, Chuanming Wang et al.

AAAI 2024arXiv:2303.12332
weakly-supervised temporal action localizationsalient snippet-feature inferencepseudo label generationtemporal structure exploitation+3
21
citations
#23

SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery

Sarah Rastegar, Mohammadreza Salehi, Yuki M Asano et al.

ECCV 2024arXiv:2408.14371
generalized category discoveryfine-grained categorizationself-expertise learninghierarchical pseudo-labeling+2
20
citations
#24

Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models

Tianzhe Chu, Shengbang Tong, Tianjiao Ding et al.

ICLR 2024
19
citations
#25

A Label-free Heterophily-guided Approach for Unsupervised Graph Fraud Detection

Junjun Pan, Yixin Liu, Xin Zheng et al.

AAAI 2025
18
citations
#26

Weakly Supervised Semantic Segmentation for Driving Scenes

Dongseob Kim, Seungho Lee, Junsuk Choe et al.

AAAI 2024arXiv:2312.13646
weakly supervised semantic segmentationdriving scene datasetscontrastive language-image pre-trainingsmall object detection+4
17
citations
#27

Label-Agnostic Forgetting: A Supervision-Free Unlearning in Deep Models

Shaofei Shen, Chenhao Zhang, Yawen Zhao et al.

ICLR 2024
17
citations
#28

USP: Unified Self-Supervised Pretraining for Image Generation and Understanding

Xiangxiang Chu, Renda Li, Yong Wang

ICCV 2025
16
citations
#29

Grounded Object-Centric Learning

Avinash Kori, Francesco Locatello, Fabio De Sousa Ribeiro et al.

ICLR 2024
16
citations
#30

R-MAE: Regions Meet Masked Autoencoders

Duy-Kien Nguyen, Yanghao Li, Vaibhav Aggarwal et al.

ICLR 2024
16
citations
#31

Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders

Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa et al.

ECCV 2024
16
citations
#32

Learning Representations of Satellite Images From Metadata Supervision

Jules Bourcier, Gohar Dashyan, Karteek Alahari et al.

ECCV 2024
13
citations
#33

RI-MAE: Rotation-Invariant Masked AutoEncoders for Self-Supervised Point Cloud Representation Learning

Kunming Su, Qiuxia Wu, Panpan Cai et al.

AAAI 2025
13
citations
#34

Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation

Xinliang Zhang, Lei Zhu, Hangzhou He et al.

AAAI 2024arXiv:2402.17555
weakly-supervised semantic segmentationscribble annotationpseudo-label generationlocalization rectification module+3
13
citations
#35

Robust Self-Paced Hashing for Cross-Modal Retrieval with Noisy Labels

Ruitao Pu, Yuan Sun, Yang Qin et al.

AAAI 2025
13
citations
#36

Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation

Jiyuan Wang, Chunyu Lin, cheng guan et al.

NeurIPS 2025
12
citations
#37

MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation

Linyan Yang, Lukas Hoyer, Mark Weber et al.

ECCV 2024arXiv:2408.16478
unsupervised domain adaptationsemantic segmentationdomain gap bridginggeometric information integration+3
12
citations
#38

Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation

Seonghoon Yu, Paul Hongsuck Seo, Jeany Son

ECCV 2024
12
citations
#39

An OpenMind for 3D Medical Vision Self-supervised Learning

Tassilo Wald, Constantin Ulrich, Jonathan Suprijadi et al.

ICCV 2025
12
citations
#40

Adaptive Self-training Framework for Fine-grained Scene Graph Generation

Kibum Kim, Kanghoon Yoon, Yeonjun In et al.

ICLR 2024
12
citations
#41

Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction

Zhaoxi Mu, Xinyu Yang, Sining Sun et al.

AAAI 2024arXiv:2312.10305
disentangled representation learningtarget speech extractionspeaker identity disentanglementadaptive modulation transformer+4
12
citations
#42

Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning

Tim Lenz, Peter Neidlinger, Marta Ligero et al.

CVPR 2025
12
citations
#43

CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers

Shahaf Arica, Or Rubin, Sapir Gershov et al.

CVPR 2024
12
citations
#44

Unsupervised Gaze Representation Learning from Multi-view Face Images

Yiwei Bao, Feng Lu

CVPR 2024
12
citations
#45

LUDVIG: Learning-Free Uplifting of 2D Visual Features to Gaussian Splatting Scenes

Juliette Marrie, Romain Menegaux, Michael Arbel et al.

ICCV 2025
12
citations
#46

Self-Supervised Any-Point Tracking by Contrastive Random Walks

Ayush Shrivastava, Andrew Owens

ECCV 2024arXiv:2409.16288
self-supervised learningpoint trackingcontrastive random walksglobal matching transformer+4
11
citations
#47

CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection

Xunfa Lai, Zhiyu Yang, Jie Hu et al.

ECCV 2024
11
citations
#48

Dynamic Sub-graph Distillation for Robust Semi-supervised Continual Learning

Yan Fan, Yu Wang, Pengfei Zhu et al.

AAAI 2024arXiv:2312.16409
semi-supervised continual learningknowledge distillationdynamic graph constructioncatastrophic forgetting+2
11
citations
#49

Non-parametric Representation Learning with Kernels

Hebaixu Wang, Meiqi Gong, Xiaoguang Mei et al.

AAAI 2024arXiv:2309.02028
kernel methodsrepresentation learningself-supervised learningcontrastive learning+4
11
citations
#50

FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning

Gaojian Wang, Feng Lin, Tong Wu et al.

CVPR 2025
11
citations
#51

Visual Generation Without Guidance

Huayu Chen, Kai Jiang, Kaiwen Zheng et al.

ICML 2025
10
citations
#52

S4-Driver: Scalable Self-Supervised Driving Multimodal Large Language Model with Spatio-Temporal Visual Representation

Yichen Xie, Runsheng Xu, Tong He et al.

CVPR 2025
10
citations
#53

When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning

Yang Liu, Qianqian Xu, Peisong Wen et al.

CVPR 2025
10
citations
#54

Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data

Tuo FENG, Wenguan Wang, Ruijie Quan et al.

ECCV 2024
10
citations
#55

Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos

Keqiang Sun, Dori Litvak, Yunzhi Zhang et al.

ECCV 2024
10
citations
#56

Learning to Compose: Improving Object Centric Learning by Injecting Compositionality

Whie Jung, Jaehoon Yoo, Sungjin Ahn et al.

ICLR 2024
10
citations
#57

SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input

Zhen Lv, Yangqi Long, Congzhentao Huang et al.

CVPR 2025
10
citations
#58

LDReg: Local Dimensionality Regularized Self-Supervised Learning

Hanxun Huang, Ricardo Campello, Sarah Erfani et al.

ICLR 2024
9
citations
#59

DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning

Xinghao Wang, Junliang He, Pengyu Wang et al.

AAAI 2024arXiv:2401.13621
sentence representation learningcontrastive learning methodssemantic textual similaritydenoising objective+3
9
citations
#60

From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.

NeurIPS 2025
9
citations
#61

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance

Xiaoxu Xu, Yitian Yuan, Jinlong Li et al.

ECCV 2024
9
citations
#62

Objective drives the consistency of representational similarity across datasets

Laure Ciernik, Lorenz Linhardt, Marco Morik et al.

ICML 2025
9
citations
#63

Unsupervised Group Re-identification via Adaptive Clustering-Driven Progressive Learning

Hongxu Chen, Quan Zhang, Jian-Huang Lai et al.

AAAI 2024
9
citations
#64

Knowledge Guided Semi-supervised Learning for Quality Assessment of User Generated Videos

Shankhanil Mitra, Rajiv Soundararajan

AAAI 2024arXiv:2312.15425
user generated contentvideo quality assessmentself-supervised learningsemi-supervised learning+3
9
citations
#65

Circumventing Shortcuts in Audio-visual Deepfake Detection Datasets with Unsupervised Learning

Stefan Smeu, Dragos-Alexandru Boldisor, Dan Oneata et al.

CVPR 2025
9
citations
#66

Learning with a Mole: Transferable latent spatial representations for navigation without reconstruction

Guillaume Bono, Leonid Antsfeld, Assem Sadek et al.

ICLR 2024
8
citations
#67

Self-supervised co-salient object detection via feature correspondences at multiple scales

Souradeep Chakraborty, Dimitris Samaras

ECCV 2024
8
citations
#68

Self-Supervised Representation Learning for Adversarial Attack Detection

Yi Li, Plamen Angelov, Neeraj Suri

ECCV 2024
8
citations
#69

UNR-Explainer: Counterfactual Explanations for Unsupervised Node Representation Learning Models

Hyunju Kang, Geonhee Han, Hogun Park

ICLR 2024
7
citations
#70

SSL-STMFormer Self-Supervised Learning Spatio-Temporal Entanglement Transformer for Traffic Flow Prediction

Zetao Li, Zheng Hu, Peng Han et al.

AAAI 2025
7
citations
#71

STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding

Aaryan Garg, Akash Kumar, Yogesh S. Rawat

CVPR 2025
7
citations
#72

DRL: Decomposed Representation Learning for Tabular Anomaly Detection

Hangting Ye, He Zhao, Wei Fan et al.

ICLR 2025
6
citations
#73

LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining

Huawen Shen, Gengluo Li, Jinwen Zhong et al.

AAAI 2025
6
citations
#74

Unsupervised Extractive Summarization with Learnable Length Control Strategies

Renlong Jie, Xiaojun Meng, Xin Jiang et al.

AAAI 2024arXiv:2312.06901
unsupervised extractive summarizationlearnable length controlsiamese networkbidirectional prediction objective+4
6
citations
#75

CNC-Net: Self-Supervised Learning for CNC Machining Operations

Mohsen Yavartanoo, Sangmin Hong, Reyhaneh Neshatavar et al.

CVPR 2024
6
citations
#76

SVIP: Semantically Contextualized Visual Patches for Zero-Shot Learning

Zhi Chen, Zecheng Zhao, Jingcai Guo et al.

ICCV 2025
6
citations
#77

Learning Graph Invariance by Harnessing Spuriosity

Tianjun Yao, Yongqiang Chen, Kai Hu et al.

ICLR 2025
graph invariant learningout-of-distribution generalizationgraph representation learninginvariant risk minimization+1
5
citations
#78

Unsupervised Object Interaction Learning with Counterfactual Dynamics Models

Jongwook Choi, Sungtae Lee, Xinyu Wang et al.

AAAI 2024
5
citations
#79

AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models

Jan Metzen, Piyapat Saranrittichai, Chaithanya Kumar Mummadi

ICLR 2025
5
citations
#80

Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning

Ray Zhang, Zheming Zhou, Min Sun et al.

ECCV 2024arXiv:2407.20223
point cloud registrationse(3)-equivariant featuresreproducing kernel hilbert spacecorrespondence-free registration+3
5
citations
#81

Revisit Event Generation Model: Self-Supervised Learning of Event-to-Video Reconstruction with Implicit Neural Representations

Zipeng Wang, yunfan lu, LIN WANG

ECCV 2024
5
citations
#82

Learning from the Web: Language Drives Weakly-Supervised Incremental Learning for Semantic Segmentation

Chang Liu, Giulia Rizzoli, Pietro Zanuttigh et al.

ECCV 2024arXiv:2407.13363
semantic segmentationweakly-supervised learningincremental learningweb image mining+3
5
citations
#83

Interpretable Image Classification via Non-parametric Part Prototype Learning

Zhijie Zhu, Lei Fan, Maurice Pagnucco et al.

CVPR 2025
5
citations
#84

Self-supervised Debiasing Using Low Rank Regularization

Geon Yeong Park, Chanyong Jung, Sangmin Lee et al.

CVPR 2024
5
citations
#85

Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization

Jiayun Wang, Yubei Chen, Stella Yu

ECCV 2024arXiv:2403.14973
self-supervised learningviewpoint trajectory regularizationpose estimationvisual representation learning+3
4
citations
#86

Epsilon: Exploring Comprehensive Visual-Semantic Projection for Multi-Label Zero-Shot Learning

Ziming Liu, Jingcai Guo, Song Guo et al.

AAAI 2025
4
citations
#87

Revisiting Supervision for Continual Representation Learning

Daniel Marczak, Sebastian Cygert, Tomasz Trzcinski et al.

ECCV 2024arXiv:2311.13321
continual representation learningself-supervised learningmulti-layer perceptron projectorfeature transferability+2
4
citations
#88

Random Forest Autoencoders for Guided Representation Learning

Adrien Aumon, Shuang Ni, Myriam Lizotte et al.

NeurIPS 2025
4
citations
#89

Collapse-Proof Non-Contrastive Self-Supervised Learning

EMANUELE SANSONE, Tim Lebailly, Tinne Tuytelaars

ICML 2025
4
citations
#90

Self-supervised contrastive learning performs non-linear system identification

Rodrigo Gonzalez Laiz, Tobias Schmidt, Steffen Schneider

ICLR 2025
4
citations
#91

Self-Training Room Layout via Geometry-aware Ray-casting

Bolivar Solarte, Chin-Hsuan Wu, Jin-Cheng Jhang et al.

ECCV 2024
4
citations
#92

Atom-Level Optical Chemical Structure Recognition with Limited Supervision

Martijn Oldenhof, Edward De Brouwer, Adam Arany et al.

CVPR 2024
4
citations
#93

Representations Shape Weak-to-Strong Generalization: Theoretical Insights and Empirical Predictions

Yihao Xue, Jiping Li, Baharan Mirzasoleiman

ICML 2025
4
citations
#94

Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations

Marcin Przewięźlikowski, Randall Balestriero, Wojciech Jasiński et al.

ICCV 2025arXiv:2412.03215
masked image modelingself-supervised learningvisual representationsattention mechanism+3
4
citations
#95

Channel Consistency Prior and Self-Reconstruction Strategy Based Unsupervised Image Deraining

Guanglu Dong, Tianheng Zheng, Yuanzhouhan Cao et al.

CVPR 2025
4
citations
#96

Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels

Qiming Xia, Wenkai Lin, Haoen Xiang et al.

CVPR 2025arXiv:2503.08421
unsupervised 3d object detectionmulti-agent lidarcollaborative perceptionpseudo-label generation+2
4
citations
#97

Generalized Debiased Semi-Supervised Hashing for Large-Scale Image Retrieval

Xingbo Liu, Xuening Zhang, Xiushan Nie et al.

AAAI 2025
3
citations
#98

Exploring a Principled Framework for Deep Subspace Clustering

Xianghan Meng, Zhiyuan Huang, Wei He et al.

ICLR 2025arXiv:2503.17288
subspace clusteringunion of subspacesself-expressive coefficientsfeature space collapse+3
3
citations
#99

Perceptual Group Tokenizer: Building Perception with Iterative Grouping

Zhiwei Deng, Ting Chen, Yang Li

ICLR 2024
3
citations
#100

Efficient Self-Supervised Video Hashing with Selective State Spaces

Jinpeng Wang, Niu Lian, Jun Li et al.

AAAI 2025
3
citations