"visual representation learning" Papers
17 papers found
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Siyuan Li, Luyuan Zhang, Zedong Wang et al.
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Xi Chen, Mingkang Zhu, Shaoteng Liu et al.
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou, Teli Ma, Kun-Yu Lin et al.
Nested Diffusion Models Using Hierarchical Latent Priors
Xiao Zhang, Ruoxi Jiang, Rebecca Willett et al.
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu, Sangkyung Kwak, Huiwon Jang et al.
Autoencoding Conditional Neural Processes for Representation Learning
Victor Prokhorov, Ivan Titov, Siddharth N
Denoising Autoregressive Representation Learning
Yazhe Li, Jorg Bornschein, Ting Chen
Efficient Unsupervised Visual Representation Learning with Explicit Cluster Balancing
Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, ioannis Patras
Just Cluster It: An Approach for Exploration in High-Dimensions using Clustering and Pre-Trained Representations
Stefan Sylvius Wagner Martinez, Stefan Harmeling
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features
Thalles Silva, Helio Pedrini, Adín Ramírez Rivera
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Samuel Lavoie, Polina Kirichenko, Mark Ibrahim et al.
Multi-Label Cluster Discrimination for Visual Representation Learning
Xiang An, Kaicheng Yang, Xiangzi Dai et al.
Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization
Jiayun Wang, Yubei Chen, Stella Yu
Rejuvenating image-GPT as Strong Visual Representation Learners
Sucheng Ren, Zeyu Wang, Hongru Zhu et al.
Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning
Kaibin Tian, Yanhua Cheng, Yi Liu et al.
Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning
Yibing Wei, Abhinav Gupta, Pedro Morgado
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu, Bencheng Liao, Qian Zhang et al.