Leonid Karlinsky
30
Papers
249
Total Citations
Papers (30)
Listen, Think, and Understand
ICLR 2024arXiv
221
citations
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
ICLR 2025arXiv
19
citations
Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features
ICCV 2025
3
citations
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
CVPR 2025arXiv
2
citations
Sample- and Parameter-Efficient Auto-Regressive Image Models
CVPR 2025arXiv
2
citations
Teaching VLMs to Localize Specific Objects from In-context Examples
ICCV 2025arXiv
2
citations
BATCLIP: Bimodal Online Test-Time Adaptation for CLIP
ICCV 2025arXiv
0
citations
LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content
ICLR 2025arXiv
0
citations
Fine-Grained Angular Contrastive Learning With Coarse Labels
CVPR 2021arXiv
0
citations
Task2Sim: Towards Effective Pre-Training and Transfer From Synthetic Data
CVPR 2022arXiv
0
citations
Unsupervised Domain Generalization by Learning a Bridge Across Domains
CVPR 2022arXiv
0
citations
CODA-Prompt: COntinual Decomposed Attention-Based Prompting for Rehearsal-Free Continual Learning
CVPR 2023
0
citations
ConStruct-VL: Data-Free Continual Structured VL Concepts Learning
CVPR 2023
0
citations
Teaching Structured Vision & Language Concepts to Vision & Language Models
CVPR 2023
0
citations
A Broad Study on the Transferability of Visual Representations With Contrastive Learning
ICCV 2021arXiv
0
citations
Detector-Free Weakly Supervised Grounding by Separation
ICCV 2021arXiv
0
citations
Going Beyond Nouns With Vision & Language Models Using Synthetic Data
ICCV 2023arXiv
0
citations
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
ICCV 2023arXiv
0
citations
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
ECCV 2020
0
citations
OnlineAugment: Online Data Augmentation with Less Domain Knowledge
ECCV 2020
0
citations
TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification
ECCV 2020
0
citations
A Broader Study of Cross-Domain Few-Shot Learning
ECCV 2020
0
citations
Self-Supervised Classification Network
ECCV 2022
0
citations
Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
NeurIPS 2021arXiv
0
citations
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
NeurIPS 2022arXiv
0
citations
FETA: Towards Specializing Foundational Models for Expert Task Applications
NeurIPS 2022arXiv
0
citations
How Transferable are Video Representations Based on Synthetic Data?
NeurIPS 2022
0
citations
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
NeurIPS 2023arXiv
0
citations
Learning Human Action Recognition Representations Without Real Humans
NeurIPS 2023arXiv
0
citations
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
NeurIPS 2023arXiv
0
citations