"multi-modal pre-training" Papers
2 papers found
MESED: A Multi-Modal Entity Set Expansion Dataset with Fine-Grained Semantic Classes and Hard Negative Entities
Li Yangning, Tingwei Lu, Hai-Tao Zheng et al.
AAAI 2024paperarXiv:2307.14878
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
Yang Liu, Pengxiang Ding, Siteng Huang et al.
ECCV 2024posterarXiv:2409.07239
9
citations