2025 "multimodal pretraining" Papers
6 papers found
Active Data Curation Effectively Distills Large-Scale Multimodal Models
Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.
CVPR 2025posterarXiv:2411.18674
14
citations
Do LVLMs Truly Understand Video Anomalies? Revealing Hallucination via Co-Occurrence Patterns
Menghao Zhang, Huazheng Wang, Pengfei Ren et al.
NEURIPS 2025poster
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li, Yutong Chen, Yiqian Wu et al.
ICCV 2025posterarXiv:2506.07886
4
citations
G2PDiffusion: Cross-species Genotype-to-Phenotype Prediction via Evolutionary Diffusion
Mengdi Liu, Zhangyang Gao, Hong Chang et al.
ICCV 2025posterarXiv:2502.04684
1
citations
LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery
Jerome Quenum, Wen-Han Hsieh, Tsung-Han (Patrick) Wu et al.
NEURIPS 2025posterarXiv:2505.02829
4
citations
VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning
Qingtao Liu, Yu Cui, Zhengnan Sun et al.
ICLR 2025poster
11
citations