"multi-modal tasks" Papers
4 papers found
Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization
kaiyuan Li, Xiaoyue Chen, Chen Gao et al.
NeurIPS 2025posterarXiv:2505.22038
4
citations
Dataset Growth
Ziheng Qin, zhaopan xu, YuKun Zhou et al.
ECCV 2024posterarXiv:2405.18347
4
citations
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
Yang Liu, Pengxiang Ding, Siteng Huang et al.
ECCV 2024posterarXiv:2409.07239
9
citations
p-Laplacian Adaptation for Generative Pre-trained Vision-Language Models
Haoyuan Wu, Xinyun Zhang, Peng Xu et al.
AAAI 2024paperarXiv:2312.10613