"video-text understanding" Papers
4 papers found
DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding
Yue Jiang, Jichu Li, Yang Liu et al.
NeurIPS 2025oralarXiv:2505.18411
3
citations
Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning
Fanrui Zhang, Dian Li, Qiang Zhang et al.
NeurIPS 2025posterarXiv:2505.16836
4
citations
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Ishaan Rawal, Alexander Matyasko, Shantanu Jaiswal et al.
ICML 2024poster
DoRA: Weight-Decomposed Low-Rank Adaptation
Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin et al.
ICML 2024poster