Junshu Sun
4
Papers
9
Total Citations
Papers (4)
Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval
ICLR 2025
7
citations
Video Language Model Pretraining with Spatio-temporal Masking
CVPR 2025
1
citations
VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set
NeurIPS 2025arXiv
1
citations
Relieving the Over-Aggregating Effect in Graph Transformers
NeurIPS 2025
0
citations