Jilan Xu

8

Papers

1,018

Total Citations

Papers (8)

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World

CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding

Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning

EgoExoBench: A Benchmark for First- and Third-person View Video Understanding in MLLMs

AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation

Learning Streaming Video Representation via Multitask Training

Retrieval-Augmented Egocentric Video Captioning