Xiaohan Zhang

Google Scholar OpenReview

6

Papers

1,646

Total Citations

2

h-index

Papers (6)

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

LVBench: An Extreme Long Video Understanding Benchmark

KoLA: Carefully Benchmarking World Knowledge of Large Language Models

Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition

Toy-GS: Assembling Local Gaussians for Precisely Rendering Large-Scale Free Camera Trajectories

OpenEQA: Embodied Question Answering in the Era of Foundation Models