Lu Hou

13

Papers

534

Total Citations

Papers (13)

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance

FlatQuant: Flatness Matters for LLM Quantization

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

OAC: Output-adaptive Calibration for Accurate Post-training Quantization

MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric

FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation

Normalization Helps Training of Quantized LSTM

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Towards Efficient Post-training Quantization of Pre-trained Language Models

Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark