Jie Tang
16
Papers
1,836
Total Citations
Papers (16)
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
ICLR 2025
1,318
citations
LVBench: An Extreme Long Video Understanding Benchmark
ICCV 2025
208
citations
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
ICLR 2024
85
citations
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
ICLR 2025
67
citations
Bilateral Propagation Network for Depth Completion
CVPR 2024
51
citations
Scaling Speech-Text Pre-training with Synthetic Interleaved Data
ICLR 2025
39
citations
CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution
CVPR 2025
23
citations
Sketch and Refine: Towards Fast and Accurate Lane Detection
AAAI 2024arXiv
20
citations
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
ICLR 2025
12
citations
TriSampler: A Better Negative Sampling Principle for Dense Retrieval
AAAI 2024arXiv
12
citations
Small Language Model Makes an Effective Long Text Extractor
AAAI 2025
1
citations
Towards Efficient Exact Optimization of Language Model Alignment
ICML 2024
0
citations
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
ICCV 2025
0
citations
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
CVPR 2025
0
citations
CogAgent: A Visual Language Model for GUI Agents
CVPR 2024
0
citations
AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning
CVPR 2025
0
citations