Xu Li
17
Papers
391
Total Citations
Papers (17)
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
ICLR 2025arXiv
200
citations
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
NeurIPS 2025arXiv
52
citations
HiFi-123: Towards High-fidelity One Image to 3D Content Generation
ECCV 2024arXiv
34
citations
Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think
NeurIPS 2025arXiv
27
citations
Is Your Multimodal Language Model Oversensitive to Safe Queries?
ICLR 2025arXiv
20
citations
NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering
NeurIPS 2025arXiv
14
citations
Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
NeurIPS 2025arXiv
13
citations
NoT: Federated Unlearning via Weight Negation
CVPR 2025arXiv
11
citations
Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data
ICLR 2025arXiv
6
citations
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
ICLR 2025arXiv
5
citations
Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation
NeurIPS 2025
3
citations
GMValuator: Similarity-based Data Valuation for Generative Models
ICLR 2025arXiv
2
citations
See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction
NeurIPS 2025arXiv
2
citations
Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation
NeurIPS 2025arXiv
1
citations
VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models
NeurIPS 2025arXiv
1
citations
MTRec: Learning to Align with User Preferences via Mental Reward Models
NeurIPS 2025arXiv
0
citations
When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions
NeurIPS 2025arXiv
0
citations