Xu Li

17
Papers
391
Total Citations

Papers (17)

OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

ICLR 2025arXiv
200
citations

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

NeurIPS 2025arXiv
52
citations

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

ECCV 2024arXiv
34
citations

Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think

NeurIPS 2025arXiv
27
citations

Is Your Multimodal Language Model Oversensitive to Safe Queries?

ICLR 2025arXiv
20
citations

NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering

NeurIPS 2025arXiv
14
citations

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

NeurIPS 2025arXiv
13
citations

NoT: Federated Unlearning via Weight Negation

CVPR 2025arXiv
11
citations

Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data

ICLR 2025arXiv
6
citations

Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

ICLR 2025arXiv
5
citations

Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation

NeurIPS 2025
3
citations

GMValuator: Similarity-based Data Valuation for Generative Models

ICLR 2025arXiv
2
citations

See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction

NeurIPS 2025arXiv
2
citations

Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation

NeurIPS 2025arXiv
1
citations

VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models

NeurIPS 2025arXiv
1
citations

MTRec: Learning to Align with User Preferences via Mental Reward Models

NeurIPS 2025arXiv
0
citations

When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions

NeurIPS 2025arXiv
0
citations