32
Papers
2,298
Total Citations
2
h-index
1
Affiliations
Affiliations
Xidian University
Papers (32)
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
NeurIPS 2025
1,227
citations
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
CVPR 2025
858
citations
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
ICML 2025
103
citations
Constrained Bayesian Optimization under Partial Observations: Balanced Improvements and Provable Convergence
AAAI 2024arXiv
19
citations
Weakly Supervised Open-Vocabulary Object Detection
AAAI 2024arXiv
16
citations
FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection
AAAI 2025
15
citations
Reinforcement Learning Friendly Vision-Language Model for Minecraft
ECCV 2024
14
citations
SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space
AAAI 2024arXiv
13
citations
Feature Denoising Diffusion Model for Blind Image Quality Assessment
AAAI 2025
8
citations
Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning
CVPR 2024
8
citations
Destroy and Repair Using Hyper-Graphs for Routing
AAAI 2025
7
citations
VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention
AAAI 2025
4
citations
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression
CVPR 2025
4
citations
Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models
ICCV 2025arXiv
2
citations
A General and Efficient Training for Transformer via Token Expansion
CVPR 2024
0
citations
Aligning and Prompting Everything All at Once for Universal Visual Perception
CVPR 2024
0
citations
Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment
ICML 2024
0
citations
MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation
CVPR 2025
0
citations
Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity
ICML 2024
0
citations
Distilling Spatially-Heterogeneous Distortion Perception for Blind Image Quality Assessment
CVPR 2025
0
citations
Global Motion Corresponder for 3D Point-Based Scene Interpolation under Large Motion
ICCV 2025
0
citations
Radiance Fields in XR: A Survey on How Radiance Fields are Envisioned and Addressed for XR Research
ISMAR 2025
0
citations
VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis
AAAI 2025
0
citations
Know Where You Are From: Event-Based Segmentation via Spatio-Temporal Propagation
AAAI 2025
0
citations
ESEG: Event-Based Segmentation Boosted by Explicit Edge-Semantic Guidance
AAAI 2025
0
citations
Probability-Density-aware Semi-supervised Learning
AAAI 2025
0
citations
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
AAAI 2025
0
citations
Bridging Sequence-Structure Alignment in RNA Foundation Models
AAAI 2025
0
citations
Semi-supervised Blind Image Quality Assessment through Knowledge Distillation and Incremental Learning
AAAI 2024
0
citations
Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection
CVPR 2024
0
citations
Solving the Catastrophic Forgetting Problem in Generalized Category Discovery
CVPR 2024
0
citations
PAPR in Motion: Seamless Point-level 3D Scene Interpolation
CVPR 2024
0
citations