32
Papers
2,298
Total Citations
2
h-index
1
Affiliations

Affiliations

Xidian University

Papers (32)

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

NeurIPS 2025
1,227
citations

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

CVPR 2025
858
citations

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

ICML 2025
103
citations

Constrained Bayesian Optimization under Partial Observations: Balanced Improvements and Provable Convergence

AAAI 2024arXiv
19
citations

Weakly Supervised Open-Vocabulary Object Detection

AAAI 2024arXiv
16
citations

FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection

AAAI 2025
15
citations

Reinforcement Learning Friendly Vision-Language Model for Minecraft

ECCV 2024
14
citations

SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space

AAAI 2024arXiv
13
citations

Feature Denoising Diffusion Model for Blind Image Quality Assessment

AAAI 2025
8
citations

Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning

CVPR 2024
8
citations

Destroy and Repair Using Hyper-Graphs for Routing

AAAI 2025
7
citations

VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention

AAAI 2025
4
citations

FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression

CVPR 2025
4
citations

Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models

ICCV 2025arXiv
2
citations

A General and Efficient Training for Transformer via Token Expansion

CVPR 2024
0
citations

Aligning and Prompting Everything All at Once for Universal Visual Perception

CVPR 2024
0
citations

Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment

ICML 2024
0
citations

MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation

CVPR 2025
0
citations

Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity

ICML 2024
0
citations

Distilling Spatially-Heterogeneous Distortion Perception for Blind Image Quality Assessment

CVPR 2025
0
citations

Global Motion Corresponder for 3D Point-Based Scene Interpolation under Large Motion

ICCV 2025
0
citations

Radiance Fields in XR: A Survey on How Radiance Fields are Envisioned and Addressed for XR Research

ISMAR 2025
0
citations

VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis

AAAI 2025
0
citations

Know Where You Are From: Event-Based Segmentation via Spatio-Temporal Propagation

AAAI 2025
0
citations

ESEG: Event-Based Segmentation Boosted by Explicit Edge-Semantic Guidance

AAAI 2025
0
citations

Probability-Density-aware Semi-supervised Learning

AAAI 2025
0
citations

Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators

AAAI 2025
0
citations

Bridging Sequence-Structure Alignment in RNA Foundation Models

AAAI 2025
0
citations

Semi-supervised Blind Image Quality Assessment through Knowledge Distillation and Incremental Learning

AAAI 2024
0
citations

Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection

CVPR 2024
0
citations

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

CVPR 2024
0
citations

PAPR in Motion: Seamless Point-level 3D Scene Interpolation

CVPR 2024
0
citations