75
Papers
2,298
Total Citations
2
h-index
1
Affiliations

Affiliations

Xidian University

Papers (75)

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

NeurIPS 2025
1,227
citations

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

CVPR 2025
858
citations

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

ICML 2025
103
citations

Constrained Bayesian Optimization under Partial Observations: Balanced Improvements and Provable Convergence

AAAI 2024arXiv
19
citations

Weakly Supervised Open-Vocabulary Object Detection

AAAI 2024arXiv
16
citations

FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection

AAAI 2025
15
citations

Reinforcement Learning Friendly Vision-Language Model for Minecraft

ECCV 2024
14
citations

SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space

AAAI 2024arXiv
13
citations

Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning

CVPR 2024
8
citations

Feature Denoising Diffusion Model for Blind Image Quality Assessment

AAAI 2025
8
citations

Destroy and Repair Using Hyper-Graphs for Routing

AAAI 2025
7
citations

FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression

CVPR 2025
4
citations

VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention

AAAI 2025
4
citations

Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models

ICCV 2025arXiv
2
citations

A General and Efficient Training for Transformer via Token Expansion

CVPR 2024
0
citations

Aligning and Prompting Everything All at Once for Universal Visual Perception

CVPR 2024
0
citations

Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment

ICML 2024
0
citations

Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity

ICML 2024
0
citations

Iterative Instance Segmentation

CVPR 2016
0
citations

Generalising Fine-Grained Sketch-Based Image Retrieval

CVPR 2019
0
citations

Non-Adversarial Image Synthesis With Generative Latent Nearest Neighbors

CVPR 2019
0
citations

Filter Grafting for Deep Neural Networks

CVPR 2020arXiv
0
citations

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

CVPR 2020arXiv
0
citations

Pose Recognition With Cascade Transformers

CVPR 2021arXiv
0
citations

DeRF: Decomposed Radiance Fields

CVPR 2021arXiv
0
citations

Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning

CVPR 2021arXiv
0
citations

Training-Free Transformer Architecture Search

CVPR 2022arXiv
0
citations

SCADE: NeRFs from Space Carving With Ambiguity-Aware Depth Estimates

CVPR 2023arXiv
0
citations

Photo Pre-Training, but for Sketch

CVPR 2023
0
citations

Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation

CVPR 2023
0
citations

CLIP Is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation

CVPR 2023arXiv
0
citations

SketchXAI: A First Look at Explainability for Human Sketches

CVPR 2023arXiv
0
citations

Diverse Image Synthesis From Semantic Layouts via Conditional IMLE

ICCV 2019
0
citations

Architecture Disentanglement for Deep Neural Networks

ICCV 2021arXiv
0
citations

Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting

ICCV 2021arXiv
0
citations

DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

ICCV 2023arXiv
0
citations

Masked Autoencoders are Efficient Class Incremental Learners

ICCV 2023arXiv
0
citations

MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection

ICCV 2023arXiv
0
citations

Inclusive GAN: Improving Data and Minority Coverage in Generative Models

ECCV 2020
0
citations

Efficient Decoder-Free Object Detection with Transformers

ECCV 2022
0
citations

Fine-Grained Data Distribution Alignment for Post-Training Quantization

ECCV 2022
0
citations

Dynamic Dual Trainable Bounds for Ultra-Low Precision Super-Resolution Networks

ECCV 2022
0
citations

ARM: Any-Time Super-Resolution Method

ECCV 2022
0
citations

DisCo: Remedying Self-Supervised Learning on Lightweight Models with Distilled Contrastive Learning

ECCV 2022
0
citations

Long-Tailed Class Incremental Learning

ECCV 2022
0
citations

Bridging Sequence-Structure Alignment in RNA Foundation Models

AAAI 2025
0
citations

Distilling Spatially-Heterogeneous Distortion Perception for Blind Image Quality Assessment

CVPR 2025
0
citations

Global Motion Corresponder for 3D Point-Based Scene Interpolation under Large Motion

ICCV 2025
0
citations

Radiance Fields in XR: A Survey on How Radiance Fields are Envisioned and Addressed for XR Research

ISMAR 2025
0
citations

VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis

AAAI 2025
0
citations

Know Where You Are From: Event-Based Segmentation via Spatio-Temporal Propagation

AAAI 2025
0
citations

ESEG: Event-Based Segmentation Boosted by Explicit Edge-Semantic Guidance

AAAI 2025
0
citations

Probability-Density-aware Semi-supervised Learning

AAAI 2025
0
citations

Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators

AAAI 2025
0
citations

MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation

CVPR 2025
0
citations

Semi-supervised Blind Image Quality Assessment through Knowledge Distillation and Incremental Learning

AAAI 2024
0
citations

Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection

CVPR 2024
0
citations

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

CVPR 2024
0
citations

PAPR in Motion: Seamless Point-level 3D Scene Interpolation

CVPR 2024
0
citations

Approximate Feature Collisions in Neural Nets

NeurIPS 2019
0
citations

Pruning Filter in Filter

NeurIPS 2020
0
citations

Variational Model Inversion Attacks

NeurIPS 2021
0
citations

CHIMLE: Conditional Hierarchical IMLE for Multimodal Conditional Image Synthesis

NeurIPS 2022
0
citations

Learning Best Combination for Efficient N:M Sparsity

NeurIPS 2022
0
citations

Micro and Macro Level Graph Modeling for Graph Variational Auto-Encoders

NeurIPS 2022
0
citations

PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining

NeurIPS 2022
0
citations

Multi-modal Queried Object Detection in the Wild

NeurIPS 2023
0
citations

NeRF Revisited: Fixing Quadrature Instability in Volume Rendering

NeurIPS 2023
0
citations

“Why Not Looking backward?” A Robust Two-Step Method to Automatically Terminate Bayesian Optimization

NeurIPS 2023
0
citations

CAPro: Webly Supervised Learning with Cross-modality Aligned Prototypes

NeurIPS 2023
0
citations

Learning from Visual Observation via Offline Pretrained State-to-Go Transformer

NeurIPS 2023
0
citations

PAPR: Proximity Attention Point Rendering

NeurIPS 2023
0
citations

CamoPatch: An Evolutionary Strategy for Generating Camoflauged Adversarial Patches

NeurIPS 2023
0
citations

Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing

ICML 2016
0
citations

Fast k-Nearest Neighbour Search via Prioritized DCI

ICML 2017
0
citations