75
Papers
2,298
Total Citations
2
h-index
1
Affiliations
Affiliations
Xidian University
Papers (75)
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
NeurIPS 2025
1,227
citations
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
CVPR 2025
858
citations
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
ICML 2025
103
citations
Constrained Bayesian Optimization under Partial Observations: Balanced Improvements and Provable Convergence
AAAI 2024arXiv
19
citations
Weakly Supervised Open-Vocabulary Object Detection
AAAI 2024arXiv
16
citations
FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection
AAAI 2025
15
citations
Reinforcement Learning Friendly Vision-Language Model for Minecraft
ECCV 2024
14
citations
SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space
AAAI 2024arXiv
13
citations
Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning
CVPR 2024
8
citations
Feature Denoising Diffusion Model for Blind Image Quality Assessment
AAAI 2025
8
citations
Destroy and Repair Using Hyper-Graphs for Routing
AAAI 2025
7
citations
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression
CVPR 2025
4
citations
VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention
AAAI 2025
4
citations
Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models
ICCV 2025arXiv
2
citations
A General and Efficient Training for Transformer via Token Expansion
CVPR 2024
0
citations
Aligning and Prompting Everything All at Once for Universal Visual Perception
CVPR 2024
0
citations
Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment
ICML 2024
0
citations
Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity
ICML 2024
0
citations
Iterative Instance Segmentation
CVPR 2016
0
citations
Generalising Fine-Grained Sketch-Based Image Retrieval
CVPR 2019
0
citations
Non-Adversarial Image Synthesis With Generative Latent Nearest Neighbors
CVPR 2019
0
citations
Filter Grafting for Deep Neural Networks
CVPR 2020arXiv
0
citations
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
CVPR 2020arXiv
0
citations
Pose Recognition With Cascade Transformers
CVPR 2021arXiv
0
citations
DeRF: Decomposed Radiance Fields
CVPR 2021arXiv
0
citations
Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning
CVPR 2021arXiv
0
citations
Training-Free Transformer Architecture Search
CVPR 2022arXiv
0
citations
SCADE: NeRFs from Space Carving With Ambiguity-Aware Depth Estimates
CVPR 2023arXiv
0
citations
Photo Pre-Training, but for Sketch
CVPR 2023
0
citations
Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation
CVPR 2023
0
citations
CLIP Is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation
CVPR 2023arXiv
0
citations
SketchXAI: A First Look at Explainability for Human Sketches
CVPR 2023arXiv
0
citations
Diverse Image Synthesis From Semantic Layouts via Conditional IMLE
ICCV 2019
0
citations
Architecture Disentanglement for Deep Neural Networks
ICCV 2021arXiv
0
citations
Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting
ICCV 2021arXiv
0
citations
DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion
ICCV 2023arXiv
0
citations
Masked Autoencoders are Efficient Class Incremental Learners
ICCV 2023arXiv
0
citations
MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection
ICCV 2023arXiv
0
citations
Inclusive GAN: Improving Data and Minority Coverage in Generative Models
ECCV 2020
0
citations
Efficient Decoder-Free Object Detection with Transformers
ECCV 2022
0
citations
Fine-Grained Data Distribution Alignment for Post-Training Quantization
ECCV 2022
0
citations
Dynamic Dual Trainable Bounds for Ultra-Low Precision Super-Resolution Networks
ECCV 2022
0
citations
ARM: Any-Time Super-Resolution Method
ECCV 2022
0
citations
DisCo: Remedying Self-Supervised Learning on Lightweight Models with Distilled Contrastive Learning
ECCV 2022
0
citations
Long-Tailed Class Incremental Learning
ECCV 2022
0
citations
Bridging Sequence-Structure Alignment in RNA Foundation Models
AAAI 2025
0
citations
Distilling Spatially-Heterogeneous Distortion Perception for Blind Image Quality Assessment
CVPR 2025
0
citations
Global Motion Corresponder for 3D Point-Based Scene Interpolation under Large Motion
ICCV 2025
0
citations
Radiance Fields in XR: A Survey on How Radiance Fields are Envisioned and Addressed for XR Research
ISMAR 2025
0
citations
VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis
AAAI 2025
0
citations
Know Where You Are From: Event-Based Segmentation via Spatio-Temporal Propagation
AAAI 2025
0
citations
ESEG: Event-Based Segmentation Boosted by Explicit Edge-Semantic Guidance
AAAI 2025
0
citations
Probability-Density-aware Semi-supervised Learning
AAAI 2025
0
citations
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
AAAI 2025
0
citations
MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation
CVPR 2025
0
citations
Semi-supervised Blind Image Quality Assessment through Knowledge Distillation and Incremental Learning
AAAI 2024
0
citations
Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection
CVPR 2024
0
citations
Solving the Catastrophic Forgetting Problem in Generalized Category Discovery
CVPR 2024
0
citations
PAPR in Motion: Seamless Point-level 3D Scene Interpolation
CVPR 2024
0
citations
Approximate Feature Collisions in Neural Nets
NeurIPS 2019
0
citations
Pruning Filter in Filter
NeurIPS 2020
0
citations
Variational Model Inversion Attacks
NeurIPS 2021
0
citations
CHIMLE: Conditional Hierarchical IMLE for Multimodal Conditional Image Synthesis
NeurIPS 2022
0
citations
Learning Best Combination for Efficient N:M Sparsity
NeurIPS 2022
0
citations
Micro and Macro Level Graph Modeling for Graph Variational Auto-Encoders
NeurIPS 2022
0
citations
PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining
NeurIPS 2022
0
citations
Multi-modal Queried Object Detection in the Wild
NeurIPS 2023
0
citations
NeRF Revisited: Fixing Quadrature Instability in Volume Rendering
NeurIPS 2023
0
citations
“Why Not Looking backward?” A Robust Two-Step Method to Automatically Terminate Bayesian Optimization
NeurIPS 2023
0
citations
CAPro: Webly Supervised Learning with Cross-modality Aligned Prototypes
NeurIPS 2023
0
citations
Learning from Visual Observation via Offline Pretrained State-to-Go Transformer
NeurIPS 2023
0
citations
PAPR: Proximity Attention Point Rendering
NeurIPS 2023
0
citations
CamoPatch: An Evolutionary Strategy for Generating Camoflauged Adversarial Patches
NeurIPS 2023
0
citations
Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing
ICML 2016
0
citations
Fast k-Nearest Neighbour Search via Prioritized DCI
ICML 2017
0
citations