Wei Li
43
Papers
1,033
Total Citations
Papers (43)
SALMONN: Towards Generic Hearing Abilities for Large Language Models
ICLR 2024
447
citations
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
ICLR 2025
180
citations
OMG-Seg: Is One Model Good Enough For All Segmentation?
CVPR 2024
106
citations
IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection
CVPR 2024
81
citations
Distilling Semantic Priors from SAM to Efficient Image Restoration Models
CVPR 2024
36
citations
LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant
CVPR 2025
33
citations
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
ICCV 2025
33
citations
OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining
ICCV 2025
23
citations
F-LMM: Grounding Frozen Large Multimodal Models
CVPR 2025
21
citations
Delta Decompression for MoE-based LLMs Compression
ICML 2025
18
citations
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization
AAAI 2025
9
citations
Subspace Prototype Guidance for Mitigating Class Imbalance in Point Cloud Semantic Segmentation
ECCV 2024
9
citations
CoPEFT: Fast Adaptation Framework for Multi-Agent Collaborative Perception with Parameter-Efficient Fine-Tuning
AAAI 2025
4
citations
PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
CVPR 2025
4
citations
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale
CVPR 2025
4
citations
Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration
AAAI 2025
4
citations
AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution
ICLR 2025
3
citations
Leveraging SD Map to Augment HD Map-based Trajectory Prediction
CVPR 2025
3
citations
MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
NeurIPS 2025
3
citations
DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting
CVPR 2025
3
citations
Can a Large Language Model be a Gaslighter?
ICLR 2025
2
citations
Describe, Don’t Dictate: Semantic Image Editing with Natural Language Intent
ICCV 2025
2
citations
Uni-LoRA: One Vector is All You Need
NeurIPS 2025arXiv
2
citations
ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency
AAAI 2025
1
citations
Efficient Spiking Point Mamba for Point Cloud Analysis
ICCV 2025
1
citations
CGS-Mask: Making Time Series Predictions Intuitive for All
AAAI 2024arXiv
1
citations
SRefiner: Soft-Braid Attention for Multi-Agent Trajectory Refinement
ICCV 2025
0
citations
Breaking Information Isolation: Accelerating MRI via Inter-sequence Mapping and Progressive Masking
AAAI 2025
0
citations
HOMO-Feature: Cross-Arbitrary-Modal Image Matching with Homomorphism of Organized Major Orientation
ICCV 2025
0
citations
GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expressions
AAAI 2025
0
citations
AdaCo: Overcoming Visual Foundation Model Noise in 3D Semantic Segmentation via Adaptive Label Correction
AAAI 2025
0
citations
AIRA: Activation-Informed Low-Rank Adaptation for Large Models
ICCV 2025
0
citations
DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection
AAAI 2024
0
citations
Multi-Modal Disordered Representation Learning Network for Description-Based Person Search
AAAI 2024
0
citations
AutoOS: Make Your OS More Powerful by Exploiting Large Language Models
ICML 2024
0
citations
Efficient Fine-Tuning of Large Models via Nested Low-Rank Adaptation
ICCV 2025
0
citations
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
ICCV 2025
0
citations
WildAvatar: Learning In-the-wild 3D Avatars from the Web
CVPR 2025
0
citations
Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion
CVPR 2025
0
citations
LMO: Linear Mamba Operator for MRI Reconstruction
CVPR 2025
0
citations
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
ICML 2024
0
citations
Improving Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
ICML 2024
0
citations
ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
AAAI 2025
0
citations