Wei Li

88
Papers
1,033
Total Citations

Papers (88)

SALMONN: Towards Generic Hearing Abilities for Large Language Models

ICLR 2024
447
citations

AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

ICLR 2025
180
citations

OMG-Seg: Is One Model Good Enough For All Segmentation?

CVPR 2024
106
citations

IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection

CVPR 2024
81
citations

Distilling Semantic Priors from SAM to Efficient Image Restoration Models

CVPR 2024
36
citations

LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant

CVPR 2025
33
citations

Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

ICCV 2025
33
citations

OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining

ICCV 2025
23
citations

F-LMM: Grounding Frozen Large Multimodal Models

CVPR 2025
21
citations

Delta Decompression for MoE-based LLMs Compression

ICML 2025
18
citations

Subspace Prototype Guidance for Mitigating Class Imbalance in Point Cloud Semantic Segmentation

ECCV 2024
9
citations

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

AAAI 2025
9
citations

Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration

AAAI 2025
4
citations

CoPEFT: Fast Adaptation Framework for Multi-Agent Collaborative Perception with Parameter-Efficient Fine-Tuning

AAAI 2025
4
citations

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

CVPR 2025
4
citations

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

CVPR 2025
4
citations

Leveraging SD Map to Augment HD Map-based Trajectory Prediction

CVPR 2025
3
citations

DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting

CVPR 2025
3
citations

MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search

NeurIPS 2025
3
citations

AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution

ICLR 2025
3
citations

Can a Large Language Model be a Gaslighter?

ICLR 2025
2
citations

Uni-LoRA: One Vector is All You Need

NeurIPS 2025
2
citations

Describe, Don’t Dictate: Semantic Image Editing with Natural Language Intent

ICCV 2025
2
citations

ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency

AAAI 2025
1
citations

Efficient Spiking Point Mamba for Point Cloud Analysis

ICCV 2025
1
citations

CGS-Mask: Making Time Series Predictions Intuitive for All

AAAI 2024arXiv
1
citations

Transferable Semantic Augmentation for Domain Adaptation

CVPR 2021arXiv
0
citations

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

CVPR 2021arXiv
0
citations

MeMOT: Multi-Object Tracking With Memory

CVPR 2022arXiv
0
citations

PPDL: Predicate Probability Distribution Based Loss for Unbiased Scene Graph Generation

CVPR 2022
0
citations

Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark

CVPR 2022
0
citations

UniVIP: A Unified Framework for Self-Supervised Visual Pre-Training

CVPR 2022arXiv
0
citations

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models

CVPR 2022arXiv
0
citations

Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

CVPR 2022arXiv
0
citations

RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis

CVPR 2023
0
citations

Balancing Logit Variation for Long-Tailed Semantic Segmentation

CVPR 2023
0
citations

Siamese DETR

CVPR 2023arXiv
0
citations

Correlational Image Modeling for Self-Supervised Visual Pre-Training

CVPR 2023arXiv
0
citations

Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation

CVPR 2023arXiv
0
citations

Attribute Recognition by Joint Recurrent Learning of Context and Correlation

ICCV 2017arXiv
0
citations

Semantic Concentration for Domain Adaptation

ICCV 2021arXiv
0
citations

Adaptive Surface Normal Constraint for Depth Estimation

ICCV 2021arXiv
0
citations

A Simple Feature Augmentation for Domain Generalization

ICCV 2021
0
citations

Improving Generalization in Visual Reinforcement Learning via Conflict-aware Gradient Agreement Augmentation

ICCV 2023arXiv
0
citations

DVI: Depth Guided Video Inpainting for Autonomous Driving

ECCV 2020
0
citations

Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting

ECCV 2020
0
citations

Towards Efficient Coarse-to-Fine Networks for Action and Gesture Recognition

ECCV 2020
0
citations

Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks

ECCV 2020
0
citations

Open-Vocabulary DETR with Conditional Matching

ECCV 2022
0
citations

Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies

ECCV 2022
0
citations

FindIt: Generalized Localization with Natural Language Queries

ECCV 2022
0
citations

Generalizing GANs: A Turing Perspective

NeurIPS 2017
0
citations

Cap2Det: Learning to Amplify Weak Caption Supervision for Object Detection

ICCV 2019
0
citations

LMO: Linear Mamba Operator for MRI Reconstruction

CVPR 2025
0
citations

Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion

CVPR 2025
0
citations

WildAvatar: Learning In-the-wild 3D Avatars from the Web

CVPR 2025
0
citations

Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

ICCV 2025
0
citations

Efficient Fine-Tuning of Large Models via Nested Low-Rank Adaptation

ICCV 2025
0
citations

AIRA: Activation-Informed Low-Rank Adaptation for Large Models

ICCV 2025
0
citations

HOMO-Feature: Cross-Arbitrary-Modal Image Matching with Homomorphism of Organized Major Orientation

ICCV 2025
0
citations

SRefiner: Soft-Braid Attention for Multi-Agent Trajectory Refinement

ICCV 2025
0
citations

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

AAAI 2025
0
citations

Breaking Information Isolation: Accelerating MRI via Inter-sequence Mapping and Progressive Masking

AAAI 2025
0
citations

GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expressions

AAAI 2025
0
citations

AdaCo: Overcoming Visual Foundation Model Noise in 3D Semantic Segmentation via Adaptive Label Correction

AAAI 2025
0
citations

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

AAAI 2024
0
citations

Multi-Modal Disordered Representation Learning Network for Description-Based Person Search

AAAI 2024
0
citations

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models

ICML 2024
0
citations

Improving Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning

ICML 2024
0
citations

AutoOS: Make Your OS More Powerful by Exploiting Large Language Models

ICML 2024
0
citations

Action Unit Detection With Region Adaptation, Multi-Labeling Learning and Optimal Temporal Fusing

CVPR 2017arXiv
0
citations

Appearance-and-Relation Networks for Video Classification

CVPR 2018arXiv
0
citations

Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification

CVPR 2018arXiv
0
citations

Harmonious Attention Network for Person Re-Identification

CVPR 2018arXiv
0
citations

Channel Attention Based Iterative Residual Learning for Depth Map Super-Resolution

CVPR 2020arXiv
0
citations

WaveletStereo: Learning Wavelet Coefficients of Disparity Map in Stereo Matching

CVPR 2020
0
citations

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

CVPR 2021arXiv
0
citations

Dynamic Domain Adaptation for Efficient Inference

CVPR 2021arXiv
0
citations

Improved Expressivity Through Dendritic Neural Networks

NeurIPS 2018
0
citations

MST: Masked Self-Supervised Transformer for Visual Representation

NeurIPS 2021
0
citations

DeepInteraction: 3D Object Detection via Modality Interaction

NeurIPS 2022
0
citations

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

NeurIPS 2022
0
citations

Learning from Future: A Novel Self-Training Framework for Semantic Segmentation

NeurIPS 2022
0
citations

Delving into Out-of-Distribution Detection with Vision-Language Representations

NeurIPS 2022
0
citations

TransHP: Image Classification with Hierarchical Prompting

NeurIPS 2023
0
citations

“Why Not Looking backward?” A Robust Two-Step Method to Automatically Terminate Bayesian Optimization

NeurIPS 2023
0
citations

Cross-Domain Policy Adaptation via Value-Guided Data Filtering

NeurIPS 2023
0
citations

GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image

NeurIPS 2023
0
citations