Wei Liu

148
Papers
698
Total Citations
1
Affiliations

Affiliations

The Hong Kong University of Science and Technology

Papers (148)

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

ICLR 2024
343
citations

Discrete Hyper-Graph Matching

CVPR 2015
77
citations

BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

CVPR 2024
68
citations

Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving

NeurIPS 2025
46
citations

MathAttack: Attacking Large Language Models towards Math Solving Ability

AAAI 2024arXiv
37
citations

IDOL: Instant Photorealistic 3D Human Creation from a Single Image

CVPR 2025arXiv
36
citations

STIV: Scalable Text and Image Conditioned Video Generation

ICCV 2025
20
citations

MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls

AAAI 2025
19
citations

Local Conditional Controlling for Text-to-Image Diffusion Models

AAAI 2025
13
citations

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

AAAI 2025
9
citations

Auto-Regressive Diffusion for Generating 3D Human-Object Interactions

AAAI 2025
6
citations

Fix-CLIP: Dual-Branch Hierarchical Contrastive Learning via Synthetic Captions for Better Understanding of Long Text

ICCV 2025
5
citations

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

ICML 2025
5
citations

EBMDock: Neural Probabilistic Protein-Protein Docking via a Differentiable Energy Model

ICLR 2024
5
citations

Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

ICLR 2024
4
citations

ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering

ICCV 2025
3
citations

Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology

AAAI 2025
1
citations

Quantifying Distributional Invariance in Causal Subgraph for IRM-Free Graph Generalization

NeurIPS 2025
1
citations

Supervised Discrete Hashing

CVPR 2015
0
citations

Saliency Propagation From Simple to Difficult

CVPR 2015
0
citations

Towards 3D Object Detection With Bimodal Deep Boltzmann Machines Over RGBD Imagery

CVPR 2015
0
citations

Understanding Image Structure via Hierarchical Shape Parsing

CVPR 2015
0
citations

Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization

CVPR 2016
0
citations

Real-Time Neural Style Transfer for Videos

CVPR 2017
0
citations

Deep Self-Taught Learning for Weakly Supervised Object Localization

CVPR 2017arXiv
0
citations

Diverse Image Annotation

CVPR 2017
0
citations

SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning

CVPR 2017
0
citations

Frustum PointNets for 3D Object Detection From RGB-D Data

CVPR 2018arXiv
0
citations

Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks

CVPR 2018arXiv
0
citations

Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks

CVPR 2018arXiv
0
citations

Gated Fusion Network for Single Image Dehazing

CVPR 2018arXiv
0
citations

Left-Right Comparative Recurrent Model for Stereo Matching

CVPR 2018arXiv
0
citations

Dual Skipping Networks

CVPR 2018arXiv
0
citations

Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval

CVPR 2018arXiv
0
citations

CosFace: Large Margin Cosine Loss for Deep Face Recognition

CVPR 2018arXiv
0
citations

CNN in MRF: Video Object Segmentation via Inference in a CNN-Based Higher-Order Spatio-Temporal MRF

CVPR 2018arXiv
0
citations

Bidirectional Attentive Fusion With Context Gating for Dense Video Captioning

CVPR 2018arXiv
0
citations

Reconstruction Network for Video Captioning

CVPR 2018arXiv
0
citations

Tagging Like Humans: Diverse and Distinct Image Annotation

CVPR 2018arXiv
0
citations

Regularizing RNNs for Caption Generation by Reconstructing the Past With the Present

CVPR 2018arXiv
0
citations

MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation

CVPR 2019
0
citations

MVF-Net: Multi-View 3D Face Morphable Model Regression

CVPR 2019
0
citations

Spatio-Temporal Video Re-Localization by Warp LSTM

CVPR 2019
0
citations

Unsupervised Deep Tracking

CVPR 2019
0
citations

DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs

CVPR 2019
0
citations

NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction

CVPR 2019
0
citations

Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation

CVPR 2019
0
citations

Face Anti-Spoofing: Model Matters, so Does Data

CVPR 2019
0
citations

Decorrelated Adversarial Learning for Age-Invariant Face Recognition

CVPR 2019
0
citations

Multi-Granularity Generator for Temporal Action Proposal

CVPR 2019
0
citations

Compressing Convolutional Neural Networks via Factorized Convolutional Filters

CVPR 2019
0
citations

Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics

CVPR 2019
0
citations

Residual Regression With Semantic Prior for Crowd Counting

CVPR 2019
0
citations

Deep Spectral Clustering Using Dual Autoencoder Network

CVPR 2019
0
citations

Unsupervised Image Captioning

CVPR 2019
0
citations

Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables

CVPR 2019
0
citations

Learning Joint Gait Representation via Quintuplet Loss Minimization

CVPR 2019
0
citations

High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection

CVPR 2019
0
citations

Learning to Compose Dynamic Tree Structures for Visual Contexts

CVPR 2019
0
citations

Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition

CVPR 2019
0
citations

Image Deformation Meta-Networks for One-Shot Learning

CVPR 2019
0
citations

A Sufficient Condition for Convergences of Adam and RMSProp

CVPR 2019
0
citations

Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning

CVPR 2019
0
citations

Central Similarity Quantization for Efficient Image and Video Retrieval

CVPR 2020arXiv
0
citations

Deblurring by Realistic Blurring

CVPR 2020arXiv
0
citations

Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content

CVPR 2020
0
citations

MTL-NAS: Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning

CVPR 2020
0
citations

Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles

CVPR 2021arXiv
0
citations

VideoMoCo: Contrastive Video Representation Learning With Temporally Adversarial Examples

CVPR 2021arXiv
0
citations

Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On

CVPR 2021arXiv
0
citations

ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows

CVPR 2021arXiv
0
citations

Parser-Free Virtual Try-On via Distilling Appearance Flows

CVPR 2021arXiv
0
citations

DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls

CVPR 2021arXiv
0
citations

Generalizing Face Forgery Detection With High-Frequency Features

CVPR 2021arXiv
0
citations

Coherent Point Drift Revisited for Non-Rigid Shape Matching and Registration

CVPR 2022
0
citations

SWEM: Towards Real-Time Video Object Segmentation With Sequential Weighted Expectation-Maximization

CVPR 2022
0
citations

Improving Visual Grounding With Visual-Linguistic Verification and Iterative Reasoning

CVPR 2022arXiv
0
citations

XMP-Font: Self-Supervised Cross-Modality Pre-Training for Few-Shot Font Generation

CVPR 2022
0
citations

Seeing What You Miss: Vision-Language Pre-Training With Semantic Completion Learning

CVPR 2023arXiv
0
citations

Top Rank Supervised Binary Coding for Visual Search

ICCV 2015
0
citations

Learning Binary Codes for Maximum Inner Product Search

ICCV 2015
0
citations

Detecting Faces Using Inside Cascaded Contextual CNN

ICCV 2017
0
citations

Semi-Global Weighted Least Squares in Image Filtering

ICCV 2017arXiv
0
citations

Occlusion Robust Face Recognition Based on Mask Learning With Pairwise Differential Siamese Network

ICCV 2019
0
citations

Controllable Video Captioning With POS Sequence Guidance Based on Gated Fusion Network

ICCV 2019
0
citations

Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion

ICCV 2019
0
citations

Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization

ICCV 2019
0
citations

Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning

ICCV 2019
0
citations

Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection

ICCV 2019
0
citations

Benchmarking Ultra-High-Definition Image Super-Resolution

ICCV 2021
0
citations

SynFace: Face Recognition With Synthetic Data

ICCV 2021arXiv
0
citations

Pyramid Architecture Search for Real-Time Image Deblurring

ICCV 2021
0
citations

Adversarial Attack on Deep Cross-Modal Hamming Retrieval

ICCV 2021
0
citations

Heterogeneous Diversity Driven Active Learning for Multi-Object Tracking

ICCV 2023
0
citations

Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos

ECCV 2020
0
citations

Face Super-Resolution Guided by 3D Facial Priors

ECCV 2020
0
citations

PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation

ECCV 2020
0
citations

Context-Gated Convolution

ECCV 2020
0
citations

Masked Autoencoders for Point Cloud Self-Supervised Learning

ECCV 2022
0
citations

Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips

ECCV 2022
0
citations

Triangle Attack: A Query-Efficient Decision-Based Adversarial Attack

ECCV 2022
0
citations

Towards Efficient Adversarial Training on Vision Transformers

ECCV 2022
0
citations

Improving Vision Transformers by Revisiting High-Frequency Components

ECCV 2022
0
citations

Mixture-Rank Matrix Approximation for Collaborative Filtering

NeurIPS 2017
0
citations

Geometric Descent Method for Convex Composite Minimization

NeurIPS 2017arXiv
0
citations

RFNet: Unsupervised Network for Mutually Reinforcing Multi-Modal Image Registration and Fusion

CVPR 2022
0
citations

Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild

CVPR 2025
0
citations

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

ICCV 2025
0
citations

GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions

ICCV 2025
0
citations

HarmonySeg: Tubular Structure Segmentation with Deep-Shallow Feature Fusion and Growth-Suppression Balanced Loss

ICCV 2025
0
citations

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

AAAI 2025
0
citations

Towards More Discriminative Feature Learning in SNNs with Temporal-Self-Erasing Supervision

AAAI 2025
0
citations

Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation

AAAI 2025
0
citations

Follow-Your-Click: Open-domain Regional Image Animation via Motion Prompts

AAAI 2025
0
citations

Just a Few Glances: Open-Set Visual Perception with Image Prompt Paradigm

AAAI 2025
0
citations

Modeling All Response Surfaces in One for Conditional Search Spaces

AAAI 2025
0
citations

Enhancing Multi-View Classification Reliability with Adaptive Rejection

AAAI 2025
0
citations

Decoupling Representation and Knowledge for Few-Shot Intent Classification and Slot Filling

AAAI 2024
0
citations

DreamIdentity: Enhanced Editability for Efficient Face-Identity Preserved Image Generation

AAAI 2024
0
citations

SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding

AAAI 2024arXiv
0
citations

Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration

CVPR 2024
0
citations

UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation

ICML 2025
0
citations

Going Deeper With Convolutions

CVPR 2015
0
citations

Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation

NeurIPS 2018
0
citations

Distilled Wasserstein Learning for Word Embedding and Topic Modeling

NeurIPS 2018
0
citations

Generalizing Graph Matching beyond Quadratic Assignment Model

NeurIPS 2018
0
citations

Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning

NeurIPS 2018
0
citations

Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling

NeurIPS 2018
0
citations

Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos

NeurIPS 2019
0
citations

Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation

NeurIPS 2019
0
citations

Cross-Modal Learning with Adversarial Samples

NeurIPS 2019
0
citations

Towards Playing Full MOBA Games with Deep Reinforcement Learning

NeurIPS 2020
0
citations

Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization

NeurIPS 2020
0
citations

Adversarial Learning for Robust Deep Clustering

NeurIPS 2020
0
citations

Fewer is More: A Deep Graph Metric Learning Perspective Using Fewer Proxies

NeurIPS 2020
0
citations

Generalized and Discriminative Few-Shot Object Detection via SVD-Dictionary Enhancement

NeurIPS 2021
0
citations

Neural Routing by Memory

NeurIPS 2021
0
citations

FR: Folded Rationalization with a Unified Encoder

NeurIPS 2022
0
citations

Egocentric Video-Language Pretraining

NeurIPS 2022
0
citations

D-Separation for Causal Self-Explanation

NeurIPS 2023
0
citations

Punctuation-level Attack: Single-shot and Single Punctuation Can Fool Text Models

NeurIPS 2023
0
citations

Exploiting Contextual Objects and Relations for 3D Visual Grounding

NeurIPS 2023
0
citations

Evaluating Post-hoc Explanations for Graph Neural Networks via Robustness Analysis

NeurIPS 2023
0
citations

GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization

ICML 2017
0
citations

Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction

ICML 2017
0
citations

End-to-end Active Object Tracking via Reinforcement Learning

ICML 2018
0
citations

An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method

ICML 2018
0
citations

Safe Element Screening for Submodular Function Minimization

ICML 2018
0
citations