Wei Liu
148
Papers
698
Total Citations
1
Affiliations
Affiliations
The Hong Kong University of Science and Technology
Papers (148)
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
ICLR 2024
343
citations
Discrete Hyper-Graph Matching
CVPR 2015
77
citations
BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP
CVPR 2024
68
citations
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
NeurIPS 2025
46
citations
MathAttack: Attacking Large Language Models towards Math Solving Ability
AAAI 2024arXiv
37
citations
IDOL: Instant Photorealistic 3D Human Creation from a Single Image
CVPR 2025arXiv
36
citations
STIV: Scalable Text and Image Conditioned Video Generation
ICCV 2025
20
citations
MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls
AAAI 2025
19
citations
Local Conditional Controlling for Text-to-Image Diffusion Models
AAAI 2025
13
citations
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization
AAAI 2025
9
citations
Auto-Regressive Diffusion for Generating 3D Human-Object Interactions
AAAI 2025
6
citations
Fix-CLIP: Dual-Branch Hierarchical Contrastive Learning via Synthetic Captions for Better Understanding of Long Text
ICCV 2025
5
citations
Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets
ICML 2025
5
citations
EBMDock: Neural Probabilistic Protein-Protein Docking via a Differentiable Energy Model
ICLR 2024
5
citations
Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain
ICLR 2024
4
citations
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
ICCV 2025
3
citations
Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology
AAAI 2025
1
citations
Quantifying Distributional Invariance in Causal Subgraph for IRM-Free Graph Generalization
NeurIPS 2025
1
citations
Supervised Discrete Hashing
CVPR 2015
0
citations
Saliency Propagation From Simple to Difficult
CVPR 2015
0
citations
Towards 3D Object Detection With Bimodal Deep Boltzmann Machines Over RGBD Imagery
CVPR 2015
0
citations
Understanding Image Structure via Hierarchical Shape Parsing
CVPR 2015
0
citations
Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization
CVPR 2016
0
citations
Real-Time Neural Style Transfer for Videos
CVPR 2017
0
citations
Deep Self-Taught Learning for Weakly Supervised Object Localization
CVPR 2017arXiv
0
citations
Diverse Image Annotation
CVPR 2017
0
citations
SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning
CVPR 2017
0
citations
Frustum PointNets for 3D Object Detection From RGB-D Data
CVPR 2018arXiv
0
citations
Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks
CVPR 2018arXiv
0
citations
Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
CVPR 2018arXiv
0
citations
Gated Fusion Network for Single Image Dehazing
CVPR 2018arXiv
0
citations
Left-Right Comparative Recurrent Model for Stereo Matching
CVPR 2018arXiv
0
citations
Dual Skipping Networks
CVPR 2018arXiv
0
citations
Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
CVPR 2018arXiv
0
citations
CosFace: Large Margin Cosine Loss for Deep Face Recognition
CVPR 2018arXiv
0
citations
CNN in MRF: Video Object Segmentation via Inference in a CNN-Based Higher-Order Spatio-Temporal MRF
CVPR 2018arXiv
0
citations
Bidirectional Attentive Fusion With Context Gating for Dense Video Captioning
CVPR 2018arXiv
0
citations
Reconstruction Network for Video Captioning
CVPR 2018arXiv
0
citations
Tagging Like Humans: Diverse and Distinct Image Annotation
CVPR 2018arXiv
0
citations
Regularizing RNNs for Caption Generation by Reconstructing the Past With the Present
CVPR 2018arXiv
0
citations
MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation
CVPR 2019
0
citations
MVF-Net: Multi-View 3D Face Morphable Model Regression
CVPR 2019
0
citations
Spatio-Temporal Video Re-Localization by Warp LSTM
CVPR 2019
0
citations
Unsupervised Deep Tracking
CVPR 2019
0
citations
DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs
CVPR 2019
0
citations
NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction
CVPR 2019
0
citations
Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation
CVPR 2019
0
citations
Face Anti-Spoofing: Model Matters, so Does Data
CVPR 2019
0
citations
Decorrelated Adversarial Learning for Age-Invariant Face Recognition
CVPR 2019
0
citations
Multi-Granularity Generator for Temporal Action Proposal
CVPR 2019
0
citations
Compressing Convolutional Neural Networks via Factorized Convolutional Filters
CVPR 2019
0
citations
Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
CVPR 2019
0
citations
Residual Regression With Semantic Prior for Crowd Counting
CVPR 2019
0
citations
Deep Spectral Clustering Using Dual Autoencoder Network
CVPR 2019
0
citations
Unsupervised Image Captioning
CVPR 2019
0
citations
Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables
CVPR 2019
0
citations
Learning Joint Gait Representation via Quintuplet Loss Minimization
CVPR 2019
0
citations
High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection
CVPR 2019
0
citations
Learning to Compose Dynamic Tree Structures for Visual Contexts
CVPR 2019
0
citations
Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition
CVPR 2019
0
citations
Image Deformation Meta-Networks for One-Shot Learning
CVPR 2019
0
citations
A Sufficient Condition for Convergences of Adam and RMSProp
CVPR 2019
0
citations
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning
CVPR 2019
0
citations
Central Similarity Quantization for Efficient Image and Video Retrieval
CVPR 2020arXiv
0
citations
Deblurring by Realistic Blurring
CVPR 2020arXiv
0
citations
Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content
CVPR 2020
0
citations
MTL-NAS: Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning
CVPR 2020
0
citations
Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles
CVPR 2021arXiv
0
citations
VideoMoCo: Contrastive Video Representation Learning With Temporally Adversarial Examples
CVPR 2021arXiv
0
citations
Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On
CVPR 2021arXiv
0
citations
ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows
CVPR 2021arXiv
0
citations
Parser-Free Virtual Try-On via Distilling Appearance Flows
CVPR 2021arXiv
0
citations
DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls
CVPR 2021arXiv
0
citations
Generalizing Face Forgery Detection With High-Frequency Features
CVPR 2021arXiv
0
citations
Coherent Point Drift Revisited for Non-Rigid Shape Matching and Registration
CVPR 2022
0
citations
SWEM: Towards Real-Time Video Object Segmentation With Sequential Weighted Expectation-Maximization
CVPR 2022
0
citations
Improving Visual Grounding With Visual-Linguistic Verification and Iterative Reasoning
CVPR 2022arXiv
0
citations
XMP-Font: Self-Supervised Cross-Modality Pre-Training for Few-Shot Font Generation
CVPR 2022
0
citations
Seeing What You Miss: Vision-Language Pre-Training With Semantic Completion Learning
CVPR 2023arXiv
0
citations
Top Rank Supervised Binary Coding for Visual Search
ICCV 2015
0
citations
Learning Binary Codes for Maximum Inner Product Search
ICCV 2015
0
citations
Detecting Faces Using Inside Cascaded Contextual CNN
ICCV 2017
0
citations
Semi-Global Weighted Least Squares in Image Filtering
ICCV 2017arXiv
0
citations
Occlusion Robust Face Recognition Based on Mask Learning With Pairwise Differential Siamese Network
ICCV 2019
0
citations
Controllable Video Captioning With POS Sequence Guidance Based on Gated Fusion Network
ICCV 2019
0
citations
Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion
ICCV 2019
0
citations
Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization
ICCV 2019
0
citations
Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning
ICCV 2019
0
citations
Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection
ICCV 2019
0
citations
Benchmarking Ultra-High-Definition Image Super-Resolution
ICCV 2021
0
citations
SynFace: Face Recognition With Synthetic Data
ICCV 2021arXiv
0
citations
Pyramid Architecture Search for Real-Time Image Deblurring
ICCV 2021
0
citations
Adversarial Attack on Deep Cross-Modal Hamming Retrieval
ICCV 2021
0
citations
Heterogeneous Diversity Driven Active Learning for Multi-Object Tracking
ICCV 2023
0
citations
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
ECCV 2020
0
citations
Face Super-Resolution Guided by 3D Facial Priors
ECCV 2020
0
citations
PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation
ECCV 2020
0
citations
Context-Gated Convolution
ECCV 2020
0
citations
Masked Autoencoders for Point Cloud Self-Supervised Learning
ECCV 2022
0
citations
Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips
ECCV 2022
0
citations
Triangle Attack: A Query-Efficient Decision-Based Adversarial Attack
ECCV 2022
0
citations
Towards Efficient Adversarial Training on Vision Transformers
ECCV 2022
0
citations
Improving Vision Transformers by Revisiting High-Frequency Components
ECCV 2022
0
citations
Mixture-Rank Matrix Approximation for Collaborative Filtering
NeurIPS 2017
0
citations
Geometric Descent Method for Convex Composite Minimization
NeurIPS 2017arXiv
0
citations
RFNet: Unsupervised Network for Mutually Reinforcing Multi-Modal Image Registration and Fusion
CVPR 2022
0
citations
Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild
CVPR 2025
0
citations
WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions
ICCV 2025
0
citations
GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions
ICCV 2025
0
citations
HarmonySeg: Tubular Structure Segmentation with Deep-Shallow Feature Fusion and Growth-Suppression Balanced Loss
ICCV 2025
0
citations
ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
AAAI 2025
0
citations
Towards More Discriminative Feature Learning in SNNs with Temporal-Self-Erasing Supervision
AAAI 2025
0
citations
Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation
AAAI 2025
0
citations
Follow-Your-Click: Open-domain Regional Image Animation via Motion Prompts
AAAI 2025
0
citations
Just a Few Glances: Open-Set Visual Perception with Image Prompt Paradigm
AAAI 2025
0
citations
Modeling All Response Surfaces in One for Conditional Search Spaces
AAAI 2025
0
citations
Enhancing Multi-View Classification Reliability with Adaptive Rejection
AAAI 2025
0
citations
Decoupling Representation and Knowledge for Few-Shot Intent Classification and Slot Filling
AAAI 2024
0
citations
DreamIdentity: Enhanced Editability for Efficient Face-Identity Preserved Image Generation
AAAI 2024
0
citations
SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding
AAAI 2024arXiv
0
citations
Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration
CVPR 2024
0
citations
UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation
ICML 2025
0
citations
Going Deeper With Convolutions
CVPR 2015
0
citations
Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation
NeurIPS 2018
0
citations
Distilled Wasserstein Learning for Word Embedding and Topic Modeling
NeurIPS 2018
0
citations
Generalizing Graph Matching beyond Quadratic Assignment Model
NeurIPS 2018
0
citations
Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning
NeurIPS 2018
0
citations
Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling
NeurIPS 2018
0
citations
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
NeurIPS 2019
0
citations
Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation
NeurIPS 2019
0
citations
Cross-Modal Learning with Adversarial Samples
NeurIPS 2019
0
citations
Towards Playing Full MOBA Games with Deep Reinforcement Learning
NeurIPS 2020
0
citations
Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization
NeurIPS 2020
0
citations
Adversarial Learning for Robust Deep Clustering
NeurIPS 2020
0
citations
Fewer is More: A Deep Graph Metric Learning Perspective Using Fewer Proxies
NeurIPS 2020
0
citations
Generalized and Discriminative Few-Shot Object Detection via SVD-Dictionary Enhancement
NeurIPS 2021
0
citations
Neural Routing by Memory
NeurIPS 2021
0
citations
FR: Folded Rationalization with a Unified Encoder
NeurIPS 2022
0
citations
Egocentric Video-Language Pretraining
NeurIPS 2022
0
citations
D-Separation for Causal Self-Explanation
NeurIPS 2023
0
citations
Punctuation-level Attack: Single-shot and Single Punctuation Can Fool Text Models
NeurIPS 2023
0
citations
Exploiting Contextual Objects and Relations for 3D Visual Grounding
NeurIPS 2023
0
citations
Evaluating Post-hoc Explanations for Graph Neural Networks via Robustness Analysis
NeurIPS 2023
0
citations
GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization
ICML 2017
0
citations
Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction
ICML 2017
0
citations
End-to-end Active Object Tracking via Reinforcement Learning
ICML 2018
0
citations
An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method
ICML 2018
0
citations
Safe Element Screening for Submodular Function Minimization
ICML 2018
0
citations