Wei Liu

33

Papers

621

Total Citations

1

Affiliations

Affiliations

The Hong Kong University of Science and Technology

Papers (33)

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving

MathAttack: Attacking Large Language Models towards Math Solving Ability

IDOL: Instant Photorealistic 3D Human Creation from a Single Image

STIV: Scalable Text and Image Conditioned Video Generation

MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls

Local Conditional Controlling for Text-to-Image Diffusion Models

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

Auto-Regressive Diffusion for Generating 3D Human-Object Interactions

EBMDock: Neural Probabilistic Protein-Protein Docking via a Differentiable Energy Model

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

Fix-CLIP: Dual-Branch Hierarchical Contrastive Learning via Synthetic Captions for Better Understanding of Long Text

Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering

Quantifying Distributional Invariance in Causal Subgraph for IRM-Free Graph Generalization

Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology

UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation

Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions

HarmonySeg: Tubular Structure Segmentation with Deep-Shallow Feature Fusion and Growth-Suppression Balanced Loss

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

Towards More Discriminative Feature Learning in SNNs with Temporal-Self-Erasing Supervision

Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation

Follow-Your-Click: Open-domain Regional Image Animation via Motion Prompts

Just a Few Glances: Open-Set Visual Perception with Image Prompt Paradigm

Modeling All Response Surfaces in One for Conditional Search Spaces

Enhancing Multi-View Classification Reliability with Adaptive Rejection

Decoupling Representation and Knowledge for Few-Shot Intent Classification and Slot Filling

DreamIdentity: Enhanced Editability for Efficient Face-Identity Preserved Image Generation

SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding

Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration