Hehe Fan

22

Papers

81

Total Citations

Papers (22)

Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition

BVINet: Unlocking Blind Video Inpainting with Zero Annotations

EnergyMoGen: Compositional Human Motion Generation with Energy-Based Diffusion Model in Latent Space

Clustering for Protein Representation Learning

Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling

ZeroMamba: Exploring Visual State Space Model for Zero-Shot Learning

Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration

Self-Supervised Global-Local Structure Modeling for Point Cloud Domain Adaptation With Reliable Voted Pseudo Labels

PointListNet: Deep Learning on 3D Point Lists

Complex Event Detection by Identifying Reliable Shots From Untrimmed Videos

Attract or Distract: Exploit the Margin of Open Set

STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition

Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos

Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos

Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion

Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction

InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation

MMAD: Multi-label Micro-Action Detection in Videos

DocMSU: A Comprehensive Benchmark for Document-Level Multimodal Sarcasm Understanding

Uncovering What Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly

Improving Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos