Hao Yang

40

Papers

153

Total Citations

Papers (40)

Goku: Flow Based Video Generative Foundation Models

Language-driven All-in-one Adverse Weather Removal

Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models

UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer

PointDGMamba: Domain Generalization of Point Cloud Classification via Generalized State Space Model

Enhancing Numerical Prediction of MLLMs with Soft Labeling

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

Exploit Bounding Box Annotations for Multi-Label Object Recognition

Efficient 3D Room Shape Recovery From a Single Panorama

MIML-FCN+: Multi-Instance Multi-Label Learning via Fully Convolutional Networks With Privileged Information

Mask-Guided Portrait Editing With Conditional GANs

Face Parsing With RoI Tanh-Warping

Face X-Ray for More General Face Forgery Detection

Advancing High Fidelity Identity Swapping for Forgery Detection

Unsupervised Pre-Training for Person Re-Identification

Style-Based Point Generator With Adversarial Rendering for Point Cloud Completion

Omni-DETR: Omni-Supervised Object Detection With Transformers

Large-Scale Pre-Training for Person Re-Identification With Noisy Labels

General Facial Representation Learning in a Visual-Linguistic Manner

ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-Real Novel View Synthesis via Contrastive Learning

MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining

A Meta-Learning Approach to Predicting Performance and Data Requirements

Guided Recommendation for Model Fine-Tuning

Detecting 11K Classes: Large Scale Object Detection Without Fine-Grained Bounding Boxes

Adversarial Example Detection Using Latent Neighborhood Graph

ADNet: Leveraging Error-Bias Towards Normal Direction in Face Alignment

InterFormer: Real-time Interactive Image Segmentation

Local and Global Logit Adjustments for Long-Tailed Learning

Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images

Scaling up Image Segmentation across Data and Tasks

Met2Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems

ZeroStereo: Zero-shot Stereo Matching from Single Images

Test-Time Adaptation on Noisy Data via Model-Pruning-Based Filtering and Flatness-Aware Entropy Minimization

AvatarVerse: High-Quality & Stable 3D Avatar Creation from Text and Pose

LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems

Your representations are in the network: composable and parallel adaptation for large scale models

PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds

From Trainable Negative Depth to Edge Heterophily in Graphs