Hao Tang
63
Papers
207
Total Citations
Papers (63)
Delving into Multimodal Prompting for Fine-Grained Visual Classification
AAAI 2024arXiv
55
citations
Stable-Hair: Real-World Hair Transfer via Diffusion Model
AAAI 2025
33
citations
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
AAAI 2024arXiv
31
citations
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
CVPR 2024
17
citations
Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
CVPR 2024
16
citations
MambaIC: State Space Models for High-Performance Learned Image Compression
CVPR 2025
14
citations
Distilling ODE Solvers of Diffusion Models into Smaller Steps
CVPR 2024
10
citations
DiffFNO: Diffusion Fourier Neural Operator
CVPR 2025
8
citations
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
NeurIPS 2025
7
citations
Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency
CVPR 2024
5
citations
Towards Robust 3D Pose Transfer with Adversarial Learning
CVPR 2024
5
citations
A Training-free Synthetic Data Selection Method for Semantic Segmentation
AAAI 2025
4
citations
DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
ICCV 2025
1
citations
Boosting Adversarial Transferability with Spatial Adversarial Alignment
NeurIPS 2025
1
citations
Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation
CVPR 2020arXiv
0
citations
DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis
CVPR 2022
0
citations
Learning To Restore 3D Face From In-the-Wild Degraded Images
CVPR 2022
0
citations
MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation
CVPR 2022arXiv
0
citations
Topology-Preserving Shape Reconstruction and Registration via Neural Diffeomorphic Flow
CVPR 2022arXiv
0
citations
Physically-Guided Disentangled Implicit Rendering for 3D Face Modeling
CVPR 2022
0
citations
Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model
CVPR 2022arXiv
0
citations
Graph Transformer GANs for Graph-Constrained House Generation
CVPR 2023arXiv
0
citations
SMAE: Few-Shot Learning for HDR Deghosting With Saturation-Aware Masked Autoencoders
CVPR 2023arXiv
0
citations
Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration
CVPR 2023arXiv
0
citations
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
CVPR 2023arXiv
0
citations
DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network
CVPR 2023arXiv
0
citations
Pruning Parameterization With Bi-Level Optimization for Efficient Semantic Segmentation on the Edge
CVPR 2023
0
citations
Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer
CVPR 2023arXiv
0
citations
Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer
ICCV 2021arXiv
0
citations
Transformer-Based Attention Networks for Continuous Pixel-Wise Prediction
ICCV 2021arXiv
0
citations
Recurrent Mask Refinement for Few-Shot Medical Image Segmentation
ICCV 2021arXiv
0
citations
Learning Concordant Attention via Target-aware Alignment for Visible-Infrared Person Re-identification
ICCV 2023
0
citations
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation
ICCV 2023
0
citations
XingGAN for Person Image Generation
ECCV 2020
0
citations
PPT: Token-Pruned Pose Transformer for Monocular and Multi-View Human Pose Estimation
ECCV 2022
0
citations
Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment
AAAI 2025
0
citations
PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model
CVPR 2025
0
citations
HOIGPT: Learning Long-Sequence Hand-Object Interaction with Language Models
CVPR 2025
0
citations
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
CVPR 2025
0
citations
Similarity Memory Prior is All You Need for Medical Image Segmentation
ICCV 2025
0
citations
MaskSAM: Auto-prompt SAM with Mask Classification for Volumetric Medical Image Segmentation
ICCV 2025
0
citations
ARNet: Self-Supervised FG-SBIR with Unified Sample Feature Alignment and Multi-Scale Token Recycling
AAAI 2025
0
citations
Multi-scale Activation, Refinement, and Aggregation: Exploring Diverse Cues for Fine-Grained Bird Recognition
AAAI 2025
0
citations
SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation
CVPR 2024
0
citations
ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization
CVPR 2024
0
citations
Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy
CVPR 2024
0
citations
On the Faithfulness of Vision Transformer Explanations
CVPR 2024
0
citations
Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation
CVPR 2018arXiv
0
citations
Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation
CVPR 2019
0
citations
3D-Aware Semantic-Guided Generative Model for Human Synthesis
ECCV 2022
0
citations
Towards Interpretable Video Super-Resolution via Alternating Optimization
ECCV 2022
0
citations
Compiler-Aware Neural Architecture Search for On-Mobile Real-Time Super-Resolution
ECCV 2022
0
citations
Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation
ECCV 2022
0
citations
EgoTracks: A Long-term Egocentric Visual Object Tracking Dataset
NeurIPS 2023
0
citations
Belief Propagation Neural Networks
NeurIPS 2020
0
citations
Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals
NeurIPS 2020
0
citations
Towards Scale-Invariant Graph-related Problem Solving by Iterative Homogeneous GNNs
NeurIPS 2020
0
citations
HotBEV: Hardware-oriented Transformer-based Multi-View 3D Detector for BEV Perception
NeurIPS 2023
0
citations
PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile
NeurIPS 2023
0
citations
SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning
ECCV 2022
0
citations
LART: Neural Correspondence Learning with Latent Regularization Transformer for 3D Motion Transfer
NeurIPS 2023
0
citations
Does Graph Distillation See Like Vision Dataset Counterpart?
NeurIPS 2023
0
citations
Object Reprojection Error (ORE): Camera pose benchmarks from lightweight tracking annotations
NeurIPS 2023
0
citations