Hao Tang
26
Papers
207
Total Citations
Papers (26)
Delving into Multimodal Prompting for Fine-Grained Visual Classification
AAAI 2024arXiv
55
citations
Stable-Hair: Real-World Hair Transfer via Diffusion Model
AAAI 2025
33
citations
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
AAAI 2024arXiv
31
citations
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
CVPR 2024
17
citations
Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
CVPR 2024
16
citations
MambaIC: State Space Models for High-Performance Learned Image Compression
CVPR 2025
14
citations
Distilling ODE Solvers of Diffusion Models into Smaller Steps
CVPR 2024
10
citations
DiffFNO: Diffusion Fourier Neural Operator
CVPR 2025
8
citations
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
NeurIPS 2025
7
citations
Towards Robust 3D Pose Transfer with Adversarial Learning
CVPR 2024
5
citations
Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency
CVPR 2024
5
citations
A Training-free Synthetic Data Selection Method for Semantic Segmentation
AAAI 2025
4
citations
Boosting Adversarial Transferability with Spatial Adversarial Alignment
NeurIPS 2025
1
citations
DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
ICCV 2025
1
citations
SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation
CVPR 2024
0
citations
ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization
CVPR 2024
0
citations
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
CVPR 2025
0
citations
Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy
CVPR 2024
0
citations
HOIGPT: Learning Long-Sequence Hand-Object Interaction with Language Models
CVPR 2025
0
citations
On the Faithfulness of Vision Transformer Explanations
CVPR 2024
0
citations
PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model
CVPR 2025
0
citations
ARNet: Self-Supervised FG-SBIR with Unified Sample Feature Alignment and Multi-Scale Token Recycling
AAAI 2025
0
citations
MaskSAM: Auto-prompt SAM with Mask Classification for Volumetric Medical Image Segmentation
ICCV 2025
0
citations
Multi-scale Activation, Refinement, and Aggregation: Exploring Diverse Cues for Fine-Grained Bird Recognition
AAAI 2025
0
citations
Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment
AAAI 2025
0
citations
Similarity Memory Prior is All You Need for Medical Image Segmentation
ICCV 2025
0
citations