Hao Tang

26
Papers
207
Total Citations

Papers (26)

Delving into Multimodal Prompting for Fine-Grained Visual Classification

AAAI 2024arXiv
55
citations

Stable-Hair: Real-World Hair Transfer via Diffusion Model

AAAI 2025
33
citations

G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model

AAAI 2024arXiv
31
citations

HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud

CVPR 2024
17
citations

Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer

CVPR 2024
16
citations

MambaIC: State Space Models for High-Performance Learned Image Compression

CVPR 2025
14
citations

Distilling ODE Solvers of Diffusion Models into Smaller Steps

CVPR 2024
10
citations

DiffFNO: Diffusion Fourier Neural Operator

CVPR 2025
8
citations

RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness

NeurIPS 2025
7
citations

Towards Robust 3D Pose Transfer with Adversarial Learning

CVPR 2024
5
citations

Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency

CVPR 2024
5
citations

A Training-free Synthetic Data Selection Method for Semantic Segmentation

AAAI 2025
4
citations

Boosting Adversarial Transferability with Spatial Adversarial Alignment

NeurIPS 2025
1
citations

DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding

ICCV 2025
1
citations

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

CVPR 2024
0
citations

ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization

CVPR 2024
0
citations

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

CVPR 2025
0
citations

Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy

CVPR 2024
0
citations

HOIGPT: Learning Long-Sequence Hand-Object Interaction with Language Models

CVPR 2025
0
citations

On the Faithfulness of Vision Transformer Explanations

CVPR 2024
0
citations

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

CVPR 2025
0
citations

ARNet: Self-Supervised FG-SBIR with Unified Sample Feature Alignment and Multi-Scale Token Recycling

AAAI 2025
0
citations

MaskSAM: Auto-prompt SAM with Mask Classification for Volumetric Medical Image Segmentation

ICCV 2025
0
citations

Multi-scale Activation, Refinement, and Aggregation: Exploring Diverse Cues for Fine-Grained Bird Recognition

AAAI 2025
0
citations

Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment

AAAI 2025
0
citations

Similarity Memory Prior is All You Need for Medical Image Segmentation

ICCV 2025
0
citations