Heng Tao Shen
42
Papers
87
Total Citations
Papers (42)
DePT: Decoupled Prompt Tuning
CVPR 2024
60
citations
ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
CVPR 2024
19
citations
ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models
CVPR 2024
4
citations
TAU-106K: A New Dataset for Comprehensive Understanding of Traffic Accident
ICLR 2025
3
citations
PHGC: Procedural Heterogeneous Graph Completion for Natural Language Task Verification in Egocentric Videos
CVPR 2025
1
citations
CDTR: Semantic Alignment for Video Moment Retrieval Using Concept Decomposition Transformer
AAAI 2025
0
citations
T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Large Language Model Signals for Science Question Answering
AAAI 2024
0
citations
Weakly-Supervised Mirror Detection via Scribble Annotations
AAAI 2024
0
citations
Adaptive Uncertainty-Based Learning for Text-Based Person Retrieval
AAAI 2024
0
citations
ScanERU: Interactive 3D Visual Grounding Based on Embodied Reference Understanding
AAAI 2024arXiv
0
citations
Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion
CVPR 2024
0
citations
Ensemble Diversity Facilitates Adversarial Transferability
CVPR 2024
0
citations
Supervised Discrete Hashing
CVPR 2015
0
citations
Optimal Graph Learning With Partial Tags and Multiple Features for Image and Video Annotation
CVPR 2015
0
citations
What's Wrong With That Object? Identifying Images of Unusual Objects by Modelling the Detection Score Distribution
CVPR 2016
0
citations
Multi-Attention Network for One Shot Learning
CVPR 2017
0
citations
Matrix Tri-Factorization With Manifold Regularizations for Zero-Shot Learning
CVPR 2017
0
citations
Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition
CVPR 2019
0
citations
Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables
CVPR 2019
0
citations
Searching for Actions on the Hyperbole
CVPR 2020
0
citations
What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images
CVPR 2020
0
citations
Universal Weighting Metric Learning for Cross-Modal Matching
CVPR 2020arXiv
0
citations
Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos
CVPR 2021
0
citations
Fine-Grained Predicates Learning for Scene Graph Generation
CVPR 2022arXiv
0
citations
Semi-Supervised Video Paragraph Grounding With Contrastive Encoder
CVPR 2022
0
citations
Meta Distribution Alignment for Generalizable Person Re-Identification
CVPR 2022
0
citations
Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression
CVPR 2022arXiv
0
citations
Multilateral Semantic Relations Modeling for Image Text Retrieval
CVPR 2023
0
citations
Multivariate, Multi-Frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation
CVPR 2023
0
citations
Learning Semantic-Aware Knowledge Guidance for Low-Light Image Enhancement
CVPR 2023arXiv
0
citations
Learning Binary Codes for Maximum Inner Product Search
ICCV 2015
0
citations
Leveraging Weak Semantic Relevance for Complex Video Event Classification
ICCV 2017
0
citations
Webly Supervised Fine-Grained Recognition: Benchmark Datasets and an Approach
ICCV 2021arXiv
0
citations
From General to Specific: Informative Scene Graph Generation via Balance Adjustment
ICCV 2021arXiv
0
citations
Part-Aware Transformer for Generalizable Person Re-identification
ICCV 2023arXiv
0
citations
ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction
ICCV 2023
0
citations
Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves
CVPR 2025
0
citations
Patch-wise Attack for Fooling Deep Neural Network
ECCV 2020
0
citations
Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather
CVPR 2025
0
citations
CoSMIC: Continual Self-supervised Learning for Multi-Domain Medical Imaging via Conditional Mutual Information Maximization
ICCV 2025
0
citations
Implicit Counterfactual Learning for Audio-Visual Segmentation
ICCV 2025
0
citations
Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy
ICCV 2025
0
citations