Sheng Jin
27
Papers
358
Total Citations
Papers (27)
Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation
ECCV 2020
138
citations
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
ICLR 2024
104
citations
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
ICCV 2025
33
citations
CLIM: Contrastive Language-Image Mosaic for Region Representation
AAAI 2024arXiv
24
citations
F-LMM: Grounding Frozen Large Multimodal Models
CVPR 2025
21
citations
AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks
AAAI 2025
14
citations
Weakly Supervised Monocular 3D Detection with a Single-View Image
CVPR 2024
12
citations
Ultra-High Resolution Segmentation via Boundary-Enhanced Patch-Merging Transformer
AAAI 2025
6
citations
NADER: Neural Architecture Design via Multi-Agent Collaboration
CVPR 2025
3
citations
UniFS: Universal Few-shot Instance Perception with Point Representations
ECCV 2024
3
citations
Whole-Body Human Pose Estimation in the Wild
ECCV 2020
0
citations
PoseTrans: A Simple yet Effective Pose Transformation Augmentation for Human Pose Estimation
ECCV 2022
0
citations
3D Interacting Hand Pose Estimation by Hand De-Occlusion and Removal
ECCV 2022
0
citations
Pose for Everything: Towards Category-Agnostic Pose Estimation
ECCV 2022
0
citations
Not All Tokens Are Equal: Human-Centric Visual Analysis via Token Clustering Transformer
CVPR 2022arXiv
0
citations
Unsupervised Continual Domain Shift Learning with Multi-Prototype Modeling
CVPR 2025
0
citations
Multi-Person Articulated Tracking With Spatial and Temporal Embeddings
CVPR 2019
0
citations
When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks
CVPR 2021arXiv
0
citations
ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search
CVPR 2021arXiv
0
citations
Aligning Bag of Regions for Open-Vocabulary Object Detection
CVPR 2023arXiv
0
citations
TRB: A Novel Triplet Representation for Understanding 2D Human Body
ICCV 2019
0
citations
Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
ICCV 2021arXiv
0
citations
Domain Generalization via Balancing Training Difficulty and Model Capability
ICCV 2023arXiv
0
citations
Uncertainty-aware Unsupervised Multi-Object Tracking
ICCV 2023arXiv
0
citations
Connectionist Temporal Classification with Maximum Entropy Regularization
NeurIPS 2018
0
citations
When Counterpoint Meets Chinese Folk Melodies
NeurIPS 2020
0
citations
Category-Extensible Out-of-Distribution Detection via Hierarchical Context Descriptions
NeurIPS 2023
0
citations