Yang Jin
8
Papers
6
Total Citations
Papers (8)
TransGOP: Transformer-Based Gaze Object Prediction
AAAI 2024arXiv
6
citations
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
ICML 2024
0
citations
Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context
CVPR 2020
0
citations
Complex Video Action Reasoning via Learnable Markov Logic Network
CVPR 2022
0
citations
Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-Commerce
CVPR 2023
0
citations
Video Action Segmentation via Contextually Refined Temporal Keypoints
ICCV 2023
0
citations
Granularity-Adaptive Spatial Evidence Tokenization for Video Question Answering
AAAI 2025
0
citations
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
NeurIPS 2022
0
citations