Xiaoming Wei
15
Papers
53
Total Citations
Papers (15)
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
NeurIPS 2025arXiv
30
citations
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
CVPR 2024
16
citations
ARIG: Autoregressive Interactive Head Generation for Real-time Conversations
ICCV 2025arXiv
7
citations
BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning
CVPR 2024
0
citations
Animating General Image with Large Visual Motion Model
CVPR 2024
0
citations
Rethinking BiSeNet for Real-Time Semantic Segmentation
CVPR 2021arXiv
0
citations
Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation
CVPR 2021
0
citations
Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
CVPR 2022
0
citations
Bridging Search Region Interaction With Template for RGB-T Tracking
CVPR 2023
0
citations
Elastic Aggregation for Federated Optimization
CVPR 2023
0
citations
Masked Auto-Encoders Meet Generative Adversarial Networks and Beyond
CVPR 2023
0
citations
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
CVPR 2025
0
citations
Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation
ECCV 2022
0
citations
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
AAAI 2025
0
citations
Real3D the Curious Case of Neural Scene Degeneration
AAAI 2024
0
citations