Xiangyu Yue
30
Papers
322
Total Citations
Papers (30)
Video-R1: Reinforcing Video Reasoning in MLLMs
NeurIPS 2025arXiv
236
citations
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
CVPR 2025
44
citations
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
CVPR 2024
11
citations
RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models
CVPR 2025
8
citations
SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance
CVPR 2025
7
citations
SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data
ICCV 2025
6
citations
Training Matting Models Without Alpha Labels
AAAI 2025
4
citations
Breaking the Encoder Barrier for Seamless Video-Language Understanding
ICCV 2025
3
citations
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
ICCV 2025
2
citations
HypDAE: Hyperbolic Diffusion Autoencoders for Hierarchical Few-shot Image Generation
ICCV 2025
1
citations
From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision
ICCV 2025
0
citations
Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities
ICCV 2025
0
citations
Unleashing Vecset Diffusion Model for Fast Shape Generation
ICCV 2025
0
citations
Chimera: Improving Generalist Model with Domain-Specific Experts
ICCV 2025
0
citations
OneLLM: One Framework to Align All Modalities with Language
CVPR 2024
0
citations
UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines
CVPR 2025
0
citations
Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions
CVPR 2018arXiv
0
citations
PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation
CVPR 2020arXiv
0
citations
Prototypical Cross-Domain Self-Supervised Learning for Few-Shot Unsupervised Domain Adaptation
CVPR 2021arXiv
0
citations
Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data
ICCV 2019
0
citations
Unsupervised Point Cloud Pre-Training via Occlusion Completion
ICCV 2021arXiv
0
citations
Space Engage: Collaborative Space Supervision for Contrastive-Based Semi-Supervised Semantic Segmentation
ICCV 2023arXiv
0
citations
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
ICCV 2023arXiv
0
citations
Beating Backdoor Attack at Its Own Game
ICCV 2023arXiv
0
citations
RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation
ECCV 2022
0
citations
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
ECCV 2022
0
citations
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition
CVPR 2024
0
citations
FairGen: Enhancing Fairness in Text-to-Image Diffusion Models via Self-Discovering Latent Directions
ICCV 2025
0
citations
Learning Beyond Still Frames: Scaling Vision-Language Models with Video
ICCV 2025
0
citations
Multi-source Domain Adaptation for Semantic Segmentation
NeurIPS 2019
0
citations