Xiangyu Yue

30
Papers
322
Total Citations

Papers (30)

Video-R1: Reinforcing Video Reasoning in MLLMs

NeurIPS 2025arXiv
236
citations

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

CVPR 2025
44
citations

Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

CVPR 2024
11
citations

RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models

CVPR 2025
8
citations

SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

CVPR 2025
7
citations

SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data

ICCV 2025
6
citations

Training Matting Models Without Alpha Labels

AAAI 2025
4
citations

Breaking the Encoder Barrier for Seamless Video-Language Understanding

ICCV 2025
3
citations

CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation

ICCV 2025
2
citations

HypDAE: Hyperbolic Diffusion Autoencoders for Hierarchical Few-shot Image Generation

ICCV 2025
1
citations

From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision

ICCV 2025
0
citations

Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities

ICCV 2025
0
citations

Unleashing Vecset Diffusion Model for Fast Shape Generation

ICCV 2025
0
citations

Chimera: Improving Generalist Model with Domain-Specific Experts

ICCV 2025
0
citations

OneLLM: One Framework to Align All Modalities with Language

CVPR 2024
0
citations

UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines

CVPR 2025
0
citations

Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions

CVPR 2018arXiv
0
citations

PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation

CVPR 2020arXiv
0
citations

Prototypical Cross-Domain Self-Supervised Learning for Few-Shot Unsupervised Domain Adaptation

CVPR 2021arXiv
0
citations

Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data

ICCV 2019
0
citations

Unsupervised Point Cloud Pre-Training via Occlusion Completion

ICCV 2021arXiv
0
citations

Space Engage: Collaborative Space Supervision for Contrastive-Based Semi-Supervised Semantic Segmentation

ICCV 2023arXiv
0
citations

Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models

ICCV 2023arXiv
0
citations

Beating Backdoor Attack at Its Own Game

ICCV 2023arXiv
0
citations

RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation

ECCV 2022
0
citations

Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models

ECCV 2022
0
citations

UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition

CVPR 2024
0
citations

FairGen: Enhancing Fairness in Text-to-Image Diffusion Models via Self-Discovering Latent Directions

ICCV 2025
0
citations

Learning Beyond Still Frames: Scaling Vision-Language Models with Video

ICCV 2025
0
citations

Multi-source Domain Adaptation for Semantic Segmentation

NeurIPS 2019
0
citations