Yueting Zhuang
15
Papers
135
Total Citations
Papers (15)
HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation
ICML 2025
63
citations
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
CVPR 2025
40
citations
Let LRMs Break Free from Overthinking via Self-Braking Tuning
NeurIPS 2025
13
citations
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
ICCV 2025
10
citations
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
NeurIPS 2025
6
citations
Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models
AAAI 2025
3
citations
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data
CVPR 2024
0
citations
Auto-Encoding Morph-Tokens for Multimodal LLM
ICML 2024
0
citations
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
CVPR 2025
0
citations
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
ICML 2024
0
citations
STEP: Enhancing Video-LLMs’ Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training
CVPR 2025
0
citations
Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
ICCV 2025
0
citations
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
ICCV 2025
0
citations
Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness
ICCV 2025
0
citations
Data Shunt: Collaboration of Small and Large Models for Lower Costs and Better Performance
AAAI 2024
0
citations