Yueting Zhuang

15
Papers
135
Total Citations

Papers (15)

HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation

ICML 2025
63
citations

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

CVPR 2025
40
citations

Let LRMs Break Free from Overthinking via Self-Braking Tuning

NeurIPS 2025
13
citations

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

ICCV 2025
10
citations

Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning

NeurIPS 2025
6
citations

Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models

AAAI 2025
3
citations

HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

CVPR 2024
0
citations

Auto-Encoding Morph-Tokens for Multimodal LLM

ICML 2024
0
citations

AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea

CVPR 2025
0
citations

Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning

ICML 2024
0
citations

STEP: Enhancing Video-LLMs’ Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

CVPR 2025
0
citations

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

ICCV 2025
0
citations

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

ICCV 2025
0
citations

Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness

ICCV 2025
0
citations

Data Shunt: Collaboration of Small and Large Models for Lower Costs and Better Performance

AAAI 2024
0
citations