Yan Song
6
Papers
73
Total Citations
Papers (6)
ReMA: Learning to Meta-Think for LLMs with Multi-agent Reinforcement Learning
NeurIPS 2025arXiv
36
citations
Efficient Reinforcement Learning with Large Language Model Priors
ICLR 2025
20
citations
ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
NeurIPS 2025arXiv
14
citations
Reinforcement Learning from Imperfect Corrective Actions and Proxy Rewards
ICLR 2025
3
citations
Agreement aware and dissimilarity oriented GLOM
ICCV 2025
0
citations
Bootstrapping Large Language Models for Radiology Report Generation
AAAI 2024
0
citations