Yi Hu
4
Papers
29
Total Citations
Papers (4)
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
NeurIPS 2025arXiv
26
citations
Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs
NeurIPS 2025
3
citations
T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Large Language Model Signals for Science Question Answering
AAAI 2024
0
citations
Case-Based or Rule-Based: How Do Transformers Do the Math?
ICML 2024
0
citations