Most Cited COLM "scaling behavior analysis" Papers
418 papers found • Page 3 of 3
Conference
Off-Policy Corrected Reward Modeling for Reinforcement Learning from Human Feedback
Johannes Ackermann, Takashi Ishida, Masashi Sugiyama
Efficient Construction of Model Family through Progressive Training Using Model Expansion
Kazuki Yano, Sho Takase, Sosuke Kobayashi et al.
Inside-Out: Hidden Factual Knowledge in LLMs
Zorik Gekhman, Eyal Ben-David, Hadas Orgad et al.
News is More than a Collection of Facts: Moral Frame Preserving News Summarization
Enrico Liscio, Michela Lorandi, Pradeep K. Murukannaiah
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
Tao Yuan, Xuefei Ning, Dong Zhou et al.
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Songjun Tu, Jiahao Lin, Xiangyu Tian et al.
Agents Are All You Need for LLM Unlearning
Debdeep Sanyal, Murari Mandal
One ruler to measure them all: Benchmarking multilingual long-context language models
Yekyung Kim, Jenna Russell, Marzena Karpinska et al.
Both Direct and Indirect Evidence Contribute to Dative Alternation Preferences in Language Models
Qing Yao, Kanishka Misra, Leonie Weissweiler et al.
TRELLIS: Learning to Compress Key-Value Memory in Attention Models
Mahdi Karami, Ali Behrouz, Praneeth Kacham et al.
Beyond the Reported Cutoff: Where Large Language Models Fall Short on Financial Knowledge
Agam Shah, Liqin Ye, Sebastian Jaskowski et al.
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
Juzheng Zhang, Jiacheng You, Ashwinee Panda et al.
CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models
Runlong Zhou, Yi Zhang
Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback
Runlong Zhou, Maryam Fazel, Simon Shaolei Du
ReFeed: Multi-dimensional Summarization Refinement with Reflective Reasoning on Feedback
Taewon Yun, Jihwan Oh, Hyangsuk Min et al.
Modifying Large Language Model Post-Training for Diverse Creative Writing
John Joon Young Chung, Vishakh Padmakumar, Melissa Roemmele et al.
FineMedLM-o1: Enhancing Medical Knowledge Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training
hongzhou yu, Tianhao Cheng, Yingwen Wang et al.
Can Test-Time Scaling Improve World Foundation Model?
Wenyan Cong, Hanqing Zhu, Peihao Wang et al.