Most Cited COLM "4d parallelism" Papers
418 papers found • Page 3 of 3
Conference
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard, Hippolyte Gisserot-Boukhlef, Duarte Miguel Alves et al.
Single-Pass Document Scanning for Question Answering
Weili Cao, Jianyou Wang, Youze Zheng et al.
CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions
Tae Soo Kim, Yoonjoo Lee, Yoonah Park et al.
Elucidating the Design Space of Decay in Linear Attention
Zhen Qin, Xuyang Shen, Yiran Zhong
Beyond the Reported Cutoff: Where Large Language Models Fall Short on Financial Knowledge
Agam Shah, Liqin Ye, Sebastian Jaskowski et al.
Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
Neel Jain, Aditya Shrivastava, Chenyang Zhu et al.
DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation
Jingyang Xiang, Sai Qian Zhang
Short-PHD: Detecting Short LLM-generated Text with Topological Data Analysis After Off-topic Content Insertion
Dongjun Wei, Minjia Mao, Xiao Fang et al.
Don’t lie to your friends: Learning what you know from collaborative self-play
Jacob Eisenstein, Reza Aghajani, Adam Fisch et al.
Efficient Construction of Model Family through Progressive Training Using Model Expansion
Kazuki Yano, Sho Takase, Sosuke Kobayashi et al.
ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models
Kaizhi Qian, Xulin Fan, Junrui Ni et al.
Analyzing Multilingualism in Large Language Models with Sparse Autoencoders
Ikhyun Cho, Julia Hockenmaier
Assessing Judging Bias in Large Reasoning Models: An Empirical Study
Qian Wang, Zhanzhi Lou, Zhenheng Tang et al.
Towards Compute-Optimal Many-Shot In-Context Learning
Shahriar Golchin, Yanfei Chen, Rujun Han et al.
CodeXEmbed: A Generalist Embedding Model Family for Multilingual and Multi-task Code Retrieval
Ye Liu, Rui Meng, Shafiq Joty et al.
Evaluating LLMs on Chinese Idiom Translation
Cai Yang, Yao Dou, David Heineman et al.
EvidenceBench: A Benchmark for Extracting Evidence from Biomedical Papers
Jianyou Wang, Weili Cao, Kaicheng Wang et al.
Overfill: Two-Stage Models for Efficient Language Model Decoding
Woojeong Kim, Junxiong Wang, Jing Nathan Yan et al.