Most Cited COLM "distributed llm serving" Papers
418 papers found • Page 3 of 3
Conference
The Dual-Route Model of Induction
Sheridan Feucht, Eric Todd, Byron C Wallace et al.
SpectR: Dynamically Composing LM Experts with Spectral Routing
William Fleshman, Benjamin Van Durme
News is More than a Collection of Facts: Moral Frame Preserving News Summarization
Enrico Liscio, Michela Lorandi, Pradeep K. Murukannaiah
BEARCUBS: A benchmark for computer-using web agents
Yixiao Song, Katherine Thai, Chau Minh Pham et al.
Supposedly Equivalent Facts That Aren’t? Entity Frequency in Pre-training Induces Asymmetry in LLMs
Yuan He, Bailan He, Zifeng Ding et al.
Plancraft: an evaluation dataset for planning with LLM agents
Gautier Dagan, Frank Keller, Alex Lascarides
Base Models Beat Aligned Models at Randomness and Creativity
Peter West, Christopher Potts
Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base
Linxin Song, Xuwei Ding, Jieyu Zhang et al.
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng, Yuzhen Huang, Qian Liu et al.
Can Test-Time Scaling Improve World Foundation Model?
Wenyan Cong, Hanqing Zhu, Peihao Wang et al.
VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric Information
Ryo Kamoi, Yusen Zhang, Sarkar Snigdha Sarathi Das et al.
DeepRetrieval: Hacking Real Search Engines and Retrievers with Large Language Models via Reinforcement Learning
Pengcheng Jiang, Jiacheng Lin, Lang Cao et al.
FineMedLM-o1: Enhancing Medical Knowledge Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training
hongzhou yu, Tianhao Cheng, Yingwen Wang et al.
PredGen: Accelerated Inference of Large Language Models through Input-Time Speculation for Real-Time Speech Interaction
Shufan Li, Aditya Grover
LLM-based Multi-Agents System Attack via Continuous Optimization with Discrete Efficient Search
Weichen Yu, Kai Hu, Tianyu Pang et al.
SEAL: Steerable Reasoning Calibration of Large Language Models for Free
Runjin Chen, Zhenyu Zhang, Junyuan Hong et al.
ReFeed: Multi-dimensional Summarization Refinement with Reflective Reasoning on Feedback
Taewon Yun, Jihwan Oh, Hyangsuk Min et al.
Training Plug-and-Play Knowledge Modules with Deep Context Distillation
Lucas Caccia, Alan Ansell, Edoardo Ponti et al.