Most Cited COLM "incentive compatibility" Papers

418 papers found • Page 2 of 3

#201

Efficient Construction of Model Family through Progressive Training Using Model Expansion

Kazuki Yano, Sho Takase, Sosuke Kobayashi et al.

COLM 2025paperarXiv:2504.00623
5
citations
#202

Gating is Weighting: Understanding Gated Linear Attention through In-context Learning

Yingcong Li, Davoud Ataee Tarzanagh, Ankit Singh Rawat et al.

COLM 2025paper
5
citations
#203

LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning

Gabriel Jacob Perin, Runjin Chen, Xuxi Chen et al.

COLM 2025paperarXiv:2506.15606
5
citations
#204

Sherkala-Chat: Building a State-of-the-Art LLM for Kazakh in a Moderately Resourced Setting

Fajri Koto, Rituraj Joshi, Nurdaulet Mukhituly et al.

COLM 2025paper
5
citations
#205

VideoSAVi: Self-Aligned Video Language Models without Human Supervision

Yogesh Kulkarni, Pooyan Fazli

COLM 2025paperarXiv:2412.00624
5
citations
#206

Multilingual Contextualization of Large Language Models for Document-Level Machine Translation

Miguel Moura Ramos, Patrick Fernandes, Sweta Agrawal et al.

COLM 2025paperarXiv:2504.12140
5
citations
#207

FormaRL: Enhancing Autoformalization with no Labeled Data

Yanxing Huang, Xinling Jin, Sijie Liang et al.

COLM 2025paperarXiv:2508.18914
5
citations
#208

Positional Biases Shift as Inputs Approach Context Window Limits

Blerta Veseli, Julian Chibane, Mariya Toneva et al.

COLM 2025paperarXiv:2508.07479
5
citations
#209

Both Direct and Indirect Evidence Contribute to Dative Alternation Preferences in Language Models

Qing Yao, Kanishka Misra, Leonie Weissweiler et al.

COLM 2025paperarXiv:2503.20850
4
citations
#210

Rethinking Multilingual Continual Pretraining: Data Mixing for Adapting LLMs Across Languages and Resources

Zihao Li, Shaoxiong Ji, Hengyu Luo et al.

COLM 2025paperarXiv:2504.04152
4
citations
#211

True Multimodal In-Context Learning Needs Attention to the Visual Context

Shuo Chen, Jianzhe Liu, Zhen Han et al.

COLM 2025paperarXiv:2507.15807
4
citations
#212

How Multimodal LLMs Solve Image Tasks: A Lens on Visual Grounding, Task Reasoning, and Answer Decoding

Zhuoran Yu, Yong Jae Lee

COLM 2025paperarXiv:2508.20279
4
citations
#213

EvalAgents: Discovering Implicit Evaluation Criteria from the Web

Manya Wadhwa, Zayne Rea Sprague, Chaitanya Malaviya et al.

COLM 2025paperarXiv:2504.15219
4
citations
#214

Overcoming Vocabulary Constraints with Pixel-level Fallback

Jonas F. Lotz, Hendra Setiawan, Stephan Peitz et al.

COLM 2025paperarXiv:2504.02122
4
citations
#215

QUDsim: Quantifying Discourse Similarities in LLM-Generated Text

Ramya Namuduri, Yating Wu, Anshun Asher Zheng et al.

COLM 2025paperarXiv:2504.09373
4
citations
#216

Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective

Weijie Xu, Yiwen Wang, Chi Xue et al.

COLM 2025paperarXiv:2506.19028
4
citations
#217

Plato: Plan to Efficient Decode for Large Language Model Inference

Shuowei Jin, Xueshen Liu, Yongji Wu et al.

COLM 2025paperarXiv:2402.12280
4
citations
#218

You Cannot Feed Two Birds with One Score: the Accuracy-Naturalness Tradeoff in Translation

Gergely Flamich, David Vilar, Jan-Thorsten Peter et al.

COLM 2025paperarXiv:2503.24013
4
citations
#219

TRELLIS: Learning to Compress Key-Value Memory in Attention Models

Mahdi Karami, Ali Behrouz, Praneeth Kacham et al.

COLM 2025paperarXiv:2512.23852
4
citations
#220

Always Tell Me The Odds: Fine-grained Conditional Probability Estimation

Liaoyaqi Wang, Zhengping Jiang, Anqi Liu et al.

COLM 2025paperarXiv:2505.01595
4
citations
#221

Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs

Sergey Troshin, Wafaa Mohammed, Yan Meng et al.

COLM 2025paperarXiv:2510.01218
4
citations
#222

PredGen: Accelerated Inference of Large Language Models through Input-Time Speculation for Real-Time Speech Interaction

Shufan Li, Aditya Grover

COLM 2025paperarXiv:2506.15556
4
citations
#223

Model-Agnostic Policy Explanations with Large Language Models

Zhang Xi-Jia, Yue Guo, Shufei Chen et al.

COLM 2025paperarXiv:2504.05625
4
citations
#224

Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse Reinforcement Learning

Jared Joselowitz, Ritam Majumdar, Arjun Jagota et al.

COLM 2025paperarXiv:2410.12491
4
citations
#225

From Next-Token to Mathematics: The Learning Dynamics of Mathematical Reasoning in Language Models

Shubhra Mishra, Gabriel Poesia, Noah Goodman

COLM 2025paperarXiv:2407.00900
4
citations
#226

Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs

Dongyang Fan, Vinko Sabolčec, Matin Ansaripour et al.

COLM 2025paper
4
citations
#227

Data-Centric Human Preference with Rationales for Direct Preference Alignment

Hoang Anh Just, Ming Jin, Anit Kumar Sahu et al.

COLM 2025paperarXiv:2407.14477
4
citations
#228

SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching

Yuxuan Zhu, Ali Falahati, David H. Yang et al.

COLM 2025paperarXiv:2504.00970
4
citations
#229

AutoScale: Scale-Aware Data Mixing for Pre-Training LLMs

Feiyang Kang, Yifan Sun, Bingbing Wen et al.

COLM 2025paperarXiv:2407.20177
3
citations
#230

Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation

Amanda Myntti, Erik Henriksson, Veronika Laippala et al.

COLM 2025paperarXiv:2504.01542
3
citations
#231

Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory

Liangyu Wang, Jie Ren, Hang Xu et al.

COLM 2025paperarXiv:2503.12668
3
citations
#232

Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs

Itay Itzhak, Yonatan Belinkov, Gabriel Stanovsky

COLM 2025paperarXiv:2507.07186
3
citations
#233

ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data

Tong Chen, Faeze Brahman, Jiacheng Liu et al.

COLM 2025paperarXiv:2504.14452
3
citations
#234

ReFeed: Multi-dimensional Summarization Refinement with Reflective Reasoning on Feedback

Taewon Yun, Jihwan Oh, Hyangsuk Min et al.

COLM 2025paperarXiv:2503.21332
3
citations
#235

AdaptMI: Adaptive Skill-based In-context Math Instructions for Small Language Models

Yinghui He, Abhishek Panigrahi, Yong Lin et al.

COLM 2025paperarXiv:2505.00147
3
citations
#236

Short-PHD: Detecting Short LLM-generated Text with Topological Data Analysis After Off-topic Content Insertion

Dongjun Wei, Minjia Mao, Xiao Fang et al.

COLM 2025paperarXiv:2504.02873
3
citations
#237

Resona: Improving Context Copying in Linear Recurrence Models with Retrieval

Xinyu Wang, Linrui Ma, Jerry Huang et al.

COLM 2025paperarXiv:2503.22913
3
citations
#238

On the Effectiveness and Generalization of Race Representations for Debiasing High-Stakes Decisions

Dang Nguyen, Chenhao Tan

COLM 2025paperarXiv:2504.06303
3
citations
#239

Language models align with brain regions that represent concepts across modalities

Maria Ryskina, Greta Tuckute, Alexander Fung et al.

COLM 2025paperarXiv:2508.11536
3
citations
#240

In-context Ranking Preference Optimization

Junda Wu, Rohan Surana, Zhouhang Xie et al.

COLM 2025paperarXiv:2504.15477
3
citations
#241

Post-training for Efficient Communication via Convention Formation

Yilun Hua, Evan Wang, Yoav Artzi

COLM 2025paperarXiv:2508.06482
3
citations
#242

DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding

Hossein Entezari Zarch, Lei Gao, Chaoyi Jiang et al.

COLM 2025paperarXiv:2504.05598
3
citations
#243

Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base

Linxin Song, Xuwei Ding, Jieyu Zhang et al.

COLM 2025paperarXiv:2503.23361
3
citations
#244

Adaptive Computation Pruning for the Forgetting Transformer

Zhixuan Lin, Johan Obando-Ceron, Xu Owen He et al.

COLM 2025paperarXiv:2504.06949
3
citations
#245

Improving Fisher Information Estimation and Efficiency for LoRA-based LLM Unlearning

Yejin Kim, Eunwon Kim, Buru Chang et al.

COLM 2025paperarXiv:2508.21300
3
citations
#246

CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions

Tae Soo Kim, Yoonjoo Lee, Yoonah Park et al.

COLM 2025paperarXiv:2508.01674
3
citations
#247

Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation

Ziqiao Ma, Jing Ding, Xuejun Zhang et al.

COLM 2025paperarXiv:2504.16060
3
citations
#248

IterKey: Iterative Keyword Generation with LLMs for Enhanced Retrieval Augmented Generation

Kazuki Hayashi, Hidetaka Kamigaito, Shinya Kouda et al.

COLM 2025paperarXiv:2505.08450
3
citations
#249

Probing then Editing Response Personality of Large Language Models

Tianjie Ju, Zhenyu Shao, Bowen Wang et al.

COLM 2025paperarXiv:2504.10227
3
citations
#250

Evaluating Large Language Models as Expert Annotators

Yu-Min Tseng, Wei-Lin Chen, Chung-Chi Chen et al.

COLM 2025paperarXiv:2508.07827
3
citations
#251

MuSeD: A Multimodal Spanish Dataset for Sexism Detection in Social Media Videos

Laura De Grazia, Pol Pastells, Mauro Vázquez Chas et al.

COLM 2025paperarXiv:2504.11169
3
citations
#252

Energy-Based Reward Models for Robust Language Model Alignment

Anamika Lochab, Ruqi Zhang

COLM 2025paperarXiv:2504.13134
3
citations
#253

Text Speaks Louder than Vision: ASCII Art Reveals Textual Biases in Vision-Language Models

Zhaochen Wang, Bryan Hooi, Yiwei Wang et al.

COLM 2025paperarXiv:2504.01589
3
citations
#254

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers

Wooseok Seo, Seungju Han, Jaehun Jung et al.

COLM 2025paperarXiv:2506.13342
3
citations
#255

Overflow Prevention Enhances Long-Context Recurrent LLMs

Assaf Ben-Kish, Itamar Zimerman, Muhammad Jehanzeb Mirza et al.

COLM 2025paperarXiv:2505.07793
3
citations
#256

Pretraining on the Test Set Is No Longer All You Need: A Debate-Driven Approach to QA Benchmarks

Linbo Cao, Jinman Zhao

COLM 2025paperarXiv:2507.17747
3
citations
#257

VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation

Ziang Ye, Yang Zhang, Wentao Shi et al.

COLM 2025paperarXiv:2507.06899
3
citations
#258

Beyond the Reported Cutoff: Where Large Language Models Fall Short on Financial Knowledge

Agam Shah, Liqin Ye, Sebastian Jaskowski et al.

COLM 2025paperarXiv:2504.00042
3
citations
#259

MAC: A Live Benchmark for Multimodal Large Language Models in Scientific Understanding

Mohan Jiang, Jin Gao, Jiahao Zhan et al.

COLM 2025paperarXiv:2508.15802
3
citations
#260

Stuffed Mamba: Oversized States Lead to the Inability to Forget

Yingfa Chen, Xinrong Zhang, Shengding Hu et al.

COLM 2025paper
3
citations
#261

ADAPT: Actively Discovering and Adapting to Preferences for any Task

Maithili Patel, Xavier Puig, Ruta Desai et al.

COLM 2025paperarXiv:2504.04040
2
citations
#262

CLIPPER: Compression enables long-context synthetic data generation

Chau Minh Pham, Yapei Chang, Mohit Iyyer

COLM 2025paperarXiv:2502.14854
2
citations
#263

SQuat: Subspace-orthogonal KV Cache Quantization

Hao Wang, Ligong Han, Kai Xu et al.

COLM 2025paperarXiv:2503.24358
2
citations
#264

Context-Adaptive Multi-Prompt Embedding with Large Language Models for Vision-Language Alignment

Dahun Kim, Anelia Angelova

COLM 2025paperarXiv:2508.02762
2
citations
#265

RankAlign: A Ranking View of the Generator-Validator Gap in Large Language Models

Juan Diego Rodriguez, Wenxuan Ding, Katrin Erk et al.

COLM 2025paper
2
citations
#266

Visual Representations inside the Language Model

Benlin Liu, Amita Kamath, Madeleine Grunde-McLaughlin et al.

COLM 2025paper
2
citations
#267

Noiser: Bounded Input Perturbations for Attributing Large Language Models

Mohammad Reza Ghasemi Madani, Aryo Pradipta Gema, Yu Zhao et al.

COLM 2025paperarXiv:2504.02911
2
citations
#268

Imagine All The Relevance: Scenario-Profiled Indexing with Knowledge Expansion for Dense Retrieval

Sangam Lee, Ryang Heo, SeongKu Kang et al.

COLM 2025paperarXiv:2503.23033
2
citations
#269

In-Context Occam’s Razor: How Transformers Prefer Simpler Hypotheses on the Fly

Puneesh Deora, Bhavya Vasudeva, Tina Behnia et al.

COLM 2025paper
2
citations
#270

LLM Unlearning Without an Expert Curated Dataset

Xiaoyuan Zhu, Muru Zhang, Ollie Liu et al.

COLM 2025paperarXiv:2508.06595
2
citations
#271

Probing Syntax in Large Language Models: Successes and Remaining Challenges

Pablo J. Diego Simon, Emmanuel Chemla, Jean-Remi King et al.

COLM 2025paperarXiv:2508.03211
2
citations
#272

Agree to Disagree? A Meta-Evaluation of LLM Misgendering

Arjun Subramonian, Vagrant Gautam, Preethi Seshadri et al.

COLM 2025paperarXiv:2504.17075
2
citations
#273

Sharpe Ratio-Guided Active Learning for Preference Optimization in RLHF

Syrine Belakaria, Joshua Kazdan, Charles Marx et al.

COLM 2025paperarXiv:2503.22137
2
citations
#274

Humans overrely on overconfident language models, across languages

Neil Rathi, Dan Jurafsky, Kaitlyn Zhou

COLM 2025paperarXiv:2507.06306
2
citations
#275

SpectR: Dynamically Composing LM Experts with Spectral Routing

William Fleshman, Benjamin Van Durme

COLM 2025paperarXiv:2504.03454
2
citations
#276

When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars

Rei Higuchi, Ryotaro Kawata, Naoki Nishikawa et al.

COLM 2025paperarXiv:2504.17562
2
citations
#277

Guided Reasoning in LLM-Driven Penetration Testing Using Structured Attack Trees

Katsuaki Nakano, Reza Fayyazi, Shanchieh Yang et al.

COLM 2025paperarXiv:2509.07939
2
citations
#278

Single-Pass Document Scanning for Question Answering

Weili Cao, Jianyou Wang, Youze Zheng et al.

COLM 2025paperarXiv:2504.03101
2
citations
#279

MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing

Michael Paul Clemens, Ana Marasovic

COLM 2025paperarXiv:2507.06329
2
citations
#280

A Taxonomy of Transcendence

Natalie Abreu, Edwin Zhang, Eran Malach et al.

COLM 2025paperarXiv:2508.17669
2
citations
#281

Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?

Anthony GX-Chen, Dongyan Lin, Mandana Samiei et al.

COLM 2025paper
2
citations
#282

The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage

Skyler Hallinan, Jaehun Jung, Melanie Sclar et al.

COLM 2025paperarXiv:2508.09603
2
citations
#283

Correctness-Guaranteed Code Generation via Constrained Decoding

Lingxiao Li, salar rahili, Yiwei Zhao

COLM 2025paperarXiv:2508.15866
2
citations
#284

From Queries to Criteria: Understanding How Astronomers Evaluate LLMs

Alina Hyk, Kiera McCormick, Mian Zhong et al.

COLM 2025paperarXiv:2507.15715
2
citations
#285

Approximating Language Model Training Data from Weights

John Xavier Morris, Junjie Oscar Yin, Woojeong Kim et al.

COLM 2025paperarXiv:2506.15553
2
citations
#286

Style over Substance: Distilled Language Models Reason Via Stylistic Replication

Philip Lippmann, Jie Yang

COLM 2025paperarXiv:2504.01738
2
citations
#287

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

Bingxiang He, Wenbin Zhang, Jiaxi Song et al.

COLM 2025paperarXiv:2504.03612
2
citations
#288

The Zero Body Problem: Probing LLM Use of Sensory Language

Rebecca M. M. Hicke, Sil Hamilton, David Mimno

COLM 2025paperarXiv:2504.06393
2
citations
#289

MS-SSM: A Multi-Scale State Space Model for Efficient Sequence Modeling

Mahdi Karami, Ali Behrouz, Peilin Zhong et al.

COLM 2025paperarXiv:2512.23824
2
citations
#290

Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality

Sewoong Lee, Adam Davies, Marc E. Canby et al.

COLM 2025paperarXiv:2503.24277
2
citations
#291

RARe: Retrieval Augmented Retrieval with In-Context Examples

Atula Tejaswi, Yoonsang Lee, sujay sanghavi et al.

COLM 2025paperarXiv:2410.20088
2
citations
#292

Impact-driven Context Filtering For Cross-file Code Completion

Yanzhou Li, Shangqing Liu, Kangjie Chen et al.

COLM 2025paperarXiv:2508.05970
2
citations
#293

Deep Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions

Minwoo Kang, Suhong Moon, Seung Hyeong Lee et al.

COLM 2025paperarXiv:2504.11673
2
citations
#294

Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups

Rijul Magu, Arka Dutta, Sean Kim et al.

COLM 2025paperarXiv:2504.06160
1
citations
#295

UTF-8 Plumbing: Byte-level Tokenizers Unavoidably Enable LLMs to Generate Ill-formed UTF-8

Preston Firestone, Shubham Ugare, Gagandeep Singh et al.

COLM 2025paperarXiv:2511.05578
1
citations
#296

Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts

Samin Yeasar Arnob, Zhan Su, Minseon Kim et al.

COLM 2025paperarXiv:2507.07140
1
citations
#297

Exploring Large Language Model Agents for Piloting Social Experiments

Jinghua Piao, Yuwei Yan, Nian Li et al.

COLM 2025paperarXiv:2508.08678
1
citations
#298

Do Large Language Models Have a Planning Theory of Mind? Evidence from MindGames: a Multi-Step Persuasion Task

Jared Moore, Ned Cooper, Rasmus Overmark et al.

COLM 2025paperarXiv:2507.16196
1
citations
#299

Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments

Yipeng Du, Zihao Wang, Ahmad Farhan et al.

COLM 2025paperarXiv:2410.21340
1
citations
#300

Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding

Fabian David Schmidt, Ivan Vulić, Goran Glavaš et al.

COLM 2025paperarXiv:2501.06117
1
citations
#301

DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models

Zhiyi Shi, Binjie Wang, Chongjie Si et al.

COLM 2025paper
1
citations
#302

Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance

Takuya Tamura, Taro Yano, Masafumi Enomoto et al.

COLM 2025paperarXiv:2504.19811
1
citations
#303

MapIQ: Evaluating Multimodal Large Language Models for Map Question Answering

Varun Srivastava, Fan Lei, Srija Mukhopadhyay et al.

COLM 2025paperarXiv:2507.11625
1
citations
#304

Hyperparameter Loss Surfaces Are Simple Near their Optima

Nicholas Lourie, He He, Kyunghyun Cho

COLM 2025paper
1
citations
#305

Elucidating the Design Space of Decay in Linear Attention

Zhen Qin, Xuyang Shen, Yiran Zhong

COLM 2025paperarXiv:2509.05282
1
citations
#306

OpinioRAG: Towards Generating User-Centric Opinion Highlights from Large-scale Online Reviews

Mir Tafseer Nayeem, Davood Rafiei

COLM 2025paperarXiv:2509.00285
1
citations
#307

Can Large Language Models Integrate Spatial Data? Empirical Insights into Reasoning Strengths and Computational Weaknesses

Bin HAN, Robert Wolfe, Anat Caspi et al.

COLM 2025paperarXiv:2508.05009
1
citations
#308

Customize Multi-modal RAI Guardrails with Precedent-based predictions

Cheng-Fu Yang, Thanh Tran, Christos Christodoulopoulos et al.

COLM 2025paperarXiv:2507.20503
1
citations
#309

Multi-Agent Retrieval-Augmented Framework for Evidence-Based Counterspeech Against Health Misinformation

Anirban Saha Anik, Xiaoying Song, Elliott Wang et al.

COLM 2025paperarXiv:2507.07307
1
citations
#310

Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression

Hanqi Xiao, Yi-Lin Sung, Elias Stengel-Eskin et al.

COLM 2025paperarXiv:2504.07389
1
citations
#311

RRO: LLM Agent Optimization Through Rising Reward Trajectories

Zilong Wang, Jingfeng Yang, Sreyashi Nag et al.

COLM 2025paperarXiv:2505.20737
1
citations
#312

Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models

Ameen Ali Ali, Shahar Katz, Lior Wolf et al.

COLM 2025paperarXiv:2507.09185
1
citations
#313

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity

Chenyang Song, Weilin Zhao, Xu Han et al.

COLM 2025paperarXiv:2507.08771
1
citations
#314

URANIA: Differentially Private Insights into AI Use

Daogao Liu, Edith Cohen, Badih Ghazi et al.

COLM 2025paperarXiv:2506.04681
1
citations
#315

Resource-efficient Inference with Foundation Model Programs

Lunyiu Nie, Zhimin Ding, Kevin Yu et al.

COLM 2025paperarXiv:2504.07247
1
citations
#316

The World According to LLMs: How Geographic Origin Influences LLMs' Entity Deduction Capabilities

Harsh Nishant Lalai, Raj Sanjay Shah, Jiaxin Pei et al.

COLM 2025paperarXiv:2508.05525
1
citations
#317

Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution

Falaah Arif Khan, Nivedha Sivakumar, Yinong Oliver Wang et al.

COLM 2025paperarXiv:2508.07111
1
citations
#318

CONCAP: Seeing Beyond English with Concepts Retrieval-Augmented Captioning

George Ibrahim, Rita Ramos, Yova Kementchedjhieva

COLM 2025paperarXiv:2507.20411
1
citations
#319

Privately Learning from Graphs with Applications in Fine-tuning Large Language Models

Haoteng Yin, Rongzhe Wei, Eli Chien et al.

COLM 2025paperarXiv:2410.08299
1
citations
#320

Teach Old SAEs New Domain Tricks with Boosting

Nikita Koriagin, Yaroslav Aksenov, Daniil Laptev et al.

COLM 2025paperarXiv:2507.12990
1
citations
#321

Ensemble Debiasing Across Class and Sample Levels for Fairer Prompting Accuracy

Ruixi Lin, Ziqiao Wang, Yang You

COLM 2025paperarXiv:2503.05157
1
citations
#322

CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models

Runlong Zhou, Yi Zhang

COLM 2025paperarXiv:2504.01450
1
citations
#323

Implicit In-Context Learning: Evidence from Artificial Language Experiments

Xiaomeng Ma, Qihui Xu

COLM 2025paperarXiv:2503.24190
1
citations
#324

Limitations of refinement methods for weak to strong generalization

Seamus Somerstep, Yaacov Ritov, Mikhail Yurochkin et al.

COLM 2025paper
1
citations
#325

SEAM: Semantically Equivalent Across Modalities Benchmark for Vision-Language Models

Zhenwei Tang, Difan Jiao, Blair Yang et al.

COLM 2025paperarXiv:2508.18179
1
citations
#326

Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation

Shiven Sinha, Shashwat Goel, Ponnurangam Kumaraguru et al.

COLM 2025paperarXiv:2502.19414
1
citations
#327

News is More than a Collection of Facts: Moral Frame Preserving News Summarization

Enrico Liscio, Michela Lorandi, Pradeep K. Murukannaiah

COLM 2025paperarXiv:2504.00657
1
citations
#328

Learning Effective Language Representations for Sequential Recommendation via Joint Embedding Predictive Architecture

Nguyen Anh Minh, Dung D. Le

COLM 2025paperarXiv:2504.10512
1
citations
#329

Layerwise Importance Analysis of Feed-Forward Networks in Transformer-based Language Models

Wataru Ikeda, Kazuki Yano, Ryosuke Takahashi et al.

COLM 2025paperarXiv:2508.17734
1
citations
#330

Mitigating Modal Imbalance in Multimodal Reasoning

Chen Henry Wu, Neil Kale, Aditi Raghunathan

COLM 2025paperarXiv:2510.02608
1
citations
#331

GenerationPrograms: Fine-grained Attribution with Executable Programs

David Wan, Eran Hirsch, Elias Stengel-Eskin et al.

COLM 2025paperarXiv:2506.14580
1
citations
#332

X-EcoMLA: Upcycling Pre-Trained Attention into MLA for Efficient and Extreme KV Compression

Guihong Li, Mehdi Rezagholizadeh, Mingyu Yang et al.

COLM 2025paperarXiv:2503.11132
1
citations
#333

Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only

Qingru Zhang, Liang Qiu, Ilgee Hong et al.

COLM 2025paperarXiv:2510.21090
1
citations
#334

ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models

Archchana Sindhujan, Shenbin Qian, Chan Chi Chun Matthew et al.

COLM 2025paperarXiv:2508.07484
1
citations
#335

The Negation Bias in Large Language Models: Investigating bias reflected in linguistic markers

Yishan Wang, Pia Sommerauer, Jelke Bloem

COLM 2025paper
1
citations
#336

BiXSE: Improving Dense Retrieval via Probabilistic Graded Relevance Distillation

Christos Tsirigotis, Vaibhav Adlakha, Joao Monteiro et al.

COLM 2025paperarXiv:2508.06781
1
citations
#337

Phased Training for LLM-powered Text Retrieval Models Beyond Data Scaling

Xin Zhang, Yanzhao Zhang, Wen Xie et al.

COLM 2025paper
#338

LLM-based Multi-Agents System Attack via Continuous Optimization with Discrete Efficient Search

Weichen Yu, Kai Hu, Tianyu Pang et al.

COLM 2025paper
#339

LM Agents May Fail to Act on Their Own Risk Knowledge

Yuzhi Tang, Tianxiao Li, Elizabeth Li et al.

COLM 2025paperarXiv:2508.13465
#340

CodeXEmbed: A Generalist Embedding Model Family for Multilingual and Multi-task Code Retrieval

Ye Liu, Rui Meng, Shafiq Joty et al.

COLM 2025paper
#341

Improving Table Understanding with LLMs and Entity-Oriented Search

Thi-Nhung Nguyen, Hoang Ngo, Dinh Phung et al.

COLM 2025paper
#342

Bootstrapping Visual Assistant Modeling with Situated Interaction Simulation

Yichi Zhang, Run Peng, Yinpei Dai et al.

COLM 2025paper
#343

ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models

Kaizhi Qian, Xulin Fan, Junrui Ni et al.

COLM 2025paperarXiv:2507.20091
#344

Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models

Youmi Ma, Sakae Mizuki, Kazuki Fujii et al.

COLM 2025paperarXiv:2503.23714
#345

Hawkeye: Model Collaboration for Efficient Reasoning

Jianshu She, Zhuohao Li, Zhemin Huang et al.

COLM 2025paper
#346

The Devil is in the EOS: Sequence Training for Detailed Image Captioning

Abdelrahman Mohamed, Yova Kementchedjhieva

COLM 2025paperarXiv:2507.20077
#347

Reasoning Models Know When They’re Right: Probing Hidden States for Self-Verification

Anqi Zhang, Yulin Chen, Jane Pan et al.

COLM 2025paper
#348

Hell or High Water: Evaluating Agentic Recovery from External Failures

Andrew Wang, Sophia Hager, Adi Asija et al.

COLM 2025paperarXiv:2508.11027
#349

Breakpoint: Stress-testing systems-level reasoning in LLM agents

Kaivalya Hariharan, Uzay Girit, Zifan Wang et al.

COLM 2025paper
#350

$\mu$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models

Zian Su, Ziyang Huang, Kaiyuan Zhang et al.

COLM 2025paper
#351

MeMAD: Structured Memory of Debates for Enhanced Multi-Agent Reasoning

Shuai Ling, Lizi Liao, Dongmei Jiang et al.

COLM 2025paper
#352

On Mechanistic Circuits for Extractive Question-Answering

Samyadeep Basu, Vlad I Morariu, Ryan A. Rossi et al.

COLM 2025paperarXiv:2502.08059
#353

Can LLM "Self-report"?: Evaluating the Validity of Self-report Scales in Measuring Personality Design in LLM-based Chatbots

Huiqi Zou, Pengda Wang, Zihan Yan et al.

COLM 2025paper
#354

Pretrained Hybrids with MAD Skills

Nicholas Roberts, Samuel Guo, Zhiqi Gao et al.

COLM 2025paper
#355

SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation

Zichong Li, Chen Liang, Zixuan Zhang et al.

COLM 2025paperarXiv:2506.18349
#356

Transformers are Efficient Compilers, Provably

Xiyu Zhai, Runlong Zhou, Liao Zhang et al.

COLM 2025paperarXiv:2410.14706
#357

EvidenceBench: A Benchmark for Extracting Evidence from Biomedical Papers

Jianyou Wang, Weili Cao, Kaicheng Wang et al.

COLM 2025paperarXiv:2504.18736
#358

Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation

Yi Lu, Wanxu Zhao, Xin Zhou et al.

COLM 2025paperarXiv:2504.18857
#359

Reverse-engineering NLI: A study of the meta-inferential properties of Natural Language Inference

Rasmus Blanck, Bill Noble, Stergios Chatzikyriakidis

COLM 2025paperarXiv:2601.05170
#360

Improving LLMs‘ Generalized Reasoning Abilities by Graph Problems

Qifan Zhang, Nuo Chen, Zehua Li et al.

COLM 2025paperarXiv:2507.17168
#361

Society of Mind Meets Real-Time Strategy: A Hierarchical Multi-Agent Framework for Strategic Reasoning

Daechul Ahn, San Kim, Jonghyun Choi

COLM 2025paperarXiv:2508.06042
#362

Towards Compute-Optimal Many-Shot In-Context Learning

Shahriar Golchin, Yanfei Chen, Rujun Han et al.

COLM 2025paperarXiv:2507.16217
#363

Evaluating LLMs on Chinese Idiom Translation

Cai Yang, Yao Dou, David Heineman et al.

COLM 2025paperarXiv:2508.10421
#364

Scaling Web Agent Training through Automatic Data Generation and Fine-grained Evaluation

Lajanugen Logeswaran, Jaekyeom Kim, Sungryull Sohn et al.

COLM 2025paper
#365

Benchmarking Retrieval-Augmented Generation for Chemistry

Xianrui Zhong, Bowen Jin, Siru Ouyang et al.

COLM 2025paper
#366

Multilingual and Multi-Accent Jailbreaking of Audio LLMs

Jaechul Roh, Virat Shejwalkar, Amir Houmansadr

COLM 2025paper
#367

UNVEILING: What Makes Linguistics Olympiad Puzzles Tricky for LLMs?

Mukund Choudhary, KV Aditya Srivatsa, Gaurja Aeron et al.

COLM 2025paper
#368

Exposing and Patching the Flaws of Large Language Models in Social Character Simulation

Yue Huang, Zhengqing Yuan, Yujun Zhou et al.

COLM 2025paper
#369

Hidden in plain sight: VLMs overlook their visual representations

Stephanie Fu, tyler bonnen, Devin Guillory et al.

COLM 2025paperarXiv:2506.08008
#370

CALLME: Call Graph Augmentation with Large Language Models for Javascript

Michael Wang, Kexin Pei, Armando Solar-Lezama

COLM 2025paper
#371

JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model

Yi Nian, Shenzhe Zhu, Yuehan Qin et al.

COLM 2025paperarXiv:2504.03770
#372

IMPersona: Evaluating Individual Level LLM Impersonation

Quan Shi, Carlos E Jimenez, Stephen Dong et al.

COLM 2025paper
#373

Learning by Teaching: Engaging Students as Instructors of Large Language Models in Computer Science Education

Xinming Yang, Haasil Pujara, Jun Li

COLM 2025paperarXiv:2508.05979
#374

CoLa: Learning to Interactively Collaborate with Large Language Models

Abhishek Sharma, Dan Goldwasser

COLM 2025paperarXiv:2504.02965
#375

Reinforcement Learning Enhanced Full-Duplex Spoken Dialogue Language Models for Conversational Interactions

Chen Chen, Ke Hu, Chao-Han Huck Yang et al.

COLM 2025paper
#376

REM: Evaluating LLM Embodied Spatial Reasoning through Multi-Frame Trajectories

Jacob Thompson, Emiliano Garcia-Lopez, Yonatan Bisk

COLM 2025paperarXiv:2512.00736
#377

Partial Perspectives: How LLMs Handle Logically Inconsistent Knowledge in Reasoning Tasks

Zichao Li, Ines Arous, Jackie CK Cheung

COLM 2025paper
#378

Synthetic Data Generation and Multi-Step Reinforcement Learning for Reasoning and Tool Use

Anna Goldie, Azalia Mirhoseini, Hao Zhou et al.

COLM 2025paper
#379

Overfill: Two-Stage Models for Efficient Language Model Decoding

Woojeong Kim, Junxiong Wang, Jing Nathan Yan et al.

COLM 2025paperarXiv:2508.08446
#380

Rethinking Associative Memory Mechanism in Induction Head

Shuo Wang, Issei Sato

COLM 2025paper
#381

Rhapsody: A Dataset for Highlight Detection in Podcasts

Younghan Park, Anuj Diwan, David Harwath et al.

COLM 2025paperarXiv:2505.19429
#382

Analyzing Multilingualism in Large Language Models with Sparse Autoencoders

Ikhyun Cho, Julia Hockenmaier

COLM 2025paper
#383

Impact of LLM Alignment on Impression Formation in Social Interactions

Ala N. Tak, Anahita Bolourani, Daniel B. Shank et al.

COLM 2025paper
#384

Knowledge Graph Retrieval-Augmented Generation via GNN-Guided Prompting

Haochen Liu, Song Wang, Jundong Li

COLM 2025paper
#385

Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning

Justin Lovelace, Christian K Belardi, Sofian Zalouk et al.

COLM 2025paper
#386

MSRS: Evaluating Multi-Source Retrieval-Augmented Generation

Rohan Phanse, Ej Zhou, Kejian Shi et al.

COLM 2025paperarXiv:2508.20867
#387

Traceable and Explainable Multimodal Large Language Models: An Information-Theoretic View

Zihan Huang, Junda Wu, Rohan Surana et al.

COLM 2025paper
#388

LawFlow: Collecting and Simulating Lawyers’ Thought Processes on Business Formation Case Studies

Debarati Das, Khanh Chi Le, Ritik Sachin Parkar et al.

COLM 2025paper
#389

HyperINF: Unleashing the HyperPower of Schulz's Method for Data Influence Estimation

Xinyu Zhou, Simin Fan, Martin Jaggi

COLM 2025paper
#390

NoWag: A Unified Framework for Shape Preserving Com- pression of Large Language Models

Lawrence Ray Liu, Inesh Chakrabarti, Yixiao Li et al.

COLM 2025paper
#391

Truth-value judgment in language models: ‘truth directions’ are context sensitive

Stefan F. Schouten, Peter Bloem, Ilia Markov et al.

COLM 2025paper
#392

E$^2$-RAG: Towards Editable Efficient RAG by Editing Compressed KV Caches

Tongxu Luo, Wenyu Du, HanWen Hao et al.

COLM 2025paper
#393

Yourbench: Dynamic Evaluation Set Generation with LLMs

Sumuk Shashidhar, Clémentine Fourrier, Alina Lozovskaya et al.

COLM 2025paper
#394

Understanding and Improving Noisy Embedding Techniques in Instruction Finetuning

Abhay Yadav

COLM 2025paper
#395

REFA: Reference Free Alignment with Fine-Grained Length Control

Taneesh Gupta, Rahul Madhavan, Xuchao Zhang et al.

COLM 2025paper
#396

2 OLMo 2 Furious (COLM’s Version)

Evan Pete Walsh, Luca Soldaini, Dirk Groeneveld et al.

COLM 2025paper
#397

VaPR - Vision-language Preference alignment for Reasoning

Rohan Wadhawan, Fabrice Y Harel-Canada, Zi-Yi Dou et al.

COLM 2025paperarXiv:2510.01700
#398

Estimating Optimal Context Length for Hybrid Retrieval-augmented Multi-document Summarization

Adithya Pratapa, Teruko Mitamura

COLM 2025paperarXiv:2504.12972
#399

SmolLM2: When Smol Goes Big — Data-Centric Training of a Fully Open Small Language Model

Loubna Ben allal, Anton Lozhkov, Elie Bakouch et al.

COLM 2025paper
#400

G1yphD3c0de: Towards Safer Language Models on Visually Perturbed Texts

Yejinchoi, Yejin Yeo, Yejin Son et al.

COLM 2025paper