Most Cited COLM "directional relation alignment" Papers
418 papers found • Page 2 of 3
Conference
FineMedLM-o1: Enhancing Medical Knowledge Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training
hongzhou yu, Tianhao Cheng, Yingwen Wang et al.
Can Test-Time Scaling Improve World Foundation Model?
Wenyan Cong, Hanqing Zhu, Peihao Wang et al.
VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric Information
Ryo Kamoi, Yusen Zhang, Sarkar Snigdha Sarathi Das et al.
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hyunwoo Kim, Melanie Sclar, Tan Zhi-Xuan et al.
DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation
Jingyang Xiang, Sai Qian Zhang
The Dual-Route Model of Induction
Sheridan Feucht, Eric Todd, Byron C Wallace et al.
Language Models Fail to Introspect About Their Knowledge of Language
Siyuan Song, Jennifer Hu, Kyle Mahowald
SQuat: Subspace-orthogonal KV Cache Quantization
Hao Wang, Ligong Han, Kai Xu et al.
Hidden in plain sight: VLMs overlook their visual representations
Stephanie Fu, tyler bonnen, Devin Guillory et al.
Language Model Uncertainty Quantification with Attention Chain
Yinghao Li, Rushi Qiang, Lama Moukheiber et al.
SmolVLM: Redefining small and efficient multimodal models
Andrés Marafioti, Orr Zohar, Miquel Farré et al.
Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach
Shijian Deng, Wentian Zhao, Yu-Jhe Li et al.
Overflow Prevention Enhances Long-Context Recurrent LLMs
Assaf Ben-Kish, Itamar Zimerman, Muhammad Jehanzeb Mirza et al.
KVSink: Understanding and Enhancing the Preservation of Attention Sinks in KV Cache Quantization for LLMs
Zunhai Su, Kehong Yuan
PredGen: Accelerated Inference of Large Language Models through Input-Time Speculation for Real-Time Speech Interaction
Shufan Li, Aditya Grover
Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base
Linxin Song, Xuwei Ding, Jieyu Zhang et al.
Assessing Judging Bias in Large Reasoning Models: An Empirical Study
Qian Wang, Zhanzhi Lou, Zhenheng Tang et al.
Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
Takuya Tamura, Taro Yano, Masafumi Enomoto et al.
E$^2$-RAG: Towards Editable Efficient RAG by Editing Compressed KV Caches
Tongxu Luo, Wenyu Du, HanWen Hao et al.
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding
Fabian David Schmidt, Ivan Vulić, Goran Glavaš et al.
NoWag: A Unified Framework for Shape Preserving Com- pression of Large Language Models
Lawrence Ray Liu, Inesh Chakrabarti, Yixiao Li et al.
Evaluating Large Language Models as Expert Annotators
Yu-Min Tseng, Wei-Lin Chen, Chung-Chi Chen et al.
Yourbench: Dynamic Evaluation Set Generation with LLMs
Sumuk Shashidhar, Clémentine Fourrier, Alina Lozovskaya et al.
LawFlow: Collecting and Simulating Lawyers’ Thought Processes on Business Formation Case Studies
Debarati Das, Khanh Chi Le, Ritik Sachin Parkar et al.
Traceable and Explainable Multimodal Large Language Models: An Information-Theoretic View
Zihan Huang, Junda Wu, Rohan Surana et al.
Understanding and Improving Noisy Embedding Techniques in Instruction Finetuning
Abhay Yadav
REFA: Reference Free Alignment with Fine-Grained Length Control
Taneesh Gupta, Rahul Madhavan, Xuchao Zhang et al.
Hyperparameter Loss Surfaces Are Simple Near their Optima
Nicholas Lourie, He He, Kyunghyun Cho
From Next-Token to Mathematics: The Learning Dynamics of Mathematical Reasoning in Language Models
Shubhra Mishra, Gabriel Poesia, Noah Goodman
The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage
Skyler Hallinan, Jaehun Jung, Melanie Sclar et al.
Synthetic Data Generation and Multi-Step Reinforcement Learning for Reasoning and Tool Use
Anna Goldie, Azalia Mirhoseini, Hao Zhou et al.
MSRS: Evaluating Multi-Source Retrieval-Augmented Generation
Rohan Phanse, Ej Zhou, Kejian Shi et al.
Epistemic Alignment: A Mediating Framework for User-LLM Knowledge Delivery
Nicholas Clark, Hua Shen, Bill Howe et al.
PrefPalette: Personalized Preference Modeling with Latent Attributes
Shuyue Stella Li, Melanie Sclar, Hunter Lang et al.
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents
Salman Rahman, Liwei Jiang, James Shiffer et al.
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Bowen Jiang, Zhuoqun Hao, Young Min Cho et al.
Language models align with brain regions that represent concepts across modalities
Maria Ryskina, Greta Tuckute, Alexander Fung et al.
SUV: Scalable Large Language Model Copyright Compliance with Regularized Selective Unlearning
Tianyang Xu, Xiaoze Liu, Feijie Wu et al.
Can LLMs Handle WebShell Detection? Overcoming Detection Challenges with Behavioral Function-Aware Framework
Feijiang Han, Jiaming Zhang, Chuyi Deng et al.
LLM Unlearning Without an Expert Curated Dataset
Xiaoyuan Zhu, Muru Zhang, Ollie Liu et al.
Steering Large Language Model Activations in Sparse Spaces
Reza Bayat, Ali Rahimi-Kalahroudi, Mohammad Pezeshki et al.
Adaptive Computation Pruning for the Forgetting Transformer
Zhixuan Lin, Johan Obando-Ceron, Xu Owen He et al.
Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation
Tuhina Tripathi, Manya Wadhwa, Greg Durrett et al.
Estimating Optimal Context Length for Hybrid Retrieval-augmented Multi-document Summarization
Adithya Pratapa, Teruko Mitamura
Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups
Rijul Magu, Arka Dutta, Sean Kim et al.
M²IV: Towards Efficient and Fine-grained Multimodal In-Context Learning via Representation Engineering
Yanshu Li, Yi Cao, Hongyang He et al.
BiXSE: Improving Dense Retrieval via Probabilistic Graded Relevance Distillation
Christos Tsirigotis, Vaibhav Adlakha, Joao Monteiro et al.
Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning
Justin Lovelace, Christian K Belardi, Sofian Zalouk et al.
In-Context Occam’s Razor: How Transformers Prefer Simpler Hypotheses on the Fly
Puneesh Deora, Bhavya Vasudeva, Tina Behnia et al.
Reasoning Models Know When They’re Right: Probing Hidden States for Self-Verification
Anqi Zhang, Yulin Chen, Jane Pan et al.
The Negation Bias in Large Language Models: Investigating bias reflected in linguistic markers
Yishan Wang, Pia Sommerauer, Jelke Bloem
Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?
Anthony GX-Chen, Dongyan Lin, Mandana Samiei et al.
Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection
Kabir Ahuja, Melanie Sclar, Yulia Tsvetkov
Hell or High Water: Evaluating Agentic Recovery from External Failures
Andrew Wang, Sophia Hager, Adi Asija et al.
A Taxonomy of Transcendence
Natalie Abreu, Edwin Zhang, Eran Malach et al.
Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers
Shalev Lifshitz, Sheila A. McIlraith, Yilun Du
Retrieval-Augmented Generation with Conflicting Evidence
Han Wang, Archiki Prasad, Elias Stengel-Eskin et al.
Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models
Thao Nguyen, Yang Li, Olga Golovneva et al.
Impact of LLM Alignment on Impression Formation in Social Interactions
Ala N. Tak, Anahita Bolourani, Daniel B. Shank et al.
MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing
Michael Paul Clemens, Ana Marasovic
Breakpoint: Stress-testing systems-level reasoning in LLM agents
Kaivalya Hariharan, Uzay Girit, Zifan Wang et al.
Rhapsody: A Dataset for Highlight Detection in Podcasts
Younghan Park, Anuj Diwan, David Harwath et al.
M-Prometheus: A Suite of Open Multilingual LLM Judges
José Pombal, Dongkeun Yoon, Patrick Fernandes et al.
Task Vectors in In-Context Learning: Emergence, Formation, and Benefits
Liu Yang, Ziqian Lin, Kangwook Lee et al.
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Dongyang Fan, Vinko Sabolčec, Matin Ansaripour et al.
Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang, Issei Sato
Stuffed Mamba: Oversized States Lead to the Inability to Forget
Yingfa Chen, Xinrong Zhang, Shengding Hu et al.
Fluid Language Model Benchmarking
Valentin Hofmann, David Heineman, Ian Magnusson et al.
Data-Centric Human Preference with Rationales for Direct Preference Alignment
Hoang Anh Just, Ming Jin, Anit Kumar Sahu et al.
DynaSaur: Large Language Agents Beyond Predefined Actions
Dang Nguyen, Viet Dac Lai, Seunghyun Yoon et al.
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
Priyanshu Kumar, Devansh Jain, Akhila Yerukola et al.
Advancing Language Multi-Agent Learning with Credit Re-Assignment for Interactive Environment Generalization
Zhitao He, Zijun Liu, Peng Li et al.
Partial Perspectives: How LLMs Handle Logically Inconsistent Knowledge in Reasoning Tasks
Zichao Li, Ines Arous, Jackie CK Cheung
EvalAgents: Discovering Implicit Evaluation Criteria from the Web
Manya Wadhwa, Zayne Rea Sprague, Chaitanya Malaviya et al.
Cats Confuse Reasoning LLM: Query Agnostic Adversarial Triggers for Reasoning Models
Meghana Arakkal Rajeev, Rajkumar Ramamurthy, Prapti Trivedi et al.
ALFA: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning
Shuyue Stella Li, Jimin Mun, Faeze Brahman et al.
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models
Ruikang Liu, Yuxuan Sun, Manyi Zhang et al.
CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation
Anirudh Khatry, Robert Zhang, Jia Pan et al.
On Mechanistic Circuits for Extractive Question-Answering
Samyadeep Basu, Vlad I Morariu, Ryan A. Rossi et al.
Boosting LLM Reasoning via Spontaneous Self-Correction
Xutong Zhao, Tengyu Xu, Xuewei Wang et al.
REM: Evaluating LLM Embodied Spatial Reasoning through Multi-Frame Trajectories
Jacob Thompson, Emiliano Garcia-Lopez, Yonatan Bisk
GenerationPrograms: Fine-grained Attribution with Executable Programs
David Wan, Eran Hirsch, Elias Stengel-Eskin et al.
Multi-Agent Retrieval-Augmented Framework for Evidence-Based Counterspeech Against Health Misinformation
Anirban Saha Anik, Xiaoying Song, Elliott Wang et al.
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
Hanqi Xiao, Yi-Lin Sung, Elias Stengel-Eskin et al.
Not All Data Are Unlearned Equally
Aravind Krishnan, Siva Reddy, Marius Mosbach
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
Lynn Chua, Badih Ghazi, Yangsibo Huang et al.
FineWeb2: One Pipeline to Scale Them All — Adapting Pre-Training Data Processing to Every Language
Guilherme Penedo, Hynek Kydlíček, Vinko Sabolčec et al.
Overcoming Vocabulary Constraints with Pixel-level Fallback
Jonas F. Lotz, Hendra Setiawan, Stephan Peitz et al.
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Junlei Zhang, Zichen Ding, Chang Ma et al.
Spike No More: Stabilizing the Pre-training of Large Language Models
Sho Takase, Shun Kiyono, Sosuke Kobayashi et al.
CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions
Yuchen Huang, Zhiyuan Fan, Zhitao He et al.
VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation
Ziang Ye, Yang Zhang, Wentao Shi et al.
Learning Effective Language Representations for Sequential Recommendation via Joint Embedding Predictive Architecture
Nguyen Anh Minh, Dung D. Le
Reinforcement Learning Enhanced Full-Duplex Spoken Dialogue Language Models for Conversational Interactions
Chen Chen, Ke Hu, Chao-Han Huck Yang et al.
When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars
Rei Higuchi, Ryotaro Kawata, Naoki Nishikawa et al.
Backdoor Attacks on Dense Retrieval via Public and Unintentional Triggers
Quanyu Long, Yue Deng, Leilei Gan et al.
Layers at Similar Depths Generate Similar Activations Across LLM Architectures
Christopher Wolfram, Aaron Schein
Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models in Multi-turn Interactions
Hao Yang, Lizhen Qu, Ehsan Shareghi et al.
Rerouting LLM Routers
Avital Shafran, Roei Schuster, Tom Ristenpart et al.
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration
Ran Xu, Wenqi Shi, Yuchen Zhuang et al.
CoLa: Learning to Interactively Collaborate with Large Language Models
Abhishek Sharma, Dan Goldwasser
Understanding R1-Zero-Like Training: A Critical Perspective
Zichen Liu, Changyu Chen, Wenjun Li et al.
SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation
Zichong Li, Chen Liang, Zixuan Zhang et al.
SEAM: Semantically Equivalent Across Modalities Benchmark for Vision-Language Models
Zhenwei Tang, Difan Jiao, Blair Yang et al.
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
Yogesh Kulkarni, Pooyan Fazli
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Saaket Agashe, Kyle Wong, Vincent Tu et al.
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
Hongzhe Du, Weikai Li, Min Cai et al.
Implicit In-Context Learning: Evidence from Artificial Language Experiments
Xiaomeng Ma, Qihui Xu
To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning
Tian Qin, David Alvarez-Melis, Samy Jelassi et al.
Sharpe Ratio-Guided Active Learning for Preference Optimization in RLHF
Syrine Belakaria, Joshua Kazdan, Charles Marx et al.
SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching
Yuxuan Zhu, Ali Falahati, David H. Yang et al.
Defending LLM Watermarking Against Spoofing Attacks with Contrastive Representation Learning
Li An, Yujian Liu, Yepeng Liu et al.
Do Language Models Agree with Human Perceptions of Suspense in Stories?
Glenn Matlin, Devin Zhang, Rodrigo Barroso Loza et al.
Learning by Teaching: Engaging Students as Instructors of Large Language Models in Computer Science Education
Xinming Yang, Haasil Pujara, Jun Li
CALLME: Call Graph Augmentation with Large Language Models for Javascript
Michael Wang, Kexin Pei, Armando Solar-Lezama
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
Wenhao Zheng, Yixiao Chen, Weitong Zhang et al.
Approximating Language Model Training Data from Weights
John Xavier Morris, Junjie Oscar Yin, Woojeong Kim et al.
Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics
Hamed Mahdavi, Alireza Hashemi, Majid Daliri et al.
Hardware-Efficient Attention for Fast Decoding
Ted Zadouri, Hubert Strauss, Tri Dao
Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality
Sewoong Lee, Adam Davies, Marc E. Canby et al.
In-context Ranking Preference Optimization
Junda Wu, Rohan Surana, Zhouhang Xie et al.
Arctic-Embed 2.0: Multilingual Retrieval Without Compromise
Puxuan Yu, Luke Merrick, Gaurav Nuti et al.
Multilingual Contextualization of Large Language Models for Document-Level Machine Translation
Miguel Moura Ramos, Patrick Fernandes, Sweta Agrawal et al.
SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding
Mingze Xu, Mingfei Gao, Shiyu Li et al.
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Hossein Entezari Zarch, Lei Gao, Chaoyi Jiang et al.
Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources
Weizhi Wang, Yu Tian, Linjie Yang et al.
CRABS: A syntactic-semantic pincer strategy for bounding LLM interpretation of Python notebooks
Meng Li, Timothy M. McPhillips, Dingmin Wang et al.
2 OLMo 2 Furious (COLM’s Version)
Evan Pete Walsh, Luca Soldaini, Dirk Groeneveld et al.
Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning
Anja Šurina, Amin Mansouri, Lars C.P.M. Quaedvlieg et al.
MS-SSM: A Multi-Scale State Space Model for Efficient Sequence Modeling
Mahdi Karami, Ali Behrouz, Peilin Zhong et al.
IterKey: Iterative Keyword Generation with LLMs for Enhanced Retrieval Augmented Generation
Kazuki Hayashi, Hidetaka Kamigaito, Shinya Kouda et al.
Evaluating the Diversity and Quality of LLM Generated Content
Alexander Shypula, Shuo Li, Botong Zhang et al.
QUDsim: Quantifying Discourse Similarities in LLM-Generated Text
Ramya Namuduri, Yating Wu, Anshun Asher Zheng et al.
A Critical Look At Tokenwise Reward-Guided Text Generation
Ahmad Rashid, Ruotian Wu, Julia Grosse et al.
Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation
Ziqiao Ma, Jing Ding, Xuejun Zhang et al.
URANIA: Differentially Private Insights into AI Use
Daogao Liu, Edith Cohen, Badih Ghazi et al.
RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing
Yiqing Xie, Alex Xie, Divyanshu Sheth et al.
Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors
Fan Nie, Lan Feng, Haotian Ye et al.
Guided Reasoning in LLM-Driven Penetration Testing Using Structured Attack Trees
Katsuaki Nakano, Reza Fayyazi, Shanchieh Yang et al.
AdaptMI: Adaptive Skill-based In-context Math Instructions for Small Language Models
Yinghui He, Abhishek Panigrahi, Yong Lin et al.
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Chenrui Fan, Ming Li, Lichao Sun et al.
OpinioRAG: Towards Generating User-Centric Opinion Highlights from Large-scale Online Reviews
Mir Tafseer Nayeem, Davood Rafiei
Learning to Generate Unit Tests for Automated Debugging
Archiki Prasad, Elias Stengel-Eskin, Justin Chen et al.
LLM Unlearning Reveals a Stronger-Than-Expected Coreset Effect in Current Benchmarks
Soumyadeep Pal, Changsheng Wang, James Diffenderfer et al.
Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing
Jihyun Janice Ahn, Wenpeng Yin
Positional Biases Shift as Inputs Approach Context Window Limits
Blerta Veseli, Julian Chibane, Mariya Toneva et al.
ADAPT: Actively Discovering and Adapting to Preferences for any Task
Maithili Patel, Xavier Puig, Ruta Desai et al.
Streaming DiLoCo with overlapping communication
Arthur Douillard, Yani Donchev, J Keith Rush et al.
MuSeD: A Multimodal Spanish Dataset for Sexism Detection in Social Media Videos
Laura De Grazia, Pol Pastells, Mauro Vázquez Chas et al.
EvidenceBench: A Benchmark for Extracting Evidence from Biomedical Papers
Jianyou Wang, Weili Cao, Kaicheng Wang et al.
An Illusion of Progress? Assessing the Current State of Web Agents
Tianci Xue, Weijian Qi, Tianneng Shi et al.
Boundless Byte Pair Encoding: Breaking the Pre-tokenization Barrier
Craig W Schmidt, Varshini Reddy, Chris Tanner et al.
Multi-Agent Systems Execute Arbitrary Malicious Code
Harold Triedman, Rishi Dev Jha, Vitaly Shmatikov
Privately Learning from Graphs with Applications in Fine-tuning Large Language Models
Haoteng Yin, Rongzhe Wei, Eli Chien et al.
Evaluating LLMs on Chinese Idiom Translation
Cai Yang, Yao Dou, David Heineman et al.
Rethinking Safety in LLM Fine-tuning: An Optimization Perspective
Minseon Kim, Jin Myung Kwak, Lama Alssum et al.
Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering
Patrick Fernandes, Sweta Agrawal, Emmanouil Zaranis et al.
Law of Vision Representation in MLLMs
Shijia Yang, Bohan Zhai, Quanzeng You et al.
Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts
Samin Yeasar Arnob, Zhan Su, Minseon Kim et al.
Towards Compute-Optimal Many-Shot In-Context Learning
Shahriar Golchin, Yanfei Chen, Rujun Han et al.
LeakAgent: RL-based Red-teaming Agent for LLM Privacy Leakage
Yuzhou Nie, Zhun Wang, Ye Yu et al.
ReasonIR: Training Retrievers for Reasoning Tasks
Rulin Shao, Rui Qiao, Varsha Kishore et al.
Déjà Vu: Multilingual LLM Evaluation through the Lens of Machine Translation Evaluation
Julia Kreutzer, Eleftheria Briakou, Sweta Agrawal et al.
Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only
Qingru Zhang, Liang Qiu, Ilgee Hong et al.
Self-Steering Language Models
Gabriel Grand, Joshua B. Tenenbaum, Vikash Mansinghka et al.
The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains
Scott Geng, Hamish Ivison, Chun-Liang Li et al.
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
Tong Chen, Faeze Brahman, Jiacheng Liu et al.
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Yubo Wang, Xiang Yue, Wenhu Chen
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving
Yong Lin, Shange Tang, Bohan Lyu et al.
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
Ben Lipkin, Benjamin LeBrun, Jacob Hoover Vigly et al.
On the Effectiveness and Generalization of Race Representations for Debiasing High-Stakes Decisions
Dang Nguyen, Chenhao Tan
Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective
Weijie Xu, Yiwen Wang, Chi Xue et al.
Improving Fisher Information Estimation and Efficiency for LoRA-based LLM Unlearning
Yejin Kim, Eunwon Kim, Buru Chang et al.
Multi-Token Attention
Olga Golovneva, Tianlu Wang, Jason E Weston et al.
From Queries to Criteria: Understanding How Astronomers Evaluate LLMs
Alina Hyk, Kiera McCormick, Mian Zhong et al.
Analyzing Multilingualism in Large Language Models with Sparse Autoencoders
Ikhyun Cho, Julia Hockenmaier
CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
Jixuan Leng, Chengsong Huang, Langlin Huang et al.
Unifying Autoregressive and Diffusion-Based Sequence Generation
Nima Fathi, Torsten Scholak, Pierre-Andre Noel
Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs
Sergey Troshin, Wafaa Mohammed, Yan Meng et al.
UTF-8 Plumbing: Byte-level Tokenizers Unavoidably Enable LLMs to Generate Ill-formed UTF-8
Preston Firestone, Shubham Ugare, Gagandeep Singh et al.
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale
Daniel Goldstein, Eric Alcaide, Janna Lu et al.
Learning Adaptive Parallel Reasoning with Language Models
Jiayi Pan, Xiuyu Li, Long Lian et al.
Why do LLMs attend to the first token?
Federico Barbero, Alvaro Arroyo, Xiangming Gu et al.
Overfill: Two-Stage Models for Efficient Language Model Decoding
Woojeong Kim, Junxiong Wang, Jing Nathan Yan et al.
CLIPPER: Compression enables long-context synthetic data generation
Chau Minh Pham, Yapei Chang, Mohit Iyyer
Society of Mind Meets Real-Time Strategy: A Hierarchical Multi-Agent Framework for Strategic Reasoning
Daechul Ahn, San Kim, Jonghyun Choi
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert, Hardik Bhatnagar, Vishaal Udandarao et al.
SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?
Jianzhu Yao, Kevin Wang, Ryan Hsieh et al.
Teach Old SAEs New Domain Tricks with Boosting
Nikita Koriagin, Yaroslav Aksenov, Daniil Laptev et al.
Improving LLMs‘ Generalized Reasoning Abilities by Graph Problems
Qifan Zhang, Nuo Chen, Zehua Li et al.
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José Pombal, Nuno M Guerreiro, Ricardo Rei et al.
Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation
Amanda Myntti, Erik Henriksson, Veronika Laippala et al.
Scoring Verifiers: Evaluating Synthetic Verification for Code and Reasoning
Aleksander Ficek, Somshubra Majumdar, Vahid Noroozi et al.
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Bolian Li, Yifan Wang, Anamika Lochab et al.
Reverse-engineering NLI: A study of the meta-inferential properties of Natural Language Inference
Rasmus Blanck, Bill Noble, Stergios Chatzikyriakidis
Adversarial Training of Reward Models
Alexander Bukharin, Haifeng Qian, Shengyang Sun et al.
Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs
Itay Itzhak, Yonatan Belinkov, Gabriel Stanovsky
ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models
Archchana Sindhujan, Shenbin Qian, Chan Chi Chun Matthew et al.
The Blessing and Curse of Dimensionality in Safety Alignment
Rachel S.Y. Teo, Laziz Abdullaev, Tan Minh Nguyen
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation
Itay Nakash, Nitay Calderon, Eyal Ben-David et al.