Most Cited COLM "constraint-aware learning" Papers
418 papers found • Page 3 of 3
Conference
Hidden in plain sight: VLMs overlook their visual representations
Stephanie Fu, tyler bonnen, Devin Guillory et al.
On Mechanistic Circuits for Extractive Question-Answering
Samyadeep Basu, Vlad I Morariu, Ryan A. Rossi et al.
Partial Perspectives: How LLMs Handle Logically Inconsistent Knowledge in Reasoning Tasks
Zichao Li, Ines Arous, Jackie CK Cheung
Teaching Models to Understand (but not Generate) High-risk Data
Ryan Yixiang Wang, Matthew Finlayson, Luca Soldaini et al.
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
Tao Yuan, Xuefei Ning, Dong Zhou et al.
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard, Hippolyte Gisserot-Boukhlef, Duarte Miguel Alves et al.
REM: Evaluating LLM Embodied Spatial Reasoning through Multi-Frame Trajectories
Jacob Thompson, Emiliano Garcia-Lopez, Yonatan Bisk
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Interactive AI Agents
Xuhui Zhou, Hyunwoo Kim, Faeze Brahman et al.
Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang, Issei Sato
Can LLM "Self-report"?: Evaluating the Validity of Self-report Scales in Measuring Personality Design in LLM-based Chatbots
Huiqi Zou, Pengda Wang, Zihan Yan et al.
GenerationPrograms: Fine-grained Attribution with Executable Programs
David Wan, Eran Hirsch, Elias Stengel-Eskin et al.
UNVEILING: What Makes Linguistics Olympiad Puzzles Tricky for LLMs?
Mukund Choudhary, KV Aditya Srivatsa, Gaurja Aeron et al.
Multilingual and Multi-Accent Jailbreaking of Audio LLMs
Jaechul Roh, Virat Shejwalkar, Amir Houmansadr
Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback
Runlong Zhou, Maryam Fazel, Simon Shaolei Du
Language Models Fail to Introspect About Their Knowledge of Language
Siyuan Song, Jennifer Hu, Kyle Mahowald
QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization
Shiyue Zhang, David Wan, Arie Cattan et al.
Rhapsody: A Dataset for Highlight Detection in Podcasts
Younghan Park, Anuj Diwan, David Harwath et al.
Overflow Prevention Enhances Long-Context Recurrent LLMs
Assaf Ben-Kish, Itamar Zimerman, Muhammad Jehanzeb Mirza et al.