"language models" Papers
42 papers found
Conference
Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages
Zui Chen, Tianqiao Liu, Tongqing et al.
A Taxonomy of Transcendence
Natalie Abreu, Edwin Zhang, Eran Malach et al.
Better Estimation of the Kullback--Leibler Divergence Between Language Models
Afra Amini, Tim Vieira, Ryan Cotterell
Dense SAE Latents Are Features, Not Bugs
Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Rosie Zhao, Alexandru Meterez, Sham M. Kakade et al.
Emergence of Linear Truth Encodings in Language Models
Shauli Ravfogel, Gilad Yehudai, Tal Linzen et al.
Fluid Language Model Benchmarking
Valentin Hofmann, David Heineman, Ian Magnusson et al.
Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models
Cong Fu, Xiner Li, Blake Olson et al.
Generalizing Verifiable Instruction Following
Valentina Pyatkin, Saumya Malik, Victoria Graf et al.
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
Jixun Yao, Hexin Liu, CHEN CHEN et al.
HOIGPT: Learning Long-Sequence Hand-Object Interaction with Language Models
Mingzhen Huang, Fu-Jen Chu, Bugra Tekin et al.
Language Representations Can be What Recommenders Need: Findings and Potentials
Leheng Sheng, An Zhang, Yi Zhang et al.
Learning Effective Language Representations for Sequential Recommendation via Joint Embedding Predictive Architecture
Nguyen Anh Minh, Dung D. Le
Mechanistic Permutability: Match Features Across Layers
Nikita Balagansky, Ian Maksimov, Daniil Gavrilov
Multi-modal Learning: A Look Back and the Road Ahead
Divyam Madaan, Sumit Chopra, Kyunghyun Cho
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
Weijia Shi, Jaechan Lee, Yangsibo Huang et al.
Number Cookbook: Number Understanding of Language Models and How to Improve It
Haotong Yang, Yi Hu, Shijia Kang et al.
Privately Learning from Graphs with Applications in Fine-tuning Large Language Models
Haoteng Yin, Rongzhe Wei, Eli Chien et al.
Revisiting Random Walks for Learning on Graphs
Jinwoo Kim, Olga Zaghen, Ayhan Suleymanzade et al.
Self-Steering Language Models
Gabriel Grand, Joshua B. Tenenbaum, Vikash Mansinghka et al.
Spurious Forgetting in Continual Learning of Language Models
Junhao Zheng, Xidi Cai, Shengjie Qiu et al.
Style over Substance: Distilled Language Models Reason Via Stylistic Replication
Philip Lippmann, Jie Yang
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Makoto Shing, Kou Misaki, Han Bao et al.
The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation
Aoxiong Yin, Kai Shen, Yichong Leng et al.
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
Xinyu Zhu, Mengzhou Xia, Zhepei Wei et al.
TopoNets: High performing vision and language models with brain-like topography
Mayukh Deb, Mainak Deb, Apurva Murty
Transformers without Normalization
Jiachen Zhu, Xinlei Chen, Kaiming He et al.
Unifying Autoregressive and Diffusion-Based Sequence Generation
Nima Fathi, Torsten Scholak, Pierre-Andre Noel
Values in the Wild: Discovering and Mapping Values in Real-World Language Model Interactions
Saffron Huang, Esin DURMUS, Kunal Handa et al.
Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It
Yulu Qin, Dheeraj Varghese, Adam Dahlgren Lindström et al.
Applying language models to algebraic topology: generating simplicial cycles using multi-labeling in Wu's formula
Kirill Brilliantov, Fedor Pavutnitskiy, Dmitrii A. Pasechniuk et al.
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Itamar Zimerman, Moran Baruch, Nir Drucker et al.
Emergent Representations of Program Semantics in Language Models Trained on Programs
Charles Jin, Martin Rinard
Instruction Tuning for Secure Code Generation
Jingxuan He, Mark Vero, Gabriela Krasnopolska et al.
Language Models as Semantic Indexers
Bowen Jin, Hansi Zeng, Guoyin Wang et al.
Model-Based Minimum Bayes Risk Decoding for Text Generation
Yuu Jinnai, Tetsuro Morimura, Ukyo Honda et al.
Multi-Prompts Learning with Cross-Modal Alignment for Attribute-Based Person Re-identification
Yajing Zhai, Yawen Zeng, Zhiyong Huang et al.
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Xiang Meng, Shibal Ibrahim, Kayhan Behdin et al.
Position: Do pretrained Transformers Learn In-Context by Gradient Descent?
Lingfeng Shen, Aayush Mishra, Daniel Khashabi
Revisiting Character-level Adversarial Attacks for Language Models
Elias Abad Rocamora, Yongtao Wu, Fanghui Liu et al.
Simple linear attention language models balance the recall-throughput tradeoff
Simran Arora, Sabri Eyuboglu, Michael Zhang et al.
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Shida Wang, Qianxiao Li