🧬Language Models

Language Modeling

Core language modeling techniques

568 papers(showing top 100)10,201 total citations
Compare with other topics
Mar '24 Feb '26481 papers
Also includes: language modeling, causal language modeling, masked language modeling

Top Papers

#1

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Zhe Chen, Jiannan Wu, Wenhai Wang et al.

CVPR 2024arXiv:2312.14238
2,210
citations
#2

LISA: Reasoning Segmentation via Large Language Model

Xin Lai, Zhuotao Tian, Yukang Chen et al.

CVPR 2024arXiv:2308.00692
721
citations
#3

VILA: On Pre-training for Visual Language Models

Ji Lin, Danny Yin, Wei Ping et al.

CVPR 2024arXiv:2312.07533
685
citations
#4

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Qinghao Ye, Haiyang Xu, Jiabo Ye et al.

CVPR 2024arXiv:2311.04257
601
citations
#5

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Jihan Yang, Shusheng Yang, Anjali W. Gupta et al.

CVPR 2025
342
citations
#6

LLaVA-CoT: Let Vision Language Models Reason Step-by-Step

Guowei Xu, Peng Jin, ZiangWu ZiangWu et al.

ICCV 2025arXiv:2411.10440
338
citations
#7

Mixture-of-Agents Enhances Large Language Model Capabilities

Junlin Wang, Jue Wang, Ben Athiwaratkun et al.

ICLR 2025
274
citations
#8

Large Language Models as Tool Makers

Tianle Cai, Xuezhi Wang, Tengyu Ma et al.

ICLR 2024arXiv:2305.17126
262
citations
#9

On Scaling Up a Multilingual Vision and Language Model

Xi Chen, Josip Djolonga, Piotr Padlewski et al.

CVPR 2024arXiv:2305.18565
254
citations
#10

Language Models Represent Space and Time

Wes Gurnee, Max Tegmark

ICLR 2024arXiv:2310.02207
232
citations
#11

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Jingyi Zhang, Jiaxing Huang, Huanjin Yao et al.

ICCV 2025
206
citations
#12

Training Language Models to Reason Efficiently

Daman Arora, Andrea Zanette

NeurIPS 2025
155
citations
#13

Physics of Language Models: Part 3.2, Knowledge Manipulation

Zeyuan Allen-Zhu, Yuanzhi Li

ICLR 2025
142
citations
#14

SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models

Yuzhou Huang, Liangbin Xie, Xintao Wang et al.

CVPR 2024arXiv:2312.06739
139
citations
#15

Large Language Models as Analogical Reasoners

Michihiro Yasunaga, Xinyun Chen, Yujia Li et al.

ICLR 2024arXiv:2310.01714
131
citations
#16

GSVA: Generalized Segmentation via Multimodal Large Language Models

Zhuofan Xia, Dongchen Han, Yizeng Han et al.

CVPR 2024arXiv:2312.10103
127
citations
#17

Layer by Layer: Uncovering Hidden Representations in Language Models

Oscar Skean, Md Rifat Arefin, Dan Zhao et al.

ICML 2025arXiv:2502.02013
118
citations
#18

Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference

Han Zhao, Min Zhang, Wei Zhao et al.

AAAI 2025arXiv:2403.14520
106
citations
#19

Understanding Catastrophic Forgetting in Language Models via Implicit Inference

Suhas Kotha, Jacob Springer, Aditi Raghunathan

ICLR 2024arXiv:2309.10105
103
citations
#20

LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding

Chuwei Luo, Yufan Shen, Zhaoqing Zhu et al.

CVPR 2024arXiv:2404.05225
98
citations
#21

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Tian Ye, Zicheng Xu, Yuanzhi Li et al.

ICLR 2025arXiv:2407.20311
98
citations
#22

Training Socially Aligned Language Models on Simulated Social Interactions

Ruibo Liu, Ruixin Yang, Chenyan Jia et al.

ICLR 2024arXiv:2305.16960
88
citations
#23

Remarkable Robustness of LLMs: Stages of Inference?

Vedang Lad, Jin Hwa Lee, Wes Gurnee et al.

NeurIPS 2025arXiv:2406.19384
layer deletionadjacent layer swappingstructural interventionsinference stages+4
87
citations
#24

Eliciting Human Preferences with Language Models

Belinda Li, Alex Tamkin, Noah Goodman et al.

ICLR 2025arXiv:2310.11589
76
citations
#25

Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data

Xinyi Wang, Antonis Antoniades, Yanai Elazar et al.

ICLR 2025arXiv:2407.14985
76
citations
#26

GROUNDHOG: Grounding Large Language Models to Holistic Segmentation

Yichi Zhang, Ziqiao Ma, Xiaofeng Gao et al.

CVPR 2024arXiv:2402.16846
75
citations
#27

Graph Neural Prompting with Large Language Models

Yijun Tian, Huan Song, Zichen Wang et al.

AAAI 2024arXiv:2309.15427
graph neural networksknowledge graphslarge language modelsretrieval-augmented generation+4
74
citations
#28

Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models

Wenbin Wang, Liang Ding, Minyan Zeng et al.

AAAI 2025arXiv:2408.15556
73
citations
#29

LLaFS: When Large Language Models Meet Few-Shot Segmentation

Lanyun Zhu, Tianrun Chen, Deyi Ji et al.

CVPR 2024arXiv:2311.16926
73
citations
#30

Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation

Shanshan Zhong, Zhongzhan Huang, Shanghua Gao et al.

CVPR 2024arXiv:2312.02439
65
citations
#31

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Zhen Ye, Peiwen Sun, Jiahe Lei et al.

AAAI 2025arXiv:2408.17175
65
citations
#32

ELLA-V: Stable Neural Codec Language Modeling with Alignment-Guided Sequence Reordering

Yakun Song, Zhuo Chen, Xiaofei Wang et al.

AAAI 2025arXiv:2401.07333
64
citations
#33

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

Zhangyang Qi, Ye Fang, Zeyi Sun et al.

CVPR 2024arXiv:2312.02980
62
citations
#34

Driving Everywhere with Large Language Model Policy Adaptation

Boyi Li, Yue Wang, Jiageng Mao et al.

CVPR 2024arXiv:2402.05932
59
citations
#35

Editing Language Model

Based Knowledge Graph Embeddings

AAAI 2024arXiv:2305.14908
language model editinghallucination mitigationprompt-based editingunsupervised training+3
57
citations
#36

Language Model Inversion

John X. Morris, Wenting Zhao, Justin Chiu et al.

ICLR 2024arXiv:2311.13647
57
citations
#37

SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation

Yi-Chia Chen, WeiHua Li, Cheng Sun et al.

ECCV 2024arXiv:2409.10542
57
citations
#38

Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction

Inhwan Bae, Junoh Lee, Hae-Gon Jeon

CVPR 2024arXiv:2403.18447
54
citations
#39

Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models

Weihao Ye, Qiong Wu, Wenhao Lin et al.

AAAI 2025arXiv:2409.10197
52
citations
#40

Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model

Shraman Pramanick, Guangxing Han, Rui Hou et al.

CVPR 2024arXiv:2312.12423
50
citations
#41

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Tian Ye, Zicheng Xu, Yuanzhi Li et al.

ICLR 2025
48
citations
#42

Calibrating Large Language Models with Sample Consistency

Qing Lyu, Kumar Shridhar, Chaitanya Malaviya et al.

AAAI 2025arXiv:2402.13904
48
citations
#43

Language Model Can Listen While Speaking

Ziyang Ma, Yakun Song, Chenpeng Du et al.

AAAI 2025arXiv:2408.02622
47
citations
#44

ALLaM: Large Language Models for Arabic and English

M Saiful Bari, Yazeed Alnumay, Norah Alzahrani et al.

ICLR 2025arXiv:2407.15390
47
citations
#45

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

Andy (DiJia) Su, Hanlin Zhu, Yingchen Xu et al.

ICML 2025arXiv:2502.03275
45
citations
#46

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

Kai Chen, Chunwei Wang, Kuo Yang et al.

ICLR 2024arXiv:2310.10477
44
citations
#47

A Vision Check-up for Language Models

Pratyusha Sharma, Tamar Rott Shaham, Manel Baradad et al.

CVPR 2024arXiv:2401.01862
40
citations
#48

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities

Zhaofeng Wu, Xinyan Yu, Dani Yogatama et al.

ICLR 2025
39
citations
#49

Scaling Language-Free Visual Representation Learning

David Fan, Shengbang Tong, Jiachen Zhu et al.

ICCV 2025arXiv:2504.01017
visual self-supervised learningcontrastive language-image pretrainingmultimodal representation learningvision encoders+2
39
citations
#50

Multi-Objective Evolution of Heuristic Using Large Language Model

Shunyu Yao, Fei Liu, Xi Lin et al.

AAAI 2025arXiv:2409.16867
34
citations
#51

Which Attention Heads Matter for In-Context Learning?

Kayo Yin, Jacob Steinhardt

ICML 2025arXiv:2502.14010
34
citations
#52

Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models

Lingzhi Wang, Xingshan Zeng, Jinsong Guo et al.

AAAI 2025arXiv:2402.05813
33
citations
#53

LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models

Shenghao Fu, Qize Yang, Qijie Mo et al.

CVPR 2025arXiv:2501.18954
33
citations
#54

Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

Michael Zhang, W. Bradley Knox, Eunsol Choi

ICLR 2025arXiv:2410.13788
clarifying question generationambiguous user requestspreference labeling methodsfuture conversation modeling+4
32
citations
#55

Emergence of a High-Dimensional Abstraction Phase in Language Transformers

Emily Cheng, Diego Doimo, Corentin Kervadec et al.

ICLR 2025arXiv:2405.15471
32
citations
#56

From Tokens to Words: On the Inner Lexicon of LLMs

Guy Kaplan, Matanel Oren, Yuval Reif et al.

ICLR 2025
30
citations
#57

CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models

Zihui Cheng, Qiguang Chen, Jin Zhang et al.

AAAI 2025arXiv:2412.12932
30
citations
#58

Evolutionary Large Language Model for Automated Feature Transformation

Nanxu Gong, Chandan K Reddy, Wangyang Ying et al.

AAAI 2025arXiv:2405.16203
30
citations
#59

Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning

Xinlu Zhang, Zhiyu Zoey Chen, Xi Ye et al.

AAAI 2025arXiv:2405.20535
30
citations
#60

What Makes Large Language Models Reason in (Multi-Turn) Code Generation?

Kunhao Zheng, Juliette Decugis, Jonas Gehring et al.

ICLR 2025arXiv:2410.08105
30
citations
#61

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

Jianzong Wu, Xiangtai Li, Chenyang Si et al.

CVPR 2024arXiv:2401.10226
30
citations
#62

Universal Segmentation at Arbitrary Granularity with Language Instruction

Yong Liu, Cairong Zhang, Yitong Wang et al.

CVPR 2024arXiv:2312.01623
30
citations
#63

PolyVoice: Language Models for Speech to Speech Translation

Qianqian Dong, Zhiying Huang, Qiao Tian et al.

ICLR 2024arXiv:2306.02982
29
citations
#64

Understanding In-Context Learning from Repetitions

Jianhao (Elliott) Yan, Jin Xu, Chiyu Song et al.

ICLR 2024arXiv:2310.00297
29
citations
#65

Longhorn: State Space Models are Amortized Online Learners

Bo Liu, Rui Wang, Lemeng Wu et al.

ICLR 2025arXiv:2407.14207
29
citations
#66

Unified Language-driven Zero-shot Domain Adaptation

Senqiao Yang, Zhuotao Tian, Li Jiang et al.

CVPR 2024arXiv:2404.07155
29
citations
#67

A Comprehensive Overhaul of Multimodal Assistant with Small Language Models

Minjie Zhu, Yichen Zhu, Ning Liu et al.

AAAI 2025arXiv:2403.06199
27
citations
#68

Distilling Multi-modal Large Language Models for Autonomous Driving

Deepti Hegde, Rajeev Yasarla, Hong Cai et al.

CVPR 2025arXiv:2501.09757
autonomous drivingmotion planningmulti-modal llmsvision-based planner+4
27
citations
#69

GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling

Jixun Yao, Hexin Liu, CHEN CHEN et al.

ICLR 2025arXiv:2502.02942
27
citations
#70

Open Models, Closed Minds? On Agents Capabilities in Mimicking Human Personalities through Open Large Language Models

Lucio La Cava, Andrea Tagarelli

AAAI 2025arXiv:2401.07115
25
citations
#71

RouteLLM: Learning to Route LLMs from Preference Data

Isaac Ong, Amjad Almahairi, Vincent Wu et al.

ICLR 2025
24
citations
#72

Min-K%++: Improved Baseline for Pre-Training Data Detection from Large Language Models

Jingyang Zhang, Jingwei Sun, Eric Yeats et al.

ICLR 2025
pre-training data detectionlarge language modelsmaximum likelihood trainingconditional categorical distribution+4
24
citations
#73

Context-Aware Meta-Learning

Christopher Fifty, Dennis Duan, Ronald Junkins et al.

ICLR 2024arXiv:2310.10971
24
citations
#74

Language Representations Can be What Recommenders Need: Findings and Potentials

Leheng Sheng, An Zhang, Yi Zhang et al.

ICLR 2025arXiv:2407.05441
23
citations
#75

CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation

Jiahao Li, Weijian Ma, Xueyang Li et al.

CVPR 2025
23
citations
#76

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Yuxuan Cai, Jiangning Zhang, Haoyang He et al.

ICCV 2025arXiv:2410.16236
knowledge distillationmultimodal large language modelsvision-language understandingmodel compression+4
22
citations
#77

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

USVSN Sai Prashanth, Alvin Deng, Kyle O'Brien et al.

ICLR 2025arXiv:2406.17746
22
citations
#78

Monitoring Latent World States in Language Models with Propositional Probes

Jiahai Feng, Stuart Russell, Jacob Steinhardt

ICLR 2025arXiv:2406.19501
21
citations
#79

SyllableLM: Learning Coarse Semantic Units for Speech Language Models

Alan Baade, Puyuan Peng, David Harwath

ICLR 2025arXiv:2410.04029
speech language modelssemantic tokenizationsyllable segmentationrepresentation distillation+3
21
citations
#80

Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models

Ma Teng, Xiaojun Jia, Ranjie Duan et al.

ICCV 2025arXiv:2412.05934
jailbreak attacksmultimodal large language modelsadversarial capabilitiesmultimodal risk distribution+3
21
citations
#81

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices

Xudong LU, Yinghao Chen, chencheng Chen et al.

CVPR 2025
20
citations
#82

Enhancing Chain of Thought Prompting in Large Language Models via Reasoning Patterns

Yufeng Zhang, Xuepeng Wang, Lingxiang Wu et al.

AAAI 2025arXiv:2404.14812
20
citations
#83

Customizing Language Model Responses with Contrastive In-Context Learning

Xiang Gao, Kamalika Das

AAAI 2024arXiv:2401.17390
contrastive learninglanguage model alignmentin-context learningintent customization+4
19
citations
#84

How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension

Xinnan Dai, Haohao QU, Yifei Shen et al.

ICLR 2025arXiv:2410.05298
graph pattern comprehensionlarge language modelsgraph pattern miningcomputational chemistry+4
19
citations
#85

Pruning Large Language Models with Semi-Structural Adaptive Sparse Training

Weiyu Huang, Yuezhou Hu, Guohao Jian et al.

AAAI 2025arXiv:2407.20584
19
citations
#86

Non-myopic Generation of Language Models for Reasoning and Planning

Chang Ma, Haiteng Zhao, Junlei Zhang et al.

ICLR 2025
18
citations
#87

What Kind of Visual Tokens Do We Need? Training-Free Visual Token Pruning for Multi-Modal Large Language Models from the Perspective of Graph

Yutao Jiang, Qiong Wu, Wenhao Lin et al.

AAAI 2025arXiv:2501.02268
18
citations
#88

Cloud-Device Collaborative Learning for Multimodal Large Language Models

Guanqun Wang, Jiaming Liu, Chenxuan Li et al.

CVPR 2024arXiv:2312.16279
18
citations
#89

Idiosyncrasies in Large Language Models

Mingjie Sun, Yida Yin, Zhiqiu (Oscar) Xu et al.

ICML 2025arXiv:2502.12150
17
citations
#90

Design Principle Transfer in Neural Architecture Search via Large Language Models

Xun Zhou, Xingyu Wu, Liang Feng et al.

AAAI 2025arXiv:2408.11330
17
citations
#91

Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures

Junxuan Wang, Xuyang Ge, Wentao Shu et al.

ICLR 2025
17
citations
#92

Controlling Large Language Models Through Concept Activation Vectors

Hanyu Zhang, Xiting Wang, Chengao Li et al.

AAAI 2025arXiv:2501.05764
16
citations
#93

Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages

Wanru Zhao, Yihong Chen, Royson Lee et al.

ICLR 2024arXiv:2507.03003
16
citations
#94

LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model

Dongkai Wang, shiyu xuan, Shiliang Zhang

CVPR 2024arXiv:2406.04659
16
citations
#95

Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective

Ming Zhong, Chenxin An, Weizhu Chen et al.

ICLR 2024arXiv:2310.11451
16
citations
#96

Scalable Influence and Fact Tracing for Large Language Model Pretraining

Tyler Chang, Dheeraj Rajagopal, Tolga Bolukbasi et al.

ICLR 2025
16
citations
#97

Training Neural Networks as Recognizers of Formal Languages

Alexandra Butoi, Ghazal Khalighinejad, Anej Svete et al.

ICLR 2025
16
citations
#98

Logically Consistent Language Models via Neuro-Symbolic Integration

Diego Calanzone, Stefano Teso, Antonio Vergari

ICLR 2025
15
citations
#99

VLM4D: Towards Spatiotemporal Awareness in Vision Language Models

Shijie Zhou, Alexander Vilesov, Xuehai He et al.

ICCV 2025arXiv:2508.02095
15
citations
#100

FOLDER: Accelerating Multi-Modal Large Language Models with Enhanced Performance

Haicheng Wang, Zhemeng Yu, Gabriele Spadaro et al.

ICCV 2025
15
citations