"large language models" Papers

986 papers found • Page 15 of 20

Weak to Strong Generalization for Large Language Models with Multi-capabilities

Yucheng Zhou, Jianbing Shen, Yu Cheng

ICLR 2025
70
citations

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Hyungjoo Chae, Namyoung Kim, Kai Ong et al.

ICLR 2025arXiv:2410.13232
65
citations

What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers

Pulkit Gopalani, Wei Hu

NEURIPS 2025arXiv:2506.13688
2
citations

What Makes Large Language Models Reason in (Multi-Turn) Code Generation?

Kunhao Zheng, Juliette Decugis, Jonas Gehring et al.

ICLR 2025arXiv:2410.08105
34
citations

What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models

Keyon Vafa, Sarah Bentley, Jon Kleinberg et al.

NEURIPS 2025arXiv:2503.17482
2
citations

When Can Model-Free Reinforcement Learning be Enough for Thinking?

Josiah Hanna, Nicholas Corrado

NEURIPS 2025arXiv:2506.17124
1
citations

Where, What, Why: Towards Explainable Driver Attention Prediction

Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao et al.

ICCV 2025highlightarXiv:2506.23088
6
citations

Why Does the Effective Context Length of LLMs Fall Short?

Chenxin An, Jun Zhang, Ming Zhong et al.

ICLR 2025arXiv:2410.18745
42
citations

Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation

Sungmin Cha, Kyunghyun Cho

NEURIPS 2025arXiv:2505.13111
4
citations

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Yuichi Inoue, Kou Misaki, Yuki Imajuku et al.

NEURIPS 2025spotlightarXiv:2503.04412
24
citations

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu et al.

ICLR 2025arXiv:2406.04770
151
citations

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Haipeng Luo, Qingfeng Sun, Can Xu et al.

ICLR 2025arXiv:2308.09583
655
citations

WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment

Jiefu Ou, Arda Uzunoğlu, Benjamin Van Durme et al.

AAAI 2025paperarXiv:2407.07778
3
citations

WritingBench: A Comprehensive Benchmark for Generative Writing

Yuning Wu, Jiahao Mei, Ming Yan et al.

NEURIPS 2025arXiv:2503.05244
46
citations

xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation

Qingchen Yu, Zifan Zheng, Shichao Song et al.

ICLR 2025arXiv:2405.11874
15
citations

Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations

Jeong Hun Yeo, Minsu Kim, Chae Won Kim et al.

ICCV 2025arXiv:2503.06273
5
citations

Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models

José Pombal, Nuno M Guerreiro, Ricardo Rei et al.

COLM 2025paperarXiv:2504.01001
8
citations

Zero-shot Model-based Reinforcement Learning using Large Language Models

Abdelhakim Benechehab, Youssef Attia El Hili, Ambroise Odonnat et al.

ICLR 2025arXiv:2410.11711
5
citations

$S^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting

Zijie Pan, Yushan Jiang, Sahil Garg et al.

ICML 2024oralarXiv:2403.05798
17
citations

Accelerated Speculative Sampling Based on Tree Monte Carlo

Zhengmian Hu, Heng Huang

ICML 2024

Accurate LoRA-Finetuning Quantization of LLMs via Information Retention

Haotong Qin, Xudong Ma, Xingyu Zheng et al.

ICML 2024arXiv:2402.05445
74
citations

A Closer Look at the Limitations of Instruction Tuning

Sreyan Ghosh, Chandra Kiran Evuru, Sonal Kumar et al.

ICML 2024arXiv:2402.05119
83
citations

A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators

Chen Zhang, L. F. D’Haro, Yiming Chen et al.

AAAI 2024paperarXiv:2312.15407
49
citations

Active Preference Learning for Large Language Models

William Muldrew, Peter Hayes, Mingtian Zhang et al.

ICML 2024arXiv:2402.08114
46
citations

Adaptive Text Watermark for Large Language Models

Yepeng Liu, Yuheng Bu

ICML 2024arXiv:2401.13927
55
citations

Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark

Fangjun Li, David C. Hogg, Anthony G. Cohn

AAAI 2024paperarXiv:2401.03991
53
citations

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Nicholas Crispino, Kyle Montgomery, Fankun Zeng et al.

ICML 2024arXiv:2310.03710
40
citations

Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models

Bilgehan Sel, Ahmad Al-Tawaha, Vanshaj Khattar et al.

ICML 2024arXiv:2308.10379
99
citations

AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training

Ziyu Wan, Xidong Feng, Muning Wen et al.

ICML 2024arXiv:2309.17179
304
citations

A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?

Agustinus Kristiadi, Felix Strieth-Kalthoff, Marta Skreta et al.

ICML 2024arXiv:2402.05015
44
citations

Assessing Large Language Models on Climate Information

Jannis Bulian, Mike Schäfer, Afra Amini et al.

ICML 2024arXiv:2310.02932
34
citations

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Yuan Chen, Zi-han Ding, Ziqin Wang et al.

ECCV 2024arXiv:2406.14556
35
citations

A Tale of Tails: Model Collapse as a Change of Scaling Laws

Elvis Dohmatob, Yunzhen Feng, Pu Yang et al.

ICML 2024arXiv:2402.07043
110
citations

Autoformalizing Euclidean Geometry

Logan Murphy, Kaiyu Yang, Jialiang Sun et al.

ICML 2024arXiv:2405.17216
14
citations

AutoOS: Make Your OS More Powerful by Exploiting Large Language Models

Huilai Chen, Yuanbo Wen, Limin Cheng et al.

ICML 2024

BAT: Learning to Reason about Spatial Sounds with Large Language Models

Zhisheng Zheng, Puyuan Peng, Ziyang Ma et al.

ICML 2024arXiv:2402.01591
40
citations

Benchmarking Large Language Models in Retrieval-Augmented Generation

Jiawei Chen, Hongyu Lin, Xianpei Han et al.

AAAI 2024paperarXiv:2309.01431
475
citations

BetterV: Controlled Verilog Generation with Discriminative Guidance

Zehua Pei, Huiling Zhen, Mingxuan Yuan et al.

ICML 2024arXiv:2402.03375
141
citations

BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization

Lancheng Zou, Wenqian Zhao, Shuo Yin et al.

ICML 2024

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Wei Huang, Yangdong Liu, Haotong Qin et al.

ICML 2024arXiv:2402.04291
142
citations

Bootstrapping Variational Information Pursuit with Large Language and Vision Models for Interpretable Image Classification

Aditya Chattopadhyay, Kwan Ho Ryan Chan, Rene Vidal

ICLR 2024

Can AI Assistants Know What They Don't Know?

Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu et al.

ICML 2024arXiv:2401.13275
43
citations

Case-Based or Rule-Based: How Do Transformers Do the Math?

Yi Hu, Xiaojuan Tang, Haotong Yang et al.

ICML 2024arXiv:2402.17709
32
citations

Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension

Fan Yin, Jayanth Srinivasa, Kai-Wei Chang

ICML 2024arXiv:2402.18048
40
citations

CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback

Henry W. Sprueill, Carl Edwards, Khushbu Agarwal et al.

ICML 2024arXiv:2402.10980
19
citations

Coactive Learning for Large Language Models using Implicit User Feedback

Aaron D. Tucker, Kianté Brantley, Adam Cahall et al.

ICML 2024

Code-Style In-Context Learning for Knowledge-Based Question Answering

Zhijie Nie, Richong Zhang, Zhongyuan Wang et al.

AAAI 2024paperarXiv:2309.04695
19
citations

CogBench: a large language model walks into a psychology lab

Julian Coda-Forno, Marcel Binz, Jane Wang et al.

ICML 2024oralarXiv:2402.18225
57
citations

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

Yiming Huang, WEILIN WAN, Yue Yang et al.

ECCV 2024arXiv:2403.13900
50
citations

Compressing Large Language Models by Joint Sparsification and Quantization

Jinyang Guo, Jianyu Wu, Zining Wang et al.

ICML 2024