"large language models" Papers

634 papers found • Page 2 of 13

C3PO: Optimized Large Language Model Cascades with Probabilistic Cost Constraints for Reasoning

Antonios Valkanas, Soumyasundar Pal, Pavel Rumiantsev et al.

NeurIPS 2025posterarXiv:2511.07396

Calibrating Translation Decoding with Quality Estimation on LLMs

Di Wu, Yibin Lei, Christof Monz

NeurIPS 2025posterarXiv:2504.19044

CAMEx: Curvature-aware Merging of Experts

Dung Viet Nguyen, Minh Nguyen, Luc Nguyen et al.

ICLR 2025posterarXiv:2502.18821
6
citations

Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

Hanlei Zhang, zhuohang li, Hua Xu et al.

NeurIPS 2025posterarXiv:2504.16427
2
citations

Can LLMs Outshine Conventional Recommenders? A Comparative Evaluation

Qijiong Liu, Jieming Zhu, Lu Fan et al.

NeurIPS 2025posterarXiv:2503.05493
4
citations

Can LLMs Reason Over Non-Text Modalities in a Training-Free Manner? A Case Study with In-Context Representation Learning

Tianle Zhang, Wanlong Fang, Jonathan Woo et al.

NeurIPS 2025posterarXiv:2509.17552
1
citations

Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?

Egor Zverev, Sahar Abdelnabi, Soroush Tabesh et al.

ICLR 2025posterarXiv:2403.06833
45
citations

Can LLMs Solve Longer Math Word Problems Better?

Xin Xu, Tong Xiao, Zitong Chao et al.

ICLR 2025posterarXiv:2405.14804
25
citations

Can LLMs Understand Time Series Anomalies?

Zihao Zhou, Rose Yu

ICLR 2025posterarXiv:2410.05440
32
citations

Can We Infer Confidential Properties of Training Data from LLMs?

Pengrun Huang, Chhavi Yadav, Kamalika Chaudhuri et al.

NeurIPS 2025spotlightarXiv:2506.10364
3
citations

Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework

Laura Kopf, Nils Feldhus, Kirill Bykov et al.

NeurIPS 2025posterarXiv:2506.15538
4
citations

Catastrophic Failure of LLM Unlearning via Quantization

Zhiwei Zhang, Fali Wang, Xiaomin Li et al.

ICLR 2025posterarXiv:2410.16454
43
citations

Causally Motivated Sycophancy Mitigation for Large Language Models

Haoxi Li, Xueyang Tang, Jie ZHANG et al.

ICLR 2025poster
8
citations

Causal Reasoning and Large Language Models: Opening a New Frontier for Causality

Chenhao Tan, Robert Ness, Amit Sharma et al.

ICLR 2025posterarXiv:2305.00050
390
citations

CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models

Song Wang, Peng Wang, Tong Zhou et al.

ICLR 2025posterarXiv:2407.02408
13
citations

CellVerse: Do Large Language Models Really Understand Cell Biology?

Fan Zhang, Tianyu Liu, Zhihong Zhu et al.

NeurIPS 2025posterarXiv:2505.07865
4
citations

Certifying Counterfactual Bias in LLMs

Isha Chaudhary, Qian Hu, Manoj Kumar et al.

ICLR 2025posterarXiv:2405.18780
4
citations

CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL

Mohammadreza Pourreza, Hailong Li, Ruoxi Sun et al.

ICLR 2025posterarXiv:2410.01943
116
citations

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Peng Xu, Wei Ping, Xianchao Wu et al.

ICLR 2025posterarXiv:2407.14482

ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning

Xiangru Tang, Tianyu Hu, Muyang Ye et al.

ICLR 2025poster
4
citations

CIDD: Collaborative Intelligence for Structure-Based Drug Design Empowered by LLMs

Bowen Gao, Yanwen Huang, Yiqiao Liu et al.

NeurIPS 2025poster

Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code

Augusto B. Corrêa, André G. Pereira, Jendrik Seipp

NeurIPS 2025posterarXiv:2503.18809
13
citations

CLAWS:Creativity detection for LLM-generated solutions using Attention Window of Sections

Keuntae Kim, Eunhye Jeong, Sehyeon Lee et al.

NeurIPS 2025poster

ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models

Veeramakali Vignesh Manivannan, Yasaman Jafari, Srikar Eranky et al.

ICLR 2025posterarXiv:2410.16701
3
citations

ClinBench: A Standardized Multi-Domain Framework for Evaluating Large Language Models in Clinical Information Extraction

Ismael Villanueva Miranda, Zifan Gu, Donghan Yang et al.

NeurIPS 2025poster

CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark

Jian Wu, Linyi Yang, Zhen Wang et al.

ICLR 2025posterarXiv:2402.11924
14
citations

Competing Large Language Models in Multi-Agent Gaming Environments

Jen-Tse Huang, Eric John Li, Man Ho LAM et al.

ICLR 2025poster

Computation and Memory-Efficient Model Compression with Gradient Reweighting

Zhiwei Li, Yuesen Liao, Binrui Wu et al.

NeurIPS 2025poster

Concept-Guided Interpretability via Neural Chunking

Shuchen Wu, Stephan Alaniz, Shyamgopal Karthik et al.

NeurIPS 2025posterarXiv:2505.11576

Concept Incongruence: An Exploration of Time and Death in Role Playing

Xiaoyan Bai, Ike Peng, Aditya Singh et al.

NeurIPS 2025oralarXiv:2505.14905
1
citations

Conditional Representation Learning for Customized Tasks

Honglin Liu, Chao Sun, Peng Hu et al.

NeurIPS 2025spotlightarXiv:2510.04564

Confidence Elicitation: A New Attack Vector for Large Language Models

Brian Formento, Chuan Sheng Foo, See-Kiong Ng

ICLR 2025posterarXiv:2502.04643
2
citations

ConfTuner: Training Large Language Models to Express Their Confidence Verbally

Yibo Li, Miao Xiong, Jiaying Wu et al.

NeurIPS 2025posterarXiv:2508.18847
10
citations

Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Sachin Goyal, Christina Baek, Zico Kolter et al.

ICLR 2025poster
9
citations

ConTextTab: A Semantics-Aware Tabular In-Context Learner

Marco Spinaci, Marek Polewczyk, Maximilian Schambach et al.

NeurIPS 2025spotlightarXiv:2506.10707
7
citations

Contextualizing biological perturbation experiments through language

Menghua (Rachel) Wu, Russell Littman, Jacob Levine et al.

ICLR 2025posterarXiv:2502.21290
3
citations

CoP: Agentic Red-teaming for Large Language Models using Composition of Principles

Chen Xiong, Pin-Yu Chen, Tsung-Yi Ho

NeurIPS 2025posterarXiv:2506.00781
3
citations

Cost-aware LLM-based Online Dataset Annotation

Eray Can Elumar, Cem Tekin, Osman Yagan

NeurIPS 2025spotlightarXiv:2505.15101
1
citations

Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models

Sophia Han, Howard Dai, Stephen Xia et al.

NeurIPS 2025posterarXiv:2505.10844
1
citations

CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

Xiaoshuai Song, Muxi Diao, Guanting Dong et al.

ICLR 2025posterarXiv:2406.08587
27
citations

CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset

Xiao Wang, Fuling Wang, Yuehang Li et al.

CVPR 2025posterarXiv:2410.00379
16
citations

DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding

Yue Jiang, Jichu Li, Yang Liu et al.

NeurIPS 2025oralarXiv:2505.18411
3
citations

DataGen: Unified Synthetic Dataset Generation via Large Language Models

Yue Huang, Siyuan Wu, Chujie Gao et al.

ICLR 2025posterarXiv:2406.18966
21
citations

DataMan: Data Manager for Pre-training Large Language Models

Ru Peng, Kexin Yang, Yawen Zeng et al.

ICLR 2025posterarXiv:2502.19363
8
citations

DataSIR: A Benchmark Dataset for Sensitive Information Recognition

Fan Mo, Bo Liu, Yuan Fan et al.

NeurIPS 2025poster

DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding

Henry Zheng, Hao Shi, Qihang Peng et al.

ICLR 2025posterarXiv:2505.04965
8
citations

Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing

Yisong Xiao, Aishan Liu, Siyuan Liang et al.

NeurIPS 2025posterarXiv:2510.01243
2
citations

DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation

Zhixuan Liang, Yao Mu, Yixiao Wang et al.

CVPR 2025posterarXiv:2411.18562
12
citations

DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models

Simone Carnemolla, Matteo Pennisi, Sarinda Samarasinghe et al.

NeurIPS 2025spotlightarXiv:2510.14741

Differentially Private Federated Low Rank Adaptation Beyond Fixed-Matrix

Ming Wen, Jiaqi Zhu, Yuedong Xu et al.

NeurIPS 2025posterarXiv:2507.09990