"large language models" Papers

986 papers found • Page 20 of 20

The Illusion of State in State-Space Models

William Merrill, Jackson Petty, Ashish Sabharwal

ICML 2024arXiv:2404.08819
128
citations

Thermometer: Towards Universal Calibration for Large Language Models

Maohao Shen, Subhro Das, Kristjan Greenewald et al.

ICML 2024arXiv:2403.08819
26
citations

tinyBenchmarks: evaluating LLMs with fewer examples

Felipe Maia Polo, Lucas Weber, Leshem Choshen et al.

ICML 2024arXiv:2402.14992
178
citations

tnGPS: Discovering Unknown Tensor Network Structure Search Algorithms via Large Language Models (LLMs)

Junhua Zeng, Chao Li, Zhun Sun et al.

ICML 2024arXiv:2402.02456
9
citations

To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO

Zi-Hao Qiu, Siqi Guo, Mao Xu et al.

ICML 2024arXiv:2404.04575
10
citations

To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models

George-Octavian Bărbulescu, Peter Triantafillou

ICML 2024

Token-level Direct Preference Optimization

Yongcheng Zeng, Guoqing Liu, Weiyu Ma et al.

ICML 2024arXiv:2404.11999
120
citations

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

Mingjia Huo, Sai Ashish Somayajula, Youwei Liang et al.

ICML 2024arXiv:2402.18059
30
citations

Toward Adaptive Reasoning in Large Language Models with Thought Rollback

Sijia Chen, Baochun Li

ICML 2024arXiv:2412.19707
16
citations

Towards AutoAI: Optimizing a Machine Learning System with Black-box and Differentiable Components

Zhiliang Chen, Chuan-Sheng Foo, Bryan Kian Hsiang Low

ICML 2024

Towards Learning a Generalist Model for Embodied Navigation

Duo Zheng, Shijia Huang, Lin Zhao et al.

CVPR 2024highlightarXiv:2312.02010
118
citations

Trainable Transformer in Transformer

Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia et al.

ICML 2024arXiv:2307.01189
14
citations

Training-free Video Temporal Grounding using Large-scale Pre-trained Models

Minghang Zheng, Xinhao Cai, Qingchao Chen et al.

ECCV 2024arXiv:2408.16219
21
citations

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

Zhiheng Xi, Wenxiang Chen, Boyang Hong et al.

ICML 2024arXiv:2402.05808
58
citations

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

Jian Xie, Kai Zhang, Jiangjie Chen et al.

ICML 2024spotlightarXiv:2402.01622
319
citations

Two-stage LLM Fine-tuning with Less Specialization and More Generalization

Yihan Wang, Si Si, Daliang Li et al.

ICLR 2024arXiv:2211.00635
43
citations

ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback

Ganqu Cui, Lifan Yuan, Ning Ding et al.

ICML 2024arXiv:2310.01377
214
citations

UMIE: Unified Multimodal Information Extraction with Instruction Tuning

Lin Sun, Kai Zhang, Qingyuan Li et al.

AAAI 2024paperarXiv:2401.03082
29
citations

Understanding the Learning Dynamics of Alignment with Human Feedback

Shawn Im, Sharon Li

ICML 2024arXiv:2403.18742
18
citations

UniAudio: Towards Universal Audio Generation with Large Language Models

Dongchao Yang, Jinchuan Tian, Xu Tan et al.

ICML 2024

UniGen: A Unified Generative Framework for Retrieval and Question Answering with Large Language Models

Xiaoxi Li, Yujia Zhou, Zhicheng Dou

AAAI 2024paperarXiv:2312.11036
22
citations

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Zhongzhi Yu, Zheng Wang, Yonggan Fu et al.

ICML 2024arXiv:2406.15765
47
citations

Unveiling the Potential of AI for Nanomaterial Morphology Prediction

Ivan Dubrovsky, Andrei Dmitrenko, Aleksey Dmitrenko et al.

ICML 2024arXiv:2406.02591
4
citations

Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers

Xiaoqiang Lin, Zhaoxuan Wu, Zhongxiang Dai et al.

ICML 2024arXiv:2310.02905
26
citations

Vamos: Versatile Action Models for Video Understanding

Shijie Wang, Qi Zhao, Minh Quan et al.

ECCV 2024arXiv:2311.13627
36
citations

Variational Learning is Effective for Large Deep Networks

Yuesong Shen, Nico Daheim, Bai Cong et al.

ICML 2024spotlightarXiv:2402.17641
47
citations

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024arXiv:2312.00937
37
citations

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Yunhao Ge, Xiaohui Zeng, Jacob Huffman et al.

CVPR 2024arXiv:2404.19752
35
citations

Watermark Stealing in Large Language Models

Nikola Jovanović, Robin Staab, Martin Vechev

ICML 2024arXiv:2402.19361
75
citations

When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models

Haoran You, Yichao Fu, Zheng Wang et al.

ICML 2024arXiv:2406.07368
9
citations

Working Memory Capacity of ChatGPT: An Empirical Study

Dongyu Gong, Xingchen Wan, Dingmin Wang

AAAI 2024paperarXiv:2305.03731
27
citations

X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-modal Reasoning

Artemis Panagopoulou, Le Xue, Ning Yu et al.

ECCV 2024
6
citations

Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement

che liu, Zhongwei Wan, Cheng Ouyang et al.

ICML 2024arXiv:2403.06659
61
citations

Zero-shot Text-guided Infinite Image Synthesis with LLM guidance

Soyeong Kwon, TAEGYEONG LEE, Taehwan Kim

ECCV 2024arXiv:2407.12642
3
citations

Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-World Multi-Turn Dialogue

Songhua Yang, Hanjie Zhao, Senbin Zhu et al.

AAAI 2024paperarXiv:2308.03549
210
citations

ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization

Shuoran Jiang, Qingcai Chen, Yang Xiang et al.

AAAI 2024paperarXiv:2312.15184
21
citations