Most Cited ICML "headless attention model" Papers

5,975 papers found • Page 6 of 30

#1001

ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks

Saurabh Jha, Rohan Arora, Yuji Watanabe et al.

ICML 2025oralarXiv:2502.05352
18
citations
#1002

Revealing Vision-Language Integration in the Brain with Multimodal Networks

Vighnesh Subramaniam, Colin Conwell, Christopher Wang et al.

ICML 2024arXiv:2406.14481
18
citations
#1003

Test-Time Learning for Large Language Models

Jinwu Hu, Zitian Zhang, Guohao Chen et al.

ICML 2025arXiv:2505.20633
18
citations
#1004

Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs

Jordan Dotzel, Yuzong Chen, Bahaa Kotb et al.

ICML 2024arXiv:2405.03103
18
citations
#1005

Plug-in Performative Optimization

Licong Lin, Tijana Zrnic

ICML 2024arXiv:2305.18728
18
citations
#1006

MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning

Suning Huang, Zheyu Zhang, Tianhai Liang et al.

ICML 2025arXiv:2410.14972
18
citations
#1007

DataDecide: How to Predict Best Pretraining Data with Small Experiments

Ian Magnusson, Tai Nguyen, Ben Bogin et al.

ICML 2025arXiv:2504.11393
18
citations
#1008

DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

Montgomery Bohde, Mrunali Manjrekar, Runzhong Wang et al.

ICML 2025arXiv:2502.09571
18
citations
#1009

Understanding Forgetting in Continual Learning with Linear Regression

Meng Ding, Kaiyi Ji, Di Wang et al.

ICML 2024arXiv:2405.17583
18
citations
#1010

Structured Preconditioners in Adaptive Optimization: A Unified Analysis

Shuo Xie, Tianhao Wang, Sashank J. Reddi et al.

ICML 2025arXiv:2503.10537
18
citations
#1011

Whoever Started the interference Should End It: Guiding Data-Free Model Merging via Task Vectors

Runxi Cheng, Feng Xiong, Yongxian Wei et al.

ICML 2025arXiv:2503.08099
18
citations
#1012

On the Expressive Power of Spectral Invariant Graph Neural Networks

Bohang Zhang, Lingxiao Zhao, Haggai Maron

ICML 2024arXiv:2406.04336
18
citations
#1013

Verification of Machine Unlearning is Fragile

Binchi Zhang, Zihan Chen, Cong Shen et al.

ICML 2024arXiv:2408.00929
18
citations
#1014

video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

Guangzhi Sun, Yudong Yang, Jimin Zhuang et al.

ICML 2025arXiv:2502.11775
18
citations
#1015

Position: Categorical Deep Learning is an Algebraic Theory of All Architectures

Bruno Gavranović, Paul Lessard, Andrew Dudzik et al.

ICML 2024arXiv:2402.15332
18
citations
#1016

$f$-Divergence Based Classification: Beyond the Use of Cross-Entropy

Nicola Novello, Andrea Tonello

ICML 2024arXiv:2401.01268
18
citations
#1017

Wasserstein Flow Matching: Generative Modeling Over Families of Distributions

Doron Haviv, Aram-Alexandre Pooladian, Dana Pe'er et al.

ICML 2025arXiv:2411.00698
18
citations
#1018

The Privacy Power of Correlated Noise in Decentralized Learning

Youssef Allouah, Anastasiia Koloskova, Aymane Firdoussi et al.

ICML 2024arXiv:2405.01031
18
citations
#1019

On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control

Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang et al.

ICML 2024arXiv:2106.08414
18
citations
#1020

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Philippe Hansen-Estruch, David Yan, Ching-Yao Chuang et al.

ICML 2025arXiv:2501.09755
18
citations
#1021

SelMatch: Effectively Scaling Up Dataset Distillation via Selection-Based Initialization and Partial Updates by Trajectory Matching

Yongmin Lee, Hye Won Chung

ICML 2024arXiv:2406.18561
18
citations
#1022

UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning

Shikun Feng, Yuyan Ni, Li et al.

ICML 2024arXiv:2405.10343
18
citations
#1023

Optimizing Watermarks for Large Language Models

Bram Wouters

ICML 2024arXiv:2312.17295
18
citations
#1024

Graph Neural Network Explanations are Fragile

Jiate Li, Meng Pang, Yun Dong et al.

ICML 2024arXiv:2406.03193
18
citations
#1025

ViP: A Differentially Private Foundation Model for Computer Vision

Yaodong Yu, Maziar Sanjabi, Yi Ma et al.

ICML 2024arXiv:2306.08842
18
citations
#1026

Efficient World Models with Context-Aware Tokenization

Vincent Micheli, Eloi Alonso, François Fleuret

ICML 2024arXiv:2406.19320
18
citations
#1027

Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks

Guanhua Zhang, Moritz Hardt

ICML 2024oralarXiv:2405.01719
18
citations
#1028

Scaling Laws for Pre-training Agents and World Models

Tim Pearce, Tabish Rashid, David Bignell et al.

ICML 2025arXiv:2411.04434
18
citations
#1029

HarmonyDream: Task Harmonization Inside World Models

Haoyu Ma, Jialong Wu, Ningya Feng et al.

ICML 2024arXiv:2310.00344
18
citations
#1030

LangTime: A Language-Guided Unified Model for Time Series Forecasting with Proximal Policy Optimization

Wenzhe Niu, Zongxia Xie, Yanru Sun et al.

ICML 2025oralarXiv:2503.08271
18
citations
#1031

Robust Yet Efficient Conformal Prediction Sets

Soroush H. Zargarbashi, Mohammad Sadegh Akhondzadeh, Aleksandar Bojchevski

ICML 2024arXiv:2407.09165
18
citations
#1032

Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks

Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman

ICML 2024arXiv:2310.05862
18
citations
#1033

Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization

Phillip Guo, Aaquib Syed, Abhay Sheshadri et al.

ICML 2025spotlightarXiv:2410.12949
18
citations
#1034

Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning

Arvi Jonnarth, Jie Zhao, Michael Felsberg

ICML 2024arXiv:2306.16978
18
citations
#1035

MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization

Kangyu Zhu, Peng Xia, Yun Li et al.

ICML 2025arXiv:2412.06141
18
citations
#1036

Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model

Hien Dang, Tho Tran Huu, Tan Nguyen et al.

ICML 2024arXiv:2401.02058
18
citations
#1037

ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation

Yupeng Hou, Jianmo Ni, Zhankui He et al.

ICML 2025spotlightarXiv:2502.13581
18
citations
#1038

Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation

Sadegh Mahdavi, Muchen Li, Kaiwen Liu et al.

ICML 2025arXiv:2501.14275
18
citations
#1039

A connection between Tempering and Entropic Mirror Descent

Nicolas Chopin, Francesca R Crucinio, Anna Korba

ICML 2024arXiv:2310.11914
18
citations
#1040

Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport

Jaemoo Choi, Jaewoong Choi, Myungjoo Kang

ICML 2024arXiv:2402.05443
18
citations
#1041

Understanding the Learning Dynamics of Alignment with Human Feedback

Shawn Im, Sharon Li

ICML 2024arXiv:2403.18742
18
citations
#1042

Understanding Heterophily for Graph Neural Networks

Junfu Wang, Yuanfang Guo, Liang Yang et al.

ICML 2024arXiv:2401.09125
18
citations
#1043

Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation

Zakhar Shumaylov, Jeremy Budd, Subhadip Mukherjee et al.

ICML 2024arXiv:2402.01052
18
citations
#1044

Score as Action: Fine Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning

Hanyang Zhao, Haoxian Chen, Ji Zhang et al.

ICML 2025arXiv:2502.01819
18
citations
#1045

Diffusive Gibbs Sampling

Wenlin Chen, Mingtian Zhang, Brooks Paige et al.

ICML 2024arXiv:2402.03008
17
citations
#1046

Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

Dejia Xu, Yifan Jiang, Chen Huang et al.

ICML 2025oralarXiv:2410.10774
17
citations
#1047

CaDA: Cross-Problem Routing Solver with Constraint-Aware Dual-Attention

Han Li, Fei Liu, Zhi Zheng et al.

ICML 2025arXiv:2412.00346
17
citations
#1048

Do Vision-Language Models Really Understand Visual Language?

Yifan Hou, Buse Giledereli, Yilei Tu et al.

ICML 2025arXiv:2410.00193
17
citations
#1049

From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and Generation

Kun Su, Xiulong Liu, Eli Shlizerman

ICML 2024arXiv:2409.19132
17
citations
#1050

Plug-and-Play image restoration with Stochastic deNOising REgularization

Marien Renaud, Jean Prost, Arthur Leclaire et al.

ICML 2024arXiv:2402.01779
17
citations
#1051

CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Guangyi Chen, Yifan Shen, Zhenhao Chen et al.

ICML 2024oralarXiv:2401.14535
17
citations
#1052

Can Transformers Learn Full Bayesian Inference in Context?

Arik Reuter, Tim G. J. Rudner, Vincent Fortuin et al.

ICML 2025arXiv:2501.16825
17
citations
#1053

Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization

Yihan Du, Anna Winnicki, Gal Dalal et al.

ICML 2024arXiv:2402.10342
17
citations
#1054

Long-Form Speech Generation with Spoken Language Models

Se Jin Park, Julian Salazar, Aren Jansen et al.

ICML 2025oralarXiv:2412.18603
17
citations
#1055

Single-Trajectory Distributionally Robust Reinforcement Learning

Zhipeng Liang, Xiaoteng Ma, Jose Blanchet et al.

ICML 2024arXiv:2301.11721
17
citations
#1056

Reasoning Limitations of Multimodal Large Language Models. A case study of Bongard Problems

Mikołaj Małkiński, Szymon Pawlonka, Jacek Mańdziuk

ICML 2025arXiv:2411.01173
17
citations
#1057

EpiCoder: Encompassing Diversity and Complexity in Code Generation

Yaoxiang Wang, Haoling Li, Xin Zhang et al.

ICML 2025arXiv:2501.04694
17
citations
#1058

Incentivized Learning in Principal-Agent Bandit Games

Antoine Scheid, Daniil Tiapkin, Etienne Boursier et al.

ICML 2024arXiv:2403.03811
17
citations
#1059

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Xuheng Li, Heyang Zhao, Quanquan Gu

ICML 2024arXiv:2404.06013
17
citations
#1060

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Vaishnavh Nagarajan, Chen Wu, Charles Ding et al.

ICML 2025oralarXiv:2504.15266
17
citations
#1061

Mitigating Object Hallucination in Large Vision-Language Models via Image-Grounded Guidance

Linxi Zhao, Yihe Deng, Weitong Zhang et al.

ICML 2025spotlightarXiv:2402.08680
17
citations
#1062

Uncertainty for Active Learning on Graphs

Dominik Fuchsgruber, Tom Wollschläger, Bertrand Charpentier et al.

ICML 2024arXiv:2405.01462
17
citations
#1063

Copilot Arena: A Platform for Code LLM Evaluation in the Wild

Wayne Chi, Valerie Chen, Anastasios Angelopoulos et al.

ICML 2025arXiv:2502.09328
17
citations
#1064

DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems

zhi Zheng, Shunyu Yao, Zhenkun Wang et al.

ICML 2024arXiv:2405.17272
17
citations
#1065

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product

Neil Mallinar, Daniel Beaglehole, Libin Zhu et al.

ICML 2025oralarXiv:2407.20199
17
citations
#1066

ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data

Xiangjian Jiang, Andrei Margeloiu, Nikola Simidjievski et al.

ICML 2024arXiv:2306.12330
17
citations
#1067

Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers

Md Shamim Hussain, Mohammed Zaki, Dharmashankar Subramanian

ICML 2024arXiv:2402.04538
17
citations
#1068

Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation

JUNYU GAO, Xuan Yao, Changsheng Xu

ICML 2024arXiv:2311.13209
17
citations
#1069

Spider: A Unified Framework for Context-dependent Concept Segmentation

Xiaoqi Zhao, Youwei Pang, Wei Ji et al.

ICML 2024arXiv:2405.01002
17
citations
#1070

Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space

Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung et al.

ICML 2024spotlightarXiv:2405.18986
17
citations
#1071

Sliced Wasserstein with Random-Path Projecting Directions

Khai Nguyen, Shujian Zhang, Tam Le et al.

ICML 2024arXiv:2401.15889
17
citations
#1072

WyckoffDiff -- A Generative Diffusion Model for Crystal Symmetry

Filip Ekström Kelvinius, Oskar Andersson, Abhijith Parackal et al.

ICML 2025arXiv:2502.06485
17
citations
#1073

(How) Do Language Models Track State?

Belinda Li, Carl Guo, Jacob Andreas

ICML 2025arXiv:2503.02854
17
citations
#1074

$S^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting

Zijie Pan, Yushan Jiang, Sahil Garg et al.

ICML 2024oralarXiv:2403.05798
17
citations
#1075

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

Kaiwen Wang, Owen Oertell, Alekh Agarwal et al.

ICML 2024arXiv:2402.07198
17
citations
#1076

Information Flow in Self-Supervised Learning

Zhiquan Tan, Jingqin Yang, Weiran Huang et al.

ICML 2024arXiv:2309.17281
17
citations
#1077

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach

Jiancong Xiao, Bojian Hou, Zhanliang Wang et al.

ICML 2025arXiv:2505.01997
17
citations
#1078

Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains

Kyungeun Lee, Ye Seul Sim, Hye-Seung Cho et al.

ICML 2024arXiv:2405.07414
17
citations
#1079

Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning

Jiachen Li, Qiaozi Gao, Michael Johnston et al.

ICML 2024arXiv:2310.09676
17
citations
#1080

Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

Jianhao Yuan, Francesco Pinto, Adam Davies et al.

ICML 2024arXiv:2212.11237
17
citations
#1081

Synthetic Face Datasets Generation via Latent Space Exploration from Brownian Identity Diffusion

David Geissbühler, Hatef Otroshi Shahreza, Sébastien Marcel

ICML 2025arXiv:2405.00228
17
citations
#1082

Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration

Xiaole Tang, Hu Xin, Xiang Gu et al.

ICML 2024arXiv:2405.02843
17
citations
#1083

Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification

Yiming Meng, Ruikun Zhou, Amartya Mukherjee et al.

ICML 2024arXiv:2402.10119
17
citations
#1084

On the Universality of Volume-Preserving and Coupling-Based Normalizing Flows

Felix Draxler, Stefan Wahl, Christoph Schnörr et al.

ICML 2024arXiv:2402.06578
17
citations
#1085

Counterfactual Image Editing

Yushu Pan, Elias Bareinboim

ICML 2024arXiv:2403.09683
17
citations
#1086

Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty

Laixi Shi, Eric Mazumdar, Yuejie Chi et al.

ICML 2024arXiv:2404.18909
17
citations
#1087

Idiosyncrasies in Large Language Models

Mingjie Sun, Yida Yin, Zhiqiu (Oscar) Xu et al.

ICML 2025arXiv:2502.12150
17
citations
#1088

Grokking Group Multiplication with Cosets

Dashiell Stander, Qinan Yu, Honglu Fan et al.

ICML 2024arXiv:2312.06581
17
citations
#1089

Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation

Xinyu Ma, Xu Chu, Zhibang Yang et al.

ICML 2024arXiv:2404.04316
17
citations
#1090

Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity

Hagyeong Lee, Minkyu Kim, Jun-Hyuk Kim et al.

ICML 2024arXiv:2403.02944
17
citations
#1091

FunBO: Discovering Acquisition Functions for Bayesian Optimization with FunSearch

Virginia Aglietti, Ira Ktena, Jessica Schrouff et al.

ICML 2025arXiv:2406.04824
17
citations
#1092

Discrete Latent Perspective Learning for Segmentation and Detection

Deyi Ji, Feng Zhao, Lanyun Zhu et al.

ICML 2024spotlightarXiv:2406.10475
17
citations
#1093

Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models

Zhihe Lu, Jiawang Bai, Xin Li et al.

ICML 2024arXiv:2311.17091
17
citations
#1094

Optimizing Temperature for Language Models with Multi-Sample Inference

Weihua Du, Yiming Yang, Sean Welleck

ICML 2025arXiv:2502.05234
17
citations
#1095

Diffusion Posterior Sampling is Computationally Intractable

Shivam Gupta, Ajil Jalal, Aditya Parulekar et al.

ICML 2024arXiv:2402.12727
17
citations
#1096

Multi-Turn Code Generation Through Single-Step Rewards

Arnav Kumar Jain, Gonzalo Gonzalez-Pumariega, Wayne Chen et al.

ICML 2025spotlightarXiv:2502.20380
17
citations
#1097

Accelerated Diffusion Models via Speculative Sampling

Valentin De Bortoli, Alexandre Galashov, Arthur Gretton et al.

ICML 2025arXiv:2501.05370
17
citations
#1098

Patch-wise Structural Loss for Time Series Forecasting

Dilfira Kudrat, Zongxia Xie, Yanru Sun et al.

ICML 2025oralarXiv:2503.00877
17
citations
#1099

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Yuqi Luo, Chenyang Song, Xu Han et al.

ICML 2025arXiv:2411.02335
16
citations
#1100

Position: A Call to Action for a Human-Centered AutoML Paradigm

Marius Lindauer, Florian Karl, Anne Klier et al.

ICML 2024arXiv:2406.03348
16
citations
#1101

Batch and match: black-box variational inference with a score-based divergence

Diana Cai, Chirag Modi, Loucas Pillaud-Vivien et al.

ICML 2024spotlightarXiv:2402.14758
16
citations
#1102

Ditto: Quantization-aware Secure Inference of Transformers upon MPC

Haoqi Wu, Wenjing Fang, Yancheng Zheng et al.

ICML 2024arXiv:2405.05525
16
citations
#1103

RelGNN: Composite Message Passing for Relational Deep Learning

Tianlang Chen, Charilaos Kanatsoulis, Jure Leskovec

ICML 2025arXiv:2502.06784
16
citations
#1104

Language Models Represent Beliefs of Self and Others

Wentao Zhu, Zhining Zhang, Yizhou Wang

ICML 2024arXiv:2402.18496
16
citations
#1105

Visual Autoregressive Modeling for Image Super-Resolution

Yunpeng Qu, Kun Yuan, Jinhua Hao et al.

ICML 2025arXiv:2501.18993
16
citations
#1106

One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

Ruochen Wang, Sohyun An, Minhao Cheng et al.

ICML 2024arXiv:2407.00256
16
citations
#1107

Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale

Candi Zheng, Yuan LAN

ICML 2024arXiv:2312.07586
16
citations
#1108

Neural Encoding and Decoding at Scale

Yizi Zhang, Yanchen Wang, Mehdi Azabou et al.

ICML 2025oralarXiv:2504.08201
16
citations
#1109

A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression

Victor Dheur, Matteo Fontana, Yorick Estievenart et al.

ICML 2025arXiv:2501.10533
16
citations
#1110

VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling

Siyuan Li, Zedong Wang, Zicheng Liu et al.

ICML 2024arXiv:2405.10812
16
citations
#1111

Balanced Resonate-and-Fire Neurons

Saya Higuchi, Sebastian Kairat, Sander Bohte et al.

ICML 2024arXiv:2402.14603
16
citations
#1112

CoLoRA: Continuous low-rank adaptation for reduced implicit neural modeling of parameterized partial differential equations

Jules Berman, Benjamin Peherstorfer

ICML 2024arXiv:2402.14646
16
citations
#1113

In-Context Fine-Tuning for Time-Series Foundation Models

Matthew Faw, Rajat Sen, Yichen Zhou et al.

ICML 2025arXiv:2410.24087
16
citations
#1114

Neural Diffusion Models

Grigory Bartosh, Dmitry Vetrov, Christian Andersson Naesseth

ICML 2024arXiv:2310.08337
16
citations
#1115

Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems

Maksim Zhdanov, Max Welling, Jan-Willem van de Meent

ICML 2025arXiv:2502.17019
16
citations
#1116

Premise-Augmented Reasoning Chains Improve Error Identification in Math reasoning with LLMs

Sagnik Mukherjee, Abhinav Chinta, Takyoung Kim et al.

ICML 2025arXiv:2502.02362
16
citations
#1117

Robust and Conjugate Gaussian Process Regression

Matias Altamirano, Francois-Xavier Briol, Jeremias Knoblauch

ICML 2024spotlightarXiv:2311.00463
16
citations
#1118

Diffusion on Language Model Encodings for Protein Sequence Generation

Viacheslav Meshchaninov, Pavel Strashnov, Andrey Shevtsov et al.

ICML 2025arXiv:2403.03726
16
citations
#1119

In-Context Learning Agents Are Asymmetric Belief Updaters

Johannes A. Schubert, Akshay Kumar Jagadish, Marcel Binz et al.

ICML 2024arXiv:2402.03969
16
citations
#1120

Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance

Marta Gentiloni Silveri, Antonio Ocello

ICML 2025arXiv:2501.02298
16
citations
#1121

FrameQuant: Flexible Low-Bit Quantization for Transformers

Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang et al.

ICML 2024arXiv:2403.06082
16
citations
#1122

Toward Adaptive Reasoning in Large Language Models with Thought Rollback

Sijia Chen, Baochun Li

ICML 2024arXiv:2412.19707
16
citations
#1123

Truly No-Regret Learning in Constrained MDPs

Adrian Müller, Pragnya Alatur, Volkan Cevher et al.

ICML 2024spotlightarXiv:2402.15776
16
citations
#1124

Expressive Power of Graph Neural Networks for (Mixed-Integer) Quadratic Programs

Ziang Chen, Xiaohan Chen, Jialin Liu et al.

ICML 2025arXiv:2406.05938
16
citations
#1125

Understanding Unimodal Bias in Multimodal Deep Linear Networks

Yedi Zhang, Peter Latham, Andrew Saxe

ICML 2024arXiv:2312.00935
16
citations
#1126

Action-Minimization Meets Generative Modeling: Efficient Transition Path Sampling with the Onsager-Machlup Functional

Sanjeev Raja, Martin Šípka, Michael Psenka et al.

ICML 2025oralarXiv:2504.18506
16
citations
#1127

Online Cascade Learning for Efficient Inference over Streams

Lunyiu Nie, Zhimin Ding, Erdong Hu et al.

ICML 2024arXiv:2402.04513
16
citations
#1128

Quantum Implicit Neural Representations

Jiaming Zhao, Wenbo Qiao, Peng Zhang et al.

ICML 2024arXiv:2406.03873
16
citations
#1129

The Brain's Bitter Lesson: Scaling Speech Decoding With Self-Supervised Learning

Dulhan Jayalath, Gilad Landau, Brendan Shillingford et al.

ICML 2025arXiv:2406.04328
16
citations
#1130

An Interpretable Evaluation of Entropy-based Novelty of Generative Models

Jingwei Zhang, Cheuk Ting Li, Farzan Farnia

ICML 2024arXiv:2402.17287
16
citations
#1131

End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations

Lirui Luo, Guoxi Zhang, Hongming Xu et al.

ICML 2024spotlightarXiv:2403.12451
16
citations
#1132

A Geometric Explanation of the Likelihood OOD Detection Paradox

Hamidreza Kamkari, Brendan Ross, Jesse Cresswell et al.

ICML 2024arXiv:2403.18910
16
citations
#1133

SurfPro: Functional Protein Design Based on Continuous Surface

Zhenqiao Song, Tinglin Huang, Lei Li et al.

ICML 2024arXiv:2405.06693
16
citations
#1134

Harnessing the Power of Neural Operators with Automatically Encoded Conservation Laws

Ning Liu, Yiming Fan, Xianyi Zeng et al.

ICML 2024spotlightarXiv:2312.11176
16
citations
#1135

PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control

Ruijie Zheng, Ching-An Cheng, Hal Daumé et al.

ICML 2024oralarXiv:2402.10450
16
citations
#1136

Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs

Daniel D. Johnson, Daniel Tarlow, David Duvenaud et al.

ICML 2024arXiv:2402.08733
16
citations
#1137

Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts

Onur Celik, Aleksandar Taranovic, Gerhard Neumann

ICML 2024arXiv:2403.06966
16
citations
#1138

Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes

Jaehyeong Jo, Sung Ju Hwang

ICML 2024arXiv:2310.07216
16
citations
#1139

$H$-Consistency Guarantees for Regression

Anqi Mao, Mehryar Mohri, Yutao Zhong

ICML 2024arXiv:2403.19480
16
citations
#1140

Data-Efficient Learning via Clustering-Based Sensitivity Sampling: Foundation Models and Beyond

Kyriakos Axiotis, Vincent Cohen-Addad, Monika Henzinger et al.

ICML 2024arXiv:2402.17327
16
citations
#1141

The Jailbreak Tax: How Useful are Your Jailbreak Outputs?

Kristina Nikolić, Luze Sun, Jie Zhang et al.

ICML 2025spotlightarXiv:2504.10694
16
citations
#1142

Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

Kaituo Feng, Changsheng Li, Xiaolu Zhang et al.

ICML 2024arXiv:2405.16064
16
citations
#1143

LoCoCo: Dropping In Convolutions for Long Context Compression

Ruisi Cai, Yuandong Tian, Zhangyang “Atlas” Wang et al.

ICML 2024arXiv:2406.05317
16
citations
#1144

Metadata Conditioning Accelerates Language Model Pre-training

Tianyu Gao, Alexander Wettig, Luxi He et al.

ICML 2025arXiv:2501.01956
16
citations
#1145

Does Graph Prompt Work? A Data Operation Perspective with Theoretical Analysis

Qunzhong WANG, Xiangguo Sun, Hong Cheng

ICML 2025arXiv:2410.01635
15
citations
#1146

Graph2Tac: Online Representation Learning of Formal Math Concepts

Lasse Blaauwbroek, Mirek Olšák, Jason Rute et al.

ICML 2024arXiv:2401.02949
15
citations
#1147

Graph Distillation with Eigenbasis Matching

Yang Liu, Deyu Bo, Chuan Shi

ICML 2024arXiv:2310.09202
15
citations
#1148

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

Rahul Ramesh, Ekdeep Singh Lubana, Mikail Khona et al.

ICML 2024arXiv:2311.12997
15
citations
#1149

Borda Regret Minimization for Generalized Linear Dueling Bandits

Yue Wu, Tao Jin, Qiwei Di et al.

ICML 2024arXiv:2303.08816
15
citations
#1150

Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints

Xiaobo Xia, Jiale Liu, Shaokun Zhang et al.

ICML 2024spotlightarXiv:2311.08675
15
citations
#1151

OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models

William Chen, Jinchuan Tian, Yifan Peng et al.

ICML 2025arXiv:2502.10373
15
citations
#1152

ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling

Dongchao Yang, Songxiang Liu, Haohan Guo et al.

ICML 2025arXiv:2504.10344
15
citations
#1153

Eliciting Language Model Behaviors with Investigator Agents

Xiang Li, Neil Chowdhury, Daniel Johnson et al.

ICML 2025oralarXiv:2502.01236
15
citations
#1154

Doubly Robust Causal Effect Estimation under Networked Interference via Targeted Learning

Weilin Chen, Ruichu Cai, Zeqin Yang et al.

ICML 2024arXiv:2405.03342
15
citations
#1155

Graph Neural PDE Solvers with Conservation and Similarity-Equivariance

Masanobu Horie, NAOTO MITSUME

ICML 2024arXiv:2405.16183
15
citations
#1156

How Interpretable Are Interpretable Graph Neural Networks?

Yongqiang Chen, Yatao Bian, Bo Han et al.

ICML 2024arXiv:2406.07955
15
citations
#1157

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

Shengchao Hu, Ziqing Fan, Li Shen et al.

ICML 2024arXiv:2405.18080
15
citations
#1158

LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently

Yuanhe Zhang, Fanghui Liu, Yudong Chen

ICML 2025oralarXiv:2502.01235
15
citations
#1159

RAUCA: A Novel Physical Adversarial Attack on Vehicle Detectors via Robust and Accurate Camouflage Generation

Jiawei Zhou, Linye Lyu, Daojing He et al.

ICML 2024arXiv:2402.15853
15
citations
#1160

Towards efficient deep spiking neural networks construction with spiking activity based pruning

Yaxin Li, Qi Xu, Jiangrong Shen et al.

ICML 2024arXiv:2406.01072
15
citations
#1161

AAAR-1.0: Assessing AI’s Potential to Assist Research

Renze Lou, Hanzi Xu, Sijia Wang et al.

ICML 2025arXiv:2410.22394
15
citations
#1162

Position: Don't Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

Sam Bowyer, Laurence Aitchison, Desi Ivanova

ICML 2025spotlightarXiv:2503.01747
15
citations
#1163

Improving Generalization in Federated Learning with Highly Heterogeneous Data via Momentum-Based Stochastic Controlled Weight Averaging

Junkang Liu, Yuanyuan Liu, Fanhua Shang et al.

ICML 2025arXiv:2507.20016
15
citations
#1164

A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization

Ashwinee Panda, Xinyu Tang, Saeed Mahloujifar et al.

ICML 2024arXiv:2212.04486
15
citations
#1165

Retrieval Augmented Time Series Forecasting

Sungwon Han, Seungeon Lee, MEEYOUNG CHA et al.

ICML 2025oralarXiv:2505.04163
15
citations
#1166

Refining Minimax Regret for Unsupervised Environment Design

Michael Beukman, Samuel Coward, Michael Matthews et al.

ICML 2024arXiv:2402.12284
15
citations
#1167

Peri-LN: Revisiting Normalization Layer in the Transformer Architecture

Jeonghoon Kim, Byeongchan Lee, Cheonbok Park et al.

ICML 2025spotlightarXiv:2502.02732
15
citations
#1168

Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates

Ashish Hooda, Mihai Christodorescu, Miltiadis Allamanis et al.

ICML 2024arXiv:2402.05980
15
citations
#1169

RAGGED: Towards Informed Design of Scalable and Stable RAG Systems

Jennifer Hsia, Afreen Shaikh, Zhiruo Wang et al.

ICML 2025arXiv:2403.09040
15
citations
#1170

Optimal transport-based conformal prediction

Gauthier Thurin, Kimia Nadjahi, Claire Boyer

ICML 2025arXiv:2501.18991
15
citations
#1171

An Empirical Study Into What Matters for Calibrating Vision-Language Models

Weijie Tu, Weijian Deng, Dylan Campbell et al.

ICML 2024arXiv:2402.07417
15
citations
#1172

Linear Explanations for Individual Neurons

Tuomas Oikarinen, Lily Weng

ICML 2024arXiv:2405.06855
15
citations
#1173

Galileo: Learning Global & Local Features of Many Remote Sensing Modalities

Gabriel Tseng, Anthony Fuller, Marlena Reil et al.

ICML 2025arXiv:2502.09356
15
citations
#1174

ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals

Utkarsh Saxena, Sayeh Sharify, Kaushik Roy et al.

ICML 2025spotlightarXiv:2412.14363
15
citations
#1175

ESM All-Atom: Multi-Scale Protein Language Model for Unified Molecular Modeling

Kangjie Zheng, Siyu Long, Tianyu Lu et al.

ICML 2024arXiv:2403.12995
15
citations
#1176

MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis

Luyuan Xie, Manqing Lin, Tianyu Luan et al.

ICML 2024arXiv:2405.06822
15
citations
#1177

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Kenneth Li, Samy Jelassi, Hugh Zhang et al.

ICML 2024arXiv:2402.14688
15
citations
#1178

Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design

Quan Nguyen, Adji Bousso Dieng

ICML 2024arXiv:2405.02449
15
citations
#1179

Language Models as Science Tutors

Alexis Chevalier, Jiayi Geng, Alexander Wettig et al.

ICML 2024arXiv:2402.11111
15
citations
#1180

ContPhy: Continuum Physical Concept Learning and Reasoning from Videos

Zhicheng Zheng, Xin Yan, Zhenfang Chen et al.

ICML 2024arXiv:2402.06119
15
citations
#1181

Weisfeiler-Leman at the margin: When more expressivity matters

Billy Franks, Christopher Morris, Ameya Velingker et al.

ICML 2024arXiv:2402.07568
15
citations
#1182

PILAF: Optimal Human Preference Sampling for Reward Modeling

Yunzhen Feng, Ariel Kwiatkowski, Kunhao Zheng et al.

ICML 2025arXiv:2502.04270
15
citations
#1183

LOCATE 3D: Real-World Object Localization via Self-Supervised Learning in 3D

Paul McVay, Sergio Arnaud, Ada Martin et al.

ICML 2025spotlightarXiv:2504.14151
15
citations
#1184

Hyperbolic Active Learning for Semantic Segmentation under Domain Shift

Luca Franco, Paolo Mandica, Konstantinos Kallidromitis et al.

ICML 2024arXiv:2306.11180
15
citations
#1185

Interpretable Deep Clustering for Tabular Data

Jonathan Svirsky, Ofir Lindenbaum

ICML 2024arXiv:2306.04785
15
citations
#1186

Critical feature learning in deep neural networks

Kirsten Fischer, Javed Lindner, David Dahmen et al.

ICML 2024arXiv:2405.10761
15
citations
#1187

Stochastic Deep Restoration Priors for Imaging Inverse Problems

Yuyang Hu, Albert Peng, Weijie Gan et al.

ICML 2025arXiv:2410.02057
15
citations
#1188

EPIC: Efficient Position-Independent Caching for Serving Large Language Models

JUNHAO HU, Wenrui Huang, Weidong Wang et al.

ICML 2025arXiv:2410.15332
15
citations
#1189

Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models

JINHAO LIANG, Jacob Christopher, Sven Koenig et al.

ICML 2025arXiv:2502.03607
15
citations
#1190

LongRoPE2: Near-Lossless LLM Context Window Scaling

Ning Shang, Li Lyna Zhang, Siyuan Wang et al.

ICML 2025arXiv:2502.20082
15
citations
#1191

Learning Linear Block Error Correction Codes

Yoni Choukroun, Lior Wolf

ICML 2024arXiv:2405.04050
15
citations
#1192

FrameBridge: Improving Image-to-Video Generation with Bridge Models

Yuji Wang, Zehua Chen, Chen Xiaoyu et al.

ICML 2025arXiv:2410.15371
15
citations
#1193

Maximum Entropy Reinforcement Learning with Diffusion Policy

Xiaoyi Dong, Jian Cheng, Xi Zhang

ICML 2025arXiv:2502.11612
15
citations
#1194

HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking

Runquan Gui, Zhihai Wang, Jie Wang et al.

ICML 2025arXiv:2505.02322
15
citations
#1195

EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting

Jiaxu Wang, Junhao He, Ziyi Zhang et al.

ICML 2024arXiv:2405.14959
15
citations
#1196

Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos

Chris Pedersen, Laure Zanna, Joan Bruna

ICML 2025oralarXiv:2503.18731
15
citations
#1197

Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety

Zihan Guan, Mengxuan Hu, Ronghang Zhu et al.

ICML 2025spotlightarXiv:2505.06843
15
citations
#1198

On Path to Multimodal Generalist: General-Level and General-Bench

Hao Fei, Yuan Zhou, Juncheng Li et al.

ICML 2025oralarXiv:2505.04620
15
citations
#1199

Prediction-powered Generalization of Causal Inferences

Ilker Demirel, Ahmed Alaa, Anthony Philippakis et al.

ICML 2024arXiv:2406.02873
15
citations
#1200

Probing Visual Language Priors in VLMs

Tiange Luo, Ang Cao, Gunhee Lee et al.

ICML 2025arXiv:2501.00569
15
citations