Most Cited ICML "hierarchical residual quantization" Papers

5,975 papers found • Page 3 of 30

#401

FlipAttack: Jailbreak LLMs via Flipping

Yue Liu, Xiaoxin He, Miao Xiong et al.

ICML 2025arXiv:2410.02832
47
citations
#402

FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching

Sucheng Ren, Qihang Yu, Ju He et al.

ICML 2025arXiv:2412.15205
47
citations
#403

Deep Networks Always Grok and Here is Why

Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk

ICML 2024arXiv:2402.15555
47
citations
#404

On the Embedding Collapse when Scaling up Recommendation Models

Xingzhuo Guo, Junwei Pan, Ximei Wang et al.

ICML 2024arXiv:2310.04400
47
citations
#405

An Analysis of Linear Time Series Forecasting Models

William Toner, Luke Darlow

ICML 2024arXiv:2403.14587
47
citations
#406

AnyEdit: Edit Any Knowledge Encoded in Language Models

Houcheng Jiang, Junfeng Fang, Ningyu Zhang et al.

ICML 2025arXiv:2502.05628
47
citations
#407

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Zhongzhi Yu, Zheng Wang, Yonggan Fu et al.

ICML 2024arXiv:2406.15765
47
citations
#408

eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data

Peng, Xinyi Ling, Ziru Chen et al.

ICML 2024arXiv:2402.08831
46
citations
#409

Empirical Design in Reinforcement Learning

Andrew Patterson, Samuel F Neumann, Martha White et al.

ICML 2025arXiv:2304.01315
46
citations
#410

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas

Shiqi Chen, Tongyao Zhu, Ruochen Zhou et al.

ICML 2025arXiv:2503.01773
46
citations
#411

Dual Operating Modes of In-Context Learning

Ziqian Lin, Kangwook Lee

ICML 2024arXiv:2402.18819
46
citations
#412

CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling

JUNCHAO GONG, LEI BAI, Peng Ye et al.

ICML 2024arXiv:2402.04290
46
citations
#413

Active Preference Learning for Large Language Models

William Muldrew, Peter Hayes, Mingtian Zhang et al.

ICML 2024arXiv:2402.08114
46
citations
#414

Improving fine-grained understanding in image-text pre-training

Ioana Bica, Anastasija Ilic, Matthias Bauer et al.

ICML 2024arXiv:2401.09865
46
citations
#415

RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis

Yao Mu, Junting Chen, Qing-Long Zhang et al.

ICML 2024arXiv:2402.16117
46
citations
#416

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems

Shaokun Zhang, Ming Yin, Jieyu Zhang et al.

ICML 2025spotlightarXiv:2505.00212
46
citations
#417

On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis

Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song et al.

ICML 2024arXiv:2402.04520
46
citations
#418

On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents

Jen-Tse Huang, Jiaxu Zhou, Tailin Jin et al.

ICML 2025arXiv:2408.00989
45
citations
#419

Context is Key: A Benchmark for Forecasting with Essential Textual Information

Andrew Williams, Arjun Ashok, Étienne Marcotte et al.

ICML 2025arXiv:2410.18959
45
citations
#420

A Multimodal Automated Interpretability Agent

Tamar Rott Shaham, Sarah Schwettmann, Franklin Wang et al.

ICML 2024arXiv:2404.14394
45
citations
#421

A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models

Yihan Wu, Zhengmian Hu, Junfeng Guo et al.

ICML 2024arXiv:2310.07710
45
citations
#422

FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing

Yingying Deng, Xiangyu He, Changwang Mei et al.

ICML 2025arXiv:2412.07517
45
citations
#423

UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent

Jianke Zhang, Yanjiang Guo, Yucheng Hu et al.

ICML 2025arXiv:2501.18867
45
citations
#424

The Surprising Effectiveness of Test-Time Training for Few-Shot Learning

Ekin Akyürek, Mehul Damani, Adam Zweiger et al.

ICML 2025arXiv:2411.07279
45
citations
#425

One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation

Zhendong Wang, Max Li, Ajay Mandlekar et al.

ICML 2025arXiv:2410.21257
44
citations
#426

RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers

Min Zhao, Guande He, Yixiao Chen et al.

ICML 2025oralarXiv:2502.15894
44
citations
#427

Theoretical insights for diffusion guidance: A case study for Gaussian mixture models

Yuchen Wu, Minshuo Chen, Zihao Li et al.

ICML 2024arXiv:2403.01639
44
citations
#428

Thinking LLMs: General Instruction Following with Thought Generation

Tianhao Wu, Janice Lan, Weizhe Yuan et al.

ICML 2025arXiv:2410.10630
44
citations
#429

A Language Model’s Guide Through Latent Space

Dimitri von Rütte, Sotiris Anagnostidis, Gregor Bachmann et al.

ICML 2024arXiv:2402.14433
44
citations
#430

Parameterized Physics-informed Neural Networks for Parameterized PDEs

Woojin Cho, Minju Jo, Haksoo Lim et al.

ICML 2024arXiv:2408.09446
44
citations
#431

Feedback Efficient Online Fine-Tuning of Diffusion Models

Masatoshi Uehara, Yulai Zhao, Kevin Black et al.

ICML 2024arXiv:2402.16359
44
citations
#432

Online conformal prediction with decaying step sizes

Anastasios Angelopoulos, Rina Barber, Stephen Bates

ICML 2024arXiv:2402.01139
44
citations
#433

Equivariant Graph Neural Operator for Modeling 3D Dynamics

Minkai Xu, Jiaqi Han, Aaron Lou et al.

ICML 2024oralarXiv:2401.11037
44
citations
#434

A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?

Agustinus Kristiadi, Felix Strieth-Kalthoff, Marta Skreta et al.

ICML 2024arXiv:2402.05015
44
citations
#435

AdaWorld: Learning Adaptable World Models with Latent Actions

Shenyuan Gao, Siyuan Zhou, Yilun Du et al.

ICML 2025arXiv:2503.18938
44
citations
#436

ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning

Zhaorun Chen, Mintong Kang, Bo Li

ICML 2025arXiv:2503.22738
43
citations
#437

An Architecture Search Framework for Inference-Time Techniques

Jon Saad-Falcon, Adrian Lafuente, Shlok Natarajan et al.

ICML 2025arXiv:2409.15254
43
citations
#438

In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Shiqi Chen, Miao Xiong, Junteng Liu et al.

ICML 2024arXiv:2403.01548
43
citations
#439

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Audrey Huang, Adam Block, Qinghua Liu et al.

ICML 2025arXiv:2503.21878
43
citations
#440

Can AI Assistants Know What They Don't Know?

Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu et al.

ICML 2024arXiv:2401.13275
43
citations
#441

Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection

Zhiyuan Yan, Jiangming Wang, Peng Jin et al.

ICML 2025oralarXiv:2411.15633
43
citations
#442

CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers

Dachuan Shi, Chaofan Tao, Anyi Rao et al.

ICML 2024arXiv:2305.17455
43
citations
#443

Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models

Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao et al.

ICML 2024arXiv:2404.03827
43
citations
#444

AI Alignment with Changing and Influenceable Reward Functions

Micah Carroll, Davis Foote, Anand Siththaranjan et al.

ICML 2024arXiv:2405.17713
43
citations
#445

Conformal Prediction for Deep Classifier via Label Ranking

Jianguo Huang, HuaJun Xi, Linjun Zhang et al.

ICML 2024arXiv:2310.06430
43
citations
#446

Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling

Denis Blessing, Xiaogang Jia, Johannes Esslinger et al.

ICML 2024arXiv:2406.07423
43
citations
#447

Graph Attention Retrospective

Kimon Fountoulakis, Amit Levi, Shenghao Yang et al.

ICML 2024arXiv:2202.13060
43
citations
#448

The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective

Chi-Heng Lin, Chiraag Kaushik, Eva Dyer et al.

ICML 2024arXiv:2210.05021
43
citations
#449

MEMORYLLM: Towards Self-Updatable Large Language Models

Yu Wang, Yifan Gao, Xiusi Chen et al.

ICML 2024arXiv:2402.04624
43
citations
#450

SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference

Jintao Zhang, Chendong Xiang, Haofeng Huang et al.

ICML 2025arXiv:2502.18137
43
citations
#451

CollabLLM: From Passive Responders to Active Collaborators

Shirley Wu, Michel Galley, Baolin Peng et al.

ICML 2025oralarXiv:2502.00640
43
citations
#452

Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs

Yeonhong Park, Jake Hyun, SangLyul Cho et al.

ICML 2024arXiv:2402.10517
43
citations
#453

STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving

Kefan Dong, Tengyu Ma

ICML 2025arXiv:2502.00212
43
citations
#454

Diffusion Model-Augmented Behavioral Cloning

Shang-Fu Chen, Hsiang-Chun Wang, Ming-Hao Hsu et al.

ICML 2024oralarXiv:2302.13335
42
citations
#455

From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning

Wei Chen, Zhen Huang, Liang Xie et al.

ICML 2024arXiv:2409.01658
42
citations
#456

CARTE: Pretraining and Transfer for Tabular Learning

Myung Jun Kim, Leo Grinsztajn, Gael Varoquaux

ICML 2024arXiv:2402.16785
42
citations
#457

Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Haozheng Luo et al.

ICML 2024arXiv:2404.03828
42
citations
#458

Fast Decision Boundary based Out-of-Distribution Detector

Litian Liu, Yao Qin

ICML 2024arXiv:2312.11536
42
citations
#459

Interpreting and Improving Large Language Models in Arithmetic Calculation

Wei Zhang, Wan Chaoqun, Yonggang Zhang et al.

ICML 2024arXiv:2409.01659
42
citations
#460

Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes

Yifan Chen, Mark Goldstein, Mengjian Hua et al.

ICML 2024arXiv:2403.13724
42
citations
#461

Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective

Fabian Falck, Ziyu Wang, Christopher Holmes

ICML 2024arXiv:2406.00793
42
citations
#462

TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks

Zhiruo Wang, Graham Neubig, Daniel Fried

ICML 2024arXiv:2401.12869
42
citations
#463

Distinguishing the Knowable from the Unknowable with Language Models

Gustaf Ahdritz, Tian Qin, Nikhil Vyas et al.

ICML 2024arXiv:2402.03563
41
citations
#464

Codebook Features: Sparse and Discrete Interpretability for Neural Networks

Alex Tamkin, Mohammad Taufeeque, Noah Goodman

ICML 2024arXiv:2310.17230
41
citations
#465

Non-Vacuous Generalization Bounds for Large Language Models

Sanae Lotfi, Marc Finzi, Yilun Kuang et al.

ICML 2024arXiv:2312.17173
41
citations
#466

Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

Michal Nauman, Michał Bortkiewicz, Piotr Milos et al.

ICML 2024arXiv:2403.00514
41
citations
#467

Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

Aaron Havens, Benjamin Kurt Miller, Bing Yan et al.

ICML 2025arXiv:2504.11713
41
citations
#468

Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge

Yue Conghan, Zhengwei Peng, Junlong Ma et al.

ICML 2024arXiv:2312.10299
41
citations
#469

Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion

Yujia Huang, Adishree Ghatare, Yuanzhe Liu et al.

ICML 2024arXiv:2402.14285
41
citations
#470

Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data

Giannis Daras, Alexandros Dimakis, Constantinos Daskalakis

ICML 2024arXiv:2404.10177
41
citations
#471

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Dongya Jia, Zhuo Chen, Jiawei Chen et al.

ICML 2025arXiv:2502.03930
41
citations
#472

Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond

Chongyu Fan, jinghan jia, Yihua Zhang et al.

ICML 2025arXiv:2502.05374
41
citations
#473

IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency

Linshan Hou, Ruili Feng, Zhongyun Hua et al.

ICML 2024arXiv:2405.09786
41
citations
#474

Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance

Xinyu Peng, Ziyang Zheng, Wenrui Dai et al.

ICML 2024arXiv:2402.02149
41
citations
#475

ReconBoost: Boosting Can Achieve Modality Reconcilement

Cong Hua, Qianqian Xu, Shilong Bao et al.

ICML 2024arXiv:2405.09321
41
citations
#476

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Maohao Shen, Guangtao Zeng, Zhenting Qi et al.

ICML 2025arXiv:2502.02508
40
citations
#477

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Nicholas Crispino, Kyle Montgomery, Fankun Zeng et al.

ICML 2024arXiv:2310.03710
40
citations
#478

Improved Operator Learning by Orthogonal Attention

Zipeng Xiao, Zhongkai Hao, Bokai Lin et al.

ICML 2024spotlightarXiv:2310.12487
40
citations
#479

Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction

Diwen Wan, Ruijie Lu, Gang Zeng

ICML 2024arXiv:2406.03697
40
citations
#480

Multimodal Prototyping for cancer survival prediction

Andrew Song, Richard Chen, Guillaume Jaume et al.

ICML 2024arXiv:2407.00224
40
citations
#481

Generalized Neural Collapse for a Large Number of Classes

Jiachen Jiang, Jinxin Zhou, Peng Wang et al.

ICML 2024arXiv:2310.05351
40
citations
#482

In-Context Principle Learning from Mistakes

Tianjun Zhang, Aman Madaan, Luyu Gao et al.

ICML 2024arXiv:2402.05403
40
citations
#483

MoH: Multi-Head Attention as Mixture-of-Head Attention

Peng Jin, Bo Zhu, Li Yuan et al.

ICML 2025arXiv:2410.11842
40
citations
#484

Learning with 3D rotations, a hitchhiker's guide to SO(3)

Andreas René Geist, Jonas Frey, Mikel Zhobro et al.

ICML 2024arXiv:2404.11735
40
citations
#485

Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts

Marta Skreta, Tara Akhound-Sadegh, Viktor Ohanesian et al.

ICML 2025spotlightarXiv:2503.02819
40
citations
#486

CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities

Yuxuan Zhu, Antony Kellermann, Dylan Bowman et al.

ICML 2025spotlightarXiv:2503.17332
40
citations
#487

BAT: Learning to Reason about Spatial Sounds with Large Language Models

Zhisheng Zheng, Puyuan Peng, Ziyang Ma et al.

ICML 2024arXiv:2402.01591
40
citations
#488

Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks

Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca et al.

ICML 2025arXiv:2502.14546
40
citations
#489

Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension

Fan Yin, Jayanth Srinivasa, Kai-Wei Chang

ICML 2024arXiv:2402.18048
40
citations
#490

The Diffusion Duality

Subham Sekhar Sahoo, Justin Deschenaux, Aaron Gokaslan et al.

ICML 2025arXiv:2506.10892
40
citations
#491

Scalable AI Safety via Doubly-Efficient Debate

Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras

ICML 2024arXiv:2311.14125
39
citations
#492

Conformal prediction for multi-dimensional time series by ellipsoidal sets

Chen Xu, Hanyang Jiang, Yao Xie

ICML 2024spotlightarXiv:2403.03850
39
citations
#493

Graph Generation with Diffusion Mixture

Jaehyeong Jo, Dongki Kim, Sung Ju Hwang

ICML 2024arXiv:2302.03596
39
citations
#494

OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction

Huang Huang, Fangchen Liu, Letian Fu et al.

ICML 2025arXiv:2503.03734
39
citations
#495

FlatQuant: Flatness Matters for LLM Quantization

Yuxuan Sun, Ruikang Liu, Haoli Bai et al.

ICML 2025arXiv:2410.09426
39
citations
#496

Improving the Diffusability of Autoencoders

Ivan Skorokhodov, Sharath Girish, Benran Hu et al.

ICML 2025arXiv:2502.14831
39
citations
#497

MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving

Jiangfei Duan, Runyu Lu, Haojie Duanmu et al.

ICML 2024oralarXiv:2404.02015
39
citations
#498

Copyright Traps for Large Language Models

Matthieu Meeus, Igor Shilov, Manuel Faysse et al.

ICML 2024arXiv:2402.09363
39
citations
#499

The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents

Yatin Dandi, Emanuele Troiani, Luca Arnaboldi et al.

ICML 2024arXiv:2402.03220
39
citations
#500

Modeling Caption Diversity in Contrastive Vision-Language Pretraining

Samuel Lavoie, Polina Kirichenko, Mark Ibrahim et al.

ICML 2024arXiv:2405.00740
39
citations
#501

SafeArena: Evaluating the Safety of Autonomous Web Agents

Ada Tur, Nicholas Meade, Xing Han Lù et al.

ICML 2025arXiv:2503.04957
39
citations
#502

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

Wei Huang, Haotong Qin, Yangdong Liu et al.

ICML 2025arXiv:2405.14917
39
citations
#503

Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models

Tanmay Gautam, Youngsuk Park, Hao Zhou et al.

ICML 2024arXiv:2404.08080
39
citations
#504

Revisiting the Role of Language Priors in Vision-Language Models

Zhiqiu Lin, Xinyue Chen, Deepak Pathak et al.

ICML 2024arXiv:2306.01879
39
citations
#505

WMAdapter: Adding WaterMark Control to Latent Diffusion Models

Hai Ci, Yiren Song, Pei Yang et al.

ICML 2025arXiv:2406.08337
38
citations
#506

Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape

Juno Kim, Taiji Suzuki

ICML 2024arXiv:2402.01258
38
citations
#507

Subgoal-based Demonstration Learning for Formal Theorem Proving

Xueliang Zhao, Wenda Li, Lingpeng Kong

ICML 2024arXiv:2305.16366
38
citations
#508

Smooth Tchebycheff Scalarization for Multi-Objective Optimization

Xi Lin, Xiaoyuan Zhang, Zhiyuan Yang et al.

ICML 2024arXiv:2402.19078
38
citations
#509

Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?

Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi et al.

ICML 2024arXiv:2410.08292
38
citations
#510

FG-CLIP: Fine-Grained Visual and Textual Alignment

Chunyu Xie, Bin Wang, Fanjing Kong et al.

ICML 2025arXiv:2505.05071
38
citations
#511

Hypergraph-enhanced Dual Semi-supervised Graph Classification

Wei Ju, Zhengyang Mao, Siyu Yi et al.

ICML 2024arXiv:2405.04773
38
citations
#512

Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models

Linhao Luo, Zicheng Zhao, Reza Haffari et al.

ICML 2025arXiv:2410.13080
38
citations
#513

Efficient Exploration for LLMs

Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao et al.

ICML 2024arXiv:2402.00396
37
citations
#514

A Computational Framework for Solving Wasserstein Lagrangian Flows

Kirill Neklyudov, Rob Brekelmans, Alexander Tong et al.

ICML 2024arXiv:2310.10649
37
citations
#515

Potential Based Diffusion Motion Planning

Yunhao Luo, Chen Sun, Josh Tenenbaum et al.

ICML 2024arXiv:2407.06169
37
citations
#516

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

Bartosz Cywiński, Kamil Deja

ICML 2025arXiv:2501.18052
37
citations
#517

On the Trajectory Regularity of ODE-based Diffusion Sampling

Defang Chen, Zhenyu Zhou, Can Wang et al.

ICML 2024arXiv:2405.11326
37
citations
#518

Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset

Shijie Lian, Ziyi Zhang, Hua Li et al.

ICML 2024arXiv:2406.06039
37
citations
#519

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models

Xin Zou, Yizhou WANG, Yibo Yan et al.

ICML 2025arXiv:2410.03577
37
citations
#520

Collapse or Thrive: Perils and Promises of Synthetic Data in a Self-Generating World

Joshua Kazdan, Rylan Schaeffer, Apratim Dey et al.

ICML 2025arXiv:2410.16713
37
citations
#521

Generalization to New Sequential Decision Making Tasks with In-Context Learning

Sharath Chandra Raparthy, Eric Hambro, Robert Kirk et al.

ICML 2024arXiv:2312.03801
37
citations
#522

Don't trust your eyes: on the (un)reliability of feature visualizations

Robert Geirhos, Roland S. Zimmermann, Blair Bilodeau et al.

ICML 2024arXiv:2306.04719
37
citations
#523

Multicalibration for Confidence Scoring in LLMs

Gianluca Detommaso, Martin A Bertran, Riccardo Fogliato et al.

ICML 2024arXiv:2404.04689
37
citations
#524

Which Attention Heads Matter for In-Context Learning?

Kayo Yin, Jacob Steinhardt

ICML 2025arXiv:2502.14010
37
citations
#525

Time Weaver: A Conditional Time Series Generation Model

Sai Shankar Narasimhan, Shubhankar Agarwal, Oguzhan Akcin et al.

ICML 2024spotlightarXiv:2403.02682
37
citations
#526

Compositional Text-to-Image Generation with Dense Blob Representations

Weili Nie, Sifei Liu, Morteza Mardani et al.

ICML 2024arXiv:2405.08246
37
citations
#527

Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies

Brian Bartoldson, James Diffenderfer, Konstantinos Parasyris et al.

ICML 2024arXiv:2404.09349
37
citations
#528

An Information-Theoretic Analysis of In-Context Learning

Hong Jun Jeon, Jason Lee, Qi Lei et al.

ICML 2024arXiv:2401.15530
37
citations
#529

Equivariant Frames and the Impossibility of Continuous Canonicalization

Nadav Dym, Hannah Lawrence, Jonathan Siegel

ICML 2024arXiv:2402.16077
36
citations
#530

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Xilin Wei, Xiaoran Liu, Yuhang Zang et al.

ICML 2025oralarXiv:2502.05173
36
citations
#531

AutoEval Done Right: Using Synthetic Data for Model Evaluation

Pierre Boyeau, Anastasios Angelopoulos, Tianle Li et al.

ICML 2025arXiv:2403.07008
36
citations
#532

On the Generalization of Stochastic Gradient Descent with Momentum

Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher et al.

ICML 2024arXiv:1809.04564
36
citations
#533

Privacy-Preserving Instructions for Aligning Large Language Models

Da Yu, Peter Kairouz, Sewoong Oh et al.

ICML 2024arXiv:2402.13659
36
citations
#534

Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

Nathaniel Cohen, Vladimir Kulikov, Matan Kleiner et al.

ICML 2024oralarXiv:2405.12211
36
citations
#535

Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting

Siru Zhong, Weilin Ruan, Ming Jin et al.

ICML 2025oralarXiv:2502.04395
36
citations
#536

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

Muhammed Emrullah Ildiz, Yixiao HUANG, Yingcong Li et al.

ICML 2024arXiv:2402.13512
36
citations
#537

DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching

Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.

ICML 2024arXiv:2402.02439
36
citations
#538

Learning to Route LLMs with Confidence Tokens

Yu-Neng Chuang, Prathusha Sarma, Parikshit Gopalan et al.

ICML 2025arXiv:2410.13284
36
citations
#539

FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models

Jingwei Sun, Ziyue Xu, Hongxu Yin et al.

ICML 2024arXiv:2310.01467
36
citations
#540

Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries

HUAKUN LUO, Haixu Wu, Hang Zhou et al.

ICML 2025arXiv:2502.02414
36
citations
#541

PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion

Sophia Tang, Yinuo Zhang, Pranam Chatterjee, PhD

ICML 2025arXiv:2412.17780
35
citations
#542

Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models

Fangzhao Zhang, Mert Pilanci

ICML 2024arXiv:2402.02347
35
citations
#543

Fair Resource Allocation in Multi-Task Learning

Hao Ban, Kaiyi Ji

ICML 2024arXiv:2402.15638
35
citations
#544

LoRA Training in the NTK Regime has No Spurious Local Minima

Uijeong Jang, Jason Lee, Ernest Ryu

ICML 2024arXiv:2402.11867
35
citations
#545

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?

Rylan Schaeffer, Hailey Schoelkopf, Brando Miranda et al.

ICML 2025arXiv:2406.04391
35
citations
#546

DeFoG: Discrete Flow Matching for Graph Generation

Yiming Qin, Manuel Madeira, Dorina Thanou et al.

ICML 2025oralarXiv:2410.04263
35
citations
#547

RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing

Jinyao Guo, Chengpeng Wang, Xiangzhe Xu et al.

ICML 2025arXiv:2501.18160
35
citations
#548

Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues

Antonio Orvieto, Soham De, Caglar Gulcehre et al.

ICML 2024arXiv:2307.11888
35
citations
#549

A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

Behrad Moniri, Donghwan Lee, Hamed Hassani et al.

ICML 2024arXiv:2310.07891
35
citations
#550

Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching

Yuchen Zhang, Tianle Zhang, Kai Wang et al.

ICML 2024arXiv:2402.05011
35
citations
#551

Larimar: Large Language Models with Episodic Memory Control

Payel Das, Subhajit Chaudhury, Elliot Nelson et al.

ICML 2024arXiv:2403.11901
35
citations
#552

Detecting Strategic Deception with Linear Probes

Nicholas Goldowsky-Dill, Bilal Chughtai, Stefan Heimersheim et al.

ICML 2025arXiv:2502.03407
35
citations
#553

CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks

Yulong Huang, Xiaopeng LIN, Hongwei Ren et al.

ICML 2024oralarXiv:2402.04663
35
citations
#554

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Yafu Li, Xuyang Hu, Xiaoye Qu et al.

ICML 2025arXiv:2501.12895
35
citations
#555

Modular Duality in Deep Learning

Jeremy Bernstein, Laker Newhouse

ICML 2025arXiv:2410.21265
35
citations
#556

Offline Training of Language Model Agents with Functions as Learnable Weights

Shaokun Zhang, Jieyu Zhang, Jiale Liu et al.

ICML 2024arXiv:2402.11359
35
citations
#557

Full-Atom Peptide Design based on Multi-modal Flow Matching

Jiahan Li, Chaoran Cheng, Zuofan Wu et al.

ICML 2024arXiv:2406.00735
35
citations
#558

Offline Actor-Critic Reinforcement Learning Scales to Large Models

Jost Tobias Springenberg, Abbas Abdolmaleki, Jingwei Zhang et al.

ICML 2024oralarXiv:2402.05546
35
citations
#559

RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences

Jie Cheng, Gang Xiong, Xingyuan Dai et al.

ICML 2024spotlightarXiv:2402.17257
35
citations
#560

SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation

Haoquan Fang, Markus Grotz, Wilbert Pumacay et al.

ICML 2025arXiv:2501.18564
35
citations
#561

Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions

Kaihong Zhang, Heqi Yin, Feng Liang et al.

ICML 2024spotlightarXiv:2402.15602
34
citations
#562

An Analysis of Quantile Temporal-Difference Learning

Mark Rowland, Remi Munos, Mohammad Gheshlaghi Azar et al.

ICML 2025oralarXiv:2301.04462
34
citations
#563

NeWRF: A Deep Learning Framework for Wireless Radiation Field Reconstruction and Channel Prediction

Haofan Lu, Christopher Vattheuer, Baharan Mirzasoleiman et al.

ICML 2024arXiv:2403.03241
34
citations
#564

Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding

Mingyu Jin, Kai Mei, Wujiang Xu et al.

ICML 2025arXiv:2502.01563
34
citations
#565

How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?

Hongkang Li, Meng Wang, Songtao Lu et al.

ICML 2024arXiv:2402.15607
34
citations
#566

AffectGPT: A New Dataset, Model, and Benchmark for Emotion Understanding with Multimodal Large Language Models

Zheng Lian, Haoyu Chen, Lan Chen et al.

ICML 2025oralarXiv:2501.16566
34
citations
#567

PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition

Ziyang Zhang, Qizhen Zhang, Jakob Foerster

ICML 2024arXiv:2405.07932
34
citations
#568

Exploration and Anti-Exploration with Distributional Random Network Distillation

Kai Yang, jian tao, Jiafei Lyu et al.

ICML 2024arXiv:2401.09750
34
citations
#569

No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

Daniel Marczak, Simone Magistri, Sebastian Cygert et al.

ICML 2025arXiv:2502.04959
34
citations
#570

HexGen: Generative Inference of Large Language Model over Heterogeneous Environment

Youhe Jiang, Ran Yan, Xiaozhe Yao et al.

ICML 2024arXiv:2311.11514
34
citations
#571

Robust Autonomy Emerges from Self-Play

Marco Cusumano-Towner, David Hafner, Alexander Hertzberg et al.

ICML 2025arXiv:2502.03349
34
citations
#572

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

Perampalli Shravan Nayak, Xiangru Jian, Kevin Qinghong Lin et al.

ICML 2025arXiv:2503.15661
34
citations
#573

Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation

Lujie Yang, Hongkai Dai, Zhouxing Shi et al.

ICML 2024arXiv:2404.07956
34
citations
#574

Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models

Jan van den Brand, Zhao Song, Tianyi Zhou

ICML 2024arXiv:2304.02207
34
citations
#575

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

Jialong Guo, Xinghao Chen, Yehui Tang et al.

ICML 2024arXiv:2405.11582
34
citations
#576

EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction

yang zhang, Zhewei Wei, Ye Yuan et al.

ICML 2024arXiv:2302.12177
34
citations
#577

Assessing Large Language Models on Climate Information

Jannis Bulian, Mike Schäfer, Afra Amini et al.

ICML 2024arXiv:2310.02932
34
citations
#578

Flextron: Many-in-One Flexible Large Language Model

Ruisi Cai, Saurav Muralidharan, Greg Heinrich et al.

ICML 2024arXiv:2406.10260
34
citations
#579

On the Emergence of Position Bias in Transformers

Xinyi Wu, Yifei Wang, Stefanie Jegelka et al.

ICML 2025arXiv:2502.01951
34
citations
#580

SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals

Rahul Thapa, Bryan He, Magnus Ruud Kjaer et al.

ICML 2024arXiv:2405.17766
34
citations
#581

LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits

Chen-Chia Chang, Yikang Shen, Shaoze Fan et al.

ICML 2024arXiv:2407.18269
33
citations
#582

Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation

Gauthier Guinet, Behrooz Tehrani, Anoop Deoras et al.

ICML 2024arXiv:2405.13622
33
citations
#583

Optimizing Large Language Model Training Using FP4 Quantization

Ruizhe Wang, Yeyun Gong, Xiao Liu et al.

ICML 2025arXiv:2501.17116
33
citations
#584

CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables

Jiecheng Lu, Xu Han, Sun et al.

ICML 2024oralarXiv:2403.01673
33
citations
#585

Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization

Ziqing Fan, Shengchao Hu, Jiangchao Yao et al.

ICML 2024spotlightarXiv:2405.18890
33
citations
#586

Time Series Diffusion in the Frequency Domain

Jonathan Crabbé, Nicolas Huynh, Jan Stanczuk et al.

ICML 2024arXiv:2402.05933
33
citations
#587

AST-T5: Structure-Aware Pretraining for Code Generation and Understanding

Linyuan Gong, Mostafa Elhoushi, Alvin Cheung

ICML 2024arXiv:2401.03003
33
citations
#588

In value-based deep reinforcement learning, a pruned network is a good network

Johan Obando Ceron, Aaron Courville, Pablo Samuel Castro

ICML 2024arXiv:2402.12479
33
citations
#589

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

Atli Kosson, Bettina Messmer, Martin Jaggi

ICML 2024arXiv:2305.17212
33
citations
#590

Second-Order Uncertainty Quantification: A Distance-Based Approach

Yusuf Sale, Viktor Bengs, Michele Caprio et al.

ICML 2024spotlightarXiv:2312.00995
33
citations
#591

Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization

Haocheng Xi, Yuxiang Chen, Kang Zhao et al.

ICML 2024spotlightarXiv:2403.12422
33
citations
#592

Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making

Vivek Myers, Chongyi Zheng, Anca Dragan et al.

ICML 2024oralarXiv:2406.17098
33
citations
#593

The dark side of the forces: assessing non-conservative force models for atomistic machine learning

Filippo Bigi, Marcel Langer, Michele Ceriotti

ICML 2025oralarXiv:2412.11569
33
citations
#594

High-Dimensional Prediction for Sequential Decision Making

Georgy Noarov, Ramya Ramalingam, Aaron Roth et al.

ICML 2025oralarXiv:2310.17651
33
citations
#595

DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts

Tobias Braun, Mark Rothermel, Marcus Rohrbach et al.

ICML 2025oralarXiv:2412.10510
33
citations
#596

Diverging Preferences: When do Annotators Disagree and do Models Know?

Michael Zhang, Zhilin Wang, Jena Hwang et al.

ICML 2025arXiv:2410.14632
33
citations
#597

Auto-Encoding Morph-Tokens for Multimodal LLM

Kaihang Pan, Siliang Tang, Juncheng Li et al.

ICML 2024spotlightarXiv:2405.01926
32
citations
#598

Position: Measure Dataset Diversity, Don't Just Claim It

Dora Zhao, Jerone Andrews, Orestis Papakyriakopoulos et al.

ICML 2024arXiv:2407.08188
32
citations
#599

INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer

Han Fang, Zhihao Song, Paul Weng et al.

ICML 2024arXiv:2402.02317
32
citations
#600

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

Thomas Fel, Ekdeep Singh Lubana, Jacob Prince et al.

ICML 2025arXiv:2502.12892
32
citations