Most Cited ICML "clustering cohesion" Papers
5,975 papers found • Page 5 of 30
Conference
Position: Why We Must Rethink Empirical Research in Machine Learning
Moritz Herrmann, F. Julian D. Lange, Katharina Eggensperger et al.
Equivariant Diffusion for Crystal Structure Prediction
Peijia Lin, Pin Chen, Rui Jiao et al.
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Anna Rogers, Sasha Luccioni
Comparing Graph Transformers via Positional Encodings
Mitchell Black, Zhengchao Wan, Gal Mishne et al.
Efficient and Effective Time-Series Forecasting with Spiking Neural Networks
Changze Lv, Yansen Wang, Dongqi Han et al.
On Mechanistic Knowledge Localization in Text-to-Image Generative Models
Samyadeep Basu, Keivan Rezaei, Priyatham Kattakinda et al.
Sampling in Unit Time with Kernel Fisher-Rao Flow
Aimee Maurais, Youssef Marzouk
Position: Evaluating Generative AI Systems Is a Social Science Measurement Challenge
Hanna Wallach, Meera Desai, A. Feder Cooper et al.
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Shikai Qiu, Andres Potapczynski, Marc Finzi et al.
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
Praneeth Kacham, Vahab Mirrokni, Peilin Zhong
Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits
Jiachen Wang, Tianji Yang, James Zou et al.
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems
Yair Schiff, Zhong Yi Wan, Jeffrey Parker et al.
Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning
Puning Yang, Qizhou Wang, Zhuo Huang et al.
From Language Models over Tokens to Language Models over Characters
Tim Vieira, Benjamin LeBrun, Mario Giulianelli et al.
On Least Square Estimation in Softmax Gating Mixture of Experts
Huy Nguyen, Nhat Ho, Alessandro Rinaldo
Optimizing Language Models for Inference Time Objectives using Reinforcement Learning
Yunhao Tang, Kunhao Zheng, Gabriel Synnaeve et al.
Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Zhuo Huang, Chang Liu, Yinpeng Dong et al.
Mastering Board Games by External and Internal Planning with Language Models
John Schultz, Jakub Adamek, Matej Jusup et al.
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
Zhenyu He, Guhao Feng, Shengjie Luo et al.
The Good, The Bad, and Why: Unveiling Emotions in Generative AI
CHENG LI, Jindong Wang, Yixuan Zhang et al.
Structure-based drug design by denoising voxel grids
Pedro O. Pinheiro, Arian Jamasb, Omar Mahmood et al.
A Simple Model of Inference Scaling Laws
Noam Levi
TabPFN Unleashed: A Scalable and Effective Solution to Tabular Classification Problems
Si-Yang Liu, Han-Jia Ye
Hyperbolic Geometric Latent Diffusion Model for Graph Generation
Xingcheng Fu, Yisen Gao, Yuecen Wei et al.
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
Jon Saad-Falcon, Daniel Y Fu, Simran Arora et al.
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
Zihan Liu, Shuangrui Ding, Zhixiong Zhang et al.
M+: Extending MemoryLLM with Scalable Long-Term Memory
Yu Wang, Dmitry Krotov, Yuanzhe Hu et al.
Automated Hypothesis Validation with Agentic Sequential Falsifications
Kexin Huang, Ying Jin, Ryan Li et al.
TinyTrain: Resource-Aware Task-Adaptive Sparse Training of DNNs at the Data-Scarce Edge
Young Kwon, Rui Li, Stylianos Venieris et al.
Projecting Molecules into Synthesizable Chemical Spaces
Shitong Luo, Wenhao Gao, Zuofan Wu et al.
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Kwangjun Ahn, Zhiyu Zhang, Yunbum Kook et al.
Reinforced Lifelong Editing for Language Models
Zherui Li, Houcheng Jiang, Hao Chen et al.
Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data
Jiahan Zhang, Qi Wei, Feng Liu et al.
STEER: Assessing the Economic Rationality of Large Language Models
Narun Raman, Taylor Lundy, Samuel Joseph Amouyal et al.
Attribution-based Explanations that Provide Recourse Cannot be Robust
Hidde Fokkema, Rianne de Heide, Tim van Erven
How Private are DP-SGD Implementations?
Lynn Chua, Badih Ghazi, Pritish Kamath et al.
Nonparametric Modern Hopfield Models
Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu et al.
Demystifying Cost-Efficiency in LLM Serving over Heterogeneous GPUs
Youhe Jiang, Fangcheng Fu, Xiaozhe Yao et al.
Latent Space Symmetry Discovery
Jianke Yang, Nima Dehmamy, Robin Walters et al.
Learning Reward for Robot Skills Using Large Language Models via Self-Alignment
Yuwei Zeng, Yao Mu, Lin Shao
RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching
Divya Nori, Wengong Jin
DPZero: Private Fine-Tuning of Language Models without Backpropagation
Liang Zhang, Bingcong Li, Kiran Thekumparampil et al.
Optimal Ridge Regularization for Out-of-Distribution Prediction
Pratik Patil, Jin-Hong Du, Ryan Tibshirani
Is Noise Conditioning Necessary for Denoising Generative Models?
Qiao Sun, Zhicheng Jiang, Hanhong Zhao et al.
OneForecast: A Universal Framework for Global and Regional Weather Forecasting
Yuan Gao, Hao Wu, Ruiqi Shu et al.
Causal Representation Learning Made Identifiable by Grouping of Observational Variables
Hiroshi Morioka, Aapo Hyvarinen
Towards a General Time Series Forecasting Model with Unified Representation and Adaptive Transfer
Yihang Wang, Yuying Qiu, Peng Chen et al.
Rethinking Momentum Knowledge Distillation in Online Continual Learning
Nicolas MICHEL, Maorong Wang, Ling Xiao et al.
Estimating Canopy Height at Scale
Jan Pauls, Max Zimmer, Una Kelly et al.
Position: Understanding LLMs Requires More Than Statistical Generalization
Patrik Reizinger, Szilvia Ujváry, Anna Mészáros et al.
Position: AI Evaluation Should Learn from How We Test Humans
Yan Zhuang, Qi Liu, Zachary Pardos et al.
Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion Bridge
Yufei Huang, Odin Zhang, Lirong Wu et al.
Compute or Load KV Cache? Why Not Both?
Shuowei Jin, Xueshen Liu, Qingzhao Zhang et al.
Position: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems
Yifan Xia, Xianliang Yang, Zichuan Liu et al.
Towards Theoretical Understandings of Self-Consuming Generative Models
Shi Fu, Sen Zhang, Yingjie Wang et al.
Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning
Kai Gan, Tong Wei
A Sparsity Principle for Partially Observable Causal Representation Learning
Danru Xu, Dingling Yao, Sébastien Lachapelle et al.
Fixing the Double Penalty in Data-Driven Weather Forecasting Through a Modified Spherical Harmonic Loss Function
Christopher Subich, Syed Husain, Leo Separovic et al.
Generalization in Kernel Regression Under Realistic Assumptions
Daniel Barzilai, Ohad Shamir
Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization
Ermo Hua, Che Jiang, Xingtai Lv et al.
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee, Jack Cai, Avi Schwarzschild et al.
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Itamar Zimerman, Moran Baruch, Nir Drucker et al.
Approximate Nearest Neighbor Search with Window Filters
Josh Engels, Ben Landrum, Shangdi Yu et al.
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Zixuan Wang, Stanley Wei, Daniel Hsu et al.
Discovering Environments with XRM
Mohammad Pezeshki, Diane Bouchacourt, Mark Ibrahim et al.
Delta Decompression for MoE-based LLMs Compression
Hao Gu, Wei Li, Lujun Li et al.
Membership Inference Attacks on Diffusion Models via Quantile Regression
Shuai Tang, Steven Wu, Sergul Aydore et al.
Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning
Zeyu Gan, Yun Liao, Yong Liu
TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling
Jiaxiang Dong, Haixu Wu, Yuxuan Wang et al.
Monte Carlo Tree Diffusion for System 2 Planning
Jaesik Yoon, Hyeonseo Cho, Doojin Baek et al.
Reducing Tool Hallucination via Reliability Alignment
Hongshen Xu, Zichen Zhu, Lei Pan et al.
Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-Loop and Hessian-Free Solution Strategy
Risheng Liu, Zhu Liu, Wei Yao et al.
Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding
Tian Jin, Ellie Cheng, Zachary Ankner et al.
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Tianying Ji, Yu Luo, Fuchun Sun et al.
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Miltiadis Allamanis, Sheena Panthaplackel, Pengcheng Yin
Decision Theoretic Foundations for Conformal Prediction: Optimal Uncertainty Quantification for Risk-Averse Agents
Shayan Kiyani, George Pappas, Aaron Roth et al.
Residual Quantization with Implicit Neural Codebooks
Iris Huijben, Matthijs Douze, Matthew Muckley et al.
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Rickard Gabrielsson, Jiacheng Zhu, Onkar Bhardwaj et al.
Memorization Through the Lens of Curvature of Loss Function Around Samples
Isha Garg, Deepak Ravikumar, Kaushik Roy
InfAlign: Inference-aware language model alignment
Ananth Balashankar, Ziteng Sun, Jonathan Berant et al.
Improving LLM Safety Alignment with Dual-Objective Optimization
Xuandong Zhao, Will Cai, Tianneng Shi et al.
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders
Yi Yu, Yufei Wang, Song Xia et al.
TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting
Peiyuan Liu, Beiliang Wu, Yifan Hu et al.
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
Yilun Xu, Gabriele Corso, Tommi Jaakkola et al.
Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation
CHUANQI CHENG, Jian Guan, Wei Wu et al.
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation
Kai Li, Runxuan Yang, Fuchun Sun et al.
Compositional Few-Shot Class-Incremental Learning
Yixiong Zou, Shanghang Zhang, haichen zhou et al.
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind
Mo Yu, Qiujing Wang, Shunchi Zhang et al.
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
songyang gao, Qiming Ge, Wei Shen et al.
Understanding the Effects of Iterative Prompting on Truthfulness
Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju
A sampling theory perspective on activations for implicit neural representations
Hemanth Saratchandran, Sameera Ramasinghe, Violetta Shevchenko et al.
MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents
Kaijie Zhu, Xianjun Yang, Jindong Wang et al.
ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification
Hyunseok Lee, Seunghyuk Oh, Jaehyung Kim et al.
BaxBench: Can LLMs Generate Correct and Secure Backends?
Mark Vero, Niels Mündler, Viktor Chibotaru et al.
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
Zhenting Wang, Vikash Sehwag, Chen Chen et al.
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
Daeun Lee, Jaehong Yoon, Sung Ju Hwang
Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting
Andrea Cini, Danilo Mandic, Cesare Alippi
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban et al.
Homomorphism Counts for Graph Neural Networks: All About That Basis
Emily Jin, Michael Bronstein, Ismail Ceylan et al.
Gaussian Processes on Cellular Complexes
Mathieu Alain, So Takao, Brooks Paige et al.
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
Feng Wang, Yaodong Yu, Wei Shao et al.
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
Shibo Jie, Yehui Tang, Ning Ding et al.
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
Rui Wang, Elyssa Hofgard, Han Gao et al.
Symmetry Induces Structure and Constraint of Learning
Liu Ziyin
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Zhuofan Zong, Dongzhi Jiang, Bingqi Ma et al.
BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute
Dujian Ding, Ankur Mallick, Shaokun Zhang et al.
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs
Jongwoo Ko, Tianyi Chen, Sungnyun Kim et al.
Wyckoff Transformer: Generation of Symmetric Crystals
Nikita Kazeev, Wei Nong, Ignat Romanov et al.
AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement
Pranjal Aggarwal, Bryan Parno, Sean Welleck
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Xingwu Chen, Difan Zou
VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data
Thomas Zeng, Shuibai Zhang, Shutong Wu et al.
LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering
Li Sun, Zhenhao Huang, Hao Peng et al.
CRANE: Reasoning with constrained LLM generation
Debangshu Banerjee, Tarun Suresh, Shubham Ugare et al.
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
kang you, Zekai Xu, Chen Nie et al.
StableMask: Refining Causal Masking in Decoder-only Transformer
Qingyu Yin, Xuzheng He, Xiang Zhuang et al.
ReGAL: Refactoring Programs to Discover Generalizable Abstractions
Elias Stengel-Eskin, Archiki Prasad, Mohit Bansal
LEAPS: A discrete neural sampler via locally equivariant networks
Peter Holderrieth, Michael Albergo, Tommi Jaakkola
Investigating Non-Transitivity in LLM-as-a-Judge
Yi Xu, Laura Ruis, Tim Rocktäschel et al.
P(all-atom) Is Unlocking New Path For Protein Design
Wei Qu, Jiawei Guan, Rui Ma et al.
Great Models Think Alike and this Undermines AI Oversight
Shashwat Goel, Joschka Strüber, Ilze Amanda Auzina et al.
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
Huy Nguyen, Pedram Akbarian, Nhat Ho
KAN-AD: Time Series Anomaly Detection with Kolmogorov–Arnold Networks
Quan Zhou, Changhua Pei, Fei Sun et al.
Position: Uncertainty Quantification Needs Reassessment for Large Language Model Agents
Michael Kirchhof, Gjergji Kasneci, Enkelejda Kasneci
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Yunheng Li, Zhong-Yu Li, Quan-Sheng Zeng et al.
CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes
Peter Mikhael, Itamar Chinn, Regina Barzilay
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Ivan Marisca, Cesare Alippi, Filippo Maria Bianchi
OptMATH: A Scalable Bidirectional Data Synthesis Framework for Optimization Modeling
Hongliang Lu, Zhonglin Xie, Yaoyu Wu et al.
Rethinking Aleatoric and Epistemic Uncertainty
Freddie Bickford Smith, Jannik Kossen, Eleanor Trollope et al.
On the Guidance of Flow Matching
Ruiqi Feng, Chenglei Yu, Wenhao Deng et al.
LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs – No Silver Bullet for LC or RAG Routing
Kuan Li, Liwen Zhang, Yong Jiang et al.
MoMo: Momentum Models for Adaptive Learning Rates
Fabian Schaipp, Ruben Ohana, Michael Eickenberg et al.
From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications
Ajay Jaiswal, Yifan Wang, Lu Yin et al.
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman, Andrew Lampinen, Lucas Dixon et al.
Parrot: Multilingual Visual Instruction Tuning
Hai-Long Sun, Da-Wei Zhou, Yang Li et al.
Neural Networks Learn Statistics of Increasing Complexity
Nora Belrose, Quintin Pope, Lucia Quirke et al.
ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models
Rohan Wadhawan, Hritik Bansal, Kai-Wei Chang et al.
Local vs. Global Interpretability: A Computational Complexity Perspective
Shahaf Bassan, Guy Amir, Guy Katz
DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization
Zhenglin Zhou, Xiaobo Xia, Fan Ma et al.
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
Kevin Frans, Seohong Park, Pieter Abbeel et al.
Inference-Time Alignment of Diffusion Models with Direct Noise Optimization
Zhiwei Tang, Jiangweizhi Peng, Jiasheng Tang et al.
Auditing $f$-differential privacy in one run
Saeed Mahloujifar, Luca Melis, Kamalika Chaudhuri
Emoji Attack: Enhancing Jailbreak Attacks Against Judge LLM Detection
Zhipeng Wei, Yuqi Liu, N. Benjamin Erichson
Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching
Federico Errica, Henrik Christiansen, Viktor Zaverkin et al.
Prompting a Pretrained Transformer Can Be a Universal Approximator
Aleksandar Petrov, Phil Torr, Adel Bibi
Universal Length Generalization with Turing Programs
Kaiying Hou, David Brandfonbrener, Sham Kakade et al.
Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models
Yukang Yang, Declan Campbell, Kaixuan Huang et al.
Flexible and Efficient Grammar-Constrained Decoding
Kanghee Park, Timothy Zhou, Loris D'Antoni
Conformal Prediction with Learned Features
Shayan Kiyani, George J. Pappas, Hamed Hassani
Trustless Audits without Revealing Data or Models
Suppakit Waiwitlikhit, Ion Stoica, Yi Sun et al.
Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency
Yuchao Lin, Jacob Helwig, Shurui Gui et al.
Overcoming Data and Model heterogeneities in Decentralized Federated Learning via Synthetic Anchors
Chun-Yin Huang, Kartik Srinivas, Xin Zhang et al.
Sparse Autoencoders for Hypothesis Generation
Rajiv Movva, Kenny Peng, Nikhil Garg et al.
HAMLET: Graph Transformer Neural Operator for Partial Differential Equations
Andrey Bryutkin, Jiahao Huang, Zhongying Deng et al.
ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset
Yilin Wang, Peixuan Lei, Jie Song et al.
Softmax is not Enough (for Sharp Size Generalisation)
Petar Veličković, Christos Perivolaropoulos, Federico Barbero et al.
KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search
Haoran Luo, Haihong E, Yikai Guo et al.
Why Do You Grok? A Theoretical Analysis on Grokking Modular Addition
Mohamad Amin Mohamadi, Zhiyuan Li, Lei Wu et al.
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding
Hongkang Li, Meng Wang, Tengfei Ma et al.
Conformal Validity Guarantees Exist for Any Data Distribution (and How to Find Them)
Drew Prinster, Samuel Stanton, Anqi Liu et al.
Interpreting CLIP with Hierarchical Sparse Autoencoders
Vladimir Zaigrajew, Hubert Baniecki, Przemysław Biecek
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought
sili huang, Jifeng Hu, Hechang Chen et al.
Liouville Flow Importance Sampler
Yifeng Tian, Nishant Panda, Yen Ting Lin
Pre-training Auto-regressive Robotic Models with 4D Representations
Dantong Niu, Yuvan Sharma, Haoru Xue et al.
Beyond Regular Grids: Fourier-Based Neural Operators on Arbitrary Domains
Levi Lingsch, Mike Yan Michelis, Emmanuel de Bézenac et al.
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Wu Lin, Felix Dangel, Runa Eschenhagen et al.
PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming
Bingheng Li, Linxin Yang, Yupeng Chen et al.
Position: Leverage Foundational Models for Black-Box Optimization
Xingyou Song, Yingtao Tian, Robert Lange et al.
CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback
Henry W. Sprueill, Carl Edwards, Khushbu Agarwal et al.
Empowering Graph Invariance Learning with Deep Spurious Infomax
Tianjun Yao, Yongqiang Chen, Zhenhao Chen et al.
Reinforce LLM Reasoning through Multi-Agent Reflection
Yurun Yuan, Tengyang Xie
Adversaries Can Misuse Combinations of Safe Models
Erik Jones, Anca Dragan, Jacob Steinhardt
Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment
Shuo Wang, Bokui Wang, Zhixiang Shen et al.
Temporal Query Network for Efficient Multivariate Time Series Forecasting
Shengsheng Lin, Haojun Chen, Haijie Wu et al.
FairProof : Confidential and Certifiable Fairness for Neural Networks
Chhavi Yadav, Amrita Roy Chowdhury, Dan Boneh et al.
Learn from Downstream and Be Yourself in Multimodal Large Language Models Fine-Tuning
Wenke Huang, Jian Liang, Zekun Shi et al.
Improved Generalization of Weight Space Networks via Augmentations
Aviv Shamsian, Aviv Navon, David Zhang et al.
Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
Jaehoon Hahm, Junho Lee, Sunghyun Kim et al.
Automated Benchmark Generation for Repository-Level Coding Tasks
Konstantinos Vergopoulos, Mark Müller, Martin Vechev
Subobject-level Image Tokenization
Delong Chen, Samuel Cahyawijaya, Jianfeng Liu et al.
Drug Discovery with Dynamic Goal-aware Fragments
Seul Lee, Seanie Lee, Kenji Kawaguchi et al.
Memory Layers at Scale
Vincent-Pierre Berges, Barlas Oğuz, Daniel HAZIZA et al.
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Shangbin Feng, Zifeng Wang, Yike Wang et al.
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Zehan Wang, Ziang Zhang, xize cheng et al.
Target Concrete Score Matching: A Holistic Framework for Discrete Diffusion
Ruixiang Zhang, Shuangfei Zhai, Yizhe Zhang et al.
Explorations of Self-Repair in Language Models
Cody Rushing, Neel Nanda
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training
Fabian Schaipp, Alexander Hägele, Adrien Taylor et al.
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling
Raunaq Bhirangi, Chenyu Wang, Venkatesh Pattabiraman et al.
Position: Optimization in SciML Should Employ the Function Space Geometry
Johannes Müller, Marius Zeinhofer
Distillation of Discrete Diffusion through Dimensional Correlations
Satoshi Hayakawa, Yuhta Takida, Masaaki Imaizumi et al.
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
Guanhua Zhang, Moritz Hardt
KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference
Xing Li, Zeyu Xing, Yiming Li et al.
The Privacy Power of Correlated Noise in Decentralized Learning
Youssef Allouah, Anastasiia Koloskova, Aymane Firdoussi et al.
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel, Yuzong Chen, Bahaa Kotb et al.
On the Expressive Power of Spectral Invariant Graph Neural Networks
Bohang Zhang, Lingxiao Zhao, Haggai Maron
Multi-Source Conformal Inference Under Distribution Shift
Yi Liu, Alexander Levis, Sharon-Lise Normand et al.
Scaling Tractable Probabilistic Circuits: A Systems Perspective
Anji Liu, Kareem Ahmed, Guy Van den Broeck
MCU: An Evaluation Framework for Open-Ended Game Agents
Xinyue Zheng, Haowei Lin, Kaichen He et al.
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang et al.
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks
Saurabh Jha, Rohan Arora, Yuji Watanabe et al.
Understanding the Learning Dynamics of Alignment with Human Feedback
Shawn Im, Sharon Li