Most Cited ICML "low precision integers" Papers
5,975 papers found • Page 3 of 30
Conference
FlipAttack: Jailbreak LLMs via Flipping
Yue Liu, Xiaoxin He, Miao Xiong et al.
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching
Sucheng Ren, Qihang Yu, Ju He et al.
Deep Networks Always Grok and Here is Why
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
On the Embedding Collapse when Scaling up Recommendation Models
Xingzhuo Guo, Junwei Pan, Ximei Wang et al.
An Analysis of Linear Time Series Forecasting Models
William Toner, Luke Darlow
AnyEdit: Edit Any Knowledge Encoded in Language Models
Houcheng Jiang, Junfeng Fang, Ningyu Zhang et al.
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Zhongzhi Yu, Zheng Wang, Yonggan Fu et al.
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
Peng, Xinyi Ling, Ziru Chen et al.
Empirical Design in Reinforcement Learning
Andrew Patterson, Samuel F Neumann, Martha White et al.
Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas
Shiqi Chen, Tongyao Zhu, Ruochen Zhou et al.
Dual Operating Modes of In-Context Learning
Ziqian Lin, Kangwook Lee
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling
JUNCHAO GONG, LEI BAI, Peng Ye et al.
Active Preference Learning for Large Language Models
William Muldrew, Peter Hayes, Mingtian Zhang et al.
Improving fine-grained understanding in image-text pre-training
Ioana Bica, Anastasija Ilic, Matthias Bauer et al.
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Yao Mu, Junting Chen, Qing-Long Zhang et al.
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems
Shaokun Zhang, Ming Yin, Jieyu Zhang et al.
On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis
Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song et al.
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents
Jen-Tse Huang, Jiaxu Zhou, Tailin Jin et al.
Context is Key: A Benchmark for Forecasting with Essential Textual Information
Andrew Williams, Arjun Ashok, Étienne Marcotte et al.
A Multimodal Automated Interpretability Agent
Tamar Rott Shaham, Sarah Schwettmann, Franklin Wang et al.
A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models
Yihan Wu, Zhengmian Hu, Junfeng Guo et al.
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
Yingying Deng, Xiangyu He, Changwang Mei et al.
UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent
Jianke Zhang, Yanjiang Guo, Yucheng Hu et al.
The Surprising Effectiveness of Test-Time Training for Few-Shot Learning
Ekin Akyürek, Mehul Damani, Adam Zweiger et al.
One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation
Zhendong Wang, Max Li, Ajay Mandlekar et al.
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers
Min Zhao, Guande He, Yixiao Chen et al.
Theoretical insights for diffusion guidance: A case study for Gaussian mixture models
Yuchen Wu, Minshuo Chen, Zihao Li et al.
Thinking LLMs: General Instruction Following with Thought Generation
Tianhao Wu, Janice Lan, Weizhe Yuan et al.
A Language Model’s Guide Through Latent Space
Dimitri von Rütte, Sotiris Anagnostidis, Gregor Bachmann et al.
Parameterized Physics-informed Neural Networks for Parameterized PDEs
Woojin Cho, Minju Jo, Haksoo Lim et al.
Feedback Efficient Online Fine-Tuning of Diffusion Models
Masatoshi Uehara, Yulai Zhao, Kevin Black et al.
Online conformal prediction with decaying step sizes
Anastasios Angelopoulos, Rina Barber, Stephen Bates
Equivariant Graph Neural Operator for Modeling 3D Dynamics
Minkai Xu, Jiaqi Han, Aaron Lou et al.
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?
Agustinus Kristiadi, Felix Strieth-Kalthoff, Marta Skreta et al.
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao, Siyuan Zhou, Yilun Du et al.
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning
Zhaorun Chen, Mintong Kang, Bo Li
An Architecture Search Framework for Inference-Time Techniques
Jon Saad-Falcon, Adrian Lafuente, Shlok Natarajan et al.
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Shiqi Chen, Miao Xiong, Junteng Liu et al.
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Audrey Huang, Adam Block, Qinghua Liu et al.
Can AI Assistants Know What They Don't Know?
Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu et al.
Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection
Zhiyuan Yan, Jiangming Wang, Peng Jin et al.
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
Dachuan Shi, Chaofan Tao, Anyi Rao et al.
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao et al.
AI Alignment with Changing and Influenceable Reward Functions
Micah Carroll, Davis Foote, Anand Siththaranjan et al.
Conformal Prediction for Deep Classifier via Label Ranking
Jianguo Huang, HuaJun Xi, Linjun Zhang et al.
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling
Denis Blessing, Xiaogang Jia, Johannes Esslinger et al.
Graph Attention Retrospective
Kimon Fountoulakis, Amit Levi, Shenghao Yang et al.
The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective
Chi-Heng Lin, Chiraag Kaushik, Eva Dyer et al.
MEMORYLLM: Towards Self-Updatable Large Language Models
Yu Wang, Yifan Gao, Xiusi Chen et al.
SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference
Jintao Zhang, Chendong Xiang, Haofeng Huang et al.
CollabLLM: From Passive Responders to Active Collaborators
Shirley Wu, Michel Galley, Baolin Peng et al.
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Yeonhong Park, Jake Hyun, SangLyul Cho et al.
STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving
Kefan Dong, Tengyu Ma
Diffusion Model-Augmented Behavioral Cloning
Shang-Fu Chen, Hsiang-Chun Wang, Ming-Hao Hsu et al.
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Wei Chen, Zhen Huang, Liang Xie et al.
CARTE: Pretraining and Transfer for Tabular Learning
Myung Jun Kim, Leo Grinsztajn, Gael Varoquaux
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Haozheng Luo et al.
Fast Decision Boundary based Out-of-Distribution Detector
Litian Liu, Yao Qin
Interpreting and Improving Large Language Models in Arithmetic Calculation
Wei Zhang, Wan Chaoqun, Yonggang Zhang et al.
Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes
Yifan Chen, Mark Goldstein, Mengjian Hua et al.
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
Fabian Falck, Ziyu Wang, Christopher Holmes
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks
Zhiruo Wang, Graham Neubig, Daniel Fried
Distinguishing the Knowable from the Unknowable with Language Models
Gustaf Ahdritz, Tian Qin, Nikhil Vyas et al.
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Alex Tamkin, Mohammad Taufeeque, Noah Goodman
Non-Vacuous Generalization Bounds for Large Language Models
Sanae Lotfi, Marc Finzi, Yilun Kuang et al.
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Michal Nauman, Michał Bortkiewicz, Piotr Milos et al.
Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching
Aaron Havens, Benjamin Kurt Miller, Bing Yan et al.
Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge
Yue Conghan, Zhengwei Peng, Junlong Ma et al.
Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
Yujia Huang, Adishree Ghatare, Yuanzhe Liu et al.
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data
Giannis Daras, Alexandros Dimakis, Constantinos Daskalakis
DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation
Dongya Jia, Zhuo Chen, Jiawei Chen et al.
Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond
Chongyu Fan, jinghan jia, Yihua Zhang et al.
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency
Linshan Hou, Ruili Feng, Zhongyun Hua et al.
Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance
Xinyu Peng, Ziyang Zheng, Wenrui Dai et al.
ReconBoost: Boosting Can Achieve Modality Reconcilement
Cong Hua, Qianqian Xu, Shilong Bao et al.
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Maohao Shen, Guangtao Zeng, Zhenting Qi et al.
Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Nicholas Crispino, Kyle Montgomery, Fankun Zeng et al.
Improved Operator Learning by Orthogonal Attention
Zipeng Xiao, Zhongkai Hao, Bokai Lin et al.
Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction
Diwen Wan, Ruijie Lu, Gang Zeng
Multimodal Prototyping for cancer survival prediction
Andrew Song, Richard Chen, Guillaume Jaume et al.
Generalized Neural Collapse for a Large Number of Classes
Jiachen Jiang, Jinxin Zhou, Peng Wang et al.
In-Context Principle Learning from Mistakes
Tianjun Zhang, Aman Madaan, Luyu Gao et al.
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin, Bo Zhu, Li Yuan et al.
Learning with 3D rotations, a hitchhiker's guide to SO(3)
Andreas René Geist, Jonas Frey, Mikel Zhobro et al.
Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts
Marta Skreta, Tara Akhound-Sadegh, Viktor Ohanesian et al.
CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities
Yuxuan Zhu, Antony Kellermann, Dylan Bowman et al.
BAT: Learning to Reason about Spatial Sounds with Large Language Models
Zhisheng Zheng, Puyuan Peng, Ziyang Ma et al.
Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks
Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca et al.
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
Fan Yin, Jayanth Srinivasa, Kai-Wei Chang
The Diffusion Duality
Subham Sekhar Sahoo, Justin Deschenaux, Aaron Gokaslan et al.
Scalable AI Safety via Doubly-Efficient Debate
Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras
Conformal prediction for multi-dimensional time series by ellipsoidal sets
Chen Xu, Hanyang Jiang, Yao Xie
Graph Generation with Diffusion Mixture
Jaehyeong Jo, Dongki Kim, Sung Ju Hwang
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
Huang Huang, Fangchen Liu, Letian Fu et al.
FlatQuant: Flatness Matters for LLM Quantization
Yuxuan Sun, Ruikang Liu, Haoli Bai et al.
Improving the Diffusability of Autoencoders
Ivan Skorokhodov, Sharath Girish, Benran Hu et al.
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
Jiangfei Duan, Runyu Lu, Haojie Duanmu et al.
Copyright Traps for Large Language Models
Matthieu Meeus, Igor Shilov, Manuel Faysse et al.
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi, Emanuele Troiani, Luca Arnaboldi et al.
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Samuel Lavoie, Polina Kirichenko, Mark Ibrahim et al.
SafeArena: Evaluating the Safety of Autonomous Web Agents
Ada Tur, Nicholas Meade, Xing Han Lù et al.
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
Wei Huang, Haotong Qin, Yangdong Liu et al.
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Tanmay Gautam, Youngsuk Park, Hao Zhou et al.
Revisiting the Role of Language Priors in Vision-Language Models
Zhiqiu Lin, Xinyue Chen, Deepak Pathak et al.
WMAdapter: Adding WaterMark Control to Latent Diffusion Models
Hai Ci, Yiren Song, Pei Yang et al.
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
Juno Kim, Taiji Suzuki
Subgoal-based Demonstration Learning for Formal Theorem Proving
Xueliang Zhao, Wenda Li, Lingpeng Kong
Smooth Tchebycheff Scalarization for Multi-Objective Optimization
Xi Lin, Xiaoyuan Zhang, Zhiyuan Yang et al.
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi et al.
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie, Bin Wang, Fanjing Kong et al.
Hypergraph-enhanced Dual Semi-supervised Graph Classification
Wei Ju, Zhengyang Mao, Siyu Yi et al.
Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models
Linhao Luo, Zicheng Zhao, Reza Haffari et al.
Efficient Exploration for LLMs
Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao et al.
A Computational Framework for Solving Wasserstein Lagrangian Flows
Kirill Neklyudov, Rob Brekelmans, Alexander Tong et al.
Potential Based Diffusion Motion Planning
Yunhao Luo, Chen Sun, Josh Tenenbaum et al.
SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders
Bartosz Cywiński, Kamil Deja
On the Trajectory Regularity of ODE-based Diffusion Sampling
Defang Chen, Zhenyu Zhou, Can Wang et al.
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset
Shijie Lian, Ziyi Zhang, Hua Li et al.
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Xin Zou, Yizhou WANG, Yibo Yan et al.
Collapse or Thrive: Perils and Promises of Synthetic Data in a Self-Generating World
Joshua Kazdan, Rylan Schaeffer, Apratim Dey et al.
Generalization to New Sequential Decision Making Tasks with In-Context Learning
Sharath Chandra Raparthy, Eric Hambro, Robert Kirk et al.
Don't trust your eyes: on the (un)reliability of feature visualizations
Robert Geirhos, Roland S. Zimmermann, Blair Bilodeau et al.
Multicalibration for Confidence Scoring in LLMs
Gianluca Detommaso, Martin A Bertran, Riccardo Fogliato et al.
Which Attention Heads Matter for In-Context Learning?
Kayo Yin, Jacob Steinhardt
Time Weaver: A Conditional Time Series Generation Model
Sai Shankar Narasimhan, Shubhankar Agarwal, Oguzhan Akcin et al.
Compositional Text-to-Image Generation with Dense Blob Representations
Weili Nie, Sifei Liu, Morteza Mardani et al.
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
Brian Bartoldson, James Diffenderfer, Konstantinos Parasyris et al.
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon, Jason Lee, Qi Lei et al.
Equivariant Frames and the Impossibility of Continuous Canonicalization
Nadav Dym, Hannah Lawrence, Jonathan Siegel
VideoRoPE: What Makes for Good Video Rotary Position Embedding?
Xilin Wei, Xiaoran Liu, Yuhang Zang et al.
AutoEval Done Right: Using Synthetic Data for Model Evaluation
Pierre Boyeau, Anastasios Angelopoulos, Tianle Li et al.
On the Generalization of Stochastic Gradient Descent with Momentum
Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher et al.
Privacy-Preserving Instructions for Aligning Large Language Models
Da Yu, Peter Kairouz, Sewoong Oh et al.
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices
Nathaniel Cohen, Vladimir Kulikov, Matan Kleiner et al.
Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting
Siru Zhong, Weilin Ruan, Ming Jin et al.
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers
Muhammed Emrullah Ildiz, Yixiao HUANG, Yingcong Li et al.
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.
Learning to Route LLMs with Confidence Tokens
Yu-Neng Chuang, Prathusha Sarma, Parikshit Gopalan et al.
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models
Jingwei Sun, Ziyue Xu, Hongxu Yin et al.
Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries
HUAKUN LUO, Haixu Wu, Hang Zhou et al.
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
Sophia Tang, Yinuo Zhang, Pranam Chatterjee, PhD
Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models
Fangzhao Zhang, Mert Pilanci
Fair Resource Allocation in Multi-Task Learning
Hao Ban, Kaiyi Ji
LoRA Training in the NTK Regime has No Spurious Local Minima
Uijeong Jang, Jason Lee, Ernest Ryu
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
Rylan Schaeffer, Hailey Schoelkopf, Brando Miranda et al.
DeFoG: Discrete Flow Matching for Graph Generation
Yiming Qin, Manuel Madeira, Dorina Thanou et al.
RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing
Jinyao Guo, Chengpeng Wang, Xiangzhe Xu et al.
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues
Antonio Orvieto, Soham De, Caglar Gulcehre et al.
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
Behrad Moniri, Donghwan Lee, Hamed Hassani et al.
Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching
Yuchen Zhang, Tianle Zhang, Kai Wang et al.
Larimar: Large Language Models with Episodic Memory Control
Payel Das, Subhajit Chaudhury, Elliot Nelson et al.
Detecting Strategic Deception with Linear Probes
Nicholas Goldowsky-Dill, Bilal Chughtai, Stefan Heimersheim et al.
CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks
Yulong Huang, Xiaopeng LIN, Hongwei Ren et al.
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback
Yafu Li, Xuyang Hu, Xiaoye Qu et al.
Modular Duality in Deep Learning
Jeremy Bernstein, Laker Newhouse
Offline Training of Language Model Agents with Functions as Learnable Weights
Shaokun Zhang, Jieyu Zhang, Jiale Liu et al.
Full-Atom Peptide Design based on Multi-modal Flow Matching
Jiahan Li, Chaoran Cheng, Zuofan Wu et al.
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Jost Tobias Springenberg, Abbas Abdolmaleki, Jingwei Zhang et al.
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Jie Cheng, Gang Xiong, Xingyuan Dai et al.
SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation
Haoquan Fang, Markus Grotz, Wilbert Pumacay et al.
Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions
Kaihong Zhang, Heqi Yin, Feng Liang et al.
An Analysis of Quantile Temporal-Difference Learning
Mark Rowland, Remi Munos, Mohammad Gheshlaghi Azar et al.
NeWRF: A Deep Learning Framework for Wireless Radiation Field Reconstruction and Channel Prediction
Haofan Lu, Christopher Vattheuer, Baharan Mirzasoleiman et al.
Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding
Mingyu Jin, Kai Mei, Wujiang Xu et al.
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?
Hongkang Li, Meng Wang, Songtao Lu et al.
AffectGPT: A New Dataset, Model, and Benchmark for Emotion Understanding with Multimodal Large Language Models
Zheng Lian, Haoyu Chen, Lan Chen et al.
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition
Ziyang Zhang, Qizhen Zhang, Jakob Foerster
Exploration and Anti-Exploration with Distributional Random Network Distillation
Kai Yang, jian tao, Jiafei Lyu et al.
No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces
Daniel Marczak, Simone Magistri, Sebastian Cygert et al.
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
Youhe Jiang, Ran Yan, Xiaozhe Yao et al.
Robust Autonomy Emerges from Self-Play
Marco Cusumano-Towner, David Hafner, Alexander Hertzberg et al.
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction
Perampalli Shravan Nayak, Xiangru Jian, Kevin Qinghong Lin et al.
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation
Lujie Yang, Hongkai Dai, Zhouxing Shi et al.
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models
Jan van den Brand, Zhao Song, Tianyi Zhou
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Jialong Guo, Xinghao Chen, Yehui Tang et al.
EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction
yang zhang, Zhewei Wei, Ye Yuan et al.
Assessing Large Language Models on Climate Information
Jannis Bulian, Mike Schäfer, Afra Amini et al.
Flextron: Many-in-One Flexible Large Language Model
Ruisi Cai, Saurav Muralidharan, Greg Heinrich et al.
On the Emergence of Position Bias in Transformers
Xinyi Wu, Yifei Wang, Stefanie Jegelka et al.
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals
Rahul Thapa, Bryan He, Magnus Ruud Kjaer et al.
LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits
Chen-Chia Chang, Yikang Shen, Shaoze Fan et al.
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Gauthier Guinet, Behrooz Tehrani, Anoop Deoras et al.
Optimizing Large Language Model Training Using FP4 Quantization
Ruizhe Wang, Yeyun Gong, Xiao Liu et al.
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables
Jiecheng Lu, Xu Han, Sun et al.
Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization
Ziqing Fan, Shengchao Hu, Jiangchao Yao et al.
Time Series Diffusion in the Frequency Domain
Jonathan Crabbé, Nicolas Huynh, Jan Stanczuk et al.
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
Linyuan Gong, Mostafa Elhoushi, Alvin Cheung
In value-based deep reinforcement learning, a pruned network is a good network
Johan Obando Ceron, Aaron Courville, Pablo Samuel Castro
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Atli Kosson, Bettina Messmer, Martin Jaggi
Second-Order Uncertainty Quantification: A Distance-Based Approach
Yusuf Sale, Viktor Bengs, Michele Caprio et al.
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi, Yuxiang Chen, Kang Zhao et al.
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Vivek Myers, Chongyi Zheng, Anca Dragan et al.
The dark side of the forces: assessing non-conservative force models for atomistic machine learning
Filippo Bigi, Marcel Langer, Michele Ceriotti
High-Dimensional Prediction for Sequential Decision Making
Georgy Noarov, Ramya Ramalingam, Aaron Roth et al.
DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts
Tobias Braun, Mark Rothermel, Marcus Rohrbach et al.
Diverging Preferences: When do Annotators Disagree and do Models Know?
Michael Zhang, Zhilin Wang, Jena Hwang et al.
Auto-Encoding Morph-Tokens for Multimodal LLM
Kaihang Pan, Siliang Tang, Juncheng Li et al.
Position: Measure Dataset Diversity, Don't Just Claim It
Dora Zhao, Jerone Andrews, Orestis Papakyriakopoulos et al.
INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer
Han Fang, Zhihao Song, Paul Weng et al.
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
Thomas Fel, Ekdeep Singh Lubana, Jacob Prince et al.