Most Cited ICML "sparse attention computation" Papers
5,975 papers found • Page 16 of 30
Conference
Settling the Maximin Share Fairness for Scheduling among Groups of Machines
Bo Li, Fangxiao WANG, Xing Shiji
ETTA: Elucidating the Design Space of Text-to-Audio Models
Sang-gil Lee, Zhifeng Kong, ARUSHI GOEL et al.
Do Vision-Language Models Really Understand Visual Language?
Yifan Hou, Buse Giledereli, Yilei Tu et al.
Concept-Based Unsupervised Domain Adaptation
Xinyue Xu, Yueying Hu, Hui Tang et al.
Unraveling the Interplay between Carryover Effects and Reward Autocorrelations in Switchback Experiments
Qianglin Wen, Chengchun Shi, Ying Yang et al.
Provable Benefits of Unsupervised Pre-training and Transfer Learning via Single-Index Models
Taj Jones-McCormick, Aukosh Jagannath, Subhabrata Sen
Adversarial Reasoning at Jailbreaking Time
Mahdi Sabbaghi, Paul Kassianik, George Pappas et al.
$\infty$-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation
Saúl Santos, António Farinhas, Daniel McNamee et al.
Non-stationary Diffusion For Probabilistic Time Series Forecasting
Weiwei Ye, Zhuopeng Xu, Ning Gui
Learning-Augmented Algorithms for MTS with Bandit Access to Multiple Predictors
Matei Gabriel Cosa, Marek Elias
Efficiently Access Diffusion Fisher: Within the Outer Product Span Space
Fangyikang Wang, Hubery Yin, Shaobin Zhuang et al.
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training
Fabian Schaipp, Alexander Hägele, Adrien Taylor et al.
A Machine Learning Approach to Duality in Statistical Physics
Prateek Gupta, Andrea Ferrari, Nabil Iqbal
Asymmetric Decision-Making in Online Knowledge Distillation: Unifying Consensus and Divergence
zhaowei chen, Borui Zhao, Yuchen Ge et al.
AdaSplash: Adaptive Sparse Flash Attention
Nuno Gonçalves, Marcos V. Treviso, Andre Martins
µnit Scaling: Simple and Scalable FP8 LLM Training
Saaketh Narayan, Abhay Gupta, Mansheej Paul et al.
Generative Human Trajectory Recovery via Embedding-Space Conditional Diffusion
KAIJUN LIU, Sijie Ruan, Liang Zhang et al.
Optimal Auction Design in the Joint Advertising
Yang Li, Yuchao Ma, Qi Qi
LASER: Attention with Exponential Transformation
Sai Surya Duvvuri, Inderjit Dhillon
Probabilistic Interactive 3D Segmentation with Hierarchical Neural Processes
Jie Liu, Pan Zhou, Zehao Xiao et al.
HPS: Hard Preference Sampling for Human Preference Alignment
Xiandong Zou, Wanyu LIN, Yuchen Li et al.
AMPO: Active Multi Preference Optimization for Self-play Preference Selection
Taneesh Gupta, Rahul Madhavan, Xuchao Zhang et al.
Data Mixing Optimization for Supervised Fine-Tuning of Large Language Models
Yuan Li, Zhengzhong Liu, Eric Xing
Guided Structural Inference: Leveraging Priors with Soft Gating Mechanisms
Aoran Wang, Xinnan Dai, Jun Pang
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity
Quan Nguyen, Nishant Mehta, Cristóbal Guzmán
Randomized Dimensionality Reduction for Euclidean Maximization and Diversity Measures
Jie Gao, Rajesh Jayaram, Benedikt Kolbe et al.
RepLoRA: Reparameterizing Low-rank Adaptation via the Perspective of Mixture of Experts
Tuan Truong, Chau Nguyen, Huy Nguyen et al.
Provable In-Context Vector Arithmetic via Retrieving Task Concepts
Dake Bu, Wei Huang, Andi Han et al.
TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
Shaobin Zhuang, Yiwei Guo, Yanbo Ding et al.
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization
Xinyi Wan, Penghui Qi, Guangxing Huang et al.
LEMoN: Label Error Detection using Multimodal Neighbors
Haoran Zhang, Aparna Balagopalan, Nassim Oufattole et al.
From Uncertain to Safe: Conformal Adaptation of Diffusion Models for Safe PDE Control
Peiyan Hu, Xiaowei Qian, Wenhao Deng et al.
Defending LVLMs Against Vision Attacks Through Partial-Perception Supervision
Qi Zhou, Dongxia Wang, Tianlin Li et al.
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng, Weihao Tan, Zhiyi Lyu et al.
Generalization Analysis for Controllable Learning
Yi-Fan Zhang, Xiao Zhang, Min-Ling Zhang
Tight and Fast Bounds for Multi-Label Learning
Yi-Fan Zhang, Min-Ling Zhang
COKE: Core Kernel for More Efficient Approximation of Kernel Weights in Multiple Kernel Clustering
Weixuan Liang, Xinwang Liu, KE LIANG et al.
Socialized Coevolution: Advancing a Better World through Cross-Task Collaboration
Xinjie Yao, Yu Wang, Pengfei Zhu et al.
MITIGATING OVER-EXPLORATION IN LATENT SPACE OPTIMIZATION USING LES
Omer Ronen, Ahmed Imtiaz Humayun, Richard Baraniuk et al.
Enhancing Graph Invariant Learning from a Negative Inference Perspective
Kuo Yang, Zhengyang Zhou, Qihe Huang et al.
OOD-Chameleon: Is Algorithm Selection for OOD Generalization Learnable?
Liangze Jiang, Damien Teney
Kona: An Efficient Privacy-Preservation Framework for KNN Classification by Communication Optimization
Guopeng Lin, Ruisheng Zhou, Shuyu Chen et al.
La RoSA: Enhancing LLM Efficiency via Layerwise Rotated Sparse Activation
Kai Liu, Bowen Xu, Shaoyu Wu et al.
Doubly Robust Fusion of Many Treatments for Policy Learning
Ke Zhu, Jianing Chu, Ilya Lipkovich et al.
Improving Reward Model Generalization from Adversarial Process Enhanced Preferences
Zhilong Zhang, Tian Xu, Xinghao Du et al.
AlphaQCM: Alpha Discovery in Finance with Distributional Reinforcement Learning
Zhoufan Zhu, Ke Zhu
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking
Runquan Gui, Zhihai Wang, Jie Wang et al.
Winner-takes-all for Multivariate Probabilistic Time Series Forecasting
Adrien Cortes, Remi Rehm, Victor Letzelter
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models
Patrick Leask, Neel Nanda, Noura Al Moubayed
Linearization Turns Neural Operators into Function-Valued Gaussian Processes
Emilia Magnani, Marvin Pförtner, Tobias Weber et al.
Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making
Hongling Zheng, Li Shen, Yong Luo et al.
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters
Mouxiang Chen, Lefei Shen, Zhuo Li et al.
Hide & Seek: Transformer Symmetries Obscure Sharpness & Riemannian Geometry Finds It
Marvin F, da Silva, Felix Dangel, Sageev Oore
Graph Attention is Not Always Beneficial: A Theoretical Analysis of Graph Attention Mechanisms via Contextual Stochastic Block Models
Zhongtian Ma, Qiaosheng Zhang, Bocheng Zhou et al.
Riemannian Diffusion Adaptation for Distributed Optimization on Manifolds
Xiuheng Wang, Ricardo Borsoi, Cédric Richard et al.
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Xinyu Guan, Li Lyna Zhang, Yifei Liu et al.
EduLLM: Leveraging Large Language Models and Framelet-Based Signed Hypergraph Neural Networks for Student Performance Prediction
Ming Li, Yukang Cheng, Lu Bai et al.
Test-Time Selective Adaptation for Uni-Modal Distribution Shift in Multi-Modal Data
MingCai Chen, Baoming Zhang, Zongbo Han et al.
Stochastic Layer-Wise Shuffle for Improving Vision Mamba Training
Zizheng Huang, Haoxing Chen, Jiaqi Li et al.
Omni-Angle Assault: An Invisible and Powerful Physical Adversarial Attack on Face Recognition
Shuai Yuan, Hongwei Li, Rui Zhang et al.
Confidence Difference Reflects Various Supervised Signals in Confidence-Difference Classification
Yuanchao Dai, Ximing Li, Changchun Li
Test-Time Graph Neural Dataset Search With Generative Projection
Xin Zheng, Wei Huang, Chuan Zhou et al.
OmiAD: One-Step Adaptive Masked Diffusion Model for Multi-class Anomaly Detection via Adversarial Distillation
Yaoxuan Feng, Wenchao Chen, yuxin li et al.
All-atom Diffusion Transformers: Unified generative modelling of molecules and materials
Chaitanya Joshi, Xiang Fu, Yi-Lun Liao et al.
Free Process Rewards without Process Labels
Lifan Yuan, Wendi Li, Huayu Chen et al.
Clustering Items through Bandit Feedback: Finding the Right Feature out of Many
Maximilian Graf, Victor Thuot, Nicolas Verzelen
Feature-Mapping Topology Optimization with Neural Heaviside Signed Distance Functions
Aleksandr Kolomeitsev, ANH-HUY PHAN
Efficient ANN-SNN Conversion with Error Compensation Learning
chang liu, Jiangrong Shen, Xuming Ran et al.
OneForecast: A Universal Framework for Global and Regional Weather Forecasting
Yuan Gao, Hao Wu, Ruiqi Shu et al.
Learning Progress Driven Multi-Agent Curriculum
Wenshuai Zhao, Zhiyuan Li, Joni Pajarinen
DIME: Diffusion-Based Maximum Entropy Reinforcement Learning
Onur Celik, Zechu Li, Denis Blessing et al.
EAGLES: Towards Effective, Efficient, and Economical Federated Graph Learning via Unified Sparsification
Zitong Shi, Guancheng Wan, Wenke Huang et al.
RBench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation
Meng-Hao Guo, Jiajun Xu, Yi Zhang et al.
Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model Learning
Adrià López Escoriza, Nicklas Hansen, Stone Tao et al.
Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in sEMG Analysis
Weiyu Guo, Ziyue Qiao, Ying Sun et al.
From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models
Xinyang Li, Siqi Liu, Bochao Zou et al.
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators
Yilun Zhou, Austin Xu, PeiFeng Wang et al.
Adjustment for Confounding using Pre-Trained Representations
Rickmer Schulte, David Rügamer, Thomas Nagler
Distilling the Knowledge in Data Pruning
Emanuel Ben Baruch, Adam Botach, Igor Kviatkovsky et al.
Spectral-Aware Reservoir Computing for Fast and Accurate Time Series Classification
Shikang Liu, Chuyang Wei, Xiren Zhou et al.
am-ELO: A Stable Framework for Arena-based LLM Evaluation
Zirui Liu, Jiatong Li, Yan Zhuang et al.
Improving Flow Matching by Aligning Flow Divergence
Yuhao Huang, Taos Transue, Shih-Hsin Wang et al.
ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset
Yilin Wang, Peixuan Lei, Jie Song et al.
Shortcut-connected Expert Parallelism for Accelerating Mixture of Experts
Weilin Cai, Juyong Jiang, Le Qin et al.
Protriever: End-to-End Differentiable Protein Homology Search for Fitness Prediction
Ruben Weitzman, Peter Mørch Groth, Lood van Niekerk et al.
LSCD: Lomb--Scargle Conditioned Diffusion for Time series Imputation
Elizabeth M Fons Etcheverry, Alejandro Sztrajman, Yousef El-Laham et al.
Efficient Quantification of Multimodal Interaction at Sample Level
Zequn Yang, Hongfa Wang, Di Hu
Diagonal Symmetrization of Neural Network Solvers for the Many-Electron Schrödinger Equation
Kevin Han Huang, Ni Zhan, Elif Ertekin et al.
What If We Recaption Billions of Web Images with LLaMA-3?
Xianhang Li, Haoqin Tu, Mude Hui et al.
Improving Generalization in Federated Learning with Highly Heterogeneous Data via Momentum-Based Stochastic Controlled Weight Averaging
Junkang Liu, Yuanyuan Liu, Fanhua Shang et al.
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li, Junzhe Zhang, Elias Bareinboim
ExLM: Rethinking the Impact of $\texttt{[MASK]}$ Tokens in Masked Language Models
Kangjie Zheng, Junwei Yang, Siyue Liang et al.
Diff-MoE: Diffusion Transformer with Time-Aware and Space-Adaptive Experts
Kun Cheng, Xiao He, Lei Yu et al.
Learning Event Completeness for Weakly Supervised Video Anomaly Detection
Yu Wang, Shiwei Chen
A Mixed-Curvature based Pre-training Paradigm for Multi-Task Vehicle Routing Solver
Suyu Liu, Zhiguang Cao, Shanshan Feng et al.
Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed
Savelii Chezhegov, Klyukin Yaroslav, Andrei Semenov et al.
Generalized Smooth Bilevel Optimization with Nonconvex Lower-Level
Siqi Zhang, Xing Huang, Feihu Huang
Function Encoders: A Principled Approach to Transfer Learning in Hilbert Spaces
Tyler Ingebrand, Adam Thorpe, Ufuk Topcu
HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration
Yushi Huang, Zining Wang, Ruihao Gong et al.
Agent Workflow Memory
Zhiruo Wang, Jiayuan Mao, Daniel Fried et al.
Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning
Rongzhe Wei, Mufei Li, Mohsen Ghassemi et al.
A Versatile Influence Function for Data Attribution with Non-Decomposable Loss
Junwei Deng, Weijing Tang, Jiaqi Ma
Auditing $f$-differential privacy in one run
Saeed Mahloujifar, Luca Melis, Kamalika Chaudhuri
Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers
Weilun Feng, Chuanguang Yang, Haotong Qin et al.
EnIGMA: Interactive Tools Substantially Assist LM Agents in Finding Security Vulnerabilities
Talor Abramovich, Meet Udeshi, Minghao Shao et al.
Revisiting Continuity of Image Tokens for Cross-domain Few-shot Learning
Shuai Yi, Yixiong Zou, Yuhua Li et al.
ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts
Samar Khanna, Medhanie Irgau, David Lobell et al.
Enhancing Adversarial Robustness with Conformal Prediction: A Framework for Guaranteed Model Reliability
Jie Bao, Chuangyin Dang, Rui Luo et al.
Meta-Reinforcement Learning with Adaptation from Human Feedback via Preference-Order-Preserving Task Embedding
Siyuan Xu, Minghui Zhu
QEM-Bench: Benchmarking Learning-based Quantum Error Mitigation and QEMFormer as a Multi-ranged Context Learning Baseline
Tianyi Bao, Ruizhe Zhong, Xinyu Ye et al.
e-GAI: e-value-based Generalized $\alpha$-Investing for Online False Discovery Rate Control
Yifan Zhang, Zijian Wei, Haojie Ren et al.
Representative Ranking for Deliberation in the Public Sphere
Manon Revel, Smitha Milli, Tyler Lu et al.
A Generalization Result for Convergence in Learning-to-Optimize
Michael Sucker, Peter Ochs
One-Shot Heterogeneous Federated Learning with Local Model-Guided Diffusion Models
Mingzhao Yang, Shangchao Su, Bin Li et al.
REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective
Simon Geisler, Tom Wollschläger, M. Hesham Abdalla et al.
Propagate and Inject: Revisiting Propagation-Based Feature Imputation for Graphs with Partially Observed Features
Daeho Um, Sunoh Kim, Jiwoong Park et al.
Set Valued Predictions For Robust Domain Generalization
Ron Tsibulsky, Daniel Nevo, Uri Shalit
EpiCoder: Encompassing Diversity and Complexity in Code Generation
Yaoxiang Wang, Haoling Li, Xin Zhang et al.
Temperature-Annealed Boltzmann Generators
Henrik Schopmans, Pascal Friederich
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Yuqi Luo, Chenyang Song, Xu Han et al.
DriveGPT: Scaling Autoregressive Behavior Models for Driving
Xin Huang, Eric M. Wolff, Paul Vernaza et al.
Selective Response Strategies for GenAI
Boaz Taitler, Omer Ben-Porat
Boosting Multi-Domain Fine-Tuning of Large Language Models through Evolving Interactions between Samples
Xize Liang, Lin Yang, Jie Wang et al.
Graph Diffusion for Robust Multi-Agent Coordination
Xianghua Zeng, Hang Su, Zhengyi Wang et al.
TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization
Mingkang Zhu, Xi Chen, Zhongdao Wang et al.
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
Huang Huang, Fangchen Liu, Letian Fu et al.
Noise Conditional Variational Score Distillation
Xinyu Peng, Ziyang Zheng, Yaoming Wang et al.
Sub-Sequential Physics-Informed Learning with State Space Model
Chenhui Xu, Dancheng Liu, Yuting Hu et al.
Enhancing Performance of Explainable AI Models with Constrained Concept Refinement
Geyu Liang, Senne Michielssen, Salar Fattahi
Towards Lifelong Model Editing via Simulating Ideal Editor
Yaming Guo, Siyang Guo, Hengshu Zhu et al.
LightningDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos
Yujun Shi, Jun Hao Liew, Hanshu Yan et al.
DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization
Zhenglin Zhou, Xiaobo Xia, Fan Ma et al.
SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics
Qingtian Zhu, Yumin Zheng, Yuling Sang et al.
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding
Xiaoqian Shen, Yunyang Xiong, Changsheng Zhao et al.
Rethink the Role of Deep Learning towards Large-scale Quantum Systems
Yusheng Zhao, Chi Zhang, Yuxuan Du
Incorporating Arbitrary Matrix Group Equivariance into KANs
Lexiang Hu, Yisen Wang, Zhouchen Lin
Approximation to Smooth Functions by Low-Rank Swish Networks
Zimeng Li, Hongjun LI, Jingyuan Wang et al.
Come Together, But Not Right Now: A Progressive Strategy to Boost Low-Rank Adaptation
Zhan Zhuang, Xiequn Wang, Wei Li et al.
BCE vs. CE in Deep Feature Learning
Qiufu Li, Huibin Xiao, Linlin Shen
On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation
Nghiem Diep, Huy Nguyen, Chau Nguyen et al.
DragLoRA: Online Optimization of LoRA Adapters for Drag-based Image Editing in Diffusion Model
Siwei Xia, Li Sun, Tiantian Sun et al.
Flexible, Efficient, and Stable Adversarial Attacks on Machine Unlearning
Zihan Zhou, Yang Zhou, Zijie Zhang et al.
Guided Zeroth-Order Methods for Stochastic Non-convex Problems with Decision-Dependent Distributions
Yuya Hikima, Hiroshi Sawada, Akinori Fujino
Tuning LLM Judge Design Decisions for 1/1000 of the Cost
David Salinas, Omar Swelam, Frank Hutter
Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts
Amir Najafi, Samin Mahdizadeh Sani, Farzan Farnia
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Tianzhe Chu, Yuexiang Zhai, Jihan Yang et al.
Unified Analysis of Continuous Weak Features Learning with Applications to Learning from Missing Data
Kosuke Sugiyama, Masato Uchida
Policy-Regret Minimization in Markov Games with Function Approximation
Thanh Nguyen-Tang, Raman Arora
Reflection-Bench: Evaluating Epistemic Agency in Large Language Models
Lingyu Li, Yixu Wang, Haiquan Zhao et al.
Proactive Agents for Multi-Turn Text-to-Image Generation Under Uncertainty
Meera Hahn, Wenjun Zeng, Nithish Kannen et al.
A Bayesian Model Selection Criterion for Selecting Pretraining Checkpoints
Michael Munn, Susan Wei
Maximum Update Parametrization and Zero-Shot Hyperparameter Transfer for Fourier Neural Operators
Shanda Li, Shinjae Yoo, Yiming Yang
Calibrated Value-Aware Model Learning with Probabilistic Environment Models
Claas Voelcker, Anastasiia Pedan, Arash Ahmadian et al.
Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM’s Reasoning Capability
Zicheng Lin, Tian Liang, Jiahao Xu et al.
MIPT: Multilevel Informed Prompt Tuning for Robust Molecular Property Prediction
Yeyun Chen, Jiangming Shi
On the Benefits of Active Data Collection in Operator Learning
Unique Subedi, Ambuj Tewari
Energy-Based Flow Matching for Generating 3D Molecular Structure
Wenyin Zhou, Christopher I Sprague, Vsevolod Viliuga et al.
Retraining-free Merging of Sparse MoE via Hierarchical Clustering
I-Chun Chen, Hsu-Shen Liu, Wei-Fang Sun et al.
Divide and Conquer: Learning Label Distribution with Subtasks
Haitao Wu, Weiwei Li, Xiuyi Jia
ENSUR: Equitable and Statistically Unbiased Recommendation
Nitin Bisht, Xiuwen Gong, Guandong Xu
Random Feature Representation Boosting
Nikita Zozoulenko, Thomas Cass, Lukas Gonon
FrameBridge: Improving Image-to-Video Generation with Bridge Models
Yuji Wang, Zehua Chen, Chen Xiaoyu et al.
Fair Clustering via Alignment
Kunwoong Kim, Jihu Lee, Sangchul Park et al.
On Learning Parallel Pancakes with Mostly Uniform Weights
Ilias Diakonikolas, Daniel Kane, Sushrut Karmalkar et al.
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee, Youngdo Lee, Takuma Seno et al.
SPRI: Aligning Large Language Models with Context-Situated Principles
Hongli Zhan, Muneeza Azmat, Raya Horesh et al.
Curvature Enhanced Data Augmentation for Regression
Ilya Kaufman, Omri Azencot
Faster Rates for Private Adversarial Bandits
Hilal Asi, Vinod Raman, Kunal Talwar
Primitive Vision: Improving Diagram Understanding in MLLMs
Shan Zhang, Aotian Chen, Yanpeng Sun et al.
Enhancing Logits Distillation with Plug&Play Kendall's $\tau$ Ranking Loss
Yuchen Guan, Runxi Cheng, Kang Liu et al.
GHOST: Generalizable One-Shot Federated Graph Learning with Proxy-Based Topology Knowledge Retention
Jiaru Qian, Guancheng Wan, Wenke Huang et al.
Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging
Shiqi Chen, Jinghan Zhang, Tongyao Zhu et al.
SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model
Zhao Yang, jiwei zhu, Bing Su
Potemkin Understanding in Large Language Models
Marina Mancoridis, Bec Weeks, Keyon Vafa et al.
Generation from Noisy Examples
Ananth Raman, Vinod Raman
Improved Expressivity of Hypergraph Neural Networks through High-Dimensional Generalized Weisfeiler-Leman Algorithms
Detian Zhang, Zhang Chengqiang, Yanghui Rao et al.
UDora: A Unified Red Teaming Framework against LLM Agents by Dynamically Hijacking Their Own Reasoning
Jiawei Zhang, Shuang Yang, Bo Li
POROver: Improving Safety and Reducing Overrefusal in Large Language Models with Overgeneration and Preference Optimization
Batuhan K. Karaman, ishmam zabir, Alon Benhaim et al.
On the Convergence of Continuous Single-timescale Actor-critic
Xuyang Chen, Lin Zhao
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
Haojie Duanmu, Xiuhong Li, Zhihang Yuan et al.
When Data-Free Knowledge Distillation Meets Non-Transferable Teacher: Escaping Out-of-Distribution Trap is All You Need
Ziming Hong, Runnan Chen, Zengmao Wang et al.
Matryoshka Quantization
Pranav Nair, Puranjay Datta, Jeff Dean et al.
G-Adaptivity: optimised graph-based mesh relocation for finite element methods
James Rowbottom, Georg Maierhofer, Teo Deveney et al.
Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension
Yijun Dong, Yicheng Li, Yunai Li et al.
Stream-level Flow Matching with Gaussian Processes
Ganchao Wei, Li Ma
Latent Imputation before Prediction: A New Computational Paradigm for De Novo Peptide Sequencing
Ye DU, Chen Yang, Nanxi Yu et al.
Parallel Simulation for Log-concave Sampling and Score-based Diffusion Models
Huanjian Zhou, Masashi Sugiyama
Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples
Fangxu Yu, Lai Jiang, Haoqiang Kang et al.
Decision Making under the Exponential Family: Distributionally Robust Optimisation with Bayesian Ambiguity Sets
Charita Dellaporta, Patrick O'Hara, Theodoros Damoulas
Expressive Power of Graph Neural Networks for (Mixed-Integer) Quadratic Programs
Ziang Chen, Xiaohan Chen, Jialin Liu et al.
Are High-Quality AI-Generated Images More Difficult for Models to Detect?
Yao Xiao, Binbin Yang, Weiyan Chen et al.
Gamma Distribution PCA-Enhanced Feature Learning for Angle-Robust SAR Target Recognition
Chong Zhang, Peng Zhang, Mengke Li
Plausible Token Amplification for Improving Accuracy of Differentially Private In-Context Learning Based on Implicit Bayesian Inference
Yusuke Yamasaki, Kenta Niwa, Daiki Chijiwa et al.
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Tianqi Du, Haotian Huang, Yifei Wang et al.
Adaptive Localization of Knowledge Negation for Continual LLM Unlearning
Abudukelimu Wuerkaixi, Qizhou Wang, Sen Cui et al.
LADA: Scalable Label-Specific CLIP Adapter for Continual Learning
Mao-Lin Luo, Zi-Hao Zhou, Tong Wei et al.
Constrained Exploitability Descent: An Offline Reinforcement Learning Method for Finding Mixed-Strategy Nash Equilibrium
Runyu Lu, Yuanheng Zhu, Dongbin Zhao
UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Kaizhen Zhu, Mokai Pan, Yuexin Ma et al.
Few-Shot Learner Generalizes Across AI-Generated Image Detection
Shiyu Wu, Jing Liu, Jing Li et al.
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers
Min Zhao, Guande He, Yixiao Chen et al.