Most Cited ICLR "memory retrieval capacity" Papers
6,124 papers found • Page 20 of 31
Conference
Denoising with a Joint-Embedding Predictive Architecture
Chen Dengsheng, Jie Hu, Xiaoming Wei et al.
Unlocking Point Processes through Point Set Diffusion
David Lüdke, Enric Rabasseda Raventós, Marcel Kollovieh et al.
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Xin Li, Deshui Miao, Zhenyu He et al.
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu et al.
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models
Zhengyi Ho, Siyuan Liang, Sen Zhang et al.
CryoFM: A Flow-based Foundation Model for Cryo-EM Densities
Yi Zhou, Yilai Li, Jing Yuan et al.
LICORICE: Label-Efficient Concept-Based Interpretable Reinforcement Learning
Zhuorui Ye, Stephanie Milani, Geoff Gordon et al.
Credal Wrapper of Model Averaging for Uncertainty Estimation in Classification
Kaizheng Wang, Fabio Cuzzolin, Keivan Shariatmadar et al.
Active Learning for Continual Learning: Keeping the Past Alive in the Present
Jaehyun Park, Dongmin Park, Jae-Gil Lee
Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions
Yan Ru Pei
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models
Donghoon Kim, Minji Bae, Kyuhong Shim et al.
How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
Tal Herman, Guy Rothblum
UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting
Haoyuan Li, Yanpeng Zhou, Tao Tang et al.
Generating Physical Dynamics under Priors
Zihan Zhou, Xiaoxue Wang, Tianshu Yu
Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment
Yizhi Song, Liu He, Zhifei Zhang et al.
Group-robust Sample Reweighting for Subpopulation Shifts via Influence Functions
Rui Qiao, Zhaoxuan Wu, Jingtan Wang et al.
A Solvable Attention for Neural Scaling Laws
Bochen Lyu, Di Wang, Zhanxing Zhu
3D Reconstruction with Generalizable Neural Fields using Scene Priors
Yang Fu, Shalini De Mello, Xueting Li et al.
ZIP: An Efficient Zeroth-order Prompt Tuning for Black-box Vision-Language Models
Seonghwan Park, Jaehyeon Jeong, Yongjun Kim et al.
Temporal Flexibility in Spiking Neural Networks: Towards Generalization Across Time Steps and Deployment Friendliness
Kangrui Du, Yuhang Wu, Shikuang Deng et al.
Finding Shared Decodable Concepts and their Negations in the Brain
Cory Efird, Alex Murphy, Joel Zylberberg et al.
Compute-Optimal LLMs Provably Generalize Better with Scale
Marc Finzi, Sanyam Kapoor, Diego Granziol et al.
Epistemic Monte Carlo Tree Search
Yaniv Oren, Viliam Vadocz, Matthijs T. J. Spaan et al.
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi et al.
Chemistry-Inspired Diffusion with Non-Differentiable Guidance
Yuchen Shen, Chenhao Zhang, Sijie Fu et al.
NetFormer: An interpretable model for recovering dynamical connectivity in neuronal population dynamics
Ziyu Lu, Wuwei Zhang, Trung Le et al.
Towards Improving Exploration through Sibling Augmented GFlowNets
Kanika Madan, Alex Lamb, Emmanuel Bengio et al.
A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics
Katharina Friedl, Noémie Jaquier, Jens Lundell et al.
DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness
Shoumik Saha, Wenxiao Wang, Yigitcan Kaya et al.
Unify ML4TSP: Drawing Methodological Principles for TSP and Beyond from Streamlined Design Space of Learning and Search
Yang Li, Jiale Ma, Wenzheng Pan et al.
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Tung-Yu Wu, Melody Lo
Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models
Shaotian Yan, Chen Shen, Wenxiao Wang et al.
RA-TTA: Retrieval-Augmented Test-Time Adaptation for Vision-Language Models
Youngjun Lee, Doyoung Kim, Junhyeok Kang et al.
Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations
Patricia Pauli, Aaron Havens, Alexandre Araujo et al.
Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control
Gezheng Xu, Hui GUO, Li Yi et al.
SPDIM: Source-Free Unsupervised Conditional and Label Shift Adaptation in EEG
Shanglin Li, Motoaki Kawanabe, Reinmar Kobler
Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning
Kwanyoung Park, Youngwoon Lee
Zero-shot Model-based Reinforcement Learning using Large Language Models
Abdelhakim Benechehab, Youssef Attia El Hili, Ambroise Odonnat et al.
An Intuitive Multi-Frequency Feature Representation for SO(3)-Equivariant Networks
Dongwon Son, Jaehyung Kim, Sanghyeon Son et al.
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini, Adel Javanmard, Murat A Erdogdu
Learning in reverse causal strategic environments with ramifications on two sided markets
Seamus Somerstep, Yuekai Sun, Yaacov Ritov
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li, Zichun Yu, Chenyan Xiong
Sensitivity-Constrained Fourier Neural Operators for Forward and Inverse Problems in Parametric Differential Equations
Abdolmehdi Behroozi, Chaopeng Shen, Daniel Kifer
SeRA: Self-Reviewing and Alignment of LLMs using Implicit Reward Margins
Jongwoo Ko, Saket Dingliwal, Bhavana Ganesh et al.
Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry
Ziheng Chen, Yue Song, Xiaojun Wu et al.
Streamlining Prediction in Bayesian Deep Learning
Rui Li, Marcus Klasson, Arno Solin et al.
Latent Radiance Fields with 3D-aware 2D Representations
Chaoyi Zhou, Xi Liu, Feng Luo et al.
Toward Generalizing Visual Brain Decoding to Unseen Subjects
Xiangtao Kong, Kexin Huang, Ping Li et al.
Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)
SUBBA REDDY OOTA, Akshett Rai Jindal, Ishani Mondal et al.
Convergence of Distributed Adaptive Optimization with Local Updates
Ziheng Cheng, Margalit Glasgow
Feedback Favors the Generalization of Neural ODEs
Jindou Jia, Zihan Yang, Meng Wang et al.
ADMM for Nonconvex Optimization under Minimal Continuity Assumption
Ganzhao Yuan
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani, Matthew E Taylor
Beyond Random Masking: When Dropout meets Graph Convolutional Networks
Yuankai Luo, Xiao-Ming Wu, Hao Zhu
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li, Charles Herrmann, Kelvin Chan et al.
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
Harry Zhang, Luca Carlone
Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling
Sirui Li, Wenbin Ouyang, Yining Ma et al.
Topological Schrödinger Bridge Matching
Maosheng Yang
Differentially Private Federated Learning with Time-Adaptive Privacy Spending
Shahrzad Kianidehkordi, Nupur Kulkarni, Adam Dziedzic et al.
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He, Hang Yu, Zi Gong et al.
Optimization by Parallel Quasi-Quantum Annealing with Gradient-Based Sampling
Yuma Ichikawa, Yamato Arai
ContextRef: Evaluating Referenceless Metrics for Image Description Generation
Elisa Kreiss, Elisa Kreiss, Eric Zelikman et al.
Advancing Graph Generation through Beta Diffusion
Xinyang Liu, Yilin He, Bo Chen et al.
Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning
Samuel Garcin, Trevor McInroe, Pablo Samuel Castro et al.
Improving Probabilistic Diffusion Models With Optimal Diagonal Covariance Matching
Zijing Ou, Mingtian Zhang, Andi Zhang et al.
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization
The Viet Bui, Thanh Nguyen, Tien Mai
When LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings
Jérémy Perez, Grgur Kovac, Corentin Léger et al.
VLMaterial: Procedural Material Generation with Large Vision-Language Models
Beichen Li, Rundi Wu, Armando Solar-Lezama et al.
Uncertainty and Influence aware Reward Model Refinement for Reinforcement Learning from Human Feedback
Zexu Sun, Yiju Guo, Yankai Lin et al.
On the Hölder Stability of Multiset and Graph Neural Networks
Yair Davidson, Nadav Dym
Decongestion by Representation: Learning to Improve Economic Welfare in Marketplaces
Omer Nahum, Gali Noti, David Parkes et al.
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
Jiwook Kim, Seonho Lee, Jaeyo Shin et al.
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
Fan Wang, Juyong Jiang, Chansung Park et al.
SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
Kangjie Zheng, Siyue Liang, Junwei Yang et al.
Interpreting Language Reward Models via Contrastive Explanations
Junqi Jiang, Tom Bewley, Saumitra Mishra et al.
Composable Interventions for Language Models
Arinbjörn Kolbeinsson, Kyle O'Brien, Tianjin Huang et al.
Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning
Mingde Zhao, Safa Alver, Harm Seijen et al.
Shifting the Paradigm: A Diffeomorphism Between Time Series Data Manifolds for Achieving Shift-Invariancy in Deep Learning
Berken Utku Demirel, Christian Holz
Risk-Controlling Model Selection via Guided Bayesian Optimization
Adam Fisch, Regina Barzilay, Bracha Laufer-Goldshtein et al.
Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures
Peimeng Guan, Naveed Iqbal, Mark Davenport et al.
Handling Delay in Real-Time Reinforcement Learning
Ivan Anokhin, Rishav Rishav, Matt Riemer et al.
Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space
Mohamed Amine Ketata, Nicholas Gao, Johanna Sommer et al.
Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling, and Zero-Shot Transfer
Zihan Pengmei, Zhengyuan Shen, Zichen Wang et al.
Hierarchical Uncertainty Estimation for Learning-based Registration in Neuroimaging
Xiaoling Hu, Karthik Gopinath, Peirong Liu et al.
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
Yujian Liu, Shiyu Chang, Tommi Jaakkola et al.
IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION
Chuanyang Zheng
Synergy Between Sufficient Changes and Sparse Mixing Procedure for Disentangled Representation Learning
Zijian Li, Shunxing Fan, Yujia Zheng et al.
Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs
Steve Azzolin, Antonio Longa, Stefano Teso et al.
Hessian-Free Online Certified Unlearning
Xinbao Qiao, Meng Zhang, Ming Tang et al.
Unraveling the Enigma of Double Descent: An In-depth Analysis through the Lens of Learned Feature Space
Yufei Gu, Xiaoqing Zheng, Tomaso Aste
MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
Erle Zhu, Yadi Liu, Zhe Zhang et al.
Neuron based Personality Trait Induction in Large Language Models
Jia Deng, Tianyi Tang, Yanbin Yin et al.
Unknown Domain Inconsistency Minimization for Domain Generalization
Seungjae Shin, HeeSun Bae, Byeonghu Na et al.
Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport
Milena Gazdieva, Jaemoo Choi, Alexander Kolesov et al.
Selective Task Group Updates for Multi-Task Optimization
Wooseong Jeong, Kuk-Jin Yoon
EBMDock: Neural Probabilistic Protein-Protein Docking via a Differentiable Energy Model
Huaijin Wu, Wei Liu, Yatao Bian et al.
SpaCE: The Spatial Confounding Environment
Mauricio Tec, Ana Trisovic, Michelle Audirac et al.
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Jan Metzen, Piyapat Saranrittichai, Chaithanya Kumar Mummadi
Plugin estimators for selective classification with out-of-distribution detection
Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum et al.
Differentiable and Learnable Wireless Simulation with Geometric Transformers
Thomas Hehn, Markus Peschl, Tribhuvanesh Orekondy et al.
Motion Control of High-Dimensional Musculoskeletal Systems with Hierarchical Model-Based Planning
Yunyue Wei, Shanning Zhuang, Vincent Zhuang et al.
TS-LIF: A Temporal Segment Spiking Neuron Network for Time Series Forecasting
Shibo Feng, Wanjin Feng, Xingyu Gao et al.
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
Sungnyun Kim, Sungwoo Cho, Sangmin Bae et al.
Learning Equivariant Non-Local Electron Density Functionals
Nicholas Gao, Eike Eberhard, Stephan Günnemann
Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning
Xinran Li, Xiaolu Wang, Chenjia Bai et al.
Learning Robust Generalizable Radiance Field with Visibility and Feature Augmented Point Representation
Jiaxu Wang, Ziyi Zhang, Renjing Xu
Towards the Fundamental Limits of Knowledge Transfer over Finite Domains
Qingyue Zhao, Banghua Zhu
Improving Large Language Model Planning with Action Sequence Similarity
Xinran Zhao, Hanie Sedghi, Bernd Bohnet et al.
Treatment Effects Estimation By Uniform Transformer
Ruoqi Yu, Shulei Wang
Efficient Perplexity Bound and Ratio Matching in Discrete Diffusion Language Models
Etrit Haxholli, Yeti Z. Gurbuz, Oğul Can et al.
DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training
Yurou Liu, Jiahao Chen, Rui Jiao et al.
A Coefficient Makes SVRG Effective
Yida Yin, Zhiqiu Xu, Zhiyuan Li et al.
MixEval-X: Any-to-any Evaluations from Real-world Data Mixture
Jinjie Ni, Yifan Song, Deepanway Ghosal et al.
Wasserstein-Regularized Conformal Prediction under General Distribution Shift
Rui Xu, Chao Chen, Yue Sun et al.
Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
Wentao Guo, Jikai Long, Yimeng Zeng et al.
HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents
Tristan Tomilin, Meng Fang, Mykola Pechenizkiy
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
Nilo Schwencke, Cyril Furtlehner
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura, Takumi Hirose, Masanari Ohi et al.
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Chenze Shao, Fandong Meng, Jie Zhou
Calibrating Expressions of Certainty
Peiqi Wang, Barbara Lam, Yingcheng Liu et al.
From an LLM Swarm to a PDDL-empowered Hive: Planning Self-executed Instructions in a Multi-modal Jungle
Kaustubh Vyas, Damien Graux, Yijun Yang et al.
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
Michael Wornow, Suhana Bedi, Miguel Angel Fuentes Hernandez et al.
Self-Updatable Large Language Models by Integrating Context into Model Parameters
Yu Wang, Xinshuang Liu, Xiusi Chen et al.
HR-Extreme: A High-Resolution Dataset for Extreme Weather Forecasting
Nian Ran, Peng Xiao, Yue Wang et al.
Shallow diffusion networks provably learn hidden low-dimensional structure
Nicholas Boffi, Arthur Jacot, Stephen Tu et al.
Taming Transformer Without Using Learning Rate Warmup
Xianbiao Qi, Yelin He, Jiaquan Ye et al.
Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
Chenyu Zhou, Mengdan Zhang, Peixian Chen et al.
Energy-conserving equivariant GNN for elasticity of lattice architected metamaterials
Ivan Grega, Ilyes Batatia, Gábor Csányi et al.
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets
Yuxin Wang, Maresa Schröder, Dennis Frauen et al.
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
Junyeong Park, Junmo Cho, Sungjin Ahn
Learning Causal Alignment for Reliable Disease Diagnosis
Mingzhou Liu, Ching-Wen Lee, Xinwei Sun et al.
QA-Calibration of Language Model Confidence Scores
Putra Manggala, Atalanti A Mastakouri, Elke Kirschbaum et al.
ComPC: Completing a 3D Point Cloud with 2D Diffusion Priors
Tianxin Huang, Zhiwen Yan, Yuyang Zhao et al.
Sharpness-Aware Black-Box Optimization
Feiyang YE, YUEMING LYU, Xuehao Wang et al.
Hierarchically Encapsulated Representation for Protocol Design in Self-Driving Labs
Yu-Zhe Shi, Mingchen Liu, Fanxu Meng et al.
ECHOPulse: ECG Controlled Echocardio-gram Video Generation
Yiwei Li, Sekeun Kim, Zihao Wu et al.
Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
Xiaoran Jiao, Weian Mao, Wengong Jin et al.
Unsupervised Pretraining for Fact Verification by Language Model Distillation
Adrian Bazaga, Pietro Lio, Gos Micklem
SetCSE: Set Operations using Contrastive Learning of Sentence Embeddings
Kang Liu
SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process
Hanzhen Zhao, Xingyu Xie, Cong Fang et al.
MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
Nayoung Kim, Seongsu Kim, Minsu Kim et al.
SymmetricDiffusers: Learning Discrete Diffusion on Finite Symmetric Groups
Yongxing Zhang, Donglin Yang, Renjie Liao
Learning High-Degree Parities: The Crucial Role of the Initialization
Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła et al.
Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks
Emanuel Sommer, Jakob Robnik, Giorgi Nozadze et al.
Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series
Byoungwoo Park, Hyungi Lee, Juho Lee
Automatic Functional Differentiation in JAX
Min Lin
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
Po-han Li, Sandeep Chinchali, ufuk topcu
A Generic Framework for Conformal Fairness
Aditya Vadlamani, Anutam Srinivasan, Pranav Maneriker et al.
Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation
Rong Tang, Lizhen Lin, Yun Yang
Progressive Compression with Universally Quantized Diffusion Models
Yibo Yang, Justus Will, Stephan Mandt
Multi-Label Test-Time Adaptation with Bound Entropy Minimization
Xiangyu Wu, Feng Yu, Yang Yang et al.
Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
Nikolaos Tsilivis, Gal Vardi, Julia Kempe
HiGen: Hierarchical Graph Generative Networks
Mahdi Karami
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Chengyu Du, Jinyi Han, Yizhou Ying et al.
Structural Estimation of Partially Observed Linear Non-Gaussian Acyclic Model: A Practical Approach with Identifiability
Songyao Jin, Feng Xie, Guangyi Chen et al.
Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes
Jongmin Lee, Ernest Ryu
Demonstration-Regularized RL
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello et al.
Convolutional Deep Kernel Machines
Edward Milsom, Ben Anson, Laurence Aitchison
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models
Linh Tran, Wei Sun, Stacy Patterson et al.
Leveraging Hyperbolic Embeddings for Coarse-to-Fine Robot Design
Heng Dong, Junyu Zhang, Chongjie Zhang
Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality
Ge Ya Luo, Gian M Favero, Zhi Hao Luo et al.
The Update-Equivalence Framework for Decision-Time Planning
Samuel Sokota, Gabriele Farina, David Wu et al.
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
Rui Lu, Runzhe Wang, Kaifeng Lyu et al.
Model-Free Offline Reinforcement Learning with Enhanced Robustness
Chi Zhang, Zain Ulabedeen Farhat, George Atia et al.
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Klaus-Rudolf Kladny, Bernhard Schölkopf, Michael Muehlebach
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
Vladimir Boza, Vladimir Macko
RaSA: Rank-Sharing Low-Rank Adaptation
Zhiwei He, Zhaopeng Tu, Xing Wang et al.
GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs
Dongzhuoran Zhou, Evgeny Kharlamov, Egor Kostylev
Feature-Based Online Bilateral Trade
Solenne Gaucher, Martino Bernasconi, Matteo Castiglioni et al.
Towards Continuous Reuse of Graph Models via Holistic Memory Diversification
Ziyue Qiao, Junren Xiao, Qingqiang Sun et al.
Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Xingxuan Zhang, Haoran Wang, Jiansheng Li et al.
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
Yu Heng Hung, Kai-Jie Lin, Yu-Heng Lin et al.
From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy
Julian Dörfler, Benito van der Zander, Markus Bläser et al.
Natural Language Inference Improves Compositionality in Vision-Language Models
Paola Cascante-Bonilla, Yu (Hope) Hou, Yang Cao et al.
Learning to Communicate Through Implicit Communication Channels
Han Wang, Binbin Chen, zhang et al.
MGDA Converges under Generalized Smoothness, Provably
Qi Zhang, Peiyao Xiao, Shaofeng Zou et al.
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh, Pradeep Varakantham, Peter Vamplew
On-the-fly Preference Alignment via Principle-Guided Decoding
Mingye Zhu, Yi Liu, Lei Zhang et al.
SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation
Song Duong, Florian Le Bronnec, Alexandre Allauzen et al.
NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval
Sepanta Zeighami, Zac Wellmer, Aditya Parameswaran
ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
Chen Bo Calvin Zhang, Zhang-Wei Hong, Aldo Pacchiano et al.
Self-Supervised Diffusion MRI Denoising via Iterative and Stable Refinement
Chenxu Wu, Qingpeng Kong, Zihang Jiang et al.
Towards Understanding the Universality of Transformers for Next-Token Prediction
Michael Sander, Gabriel Peyré
Neural Interactive Proofs
Lewis Hammond, Sam Adam-Day
Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization
Hanmin Li, Avetik Karagulyan, Peter Richtarik
Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning
Fengyu Gao, Ruida Zhou, Tianhao Wang et al.
ZETA: Leveraging $Z$-order Curves for Efficient Top-$k$ Attention
Qiuhao Zeng, Jierui Huang, Peng Lu et al.
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Yuxin Jiang, Bo Huang, Yufei Wang et al.
Simple, Good, Fast: Self-Supervised World Models Free of Baggage
Jan Robine, Marc Höftmann, Stefan Harmeling
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
Hui Yuan, Yifan Zeng, Yue Wu et al.
What should a neuron aim for? Designing local objective functions based on information theory
Andreas C. Schneider, Valentin Neuhaus, David Ehrlich et al.
ReFusion: Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion
Shangyu Wu, Ying Xiong, Yufei CUI et al.
Lightweight Predictive 3D Gaussian Splats
Junli Cao, Vidit Goel, Chaoyang Wang et al.
Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift
Yihao Xue, Siddharth Joshi, Dang Nguyen et al.
LeanVec: Searching vectors faster by making them fit
Ishwar Bhati, Cecilia Aguerrebere, Mark Hildebrand et al.
No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models
Changlong Wu, Ananth Grama, Wojciech Szpankowski
Attribute-based Visual Reprogramming for Vision-Language Models
Chengyi Cai, Zesheng Ye, Lei Feng et al.
Estimating Shape Distances on Neural Representations with Limited Samples
Dean Pospisil, Brett Larsen, Sarah Harvey et al.
On Representation Complexity of Model-based and Model-free Reinforcement Learning
Hanlin Zhu, Baihe Huang, Stuart Russell
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Timofei Gritsaev, Nikita Morozov, Sergey Samsonov et al.