Most Cited ICLR "language model applications" Papers
6,124 papers found • Page 20 of 31
Conference
Self-Improving Robust Preference Optimization
Eugene Choi, Arash Ahmadian, Matthieu Geist et al.
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Sreyan Ghosh, Sonal Kumar, Zhifeng Kong et al.
Fugatto 1: Foundational Generative Audio Transformer Opus 1
Rafael Valle, Rohan Badlani, Zhifeng Kong et al.
Linear Representations of Political Perspective Emerge in Large Language Models
Junsol Kim, James Evans, Aaron Schein
Looking Backward: Streaming Video-to-Video Translation with Feature Banks
Feng Liang, Akio Kodaira, Chenfeng Xu et al.
Why Does the Effective Context Length of LLMs Fall Short?
Chenxin An, Jun Zhang, Ming Zhong et al.
Dobi-SVD: Differentiable SVD for LLM Compression and Some New Perspectives
Qinsi Wang, Jinghan Ke, Masayoshi Tomizuka et al.
Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation
Shengjie Ma, Chengjin Xu, Xuhui Jiang et al.
Model-Agnostic Knowledge Guided Correction for Improved Neural Surrogate Rollout
Bharat Srikishan, Daniel O'Malley, Mohamed Mehana et al.
Singular Subspace Perturbation Bounds via Rectangular Random Matrix Diffusions
Peiyao Lai, Oren Mangoubi
Leveraging Variable Sparsity to Refine Pareto Stationarity in Multi-Objective Optimization
Zeou Hu, Yaoliang Yu
On the Convergence of No-Regret Dynamics in Information Retrieval Games with Proportional Ranking Functions
Omer Madmon, Idan Pipano, Itamar Jacob Reinman et al.
Fine-Tuning Attention Modules Only: Enhancing Weight Disentanglement in Task Arithmetic
Ruochen Jin, Bojian Hou, Jiancong Xiao et al.
ToolGen: Unified Tool Retrieval and Calling via Generation
Renxi Wang, Xudong Han, Lei Ji et al.
When do GFlowNets learn the right distribution?
Tiago Silva, Rodrigo Alves, Eliezer de Souza da Silva et al.
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
Tarun Suresh, Revanth Gangi Reddy, Yifei Xu et al.
Equivariant Denoisers Cannot Copy Graphs: Align Your Graph Diffusion Models
Najwa Laabid, Severi Rissanen, Markus Heinonen et al.
Towards a Complete Logical Framework for GNN Expressiveness
Tuo Xu
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Zachary Ankner, Cody Blakeney, Kartik Sreenivasan et al.
Efficient and Context-Aware Label Propagation for Zero-/Few-Shot Training-Free Adaptation of Vision-Language Model
Yushu Li, Yongyi Su, Adam Goodge et al.
Rethinking Shapley Value for Negative Interactions in Non-convex Games
Wonjoon Chang, Myeongjin Lee, Jaesik Choi
Matérn Kernels for Tunable Implicit Surface Reconstruction
Maximilian Weiherer, Bernhard Egger
In Search of the Engram in LLMs: A Neuroscience Perspective on the Memory Functions in AI Models
Minsung Kim, Jea Kwon, Dong-Kyum Kim et al.
MAESTRO: Masked Encoding Set Transformer with Self-Distillation
Matthew Lee, Jaesik Kim, Matei Ionita et al.
dEBORA: Efficient Bilevel Optimization-based low-Rank Adaptation
Emanuele Zangrando, Sara Venturini, Francesco Rinaldi et al.
Multi-session, multi-task neural decoding from distinct cell-types and brain regions
Mehdi Azabou, Krystal Pan, Vinam Arora et al.
Offline Model-Based Optimization by Learning to Rank
Rong-Xi Tan, Ke Xue, Shen-Huan Lyu et al.
Open-CK: A Large Multi-Physics Fields Coupling benchmarks in Combustion Kinetics
Zaige Fei, Fan Xu, Junyuan Mao et al.
RecDreamer: Consistent Text-to-3D Generation via Uniform Score Distillation
Chenxi Zheng, Yihong Lin, Bangzhen Liu et al.
CameraCtrl: Enabling Camera Control for Video Diffusion Models
Hao He, Yinghao Xu, Yuwei Guo et al.
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
Botao Ye, Sifei Liu, Haofei Xu et al.
Comparing Targeting Strategies for Maximizing Social Welfare with Limited Resources
Vibhhu Sharma, Bryan Wilder
MotionClone: Training-Free Motion Cloning for Controllable Video Generation
Pengyang Ling, Jiazi Bu, Pan Zhang et al.
Systematic Relational Reasoning With Epistemic Graph Neural Networks
Irtaza Khalid, Steven Schockaert
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension
Amaia Cardiel, Eloi Zablocki, Elias Ramzi et al.
Continuity-Preserving Convolutional Autoencoders for Learning Continuous Latent Dynamical Models from Images
Aiqing Zhu, Yuting Pan, Qianxiao Li
pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation
Shentong Mo, Xufang Luo, Dongsheng Li
Comparing noisy neural population dynamics using optimal transport distances
Amin Nejatbakhsh, Victor Geadah, Alex Williams et al.
ParFam -- (Neural Guided) Symbolic Regression via Continuous Global Optimization
Philipp Scholl, Katharina Bieker, Hillary Hauger et al.
ACES: Automatic Cohort Extraction System for Event-Stream Datasets
Justin Xu, Jack Gallifant, ALISTAIR JOHNSON et al.
GRAIN: Exact Graph Reconstruction from Gradients
Maria Drencheva, Ivo Petrov, Maximilian Baader et al.
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
Shuhong Zheng, Zhipeng Bao, Ruoyu Zhao et al.
Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models
Aniruddha Kembhavi, Mohit Bansal, Amita Kamath et al.
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
Sang-Hoon Lee, Ha-Yeong Choi, Seong-Whan Lee
See It from My Perspective: How Language Affects Cultural Bias in Image Understanding
Amith Ananthram, Elias Stengel-Eskin, Mohit Bansal et al.
Streaming Algorithms For $\ell_p$ Flows and $\ell_p$ Regression
Amit Chakrabarti, Jeffrey Jiang, David Woodruff et al.
Improving the Sparse Structure Learning of Spiking Neural Networks from the View of Compression Efficiency
Jiangrong Shen, Qi Xu, Gang Pan et al.
A Theory for Token-Level Harmonization in Retrieval-Augmented Generation
Shicheng Xu, Liang Pang, Huawei Shen et al.
Peeking Behind Closed Doors: Risks of LLM Evaluation by Private Data Curators
Pratyush Maini, Hritik Bansal
Reassessing EMNLP 2024’s Best Paper: Does Divergence-Based Calibration for MIAs Hold Up?
Pratyush Maini, Anshuman Suri
SoftCVI: Contrastive variational inference with self-generated soft labels
Daniel Ward, Mark Beaumont, Matteo Fasiolo
ContraDiff: Planning Towards High Return States via Contrastive Learning
Yixiang Shan, Zhengbang Zhu, Ting Long et al.
Class Distribution-induced Attention Map for Open-vocabulary Semantic Segmentations
Dong Un Kang, Hayeon Kim, Se Young Chun
Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate
Byung Hyun Lee, Sungjin Lim, Seunggyu Lee et al.
Reconstruction-Guided Policy: Enhancing Decision-Making through Agent-Wise State Consistency
Qifan Liang, Yixiang Shan, Haipeng Liu et al.
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
Yu Heng Hung, Kai-Jie Lin, Yu-Heng Lin et al.
Reasoning Elicitation in Language Models via Counterfactual Feedback
Alihan Hüyük, Xinnuo Xu, Jacqueline Maasch et al.
TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models
Liangzu Peng, Juan Elenter, Joshua Agterberg et al.
Coreset Spectral Clustering
Ben Jourdan, Gregory Schwartzman, Peter Macgregor et al.
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
Jinxu Lin, Linwei Tao, Minjing Dong et al.
Fundamental Limitations on Subquadratic Alternatives to Transformers
Josh Alman, Hantao Yu
Start Smart: Leveraging Gradients For Enhancing Mask-based XAI Methods
Buelent Uendes, Shujian Yu, Mark Hoogendoorn
Improving Instruction-Following in Language Models through Activation Steering
Alessandro Stolfo, Vidhisha Balachandran, Safoora Yousefi et al.
Unearthing Skill-level Insights for Understanding Trade-offs of Foundation Models
Mazda Moayeri, Vidhisha Balachandran, Varun Chandrasekaran et al.
Intelligence at the Edge of Chaos
Shiyang Zhang, Aakash Patel, Syed Rizvi et al.
Multimodal Situational Safety
Kaiwen Zhou, Chengzhi Liu, Xuandong Zhao et al.
Analysing The Spectral Biases in Generative Models
Amitoj Miglani, Shweta Singh, Vidit Aggarwal
Learning Continually by Spectral Regularization
Alex Lewandowski, Michał Bortkiewicz, Saurabh Kumar et al.
MGDA Converges under Generalized Smoothness, Provably
Qi Zhang, Peiyao Xiao, Shaofeng Zou et al.
Boundary constrained Gaussian processes for robust physics-informed machine learning of linear partial differential equations
David Dalton, Alan Lazarus, Hao Gao et al.
Bayesian Regularization of Latent Representation
Chukwudi Paul Obite, Zhi Chang, Keyan Wu et al.
Boosting Ray Search Procedure of Hard-label Attacks with Transfer-based Priors
Chen Ma, Xinjie Xu, Shuyu Cheng et al.
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
Moritz Reuss, Jyothish Pari, Pulkit Agrawal et al.
DEPfold: RNA Secondary Structure Prediction as Dependency Parsing.
Ke Wang, Shay B Cohen
Open-Source vs Close-Source: The Context Utilization Challenge
Litu Ou
Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
Louis Bradshaw, Simon Colton
Variational Bayesian Pseudo-Coreset
Hyungi Lee, Seungyoo Lee, Juho Lee
ThinK: Thinner Key Cache by Query-Driven Pruning
Yuhui Xu, Zhanming Jie, Hanze Dong et al.
Scaling up the Banded Matrix Factorization Mechanism for Large Scale Differentially Private ML
Ryan McKenna
BoneMet: An Open Large-Scale Multi-Modal Murine Dataset for Breast Cancer Bone Metastasis Diagnosis and Prognosis
Tiankuo Chu, Fudong Lin, Shubo Wang et al.
Adaptive Camera Sensor for Vision Models
Eunsu Baek, Sung-hwan Han, Taesik Gong et al.
Size-Generalizable RNA Structure Evaluation by Exploring Hierarchical Geometries
Zongzhao Li, Jiacheng Cen, Wenbing Huang et al.
Semialgebraic Neural Networks: From roots to representations
S David Mis, Matti Lassas, Maarten V de Hoop
Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning
Nan Jiang, Chengxiao Wang, Kevin Liu et al.
ADAPT: Attentive Self-Distillation and Dual-Decoder Prediction Fusion for Continual Panoptic Segmentation
Ze Yang, Shichao Dong, Ruibo Li et al.
Flow With What You Know
Scott Hawley
KLay: Accelerating Arithmetic Circuits for Neurosymbolic AI
Jaron Maene, Vincent Derkinderen, Pedro Zuidberg Dos Martires
Difference-of-submodular Bregman Divergence
Masanari Kimura, Takahiro Kawashima, Tasuku Soma et al.
Transformers are Universal In-context Learners
Takashi Furuya, Maarten V de Hoop, Gabriel Peyré
Differential learning kinetics govern the transition from memorization to generalization during in-context learning
Alex Nguyen, Gautam Reddy Nallamala
GameGen-X: Interactive Open-world Game Video Generation
Haoxuan Che, Xuanhua He, Quande Liu et al.
Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers
Shijie Chen, Bernal Jimenez Gutierrez, Yu Su
DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References
Xueyi Liu, Jianibieke Adalibieke, Qianwei Han et al.
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
Hojoon Lee, Dongyoon Hwang, Donghu Kim et al.
Revealing and Mitigating Over-Attention in Knowledge Editing
Pinzheng Wang, Zecheng Tang, Keyan Zhou et al.
MotherNet: Fast Training and Inference via Hyper-Network Transformers
Andreas Mueller, Carlo Curino, Raghu Ramakrishnan
Optimistic Games for Combinatorial Bayesian Optimization with Application to Protein Design
Melis Ilayda Bal, Pier Giuseppe Sessa, Mojmir Mutny et al.
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding
Alexey Skrynnik, Anton Andreychuk, Anatolii Borzilov et al.
Probabilistic Conformal Prediction with Approximate Conditional Validity
Vincent Plassier, Alexander Fishkov, Mohsen Guizani et al.
Provable Convergence Bounds for Hybrid Dynamical Sampling and Optimization
Matthew Burns, Qingyuan Hou, Michael Huang
Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control
Devdhar Patel, Hava Siegelmann
Large Scale Knowledge Washing
Yu Wang, Ruihan Wu, Zexue He et al.
MorphoDiff: Cellular Morphology Painting with Diffusion Models
Zeinab Navidi, Jun Ma, Esteban Miglietta et al.
PN-GAIL: Leveraging Non-optimal Information from Imperfect Demonstrations
Qiang Liu, Huiqiao Fu, Kaiqiang Tang et al.
Rethinking Graph Neural Networks From A Geometric Perspective Of Node Features
Feng Ji, Yanan Zhao, KAI ZHAO et al.
InfoGS: Efficient Structure-Aware 3D Gaussians via Lightweight Information Shaping
Yunchao Zhang, Guandao Yang, Leonidas Guibas et al.
IV-mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video Synthesis
Shitong Shao, zikai zhou, Lichen Bai et al.
Animate Your Thoughts: Reconstruction of Dynamic Natural Vision from Human Brain Activity
Yizhuo Lu, Changde Du, Chong Wang et al.
Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate
Yexiang Liu, Jie Cao, Zekun Li et al.
Multi-objective Differentiable Neural Architecture Search
Rhea Sukthanker, Arber Zela, Benedikt Staffler et al.
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi, Julien Siems, Arber Zela et al.
RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models
Tanqiu Jiang, Changjiang Li, Fenglong Ma et al.
CAX: Cellular Automata Accelerated in JAX
Maxence Faldor, Antoine Cully
Modeling dynamic social vision highlights gaps between deep learning and humans
Kathy Garcia, Emalie McMahon, Colin Conwell et al.
MoLEx: Mixture of Layer Experts for Fine-tuning with Sparse Upcycling
Rachel Teo, Tan Nguyen
Transformer Meets Twicing: Harnessing Unattended Residual Information
Laziz Abdullaev, Tan Nguyen
RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression
Hengzhe Zhang, Qi Chen, Bing XUE et al.
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
Angelika Romanou, Negar Foroutan, Anna Sotnikova et al.
Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment
Chenliang Li, Siliang Zeng, Zeyi Liao et al.
Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models
Jung Hyun Lee, June Yong Yang, Byeongho Heo et al.
Inverse Attention Agents for Multi-Agent Systems
Qian Long, Ruoyan Li, Minglu Zhao et al.
SelectFormer in Data Markets: Privacy-Preserving and Efficient Data Selection for Transformers with Multi-Party Computation
Xu Ouyang, Felix Xiaozhu Lin, Yangfeng Ji
Generalizing Reasoning Problems to Longer Lengths
Changnan Xiao, Bing Liu
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
Andy K Zhang, Neil Perry, Riya Dulepet et al.
Learn-by-interact: A Data-Centric Framework For Self-Adaptive Agents in Realistic Environments
Hongjin SU, Ruoxi Sun, Jinsung Yoon et al.
Verifying Properties of Binary Neural Networks Using Sparse Polynomial Optimization
Jianting Yang, Srecko Durasinovic, Jean Bernard Lasserre et al.
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
Saman Kazemkhani, Aarav Pandya, Daphne Cornelisse et al.
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
Jiaxiang Tang, Max Li, Zekun Hao et al.
Action Sequence Augmentation for Action Anticipation
Yihui Qiu, Deepu Rajan
IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
Shubham Dipak Ugare, Rohan Gumaste, Tarun Suresh et al.
AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies
Jian Gao, Weidong Cao, Junyi Yang et al.
The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD
Milad Nasr, Thomas Steinke, Borja Balle et al.
Near-Exact Privacy Amplification for Matrix Mechanisms
Christopher Choquette-Choo, Arun Ganesh, Saminul Haque et al.
Learning to Select Nodes in Branch and Bound with Sufficient Tree Representation
Sijia Zhang, Shuli Zeng, Shaoang Li et al.
Looking into User’s Long-term Interests through the Lens of Conservative Evidential Learning
Dingrong Wang, Krishna Neupane, Ervine Zheng et al.
Computational Explorations of Total Variation Distance
Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel et al.
TopoLM: brain-like spatio-functional organization in a topographic language model
Neil Rathi, Johannes Mehrer, Badr AlKhamissi et al.
Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs
Siyan Zhao, Mingyi Hong, Yang Liu et al.
Convex Formulations for Training Two-Layer ReLU Neural Networks
Karthik Prakhya, Tolga Birdal, Alp Yurtsever
Accelerating Task Generalisation with Multi-Level Skill Hierarchies
Thomas Cannon, Özgür Şimşek
SSOLE: Rethinking Orthogonal Low-rank Embedding for Self-Supervised Learning
Lun Huang, Qiang Qiu, Guillermo Sapiro
Large Convolutional Model Tuning via Filter Subspace
Wei Chen, Zichen Miao, Qiang Qiu
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Yuxin Jiang, Bo Huang, Yufei Wang et al.
DPaI: Differentiable Pruning at Initialization with Node-Path Balance Principle
Lichuan Xiang, Quan Nguyen-Tri, Lan-Cuong Nguyen et al.
Balancing Act: Diversity and Consistency in Large Language Model Ensembles
Ahmed Abdulaal, Chen Jin, Nina Montaña-Brown et al.
DELTA: DENSE EFFICIENT LONG-RANGE 3D TRACKING FOR ANY VIDEO
Tuan Ngo, Peiye Zhuang, Evangelos Kalogerakis et al.
Tailoring Mixup to Data for Calibration
Quentin Bouniot, Pavlo Mozharovskyi, Florence d'Alché-Buc
BlendRL: A Framework for Merging Symbolic and Neural Policy Learning
Hikaru Shindo, Quentin Delfosse, Devendra Singh Dhami et al.
MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
Jing Yang, Minyue Jiang, Sen Yang et al.
On Evaluating the Durability of Safeguards for Open-Weight LLMs
Xiangyu Qi, Boyi Wei, Nicholas Carlini et al.
LeanVec: Searching vectors faster by making them fit
Ishwar Bhati, Cecilia Aguerrebere, Mark Hildebrand et al.
Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide, Josh Engels, Eric Michaud et al.
Curriculum-aware Training for Discriminating Molecular Property Prediction Models
Hansi Yang, Quanming Yao, James Kwok
Rationalizing and Augmenting Dynamic Graph Neural Networks
Guibin Zhang, Yiyan Qi, Ziyang Cheng et al.
Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
Chongjie Si, Xuehui Wang, Xue Yang et al.
On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning
Yongyi Su, Yushu Li, Nanqing Liu et al.
Evidential Learning-based Certainty Estimation for Robust Dense Feature Matching
Lile Cai, Chuan Sheng Foo, Xun Xu et al.
Policy Design in Long-run Welfare Dynamics
Jiduan Wu, Rediet Abebe, Moritz Hardt et al.
KAA: Kolmogorov-Arnold Attention for Enhancing Attentive Graph Neural Networks
Taoran Fang, Tianhong Gao, Chunping Wang et al.
DECO: Unleashing the Potential of ConvNets for Query-based Detection and Segmentation
Xinghao Chen, Siwei Li, Yijing Yang et al.
A Theoretical Framework for Partially-Observed Reward States in RLHF
Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano et al.
SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION
Jingxuan Chen, Derek Yuen, Bin Xie et al.
DeeperForward: Enhanced Forward-Forward Training for Deeper and Better Performance
Liang Sun, Yang Zhang, Weizhao He et al.
SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding
Zhenyu Yang, Yuhang Hu, Zemin Du et al.
Spherical Tree-Sliced Wasserstein Distance
Viet-Hoang Tran, Thanh Chu, Minh-Khoi Nguyen-Nhat et al.
MAI: A Multi-turn Aggregation-Iteration Model for Composed Image Retrieval
Yanzhe Chen, Zhiwen Yang, Jinglin Xu et al.
Disentangled Representation Learning with the Gromov-Monge Gap
Théo Uscidda, Luca Eyring, Karsten Roth et al.
kNN Attention Demystified: A Theoretical Exploration for Scalable Transformers
Themistoklis Haris
Towards Unified Human Motion-Language Understanding via Sparse Interpretable Characterization
guangtao lyu, Chenghao Xu, Jiexi Yan et al.
Efficient Low-Bit Quantization with Adaptive Scales for Multi-Task Co-Training
Boyu Liu, Haoyu Huang, Linlin Yang et al.
Regularizing Energy among Training Samples for Out-of-Distribution Generalization
Yiting Chen, Qitian Wu, Junchi Yan
Rethinking and Improving Autoformalization: Towards a Faithful Metric and a Dependency Retrieval-based Approach
Qi Liu, Xinhao Zheng, Xudong Lu et al.
Learning Structured Universe Graph with Outlier OOD Detection for Partial Matching
Zetian Jiang, Jiaxin Lu, Haizhao Fan et al.
What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models
Ahmed Imtiaz Humayun, Ibtihel Amara, Cristina Nader Vasconcelos et al.
To Clip or not to Clip: the Dynamics of SGD with Gradient Clipping in High-Dimensions
Noah Marshall, Ke Liang Xiao, Atish Agarwala et al.
UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP
Wenzheng Pan, Hao Xiong, Jiale Ma et al.
Forget the Data and Fine-Tuning! Just Fold the Network to Compress
Dong Wang, Haris Šikić, Lothar Thiele et al.
Statistical Advantages of Perturbing Cosine Router in Mixture of Experts
Huy Nguyen, Pedram Akbarian Saravi, Trang Pham et al.
Learning Geometric Reasoning Networks For Robot Task And Motion Planning
Smail Ait Bouhsain, Rachid Alami, Thierry Simeon
Prompting Fairness: Integrating Causality to Debias Large Language Models
Jingling Li, Zeyu Tang, Xiaoyu Liu et al.
Dynamic Negative Guidance of Diffusion Models
Felix Koulischer, Johannes Deleu, Gabriel Raya et al.
Bilinear MLPs enable weight-based mechanistic interpretability
Michael Pearce, Thomas Dooms, Alice Rigg et al.
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang, Shiqi Shen, Guangyao Shen et al.
DocMIA: Document-Level Membership Inference Attacks against DocVQA Models
Khanh Nguyen, Raouf Kerkouche, Mario Fritz et al.
Fine-Tuning Token-Based Large Multimodal Models: What Works, What Doesn’t and What's Next
Zhulin Hu, Yan Ma, Jiadi Su et al.
Training-Free Diffusion Model Alignment with Sampling Demons
Po-Hung Yeh, Kuang-Huei Lee, Jun-Cheng Chen
Uncertainty-Aware Decoding with Minimum Bayes Risk
Nico Daheim, Clara Meister, Thomas Möllenhoff et al.
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
Thomas Robert, Mher Safaryan, Ionut-Vlad Modoranu et al.
Tracking objects that change in appearance with phase synchrony
Sabine Muzellec, Drew Linsley, Alekh Ashok et al.
Descent with Misaligned Gradients and Applications to Hidden Convexity
Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar et al.
Diffusion State-Guided Projected Gradient for Inverse Problems
Rayhan Zirvi, Bahareh Tolooshams, anima anandkumar
Learning from weak labelers as constraints
Vishwajeet Agrawal, Rattana Pukdee, Nina Balcan et al.
A Distributional Approach to Uncertainty-Aware Preference Alignment Using Offline Demonstrations
Sheng Xu, Bo Yue, Hongyuan Zha et al.
Estimating the Probabilities of Rare Outputs in Language Models
Gabriel Wu, Jacob Hilton
Self-Normalized Resets for Plasticity in Continual Learning
Vivek Farias, Adam Jozefiak
Training on the Test Task Confounds Evaluation and Emergence
Ricardo Dominguez-Olmedo, Florian Eddie Dorner, Moritz Hardt
COME: Test-time Adaption by Conservatively Minimizing Entropy
Qingyang Zhang, Yatao Bian, Xinke Kong et al.
Oracle efficient truncated statistics
Konstantinos Karatapanis, Vasilis Kontonis, Christos Tzamos
Training Free Guided Flow-Matching with Optimal Control
Luran Wang, Chaoran Cheng, Yizhen Liao et al.
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
Mingjie Li, Wai Man Si, Michael Backes et al.