ICLR 2025 Papers

Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers

Yuchen Liang, Peizhong Ju, Yingbin Liang et al.

ICLR 2025posterarXiv:2412.07684

The Pitfalls of Memorization: When Memorization Hurts Generalization

Reza Bayat, Mohammad Pezeshki, Elvis Dohmatob et al.

The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions

Stefan Sylvius Wagner, Maike Behrendt, Marc Ziegele et al.

The Ramanujan Library - Automated Discovery on the Hypergraph of Integer Relations

Itay Beit Halachmi, Ido Kaminer

ICLR 2025posterarXiv:2412.07298

The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model

Jiawei Chen, Wentao Chen, Jing Su et al.

ICLR 2025posterarXiv:2409.07200

ThermalGaussian: Thermal 3D Gaussian Splatting

Rongfeng Lu, Hangyu Chen, Zunjie Zhu et al.

THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS

Huiyang Yi, Yanyan He, Duxin Chen et al.

The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling

Ruochen Zhang, Qinan Yu, Matianyu Zang et al.

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities

Zhaofeng Wu, Xinyan Yu, Dani Yogatama et al.

The Superposition of Diffusion Models Using the Itô Density Estimator

Marta Skreta, Lazar Atanackovic, Joey Bose et al.

ICLR 2025posterarXiv:2403.17887

The Unreasonable Ineffectiveness of the Deeper Layers

Andrey Gromov, Kushal Tirumala, Hassan Shapourian et al.

160

ICLR 2025posterarXiv:2412.09119

The Utility and Complexity of In- and Out-of-Distribution Machine Unlearning

Youssef Allouah, Joshua Kazdan, Rachid Guerraoui et al.

The Value of Sensory Information to a Robot

Arjun Krishna, Edward Hu, Dinesh Jayaraman

ThinkBot: Embodied Instruction Following with Thought Chain Reasoning

Guanxing Lu, Ziwei Wang, Changliu Liu et al.

Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation

Shengjie Ma, Chengjin Xu, Xuhui Jiang et al.

Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation

Wenhui Tan, Boyuan Li, Chuhao Jin et al.

ThinK: Thinner Key Cache by Query-Driven Pruning

Yuhui Xu, Zhanming Jie, Hanze Dong et al.

ICLR 2025posterarXiv:2410.13413

Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models

Chengyu Du, Jinyi Han, Yizhou Ying et al.

Think while You Generate: Discrete Diffusion with Planned Denoising

Sulin Liu, Juno Nam, Andrew Campbell et al.

Three-in-One: Fast and Accurate Transducer for Hybrid-Autoregressive ASR

Hainan Xu, Travis Bartley, Vladimir Bataev et al.

Three Mechanisms of Feature Learning in a Linear Network

Yizhou Xu, Liu Ziyin

ThunderKittens: Simple, Fast, and $\textit{Adorable}$ Kernels

Benjamin Spector, Simran Arora, Aaryan Singhal et al.

TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Lijie Yang, Zhihao Zhang, Zhuofu Chen et al.

ICLR 2025oralarXiv:2410.01469

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Mohan Xu, Kai Li, Guo Chen et al.

TIGeR: Unifying Text-to-Image Generation and Retrieval with Large Multimodal Models

Leigang Qu, Haochuan Li, Tan Wang et al.

ICLR 2025posterarXiv:2502.15315

Tight Clusters Make Specialized Experts

Stefan Nielsen, Rachel Teo, Laziz Abdullaev et al.

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Tudor Cebere, Aurélien Bellet, Nicolas Papernot

Tight Lower Bounds under Asymmetric High-Order Hölder Smoothness and Uniform Convexity

Cedar Site Bai, Brian Bullins

Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics

Alexander Tyurin

ICLR 2025posterarXiv:2503.15890

Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do

Yoav Wald, Mark Goldstein, Yonathan Efroni et al.

TimeInf: Time Series Data Contribution via Influence Functions

Yizi Zhang, Jingyan Shen, Xiaoxue Xiong et al.

TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting

Songtao Huang, Zhen Zhao, Can Li et al.

TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis

Shiyu Wang, Jiawei LI, Xiaoming Shi et al.

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Xiaoming Shi, Shiyu Wang, Yuqi Nie et al.

170

Timer-XL: Long-Context Transformers for Unified Time Series Forecasting

Yong Liu, Guo Qin, Xiangdong Huang et al.

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

Xiangyu Zeng, Kunchang Li, Chenting Wang et al.

Time-to-Event Pretraining for 3D Medical Imaging

Zepeng Frazier Huo, Jason Fries, Alejandro Lozano et al.

TIPS: Text-Image Pretraining with Spatial awareness

Kevis-Kokitsi Maninis, Kaifeng Chen, Soham Ghosh et al.

TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights

Aiwei Liu, Haoping Bai, Zhiyun Lu et al.

T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data

Hugo Thimonier, José Lucas De Melo Costa, Fabrice Popineau et al.

TLDR: Token-Level Detective Reward Model for Large Vision Language Models

Deqing Fu, Tong Xiao, Rui Wang et al.

To Clip or not to Clip: the Dynamics of SGD with Gradient Clipping in High-Dimensions

Noah Marshall, Ke Liang Xiao, Atish Agarwala et al.

To Code or Not To Code? Exploring Impact of Code in Pre-training

Viraat Aryabumi, Yixuan Su, Raymond Ma et al.

ICLR 2025posterarXiv:2409.12183

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Zayne Sprague, Fangcong Yin, Juan Rodriguez et al.

239

ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge

Eslam Abdelrahman, Liangbing Zhao, Tao Hu et al.

TODO: Enhancing LLM Alignment with Ternary Preferences

Yuxiang Guo, Lu Yin, Bo Jiang et al.

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Haiyang Wang, Yue Fan, Muhammad Ferjad Naeem et al.

Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction

Ziyang Wu, Tianjiao Ding, Yifu Lu et al.

Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models

Jung Hyun Lee, June Yong Yang, Byeongho Heo et al.