NeurIPS 2025 Papers

What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

Sang Choe, Hwijeen Ahn, Juhan Bae et al.

What Makes a Reward Model a Good Teacher? An Optimization Perspective

Noam Razin, Zixuan Wang, Hubert Strauss et al.

WHAT MAKES MATH PROBLEMS HARD FOR REINFORCEMENT LEARNING: A CASE STUDY

Ali Shehper, Anibal Medina-Mardones, Lucas Fagan et al.

What Matters in Data for DPO?

Yu Pan, Zhongze Cai, Huaiyang Zhong et al.

What Moves the Eyes: Doubling Mechanistic Model Performance Using Deep Networks to Discover and Test Cognitive Hypotheses

Federico D'Agostino, Lisa Schwetlick, Matthias Bethge et al.

NeurIPS 2025spotlightarXiv:2508.07208

What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains

Chanakya Ekbote, Ashok Vardhan Makkuva, Marco Bondaschi et al.

What Really is a Member? Discrediting Membership Inference via Poisoning

Neal Mangaokar, Ashish Hooda, Zhuohang Li et al.

What’s in Common? Multimodal Models Hallucinate When Reasoning Across Scenes

Candace Ross, Florian Bordes, Adina Williams et al.

What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models

Keyon Vafa, Sarah Bentley, Jon Kleinberg et al.

What We Miss Matters: Learning from the Overlooked in Point Cloud Transformers

Yi Wang, Jiaze Wang, Ziyu Guo et al.

When Additive Noise Meets Unobserved Mediators: Bivariate Denoising Diffusion for Causal Discovery

Dominik Meier, Sujai Hiremath, PROMIT GHOSAL et al.

NeurIPS 2025posterarXiv:2508.07581

When and how can inexact generative models still sample from the data manifold?

Nisha Chandramoorthy, Adriaan de Clercq

When Are Concepts Erased From Diffusion Models?

Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.

When Can Model-Free Reinforcement Learning be Enough for Thinking?

Josiah Hanna, Nicholas Corrado

When Causal Dynamics Matter: Adapting Causal Strategies through Meta-Aware Interventions

Moritz Willig, Tim Woydt, Devendra Singh Dhami et al.

When Data Can't Meet: Estimating Correlation Across Privacy Barriers

Abhinav Chakraborty, Arnab Auddy, T. Tony Cai

When Does Closeness in Distribution Imply Representational Similarity? An Identifiability Perspective

Beatrix Nielsen, Emanuele Marconato, Andrea Dittadi et al.

When Does Curriculum Learning Help? A Theoretical Perspective

Raman Arora, Yunjuan Wang, Kaibo Zhang

When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective

Alireza Mousavi-Hosseini, Clayton Sanford, Denny Wu et al.

When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product

Youqi WU, Jingwei Zhang, Farzan Farnia

When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners

Weixiang Zhao, Jiahe Guo, Yang Deng et al.

When Lower-Order Terms Dominate: Adaptive Expert Algorithms for Heavy-Tailed Losses

Antoine Moulin, Emmanuel Esposito, Dirk van der Hoeven

When majority rules, minority loses: bias amplification of gradient descent

François Bachoc, Jerome Bolte, Ryan Boustany et al.

When Models Don’t Collapse: On the Consistency of Iterative MLE

Daniel Barzilai, Ohad Shamir

When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

Quan Shi, Carlos Jimenez, Shunyu Yao et al.

When No Paths Lead to Rome: Benchmarking Systematic Neural Relational Reasoning

Anirban Das, Muhammad Irtaza Khalid, Rafael Peñaloza et al.

NeurIPS 2025oralarXiv:2510.17218

When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions

Zhuo Cao, Heming Du, Bingqing Zhang et al.

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

Yan Shu, Hangui Lin, Yexin Liu et al.

When Thinking Drifts: Evidential Grounding for Robust Video Reasoning

Romy Luo, Zihui (Sherry) Xue, Alex Dimakis et al.

When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs

Xiaomin Li, Zhou Yu, Zhiwei Zhang et al.

When Worse is Better: Navigating the Compression Generation Trade-off In Visual Tokenization

Vivek Ramanujan, Kushal Tirumala, Armen Aghajanyan et al.

Where and How to Perturb: On the Design of Perturbation Guidance in Diffusion and Flow Models

Donghoon Ahn, Jiwon Kang, Sanghyun Lee et al.

Where Does It Exist from the Low-Altitude: Spatial Aerial Video Grounding

Yang Zhan, Yuan Yuan

Where Graph Meets Heterogeneity: Multi-View Collaborative Graph Experts

Zhihao Wu, Jinyu Cai, Yunhe Zhang et al.

NeurIPS 2025spotlightarXiv:2410.01969

Which Algorithms Have Tight Generalization Bounds?

Michael Gastpar, Ido Nachum, Jonathan Shafer et al.

Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions

Siqi Kou, Qingyuan Tian, Hanwen Xu et al.

Whitened Score Diffusion: A Structured Prior for Imaging Inverse Problems

Jeffrey Alido, Tongyu Li, Yu Sun et al.

Whole-Body Conditioned Egocentric Video Prediction

Yutong Bai, Danny Tran, Amir Bar et al.

Who Reasons in the Large Language Models?

Jie Shao, Jianxin Wu

Whose Instructions Count? Resolving Preference Bias in Instruction Fine-Tuning

Jiayu Zhang, Changbang Li, Yinan Peng et al.

NeurIPS 2025spotlightarXiv:2507.13383

Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models

Charvi Rastogi, Tian Huey Teh, Pushkar Mishra et al.

Who Speaks for the Trigger? Dynamic Expert Routing in Backdoored Mixture-of-Experts Transformers

Xin Zhao, Xiaojun Chen, Bingshan Liu et al.

Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation

Qing Yu, Xiaobei Wang, Shuchang Liu et al.

Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering

Yangfu Li, Hongjian Zhan, Tianyi Chen et al.

Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations

Yiyou Sun, Yu Gai, Lijie Chen et al.

Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training

Tony Bonnaire, Raphaël Urfin, Giulio Biroli et al.

NeurIPS 2025spotlightarXiv:2503.13657

Why Do Multi-Agent LLM Systems Fail?

Mert Cemri, Melissa Z Pan, Shuyi Yang et al.

188

Why Do Some Language Models Fake Alignment While Others Don't?

Abhay Sheshadri, John Hughes, Julian Michael et al.

Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation

Sungmin Cha, Kyunghyun Cho