NeurIPS 2025 Papers
5,858 papers found • Page 116 of 118
What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
Pulkit Gopalani, Wei Hu
What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Sang Choe, Hwijeen Ahn, Juhan Bae et al.
What Makes a Reward Model a Good Teacher? An Optimization Perspective
Noam Razin, Zixuan Wang, Hubert Strauss et al.
WHAT MAKES MATH PROBLEMS HARD FOR REINFORCEMENT LEARNING: A CASE STUDY
Ali Shehper, Anibal Medina-Mardones, Lucas Fagan et al.
What Matters in Data for DPO?
Yu Pan, Zhongze Cai, Huaiyang Zhong et al.
What Moves the Eyes: Doubling Mechanistic Model Performance Using Deep Networks to Discover and Test Cognitive Hypotheses
Federico D'Agostino, Lisa Schwetlick, Matthias Bethge et al.
What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
Chanakya Ekbote, Ashok Vardhan Makkuva, Marco Bondaschi et al.
What Really is a Member? Discrediting Membership Inference via Poisoning
Neal Mangaokar, Ashish Hooda, Zhuohang Li et al.
What’s in Common? Multimodal Models Hallucinate When Reasoning Across Scenes
Candace Ross, Florian Bordes, Adina Williams et al.
What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models
Keyon Vafa, Sarah Bentley, Jon Kleinberg et al.
What We Miss Matters: Learning from the Overlooked in Point Cloud Transformers
Yi Wang, Jiaze Wang, Ziyu Guo et al.
When Additive Noise Meets Unobserved Mediators: Bivariate Denoising Diffusion for Causal Discovery
Dominik Meier, Sujai Hiremath, PROMIT GHOSAL et al.
When and how can inexact generative models still sample from the data manifold?
Nisha Chandramoorthy, Adriaan de Clercq
When Are Concepts Erased From Diffusion Models?
Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.
When Can Model-Free Reinforcement Learning be Enough for Thinking?
Josiah Hanna, Nicholas Corrado
When Causal Dynamics Matter: Adapting Causal Strategies through Meta-Aware Interventions
Moritz Willig, Tim Woydt, Devendra Singh Dhami et al.
When Data Can't Meet: Estimating Correlation Across Privacy Barriers
Abhinav Chakraborty, Arnab Auddy, T. Tony Cai
When Does Closeness in Distribution Imply Representational Similarity? An Identifiability Perspective
Beatrix Nielsen, Emanuele Marconato, Andrea Dittadi et al.
When Does Curriculum Learning Help? A Theoretical Perspective
Raman Arora, Yunjuan Wang, Kaibo Zhang
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
Alireza Mousavi-Hosseini, Clayton Sanford, Denny Wu et al.
When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product
Youqi WU, Jingwei Zhang, Farzan Farnia
When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners
Weixiang Zhao, Jiahe Guo, Yang Deng et al.
When Lower-Order Terms Dominate: Adaptive Expert Algorithms for Heavy-Tailed Losses
Antoine Moulin, Emmanuel Esposito, Dirk van der Hoeven
When majority rules, minority loses: bias amplification of gradient descent
François Bachoc, Jerome Bolte, Ryan Boustany et al.
When Models Don’t Collapse: On the Consistency of Iterative MLE
Daniel Barzilai, Ohad Shamir
When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
Quan Shi, Carlos Jimenez, Shunyu Yao et al.
When No Paths Lead to Rome: Benchmarking Systematic Neural Relational Reasoning
Anirban Das, Muhammad Irtaza Khalid, Rafael Peñaloza et al.
When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions
Zhuo Cao, Heming Du, Bingqing Zhang et al.
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
Yan Shu, Hangui Lin, Yexin Liu et al.
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
Romy Luo, Zihui (Sherry) Xue, Alex Dimakis et al.
When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
Xiaomin Li, Zhou Yu, Zhiwei Zhang et al.
When Worse is Better: Navigating the Compression Generation Trade-off In Visual Tokenization
Vivek Ramanujan, Kushal Tirumala, Armen Aghajanyan et al.
Where and How to Perturb: On the Design of Perturbation Guidance in Diffusion and Flow Models
Donghoon Ahn, Jiwon Kang, Sanghyun Lee et al.
Where Does It Exist from the Low-Altitude: Spatial Aerial Video Grounding
Yang Zhan, Yuan Yuan
Where Graph Meets Heterogeneity: Multi-View Collaborative Graph Experts
Zhihao Wu, Jinyu Cai, Yunhe Zhang et al.
Which Algorithms Have Tight Generalization Bounds?
Michael Gastpar, Ido Nachum, Jonathan Shafer et al.
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions
Siqi Kou, Qingyuan Tian, Hanwen Xu et al.
Whitened Score Diffusion: A Structured Prior for Imaging Inverse Problems
Jeffrey Alido, Tongyu Li, Yu Sun et al.
Whole-Body Conditioned Egocentric Video Prediction
Yutong Bai, Danny Tran, Amir Bar et al.
Who Reasons in the Large Language Models?
Jie Shao, Jianxin Wu
Whose Instructions Count? Resolving Preference Bias in Instruction Fine-Tuning
Jiayu Zhang, Changbang Li, Yinan Peng et al.
Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models
Charvi Rastogi, Tian Huey Teh, Pushkar Mishra et al.
Who Speaks for the Trigger? Dynamic Expert Routing in Backdoored Mixture-of-Experts Transformers
Xin Zhao, Xiaojun Chen, Bingshan Liu et al.
Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation
Qing Yu, Xiaobei Wang, Shuchang Liu et al.
Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering
Yangfu Li, Hongjian Zhan, Tianyi Chen et al.
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Yiyou Sun, Yu Gai, Lijie Chen et al.
Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training
Tony Bonnaire, Raphaël Urfin, Giulio Biroli et al.
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri, Melissa Z Pan, Shuyi Yang et al.
Why Do Some Language Models Fake Alignment While Others Don't?
Abhay Sheshadri, John Hughes, Julian Michael et al.
Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation
Sungmin Cha, Kyunghyun Cho