Most Cited NEURIPS "self-introspection capabilities" Papers

5,858 papers found • Page 3 of 30

#401

Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities

Tara Akhound-Sadegh, Jungyoon Lee, Joey Bose et al.

NEURIPS 2025spotlightarXiv:2506.16471
16
citations
#402

LeVo: High-Quality Song Generation with Multi-Preference Alignment

Shun Lei, Yaoxun XU, ZhiweiLin et al.

NEURIPS 2025arXiv:2506.07520
16
citations
#403

ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference

Xiang Liu, Zhenheng Tang, Peijie Dong et al.

NEURIPS 2025arXiv:2502.00299
16
citations
#404

Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining

Raghuveer Thirukovalluru, Rui Meng, Ye Liu et al.

NEURIPS 2025spotlightarXiv:2505.11293
16
citations
#405

SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks

Hwiwon Lee, Ziqi Zhang, Hanxiao Lu et al.

NEURIPS 2025arXiv:2506.11791
16
citations
#406

Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)

Liwei Jiang, Yuanjun Chai, Margaret Li et al.

NEURIPS 2025oralarXiv:2510.22954
16
citations
#407

Metis: A Foundation Speech Generation Model with Masked Generative Pre-training

Yuancheng Wang, Jiachen Zheng, Junan Zhang et al.

NEURIPS 2025arXiv:2502.03128
16
citations
#408

Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization

Tao Zhang, Cheng Da, Kun Ding et al.

NEURIPS 2025arXiv:2502.01051
16
citations
#409

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

Ziang Yan, Yinan He, Xinhao Li et al.

NEURIPS 2025oralarXiv:2509.21100
16
citations
#410

OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles

Yihe Deng, Hritik Bansal, Fan Yin et al.

NEURIPS 2025arXiv:2503.17352
16
citations
#411

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.

NEURIPS 2025arXiv:2506.08989
16
citations
#412

GVPO: Group Variance Policy Optimization for Large Language Model Post-Training

Kaichen Zhang, Yuzhong Hong, Junwei Bao et al.

NEURIPS 2025arXiv:2504.19599
16
citations
#413

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

Andy Zhou, Kevin Wu, Francesco Pinto et al.

NEURIPS 2025arXiv:2503.15754
16
citations
#414

Repo2Run: Automated Building Executable Environment for Code Repository at Scale

Ruida Hu, Chao Peng, XinchenWang et al.

NEURIPS 2025spotlightarXiv:2502.13681
16
citations
#415

FastVID: Dynamic Density Pruning for Fast Video Large Language Models

Leqi Shen, Guoqiang Gong, Tao He et al.

NEURIPS 2025oralarXiv:2503.11187
16
citations
#416

Let LRMs Break Free from Overthinking via Self-Braking Tuning

Haoran Zhao, Yuchen Yan, Yongliang Shen et al.

NEURIPS 2025arXiv:2505.14604
16
citations
#417

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

Yuxian Gu, Qinghao Hu, Haocheng Xi et al.

NEURIPS 2025arXiv:2508.15884
16
citations
#418

ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing

Huadai Liu, Kaicheng Luo, Jialei Wang et al.

NEURIPS 2025oral
16
citations
#419

Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks

Andrea Montanari, Pierfrancesco Urbani

NEURIPS 2025oralarXiv:2502.21269
16
citations
#420

InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding

Minsoo Kim, Kyuhong Shim, Jungwook Choi et al.

NEURIPS 2025oralarXiv:2506.15745
16
citations
#421

CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs

Sijia Chen, Xiaomin Li, mengxue zhang et al.

NEURIPS 2025arXiv:2505.11413
16
citations
#422

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

Heyang Zhao, Chenlu Ye, Quanquan Gu et al.

NEURIPS 2025arXiv:2411.04625
16
citations
#423

Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations

Ji-An Li, Huadong Xiong, Robert Wilson et al.

NEURIPS 2025arXiv:2505.13763
16
citations
#424

Memory Injection Attacks on LLM Agents via Query-Only Interaction

Shen Dong, Shaochen Xu, Pengfei He et al.

NEURIPS 2025arXiv:2503.03704
16
citations
#425

Improving LLM General Preference Alignment via Optimistic Online Mirror Descent

Yuheng Zhang, Dian Yu, Tao Ge et al.

NEURIPS 2025spotlightarXiv:2502.16852
16
citations
#426

DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response

Junjue Wang, Weihao Xuan, Heli Qi et al.

NEURIPS 2025oralarXiv:2505.21089
15
citations
#427

Vision Transformers Don't Need Trained Registers

Nicholas Jiang, Amil Dravid, Alexei Efros et al.

NEURIPS 2025spotlightarXiv:2506.08010
15
citations
#428

Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning

Jiaru Zou, Yikun Ban, Zihao Li et al.

NEURIPS 2025spotlightarXiv:2505.16270
15
citations
#429

RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Yan Gong, Yiren Song, Yicheng Li et al.

NEURIPS 2025arXiv:2506.02528
15
citations
#430

Guided Diffusion Sampling on Function Spaces with Applications to PDEs

Jiachen Yao, Abbas Mammadov, Julius Berner et al.

NEURIPS 2025arXiv:2505.17004
15
citations
#431

Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens

Xixian Yong, Xiao Zhou, Yingying Zhang et al.

NEURIPS 2025spotlightarXiv:2505.18237
15
citations
#432

Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift

Kaizheng Wang

NEURIPS 2025arXiv:2302.10160
15
citations
#433

EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code

Yuhao Qing, Boyu Zhu, Mingzhe Du et al.

NEURIPS 2025arXiv:2505.13004
15
citations
#434

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

Shulin Huang, Linyi Yang, Yan Song et al.

NEURIPS 2025arXiv:2502.16268
15
citations
#435

TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster

Kanghui Ning, Zijie Pan, Yu Liu et al.

NEURIPS 2025arXiv:2503.07649
15
citations
#436

Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching

Benjamin Minixhofer, Ivan Vulić, Edoardo Maria Ponti

NEURIPS 2025arXiv:2503.20083
15
citations
#437

QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Wanlong Liu, Junxiao Xu, Fei Yu et al.

NEURIPS 2025spotlightarXiv:2506.12860
15
citations
#438

VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding

Zongxia Li, Xiyang Wu, Guangyao Shi et al.

NEURIPS 2025arXiv:2505.01481
15
citations
#439

MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control

Yuchen Zhu, Wei Guo, Jaemoo Choi et al.

NEURIPS 2025arXiv:2508.10684
15
citations
#440

Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling

Yitian Chen, Jingfan Xia, Siyu Shao et al.

NEURIPS 2025arXiv:2505.11792
15
citations
#441

From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.

NEURIPS 2025arXiv:2506.03093
15
citations
#442

Establishing Best Practices in Building Rigorous Agentic Benchmarks

Yuxuan Zhu, Tengjun Jin, Yada Pruksachatkun et al.

NEURIPS 2025arXiv:2507.02825
15
citations
#443

Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms

Zhihai Wang, Zijie Geng, Zhaojie Tu et al.

NEURIPS 2025arXiv:2407.15026
15
citations
#444

AutoPartGen: Autoregressive 3D Part Generation and Discovery

Minghao Chen, Jianyuan Wang, Roman Shapovalov et al.

NEURIPS 2025
15
citations
#445

DrivingRecon: Large 4D Gaussian Reconstruction Model For Autonomous Driving

Hao LU, Tianshuo Xu, Wenzhao Zheng et al.

NEURIPS 2025arXiv:2412.09043
15
citations
#446

MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks

Yinghao Zhu, Ziyi He, Haoran Hu et al.

NEURIPS 2025arXiv:2505.12371
15
citations
#447

OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers

Ziqiao Peng, Jiwen Liu, Haoxian Zhang et al.

NEURIPS 2025oralarXiv:2505.21448
15
citations
#448

AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling

Zhining Zhang, Chuanyang Jin, Mung Yao Jia et al.

NEURIPS 2025spotlightarXiv:2502.15676
15
citations
#449

1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities

Kevin Wang, Ishaan Javali, Michał Bortkiewicz et al.

NEURIPS 2025oralarXiv:2503.14858
15
citations
#450

DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents

Hao Li, Xiaogeng Liu, CHIU Chun et al.

NEURIPS 2025arXiv:2506.12104
15
citations
#451

AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios

Yunjia Qi, Hao Peng, Xiaozhi Wang et al.

NEURIPS 2025spotlight
15
citations
#452

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections

Bo Wang, Qinyuan Cheng, Runyu Peng et al.

NEURIPS 2025arXiv:2507.00018
15
citations
#453

MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation

Ning Li, Xiangmou Qu, Jiamu Zhou et al.

NEURIPS 2025oral
15
citations
#454

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

Jiyuan Shi, Xinzhe Liu, Dewei Wang et al.

NEURIPS 2025arXiv:2504.14305
15
citations
#455

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Jiatao Gu, Tianrong Chen, David Berthelot et al.

NEURIPS 2025spotlightarXiv:2506.06276
15
citations
#456

NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering

Zhihao Huang, Xi Qiu, Yukuo Ma et al.

NEURIPS 2025arXiv:2503.07076
15
citations
#457

On Reasoning Strength Planning in Large Reasoning Models

Leheng Sheng, An Zhang, Zijian Wu et al.

NEURIPS 2025arXiv:2506.08390
15
citations
#458

4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos

Zhen Xu, Zhengqin Li, Zhao Dong et al.

NEURIPS 2025spotlightarXiv:2506.08015
15
citations
#459

ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

junliang ye, Zhengyi Wang, Ruowen Zhao et al.

NEURIPS 2025spotlightarXiv:2506.01853
15
citations
#460

Conformal Prediction for Causal Effects of Continuous Treatments

Maresa Schröder, Dennis Frauen, Jonas Schweisthal et al.

NEURIPS 2025arXiv:2407.03094
14
citations
#461

Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging

Jinluan Yang, Dingnan Jin, Anke Tang et al.

NEURIPS 2025arXiv:2502.06876
14
citations
#462

Joint Velocity-Growth Flow Matching for Single-Cell Dynamics Modeling

Dongyi Wang, Yuanwei Jiang, Zhenyi Zhang et al.

NEURIPS 2025arXiv:2505.13413
14
citations
#463

RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving

Huacan Wang, Ziyi Ni, Shuo Zhang et al.

NEURIPS 2025spotlightarXiv:2505.21577
14
citations
#464

Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models

Luca Eyring, Shyamgopal Karthik, Alexey Dosovitskiy et al.

NEURIPS 2025arXiv:2508.09968
14
citations
#465

SeePhys: Does Seeing Help Thinking? – Benchmarking Vision-Based Physics Reasoning

Kun Xiang, Heng Li, Terry Jingchen Zhang et al.

NEURIPS 2025arXiv:2505.19099
14
citations
#466

Enhancing Time Series Forecasting through Selective Representation Spaces: A Patch Perspective

Xingjian Wu, Xiangfei Qiu, Hanyin Cheng et al.

NEURIPS 2025arXiv:2510.14510
14
citations
#467

UniRelight: Learning Joint Decomposition and Synthesis for Video Relighting

Kai He, Ruofan Liang, Jacob Munkberg et al.

NEURIPS 2025oralarXiv:2506.15673
14
citations
#468

REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints

Di Wu, Liu Liu, Zhou Linli et al.

NEURIPS 2025arXiv:2503.06677
14
citations
#469

DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.

NEURIPS 2025oralarXiv:2506.03517
14
citations
#470

MLIP Arena: Advancing Fairness and Transparency in Machine Learning Interatomic Potentials via an Open, Accessible Benchmark Platform

Yuan Chiang, Tobias Kreiman, Christine Zhang et al.

NEURIPS 2025spotlightarXiv:2509.20630
14
citations
#471

Multi-step Visual Reasoning with Visual Tokens Scaling and Verification

Tianyi Bai, Zengjie Hu, Fupeng Sun et al.

NEURIPS 2025arXiv:2506.07235
14
citations
#472

Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models

Ling Li, Yao Zhou, Yuxuan Liang et al.

NEURIPS 2025arXiv:2506.14674
14
citations
#473

MiniMax-Remover: Taming Bad Noise Helps Video Object Removal

Bojia Zi, Weixuan Peng, Xianbiao Qi et al.

NEURIPS 2025arXiv:2505.24873
14
citations
#474

Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models

Michael Plainer, Hao Wu, Leon Klein et al.

NEURIPS 2025arXiv:2506.17139
14
citations
#475

PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference

Jiarui Fang, Jinzhe Pan, Aoyu Li et al.

NEURIPS 2025arXiv:2405.14430
14
citations
#476

Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding

Yixiong Fang, Ziran Yang, Zhaorun Chen et al.

NEURIPS 2025arXiv:2412.06474
14
citations
#477

Puppeteer: Rig and Animate Your 3D Models

Chaoyue Song, Xiu Li, Fan Yang et al.

NEURIPS 2025oralarXiv:2508.10898
14
citations
#478

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

Chaofan Lin, Jiaming Tang, Shuo Yang et al.

NEURIPS 2025spotlightarXiv:2502.02770
14
citations
#479

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Senqiao Yang, Junyi Li, Xin Lai et al.

NEURIPS 2025arXiv:2507.13348
14
citations
#480

Meta-World+: An Improved, Standardized, RL Benchmark

Reginald McLean, Evangelos Chatzaroulas, Luc McCutcheon et al.

NEURIPS 2025arXiv:2505.11289
14
citations
#481

Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning

Yana Wei, Liang Zhao, Jianjian Sun et al.

NEURIPS 2025arXiv:2507.05255
14
citations
#482

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Hao Zhong, Muzhi Zhu, Zongze Du et al.

NEURIPS 2025oralarXiv:2505.20256
14
citations
#483

Backdoor Cleaning without External Guidance in MLLM Fine-tuning

Xuankun Rong, Wenke Huang, Jian Liang et al.

NEURIPS 2025arXiv:2505.16916
14
citations
#484

TESTING STATIONARITY AND CHANGE POINT DETECTION IN REINFORCEMENT LEARNING

Mengbing Li, Chengchun Shi, Zhenke Wu et al.

NEURIPS 2025arXiv:2203.01707
14
citations
#485

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation

Yibo Wang, Tiansheng Huang, Li Shen et al.

NEURIPS 2025arXiv:2501.18100
14
citations
#486

MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Zeyu Zhang, Quanyu Dai, Luyu Chen et al.

NEURIPS 2025arXiv:2409.20163
14
citations
#487

OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain

Wenzhen Yue, Yong Liu, Hao Wang et al.

NEURIPS 2025oralarXiv:2505.08550
14
citations
#488

Nested Learning: The Illusion of Deep Learning Architectures

Ali Behrouz, Meisam Razaviyayn, Peilin Zhong et al.

NEURIPS 2025arXiv:2512.24695
14
citations
#489

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Yijun Liang, Ming Li, Chenrui Fan et al.

NEURIPS 2025arXiv:2504.10514
14
citations
#490

Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo

Zachary Charles, Gabriel Teston, Lucio Dery et al.

NEURIPS 2025spotlightarXiv:2503.09799
14
citations
#491

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper

Xinyue Zhu, Binghao Huang, Yunzhu Li

NEURIPS 2025arXiv:2507.15062
14
citations
#492

Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking

Pengxiang Li, Shilin Yan, Jiayin Cai et al.

NEURIPS 2025arXiv:2505.20199
14
citations
#493

UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface

Hao Tang, Chen-Wei Xie, Haiyang Wang et al.

NEURIPS 2025spotlightarXiv:2503.01342
14
citations
#494

Scaling Laws for Optimal Data Mixtures

Mustafa Shukor, Louis Bethune, Dan Busbridge et al.

NEURIPS 2025arXiv:2507.09404
14
citations
#495

Do-PFN: In-Context Learning for Causal Effect Estimation

Jake Robertson, Arik Reuter, Siyuan Guo et al.

NEURIPS 2025spotlightarXiv:2506.06039
14
citations
#496

Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation

Yuyang Wanyan, Xi Zhang, Haiyang Xu et al.

NEURIPS 2025arXiv:2506.04614
13
citations
#497

UFM: A Simple Path towards Unified Dense Correspondence with Flow

Yuchen Zhang, Nikhil Keetha, Chenwei Lyu et al.

NEURIPS 2025arXiv:2506.09278
13
citations
#498

Hyperbolic Fine-Tuning for Large Language Models

Menglin Yang, Ram Samarth B B, Aosong Feng et al.

NEURIPS 2025spotlightarXiv:2410.04010
13
citations
#499

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu, Yi Ge, Yichen You et al.

NEURIPS 2025arXiv:2505.21600
13
citations
#500

Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

Yanming Wan, Jiaxing Wu, Marwa Abdulhai et al.

NEURIPS 2025arXiv:2504.03206
13
citations
#501

Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation

Jiyuan Wang, Chunyu Lin, cheng guan et al.

NEURIPS 2025arXiv:2503.15905
13
citations
#502

Geometry Aware Operator Transformer as an efficient and accurate neural surrogate for PDEs on arbitrary domains

Shizheng Wen, Arsh Kumbhat, Levi Lingsch et al.

NEURIPS 2025arXiv:2505.18781
13
citations
#503

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Yiren Song, Cheng Liu, Mike Zheng Shou

NEURIPS 2025arXiv:2505.18445
13
citations
#504

Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Shangbin Feng, Zifeng Wang, Palash Goyal et al.

NEURIPS 2025arXiv:2502.04510
13
citations
#505

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Kongcheng Zhang, QI YAO, Shunyu Liu et al.

NEURIPS 2025arXiv:2506.08745
13
citations
#506

Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties

Gouki Minegishi, Hiroki Furuta, Takeshi Kojima et al.

NEURIPS 2025arXiv:2506.05744
13
citations
#507

Linguini: A benchmark for language-agnostic linguistic reasoning

Eduardo Sánchez, Belen Alastruey, Christophe Ropers et al.

NEURIPS 2025arXiv:2409.12126
13
citations
#508

Imagine360: Immersive 360 Video Generation from Perspective Anchor

Jing Tan, Shuai Yang, Tong Wu et al.

NEURIPS 2025arXiv:2412.03552
13
citations
#509

KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment

Yuxing Lu, Wei Wu, Xukai Zhao et al.

NEURIPS 2025spotlightarXiv:2502.06472
13
citations
#510

Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers

Gangwei Xu, Haotong Lin, Hongcheng Luo et al.

NEURIPS 2025arXiv:2510.07316
13
citations
#511

Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals

Linda Zeng, Rithwik Gupta, Divij Motwani et al.

NEURIPS 2025arXiv:2502.16101
13
citations
#512

BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization

Xueyang Zhou, Guiyao Tie, Guowen Zhang et al.

NEURIPS 2025arXiv:2505.16640
13
citations
#513

Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code

Augusto B. Corrêa, André G. Pereira, Jendrik Seipp

NEURIPS 2025arXiv:2503.18809
13
citations
#514

Contextual Integrity in LLMs via Reasoning and Reinforcement Learning

Guangchen (Eric) Lan, Huseyin A. Inan, Sahar Abdelnabi et al.

NEURIPS 2025arXiv:2506.04245
13
citations
#515

JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Yunlong Lin, Zixu Lin, Kunjie Lin et al.

NEURIPS 2025arXiv:2506.17612
13
citations
#516

CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring

Benjamin Arnav, Pablo Bernabeu-Perez, Nathan Helm-Burger et al.

NEURIPS 2025arXiv:2505.23575
13
citations
#517

CLEVER: A Curated Benchmark for Formally Verified Code Generation

Amitayush Thakur, Jasper Lee, George Tsoukalas et al.

NEURIPS 2025arXiv:2505.13938
13
citations
#518

Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment

Weixiang Zhao, Xingyu Sui, Yulin Hu et al.

NEURIPS 2025arXiv:2505.15456
13
citations
#519

Latent Chain-of-Thought for Visual Reasoning

Guohao Sun, Hang Hua, Jian Wang et al.

NEURIPS 2025arXiv:2510.23925
13
citations
#520

Solving Inequality Proofs with Large Language Models

Jiayi Sheng, Luna Lyu, Jikai Jin et al.

NEURIPS 2025spotlightarXiv:2506.07927
13
citations
#521

Large Language Models Miss the Multi-agent Mark

Emanuele La Malfa, Gabriele La Malfa, Samuele Marro et al.

NEURIPS 2025arXiv:2505.21298
13
citations
#522

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

Jae-Won Chung, Jeff J. Ma, Ruofan Wu et al.

NEURIPS 2025spotlightarXiv:2505.06371
13
citations
#523

SURDS: Benchmarking Spatial Understanding and Reasoning in Driving Scenarios with Vision Language Models

Xianda Guo, Ruijun Zhang, Yiqun Duan et al.

NEURIPS 2025arXiv:2411.13112
13
citations
#524

R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization

Yuante Li, Xu Yang, Xiao Yang et al.

NEURIPS 2025arXiv:2505.15155
13
citations
#525

Detecting High-Stakes Interactions with Activation Probes

Alex McKenzie, Urja Pawar, Phil Blandfort et al.

NEURIPS 2025arXiv:2506.10805
13
citations
#526

Ambient Diffusion Omni: Training Good Models with Bad Data

Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans et al.

NEURIPS 2025spotlightarXiv:2506.10038
13
citations
#527

FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks

Luca Della Libera, Francesco Paissan, Cem Subakan et al.

NEURIPS 2025arXiv:2502.04465
13
citations
#528

ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data

Xiaoyang Liu, Kangjie Bao, Jiashuo Zhang et al.

NEURIPS 2025arXiv:2502.05567
13
citations
#529

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Minki Kang, Jongwon Jeong, Seanie Lee et al.

NEURIPS 2025spotlightarXiv:2505.17612
13
citations
#530

Large language models can learn and generalize steganographic chain-of-thought under process supervision

ROBERT MC CARTHY, Joey SKAF, Luis Ibanez-Lissen et al.

NEURIPS 2025arXiv:2506.01926
13
citations
#531

CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks

Danning Xie, Mingwei Zheng, Xuwei Liu et al.

NEURIPS 2025spotlightarXiv:2507.05269
13
citations
#532

This Time is Different: An Observability Perspective on Time Series Foundation Models

Ben Cohen, Emaad Khwaja, Youssef Doubli et al.

NEURIPS 2025arXiv:2505.14766
13
citations
#533

AI-Researcher: Autonomous Scientific Innovation

Jiabin Tang, Lianghao Xia, Zhonghang Li et al.

NEURIPS 2025spotlightarXiv:2505.18705
13
citations
#534

VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models

Xinan He, Yue Zhou, Bing Fan et al.

NEURIPS 2025arXiv:2503.06142
13
citations
#535

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Jiarui Yao, Yifan Hao, Hanning Zhang et al.

NEURIPS 2025arXiv:2505.02391
13
citations
#536

A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities

Han-Jia Ye, Si-Yang Liu, Wei-Lun (Harry) Chao

NEURIPS 2025arXiv:2502.17361
13
citations
#537

Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning

Minheng Ni, Zhengyuan Yang, Linjie Li et al.

NEURIPS 2025arXiv:2505.19702
13
citations
#538

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang

NEURIPS 2025oralarXiv:2401.07844
13
citations
#539

BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

Andy Zhang, Joey Ji, Celeste Menders et al.

NEURIPS 2025arXiv:2505.15216
13
citations
#540

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)

Tianyi Zhang, Mohsen Hariri, Shaochen (Henry) Zhong et al.

NEURIPS 2025arXiv:2504.11651
13
citations
#541

AdvPrefix: An Objective for Nuanced LLM Jailbreaks

Sicheng Zhu, Brandon Amos, Yuandong Tian et al.

NEURIPS 2025arXiv:2412.10321
12
citations
#542

Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers

Zeyuan Allen-Zhu

NEURIPS 2025arXiv:2512.17351
12
citations
#543

LinPrim: Linear Primitives for Differentiable Volumetric Rendering

Nicolas von Lützow, Matthias Niessner

NEURIPS 2025arXiv:2501.16312
12
citations
#544

Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection

Yue Zhou, Xinan He, Kaiqing Lin et al.

NEURIPS 2025arXiv:2506.00874
12
citations
#545

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training

David Dai, Peilin Chen, Chanakya Ekbote et al.

NEURIPS 2025oralarXiv:2506.00711
12
citations
#546

Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions

Siqiao Mu, Diego Klabjan

NEURIPS 2025arXiv:2409.09778
12
citations
#547

Universal Video Temporal Grounding with Generative Multi-modal Large Language Models

Zeqian Li, Shangzhe Di, Zhonghua Zhai et al.

NEURIPS 2025oralarXiv:2506.18883
12
citations
#548

Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law

Frederik Kunstner, Francis Bach

NEURIPS 2025arXiv:2505.19227
12
citations
#549

Diffusion Tree Sampling: Scalable inference‑time alignment of diffusion models

Vineet Jain, Kusha Sareen, Mohammad Pedramfar et al.

NEURIPS 2025arXiv:2506.20701
12
citations
#550

DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs

Dongyuan Li, Shiyin Tan, Ying Zhang et al.

NEURIPS 2025arXiv:2408.06966
12
citations
#551

VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models

Chongkai Gao, Zixuan Liu, Zhenghao Chi et al.

NEURIPS 2025arXiv:2506.17561
12
citations
#552

Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model

Dongki Kim, Wonbin Lee, Sung Ju Hwang

NEURIPS 2025arXiv:2502.13449
12
citations
#553

GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs

Advik Basani, Xiao Zhang

NEURIPS 2025arXiv:2411.14133
12
citations
#554

Searching Latent Program Spaces

Matthew Macfarlane, Clem Bonnet

NEURIPS 2025spotlightarXiv:2411.08706
12
citations
#555

RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics

Jie Zhang, Cezara Petrui, Kristina Nikolić et al.

NEURIPS 2025arXiv:2505.12575
12
citations
#556

Theoretically Grounded Framework for LLM Watermarking: A Distribution-Adaptive Approach

Haiyun He, Yepeng Liu, Ziqiao Wang et al.

NEURIPS 2025arXiv:2410.02890
12
citations
#557

UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation

Rui Tian, Mingfei Gao, Mingze Xu et al.

NEURIPS 2025arXiv:2505.14682
12
citations
#558

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

Fengxiang Wang, Mingshuo Chen, Yueying Li et al.

NEURIPS 2025spotlightarXiv:2505.21375
12
citations
#559

AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation

Sixiang Chen, Jiaming Liu, Siyuan Qian et al.

NEURIPS 2025arXiv:2507.01961
12
citations
#560

Enhancing Multilingual LLM Pretraining with Model-Based Data Selection

Bettina Messmer, Vinko Sabolčec, Martin Jaggi

NEURIPS 2025arXiv:2502.10361
12
citations
#561

Bag of Tricks for Inference-time Computation of LLM Reasoning

Fan LIU, Wen-Shuo Chao, Naiqiang Tan et al.

NEURIPS 2025arXiv:2502.07191
12
citations
#562

KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning

Wei Sun, Wen Yang, Pu Jian et al.

NEURIPS 2025arXiv:2505.16826
12
citations
#563

Exploring the limits of strong membership inference attacks on large language models

Jamie Hayes, I Shumailov, Christopher A. Choquette-Choo et al.

NEURIPS 2025arXiv:2505.18773
12
citations
#564

Training-Free Guidance Beyond Differentiability: Scalable Path Steering with Tree Search in Diffusion and Flow Models

Yingqing Guo, Yukang Yang, Hui Yuan et al.

NEURIPS 2025arXiv:2502.11420
12
citations
#565

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models

Huajie Tan, Yuheng Ji, Xiaoshuai Hao et al.

NEURIPS 2025arXiv:2503.20752
12
citations
#566

UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents

Han Xiao, Guozhi Wang, Yuxiang Chai et al.

NEURIPS 2025arXiv:2505.21496
12
citations
#567

On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization

wenlong deng, Yi Ren, Muchen Li et al.

NEURIPS 2025arXiv:2505.18830
12
citations
#568

NAVIX: Scaling MiniGrid Environments with JAX

Eduardo Pignatelli, Jarek Liesen, Robert Lange et al.

NEURIPS 2025arXiv:2407.19396
12
citations
#569

Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling

Michal Balcerak, Tamaz Amiranashvili, Antonio Terpin et al.

NEURIPS 2025arXiv:2504.10612
12
citations
#570

VORTA: Efficient Video Diffusion via Routing Sparse Attention

Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.

NEURIPS 2025arXiv:2505.18809
12
citations
#571

Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Kianté Brantley, Mingyu Chen, Zhaolin Gao et al.

NEURIPS 2025arXiv:2505.20686
12
citations
#572

UniTraj: Learning a Universal Trajectory Foundation Model from Billion-Scale Worldwide Traces

Yuanshao Zhu, James Yu, Xiangyu Zhao et al.

NEURIPS 2025arXiv:2411.03859
12
citations
#573

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training

Jin Zhou, Kaiwen Wang, Jonathan Chang et al.

NEURIPS 2025arXiv:2502.20548
12
citations
#574

DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization

Gang Li, Ming Lin, Tomer Galanti et al.

NEURIPS 2025arXiv:2505.12366
12
citations
#575

Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models

Ben Finkelshtein, Ismail Ilkan Ceylan, Michael Bronstein et al.

NEURIPS 2025arXiv:2506.14291
12
citations
#576

VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents

Kangrui Wang, Pingyue Zhang, Zihan Wang et al.

NEURIPS 2025arXiv:2510.16907
12
citations
#577

Test-Time Scaling of Diffusion Models via Noise Trajectory Search

Vignav Ramesh, Morteza Mardani

NEURIPS 2025arXiv:2506.03164
12
citations
#578

LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling

Yang Xiao, Jiashuo WANG, Ruifeng Yuan et al.

NEURIPS 2025arXiv:2505.19187
12
citations
#579

Flow-Based Policy for Online Reinforcement Learning

Lei Lv, Yunfei Li, Yu Luo et al.

NEURIPS 2025arXiv:2506.12811
12
citations
#580

FEAT: Free energy Estimators with Adaptive Transport

Yuanqi Du, Jiajun He, Francisco Vargas et al.

NEURIPS 2025arXiv:2504.11516
11
citations
#581

Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization

Yamato Arai, Yuma Ichikawa

NEURIPS 2025arXiv:2504.09629
11
citations
#582

NeuralPLexer3: Accurate Biomolecular Complex Structure Prediction with Flow Models

Jarren Zhuoran Qiao, Feizhi Ding, Thomas Dresselhaus et al.

NEURIPS 2025arXiv:2412.10743
11
citations
#583

ConTextTab: A Semantics-Aware Tabular In-Context Learner

Marco Spinaci, Marek Polewczyk, Maximilian Schambach et al.

NEURIPS 2025spotlightarXiv:2506.10707
11
citations
#584

DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Zheng Chen, Zichen Zou, Kewei Zhang et al.

NEURIPS 2025arXiv:2505.16239
11
citations
#585

FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models

Jintao Tong, Wenwei Jin, Pengda Qin et al.

NEURIPS 2025arXiv:2505.19536
11
citations
#586

Scaling Embedding Layers in Language Models

Da Yu, Edith Cohen, Badih Ghazi et al.

NEURIPS 2025arXiv:2502.01637
11
citations
#587

Reviving DSP for Advanced Theorem Proving in the Era of Reasoning Models

Chenrui Cao, Liangcheng Song, Zenan Li et al.

NEURIPS 2025arXiv:2506.11487
11
citations
#588

Whole-Body Conditioned Egocentric Video Prediction

Yutong Bai, Danny Tran, Amir Bar et al.

NEURIPS 2025arXiv:2506.21552
11
citations
#589

Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization

Abhishek Roy, Geelon So, Yian Ma

NEURIPS 2025
11
citations
#590

GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior

Penghao Wu, Shengnan Ma, Bo Wang et al.

NEURIPS 2025arXiv:2506.08012
11
citations
#591

Reward-Instruct: A Reward-Centric Approach to Fast Photo-Realistic Image Generation

Yihong Luo, Tianyang Hu, Weijian Luo et al.

NEURIPS 2025arXiv:2503.13070
11
citations
#592

Vision-centric Token Compression in Large Language Model

Ling Xing, Alex Jinpeng Wang, Rui Yan et al.

NEURIPS 2025spotlightarXiv:2502.00791
11
citations
#593

EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

Ruskin Raj Manku, Yuzhi Tang, Xingjian Shi et al.

NEURIPS 2025arXiv:2505.23009
11
citations
#594

The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement

Ruihan Yang, Fanghua Ye, Jian Li et al.

NEURIPS 2025arXiv:2503.16024
11
citations
#595

Locality in Image Diffusion Models Emerges from Data Statistics

Artem Lukoianov, Chenyang Yuan, Justin Solomon et al.

NEURIPS 2025spotlightarXiv:2509.09672
11
citations
#596

EgoExoBench: A Benchmark for First- and Third-person View Video Understanding in MLLMs

Yuping He, Yifei Huang, Guo Chen et al.

NEURIPS 2025oralarXiv:2507.18342
11
citations
#597

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Nikhil Kandpal, Brian Lester, Colin Raffel et al.

NEURIPS 2025arXiv:2506.05209
11
citations
#598

UAV-Flow Colosseo: A Real-World Benchmark for Flying-on-a-Word UAV Imitation Learning

Xiangyu Wang, Donglin Yang, Yue Liao et al.

NEURIPS 2025arXiv:2505.15725
11
citations
#599

3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model

Wenbo Hu, Yining Hong, Yanjun Wang et al.

NEURIPS 2025oralarXiv:2505.22657
11
citations
#600

GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

Manish Shetty, Naman Jain, Jinjian Liu et al.

NEURIPS 2025arXiv:2505.23671
11
citations