Most Cited NEURIPS Oral "adaptive image tokenization" Papers
5,858 papers found • Page 1 of 30
Conference
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Chaoyou Fu, Peixian Chen, Yunhang Shen et al.
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Yang Yue, Zhiqi Chen, Rui Lu et al.
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh vahid et al.
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng, Kaixiong Gong, Bohao Li et al.
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri, Melissa Z Pan, Shuyi Yang et al.
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Xiaoxi Li, Jiajie Jin, Guanting Dong et al.
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang, Chao Qu, Zuming Huang et al.
Training Language Models to Reason Efficiently
Daman Arora, Andrea Zanette
ToolRL: Reward is All Tool Learning Needs
Cheng Qian, Emre Can Acikgoz, Qi He et al.
Mean Flows for One-step Generative Modeling
Zhengyang Geng, Mingyang Deng, Xingjian Bai et al.
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Jonas Geiping, Sean McLeish, Neel Jain et al.
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Chaoyou Fu, Haojia Lin, Xiong Wang et al.
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
Xun Huang, Zhengqi Li, Guande He et al.
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo, Kaiyan Zhang, Li Sheng et al.
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Xeron Du, Yifan Yao, Kaijing Ma et al.
Rethinking Joint Maximum Mean Discrepancy for Visual Domain Adaptation
Wei Wang, Haifeng Xia, Chao Huang et al.
Improving Video Generation with Human Feedback
Jie Liu, Gongye Liu, Jiajun Liang et al.
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Huanjin Yao, Jiaxing Huang, Wenhao Wu et al.
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
Yuhui Li, Fangyun Wei, Chao Zhang et al.
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Mingjie Liu, Shizhe Diao, Ximing Lu et al.
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
Wenkai Yang, Shuming Ma, Yankai Lin et al.
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
Yuan Feng, Junlin Lv, Yukun Cao et al.
AREAL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
Wei Fu, Jiaxuan Gao, Xujie Shen et al.
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
Dongzhi JIANG, Ziyu Guo, Renrui Zhang et al.
Show-o2: Improved Native Unified Multimodal Models
Jinheng Xie, Zhenheng Yang, Mike Zheng Shou
Remarkable Robustness of LLMs: Stages of Inference?
Vedang Lad, Jin Hwa Lee, Wes Gurnee et al.
ImgEdit: A Unified Image Editing Dataset and Benchmark
Yang Ye, Xianyi He, Zongjian Li et al.
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Di Liu, Meng Chen, Baotong Lu et al.
WebDancer: Towards Autonomous Information Seeking Agency
Jialong Wu, Baixuan Li, Runnan Fang et al.
LMFusion: Adapting Pretrained Language Models for Multimodal Generation
Weijia Shi, Xiaochuang Han, Chunting Zhou et al.
OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Mengkang Hu, Yuhang Zhou, Wendong Fan et al.
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
David Chanin, James Wilken-Smith, Tomáš Dulka et al.
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Siyan Zhao, Devaansh Gupta, Qinqing Zheng et al.
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models
Sreyan Ghosh, Arushi Goel, Jaehyeon Kim et al.
General-Reasoner: Advancing LLM Reasoning Across All Domains
Xueguang Ma, Qian Liu, Dongfu Jiang et al.
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
Xinyu Zhu, Mengzhou Xia, Zhepei Wei et al.
Remasking Discrete Diffusion Models with Inference-Time Scaling
Guanghan Wang, Yair Schiff, Subham Sahoo et al.
Offline Actor-Critic for Average Reward MDPs
William Powell, Jeongyeol Kwon, Qiaomin Xie et al.
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
Akshara Prabhakar, Zuxin Liu, Ming Zhu et al.
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
Chongyu Fan, Jiancheng Liu, Licong Lin et al.
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Qingyang Zhang, Haitao Wu, Changqing Zhang et al.
Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Yongdong Luo, Xiawu Zheng, Guilin Li et al.
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
Diankun Wu, Fangfu Liu, Yi-Hsin Hung et al.
dKV-Cache: The Cache for Diffusion Language Models
Xinyin Ma, Runpeng Yu, Gongfan Fang et al.
SWE-smith: Scaling Data for Software Engineering Agents
John Yang, Kilian Lieret, Carlos Jimenez et al.
UMA: A Family of Universal Models for Atoms
Brandon Wood, Misko Dzamba, Xiang Fu et al.
GoT: Unleashing Reasoning Capability of MLLM for Visual Generation and Editing
Rongyao Fang, Chengqi Duan, Kun Wang et al.
CSGO: Content-Style Composition in Text-to-Image Generation
Peng Xing, Haofan Wang, Yanpeng Sun et al.
Perception-R1: Pioneering Perception Policy with Reinforcement Learning
En Yu, Kangheng Lin, Liang Zhao et al.
Thinkless: LLM Learns When to Think
Gongfan Fang, Xinyin Ma, Xinchao Wang
LoRA vs Full Fine-tuning: An Illusion of Equivalence
Reece Shuttleworth, Jacob Andreas, Antonio Torralba et al.
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Zhaorun Chen, Zichen Wen, Yichao Du et al.
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Mingyang Chen, Linzhuang Sun, Tianpeng Li et al.
Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models
Lvmin Zhang, Shengqu Cai, Muyang Li et al.
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Weiqi Li, Xuanyu Zhang, Shijie Zhao et al.
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Tianbao Xie, Jiaqi Deng, Xiaochuan Li et al.
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Chi-Pin Huang, Yueh-Hua Wu, Min-Hung Chen et al.
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Ziyang Ma, Yinghao Ma, Yanqiao Zhu et al.
OmniBench: Towards The Future of Universal Omni-Language Models
Yizhi Li, Ge Zhang, Yinghao Ma et al.
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
Enshen Zhou, Jingkun An, Cheng Chi et al.
GRIT: Teaching MLLMs to Think with Images
Yue Fan, Xuehai He, Diji Yang et al.
WorldMem: Long-term Consistent World Simulation with Memory
Zeqi Xiao, Yushi LAN, Yifan Zhou et al.
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models
Zhihang Lin, Mingbao Lin, Yuan Xie et al.
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Weizhe Yuan, Jane Yu, Song Jiang et al.
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
Daoguang Zan, Zhirong Huang, Wei Liu et al.
The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
Ruili Feng, Han Zhang, Zhilei Shu et al.
Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better
Danny Driess, Jost Springenberg, Brian Ichter et al.
S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models
Muzhi Dai, Chenxu Yang, Qingyi Si
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov, Arman Zharmagambetov, Aaron Grattafiori et al.
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Peixian Ma, Xialie Zhuang, Chengjin Xu et al.
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
Junfei Wu, Jian Guan, Kaituo Feng et al.
VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold
Dominic Maggio, Hyungtae Lim, Luca Carlone
Atom of Thoughts for Markov LLM Test-Time Scaling
Fengwei Teng, Quan Shi, Zhaoyang Yu et al.
Detecting Data Deviations in Electronic Health Records
Kaiping Zheng, Horng-Ruey Chua, Beng Chin Ooi
RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
Hao Gao, Shaoyu Chen, Bo Jiang et al.
LLM Generated Persona is a Promise with a Catch
Leon Li, Haozhe Chen, Hongseok Namkoong et al.
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Yibin Wang, li zhimin, Yuhang Zang et al.
Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
Ye Wang, Ziheng Wang, Boshen Xu et al.
WritingBench: A Comprehensive Benchmark for Generative Writing
Yuning Wu, Jiahao Mei, Ming Yan et al.
Video World Models with Long-term Spatial Memory
Tong Wu, Shuai Yang, Ryan Po et al.
Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
Heli Ben-Hamu, Itai Gat, Daniel Severo et al.
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi et al.
Reasoning Gym: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Zafir Stojanovski, Oliver Stanley, Joe Sharratt et al.
Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
Yiming Wang, Pei Zhang, Siyuan Huang et al.
Scaling RL to Long Videos
Yukang Chen, Wei Huang, Baifeng Shi et al.
Agentic RL Scaling Law: Spontaneous Code Execution for Mathematical Problem Solving
Xinji Mai, Haotian Xu, Xing W et al.
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space
Zhen Zhang, Xuehai He, Weixiang Yan et al.
Real-Time Execution of Action Chunking Flow Policies
Kevin Black, Manuel Galliker, Sergey Levine
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation
Shanchuan Lin, Ceyuan Yang, Hao He et al.
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
Qianhui Wu, Kanzhi Cheng, Rui Yang et al.
Generalizing Verifiable Instruction Following
Valentina Pyatkin, Saumya Malik, Victoria Graf et al.
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Yong Liu, Zirui Zhu, Chaoyu Gong et al.
ReMA: Learning to Meta-Think for LLMs with Multi-agent Reinforcement Learning
Ziyu Wan, Yunxiang Li, Xiaoyu Wen et al.
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Siyuan Huang, Liliang Chen, Pengfei Zhou et al.
Sparse Meets Dense: Unified Generative Recommendations with Cascaded Sparse-Dense Representations
Yuhao Yang, ZhI JI, Zhaopeng Li et al.
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Jorge (Zhoujun) Cheng, Shibo Hao, Tianyang Liu et al.
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
Rui Pan, Yinwei Dai, Zhihao Zhang et al.
Think Only When You Need with Large Hybrid-Reasoning Models
Lingjie Jiang, Xun Wu, Shaohan Huang et al.
Accelerating Diffusion LLMs via Adaptive Parallel Decoding
Daniel Israel, Guy Van den Broeck, Aditya Grover
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
Zhihao Li, Yufei Wang, Heliang Zheng et al.
Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention
Shuang Wu, Youtian Lin, Feihu Zhang et al.
WISA: World simulator assistant for physics-aware text-to-video generation
Jing Wang, Ao Ma, Ke Cao et al.
Tensor Product Attention Is All You Need
Yifan Zhang, Yifeng Liu, Huizhuo Yuan et al.
UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens
Ruichuan An, Sihan Yang, Renrui Zhang et al.
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Zekun Qi, Wenyao Zhang, Yufei Ding et al.
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Sifan Wang, Ananyae bhartari, Bowen Li et al.
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Xiangyu Zhao, Peiyuan Zhang, Kexian Tang et al.
Reasoning Models Better Express Their Confidence
Dongkeun Yoon, Seungone Kim, Sohee Yang et al.
Align Your Flow: Scaling Continuous-Time Flow Map Distillation
Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis
HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
Zhilin Wang, Jiaqi Zeng, Olivier Delalleau et al.
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
Mantas Mazeika, Xuwang Yin, Rishub Tamirisa et al.
WorldModelBench: Judging Video Generation Models As World Models
Dacheng Li, Yunhao Fang, Yukang Chen et al.
OpenCUA: Open Foundations for Computer-Use Agents
Xinyuan Wang, Bowen Wang, Dunjie Lu et al.
TabArena: A Living Benchmark for Machine Learning on Tabular Data
Nick Erickson, Lennart Purucker, Andrej Tschalzev et al.
KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills
Weiji Xie, Jinrui Han, Jiakun Zheng et al.
TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning
Andreas Auer, Patrick Podest, Daniel Klotz et al.
Informed Correctors for Discrete Diffusion Models
Yixiu Zhao, Jiaxin Shi, Feng Chen et al.
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Yantai Yang, Yuhao Wang, Zichen Wen et al.
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
Shuo Yang, Haocheng Xi, Yilong Zhao et al.
VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank
Tianhe Wu, Jian Zou, Jie Liang et al.
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
Will Merrill, Ashish Sabharwal
VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning
Qi Wang, Yanrui Yu, Ye Yuan et al.
Policy learning “without” overlap: Pessimism and generalized empirical Bernstein’s inequality
Ying Jin, Zhimei Ren, Zhuoran Yang et al.
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Zhe Kong, Feng Gao, Yong Zhang et al.
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
Siwei Wen, junyan ye, Peilin Feng et al.
Unlocking Multimodal Mathematical Reasoning via Process Reward Model
Ruilin Luo, Zhuofan Zheng, Lei Wang et al.
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Weifeng Lin, Xinyu Wei, Ruichuan An et al.
Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms
Yinuo Ren, Haoxuan Chen, Yuchen Zhu et al.
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers
Yuchen Lin, Chenguo Lin, Panwang Pan et al.
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Zemin Huang, Zhiyang Chen, Zijun Wang et al.
When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
Xiaomin Li, Zhou Yu, Zhiwei Zhang et al.
MAT-Agent: Adaptive Multi-Agent Training Optimization
jusheng zhang, Kaitong Cai, Yijia Fan et al.
ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding
Yiyang Zhou, Yangfan He, Yaofeng Su et al.
OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization
Yiyou Sun, Shawn Hu, Georgia Zhou et al.
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
Zhongxing Xu, Chengzhi Liu, Qingyue Wei et al.
Fast-in-Slow: A Dual-System VLA Model Unifying Fast Manipulation within Slow Reasoning
Hao Chen, Jiaming Liu, Chenyang Gu et al.
VLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching
Siyu Xu, Yunke Wang, Chenghao Xia et al.
BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models
Peiyan Li, Yixiang Chen, Hongtao Wu et al.
Theoretical Benefit and Limitation of Diffusion Language Model
Guhao Feng, Yihan Geng, Jian Guan et al.
Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning
Yiqun Chen, Lingyong Yan, Weiwei Sun et al.
Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think
Ge Wu, Shen Zhang, Ruijing Shi et al.
Towards Understanding Camera Motions in Any Video
Zhiqiu Lin, Siyuan Cen, Daniel Jiang et al.
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
Sai Sumedh R. Hindupur, Ekdeep S Lubana, Thomas Fel et al.
ASGO: Adaptive Structured Gradient Optimization
Kang An, Yuxing Liu, Rui Pan et al.
Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory
Yuqi Wu, Wenzhao Zheng, Jie Zhou et al.
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Sicong Leng, Yun Xing, Zesen Cheng et al.
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
Polina Kirichenko, Mark Ibrahim, Kamalika Chaudhuri et al.
Chain-of-Retrieval Augmented Generation
Liang Wang, Haonan Chen, Nan Yang et al.
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
Shi Qiu, Shaoyang Guo, Zhuo-Yang Song et al.
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
Zihan Zheng, Zerui Cheng, Zeyu Shen et al.
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
Shenghai Yuan, Xianyi He, Yufan Deng et al.
Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces I: the compact case
Iskander Azangulov, Andrei Smolensky, Alexander Terenin et al.
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation
Bowen Chen, Brynn zhao, Haomiao Sun et al.
Grounded Reinforcement Learning for Visual Reasoning
Gabriel Sarch, Snigdha Saha, Naitik Khandelwal et al.
KGGen: Extracting Knowledge Graphs from Plain Text with Language Models
Belinda Mo, Kyssen Yu, Joshua Kazdan et al.
Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
Bojia Zi, Penghui Ruan, Marco Chen et al.
Multi-Agent Collaboration via Evolving Orchestration
Yufan Dang, Chen Qian, Xueheng Luo et al.
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
Chengzhuo Tong, Ziyu Guo, Renrui Zhang et al.
How to build a consistency model: Learning flow maps via self-distillation
Nicholas Boffi, Michael Albergo, Eric Vanden-Eijnden
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Videos Generation
Xiaofeng Wang, Kang Zhao, Feng Liu et al.
Results of the Big ANN: NeurIPS’23 competition
Harsha Vardhan simhadri, Martin Aumüller, Matthijs Douze et al.
Diffusion Beats Autoregressive in Data-Constrained Settings
Mihir Prabhudesai, Mengning Wu, Amir Zadeh et al.
Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons
Jianhui Chen, Xiaozhi Wang, Zijun Yao et al.
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors
Duo Zheng, shijia Huang, Yanyang Li et al.
KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse
Jingbo Yang, Bairu Hou, Wei Wei et al.
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Jiangjie Chen, Qianyu He, Siyu Yuan et al.
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
Julien Siems, Timur Carstensen, Arber Zela et al.
ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
Zechun Liu, Changsheng Zhao, Hanxian Huang et al.
Checklists Are Better Than Reward Models For Aligning Language Models
Vijay Viswanathan, Yanchao Sun, Xiang Kong et al.
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models
Yongliang Wu, Zonghui Li, Xinting Hu et al.
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
Jinyoung Park, Jeehye Na, Jinyoung Kim et al.
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
Ibragim Badertdinov, Alexander Golubev, Maksim Nekrashevich et al.
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
Wei Pang, Kevin Qinghong Lin, Xiangru Jian et al.
G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems
Guibin Zhang, Muxin Fu, Kun Wang et al.
ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World
Weixiang Yan, Haitian Liu, Tengxiao Wu et al.
Scaling Unlocks Broader Generation and Deeper Functional Understanding of Proteins
Aadyot Bhatnagar, Sarthak Jain, Joel Beazer et al.
SWE-bench Goes Live!
Linghao Zhang, Shilin He, Chaoyun Zhang et al.
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang, Qirun Dai, Hao Peng
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
Xinyan Chen, Renrui Zhang, Dongzhi JIANG et al.
Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning
Chen Qian, Dongrui Liu, Hao Wen et al.
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
Jiaru Zou, Ling Yang, Jingwen Gu et al.
rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset
Yifei Liu, Li Lyna Zhang, Yi Zhu et al.
Truthful Aggregation of LLMs with an Application to Online Advertising
Ermis Soumalias, Michael Curry, Sven Seuken
Training a Scientific Reasoning Model for Chemistry
Siddharth Narayanan, James Braza, Ryan-Rhys Griffiths et al.
Unlocking Dataset Distillation with Diffusion Models
Brian Moser, Federico Raue, Sebastian Palacio et al.
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Jingjing Chang, Yixiao Fang, Peng Xing et al.
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
Jiaqi Chen, Bang Zhang, Ruotian Ma et al.
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
Junteng Liu, Yuanxiang Fan, Jiang Zhuo et al.
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Yongsen Mao, Junhao Zhong, Chuan Fang et al.
Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems
Christian Walder, Deep Tejas Karkhanis
SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning
Wufei Ma, Yu-Cheng Chou, Qihao Liu et al.
Rope to Nope and Back Again: A New Hybrid Attention Strategy
Bowen Yang, Bharat Venkitesh, Dwaraknath Gnaneshwar Talupuru et al.
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
Boyu Gou, Zanming Huang, Yuting Ning et al.
Parallel Scaling Law for Language Models
Mouxiang Chen, Binyuan Hui, Zeyu Cui et al.
CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification
Wei Li, Renshan Zhang, Rui Shao et al.
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up
Songhua Liu, Zhenxiong Tan, Xinchao Wang
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Gleb Rodionov, Roman Garipov, Alina Shutova et al.
Generative Trajectory Stitching through Diffusion Composition
Yunhao Luo, Utkarsh Mishra, Yilun Du et al.
FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities
Jin Wang, Yao Lai, Aoxue Li et al.