Most Cited NEURIPS "gradient redistribution" Papers
5,858 papers found • Page 1 of 30
Conference
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Chaoyou Fu, Peixian Chen, Yunhang Shen et al.
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu, Zheng Zhang, Ruofei Zhu et al.
YOLOv12: Attention-Centric Real-Time Object Detectors
Yunjie Tian, Qixiang Ye, DAVID DOERMANN
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Yang Yue, Zhiqi Chen, Rui Lu et al.
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Mark Towers, Ariel Kwiatkowski, John Balis et al.
Large Language Diffusion Models
Shen Nie, Fengqi Zhu, Zebin You et al.
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Jingcheng Hu, Yinmin Zhang, Qi Han et al.
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Shenzhi Wang, Le Yu, Chang Gao et al.
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh vahid et al.
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng, Kaixiong Gong, Bohao Li et al.
A-Mem: Agentic Memory for LLM Agents
Wujiang Xu, Zujie Liang, Kai Mei et al.
Flow-GRPO: Training Flow Matching Models via Online RL
Jie Liu, Gongye Liu, Jiajun Liang et al.
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri, Melissa Z Pan, Shuyi Yang et al.
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Xiaoxi Li, Jiajie Jin, Guanting Dong et al.
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang, Qing Yang, Zhiyuan Zeng et al.
Mean Flows for One-step Generative Modeling
Zhengyang Geng, Mingyang Deng, Xingjian Bai et al.
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang, Chao Qu, Zuming Huang et al.
ToolRL: Reward is All Tool Learning Needs
Cheng Qian, Emre Can Acikgoz, Qi He et al.
Training Language Models to Reason Efficiently
Daman Arora, Andrea Zanette
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Jonas Geiping, Sean McLeish, Neel Jain et al.
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Yuxiang Wei, Olivier Duchenne, Jade Copet et al.
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Andrew Zhao, Yiran Wu, Yang Yue et al.
Titans: Learning to Memorize at Test Time
Ali Behrouz, Peilin Zhong, Vahab Mirrokni
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
Xun Huang, Zhengqi Li, Guande He et al.
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Chaoyou Fu, Haojia Lin, Xiong Wang et al.
MMaDA: Multimodal Large Diffusion Language Models
Ling Yang, Ye Tian, Bowen Li et al.
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya, Po-Yao Huang, Peize Sun et al.
Learning to Reason under Off-Policy Guidance
Jianhao Yan, Yafu Li, Zican Hu et al.
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo, Kaiyan Zhang, Li Sheng et al.
Improving Video Generation with Human Feedback
Jie Liu, Gongye Liu, Jiajun Liang et al.
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Xeron Du, Yifan Yao, Kaijing Ma et al.
AREAL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
Wei Fu, Jiaxuan Gao, Xujie Shen et al.
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
Yuhui Li, Fangyun Wei, Chao Zhang et al.
Rethinking Joint Maximum Mean Discrepancy for Visual Domain Adaptation
Wei Wang, Haifeng Xia, Chao Huang et al.
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Huanjin Yao, Jiaxing Huang, Wenhao Wu et al.
Show-o2: Improved Native Unified Multimodal Models
Jinheng Xie, Zhenheng Yang, Mike Zheng Shou
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
Yuan Feng, Junlin Lv, Yukun Cao et al.
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Frank (Fangzheng) Xu, Yufan Song, Boxuan Li et al.
Group-in-Group Policy Optimization for LLM Agent Training
Lang Feng, Zhenghai Xue, Tingcong Liu et al.
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Mingjie Liu, Shizhe Diao, Ximing Lu et al.
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
Wenkai Yang, Shuming Ma, Yankai Lin et al.
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Shivam Agarwal, Zimin Zhang, Lifan Yuan et al.
Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
Zechuan Zhang, Ji Xie, Yu Lu et al.
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
Dongzhi JIANG, Ziyu Guo, Renrui Zhang et al.
ImgEdit: A Unified Image Editing Dataset and Benchmark
Yang Ye, Xianyi He, Zongjian Li et al.
WebDancer: Towards Autonomous Information Seeking Agency
Jialong Wu, Baixuan Li, Runnan Fang et al.
Remarkable Robustness of LLMs: Stages of Inference?
Vedang Lad, Jin Hwa Lee, Wes Gurnee et al.
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models
Sreyan Ghosh, Arushi Goel, Jaehyeon Kim et al.
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Di Liu, Meng Chen, Baotong Lu et al.
Remasking Discrete Diffusion Models with Inference-Time Scaling
Guanghan Wang, Yair Schiff, Subham Sahoo et al.
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-based Decoding
Xiner Li, Yulai Zhao, Chenyu Wang et al.
SWE-smith: Scaling Data for Software Engineering Agents
John Yang, Kilian Lieret, Carlos Jimenez et al.
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Zhewei Kang, Xuandong Zhao, Dawn Song
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
Xinyu Zhu, Mengzhou Xia, Zhepei Wei et al.
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Siyan Zhao, Devaansh Gupta, Qinqing Zheng et al.
OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Mengkang Hu, Yuhang Zhou, Wendong Fan et al.
General-Reasoner: Advancing LLM Reasoning Across All Domains
Xueguang Ma, Qian Liu, Dongfu Jiang et al.
LMFusion: Adapting Pretrained Language Models for Multimodal Generation
Weijia Shi, Xiaochuang Han, Chunting Zhou et al.
FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving
Shuang Zeng, Xinyuan Chang, Mengwei Xie et al.
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
Akshara Prabhakar, Zuxin Liu, Ming Zhu et al.
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
Xiyao Wang, Zhengyuan Yang, Chao Feng et al.
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
David Chanin, James Wilken-Smith, Tomáš Dulka et al.
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
Chongyu Fan, Jiancheng Liu, Licong Lin et al.
UniTok: a Unified Tokenizer for Visual Generation and Understanding
Chuofan Ma, Yi Jiang, Junfeng Wu et al.
dKV-Cache: The Cache for Diffusion Language Models
Xinyin Ma, Runpeng Yu, Gongfan Fang et al.
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Qingyang Zhang, Haitao Wu, Changqing Zhang et al.
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Zihan Qiu, Zekun Wang, Bo Zheng et al.
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
Diankun Wu, Fangfu Liu, Yi-Hsin Hung et al.
What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Sang Choe, Hwijeen Ahn, Juhan Bae et al.
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning
Zewei Zhou, Tianhui Cai, Seth Zhao et al.
UMA: A Family of Universal Models for Atoms
Brandon Wood, Misko Dzamba, Xiang Fu et al.
Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Yongdong Luo, Xiawu Zheng, Guilin Li et al.
Offline Actor-Critic for Average Reward MDPs
William Powell, Jeongyeol Kwon, Qiaomin Xie et al.
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Ling Fu, Zhebin Kuang, Jiajun Song et al.
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details
Ruicheng Wang, Sicheng Xu, Yue Dong et al.
LoRA vs Full Fine-tuning: An Illusion of Equivalence
Reece Shuttleworth, Jacob Andreas, Antonio Torralba et al.
Thinkless: LLM Learns When to Think
Gongfan Fang, Xinyin Ma, Xinchao Wang
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Weiqi Li, Xuanyu Zhang, Shijie Zhao et al.
CSGO: Content-Style Composition in Text-to-Image Generation
Peng Xing, Haofan Wang, Yanpeng Sun et al.
Perception-R1: Pioneering Perception Policy with Reinforcement Learning
En Yu, Kangheng Lin, Liang Zhao et al.
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
Wenyao Zhang, Hongsi Liu, Zekun Qi et al.
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
Daoguang Zan, Zhirong Huang, Wei Liu et al.
GoT: Unleashing Reasoning Capability of MLLM for Visual Generation and Editing
Rongyao Fang, Chengqi Duan, Kun Wang et al.
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Tianbao Xie, Jiaqi Deng, Xiaochuan Li et al.
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Zhaorun Chen, Zichen Wen, Yichao Du et al.
GRIT: Teaching MLLMs to Think with Images
Yue Fan, Xuehai He, Diji Yang et al.
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Chi-Pin Huang, Yueh-Hua Wu, Min-Hung Chen et al.
Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models
Lvmin Zhang, Shengqu Cai, Muyang Li et al.
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
Enshen Zhou, Jingkun An, Cheng Chi et al.
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Ziyang Ma, Yinghao Ma, Yanqiao Zhu et al.
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Mingyang Chen, Linzhuang Sun, Tianpeng Li et al.
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
Xiangyan Liu, Jinjie Ni, Zijian Wu et al.
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models
Zhihang Lin, Mingbao Lin, Yuan Xie et al.
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
Junfei Wu, Jian Guan, Kaituo Feng et al.
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov, Arman Zharmagambetov, Aaron Grattafiori et al.
Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
Heli Ben-Hamu, Itai Gat, Daniel Severo et al.
Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better
Danny Driess, Jost Springenberg, Brian Ichter et al.
OmniBench: Towards The Future of Universal Omni-Language Models
Yizhi Li, Ge Zhang, Yinghao Ma et al.
WorldMem: Long-term Consistent World Simulation with Memory
Zeqi Xiao, Yushi LAN, Yifan Zhou et al.
What Makes a Reward Model a Good Teacher? An Optimization Perspective
Noam Razin, Zixuan Wang, Hubert Strauss et al.
S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models
Muzhi Dai, Chenxu Yang, Qingyi Si
GAM-Agent: Game-Theoretic and Uncertainty-Aware Collaboration for Complex Visual Reasoning
jusheng zhang, Yijia Fan, Wenjun Lin et al.
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning
Yang Chen, Zhuolin Yang, Zihan Liu et al.
Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
Ye Wang, Ziheng Wang, Boshen Xu et al.
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Yibin Wang, li zhimin, Yuhang Zang et al.
LLM Generated Persona is a Promise with a Catch
Leon Li, Haozhe Chen, Hongseok Namkoong et al.
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Weizhe Yuan, Jane Yu, Song Jiang et al.
Atom of Thoughts for Markov LLM Test-Time Scaling
Fengwei Teng, Quan Shi, Zhaoyang Yu et al.
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space
Zhen Zhang, Xuehai He, Weixiang Yan et al.
VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold
Dominic Maggio, Hyungtae Lim, Luca Carlone
The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
Ruili Feng, Han Zhang, Zhilei Shu et al.
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Xiangyu Zhao, Peiyuan Zhang, Kexian Tang et al.
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Peixian Ma, Xialie Zhuang, Chengjin Xu et al.
Reasoning Gym: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Zafir Stojanovski, Oliver Stanley, Joe Sharratt et al.
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi et al.
Real-Time Execution of Action Chunking Flow Policies
Kevin Black, Manuel Galliker, Sergey Levine
What Can RL Bring to VLA Generalization? An Empirical Study
Jijia Liu, Feng Gao, Bingwen Wei et al.
WritingBench: A Comprehensive Benchmark for Generative Writing
Yuning Wu, Jiahao Mei, Ming Yan et al.
RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
Hao Gao, Shaoyu Chen, Bo Jiang et al.
Accelerating Diffusion LLMs via Adaptive Parallel Decoding
Daniel Israel, Guy Van den Broeck, Aditya Grover
Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
Yiming Wang, Pei Zhang, Siyuan Huang et al.
Video World Models with Long-term Spatial Memory
Tong Wu, Shuai Yang, Ryan Po et al.
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation
Shanchuan Lin, Ceyuan Yang, Hao He et al.
Scaling RL to Long Videos
Yukang Chen, Wei Huang, Baifeng Shi et al.
Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts
Haizhong Zheng, Yang Zhou, Brian Bartoldson et al.
ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
Jingyang Yi, Jiazheng Wang, Sida Li
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
Qianhui Wu, Kanzhi Cheng, Rui Yang et al.
TabArena: A Living Benchmark for Machine Learning on Tabular Data
Nick Erickson, Lennart Purucker, Andrej Tschalzev et al.
Detecting Data Deviations in Electronic Health Records
Kaiping Zheng, Horng-Ruey Chua, Beng Chin Ooi
Faster Video Diffusion with Trainable Sparse Attention
Peiyuan Zhang, Yongqi Chen, Haofeng Huang et al.
Sparse Meets Dense: Unified Generative Recommendations with Cascaded Sparse-Dense Representations
Yuhao Yang, ZhI JI, Zhaopeng Li et al.
KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills
Weiji Xie, Jinrui Han, Jiakun Zheng et al.
A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules
Xiang Li, Feng Ruan, Huiyuan Wang et al.
Agentic RL Scaling Law: Spontaneous Code Execution for Mathematical Problem Solving
Xinji Mai, Haotian Xu, Xing W et al.
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Jorge (Zhoujun) Cheng, Shibo Hao, Tianyang Liu et al.
ReMA: Learning to Meta-Think for LLMs with Multi-agent Reinforcement Learning
Ziyu Wan, Yunxiang Li, Xiaoyu Wen et al.
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Jiahui Zhang, Yurui Chen, Yueming Xu et al.
Think Only When You Need with Large Hybrid-Reasoning Models
Lingjie Jiang, Xun Wu, Shaohan Huang et al.
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
Shuo Yang, Haocheng Xi, Yilong Zhao et al.
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Siyuan Huang, Liliang Chen, Pengfei Zhou et al.
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Songjun Tu, Jiahao Lin, Qichao Zhang et al.
HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
Zhilin Wang, Jiaqi Zeng, Olivier Delalleau et al.
UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens
Ruichuan An, Sihan Yang, Renrui Zhang et al.
Align Your Flow: Scaling Continuous-Time Flow Map Distillation
Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis
Generalizing Verifiable Instruction Following
Valentina Pyatkin, Saumya Malik, Victoria Graf et al.
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Sifan Wang, Ananyae bhartari, Bowen Li et al.
Informed Correctors for Discrete Diffusion Models
Yixiu Zhao, Jiaxin Shi, Feng Chen et al.
Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention
Shuang Wu, Youtian Lin, Feihu Zhang et al.
OpenCUA: Open Foundations for Computer-Use Agents
Xinyuan Wang, Bowen Wang, Dunjie Lu et al.
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Yong Liu, Zirui Zhu, Chaoyu Gong et al.
WorldModelBench: Judging Video Generation Models As World Models
Dacheng Li, Yunhao Fang, Yukang Chen et al.
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
Rui Pan, Yinwei Dai, Zhihao Zhang et al.
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Zemin Huang, Zhiyang Chen, Zijun Wang et al.
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Zekun Qi, Wenyao Zhang, Yufei Ding et al.
TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning
Andreas Auer, Patrick Podest, Daniel Klotz et al.
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
Mantas Mazeika, Xuwang Yin, Rishub Tamirisa et al.
Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning
Wenlin Zhang, Xiangyang Li, Kuicai Dong et al.
WISA: World simulator assistant for physics-aware text-to-video generation
Jing Wang, Ao Ma, Ke Cao et al.
Multi-Agent Collaboration via Evolving Orchestration
Yufan Dang, Chen Qian, Xueheng Luo et al.
Harnessing the Universal Geometry of Embeddings
Rishi Jha, Collin Zhang, Vitaly Shmatikov et al.
VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank
Tianhe Wu, Jian Zou, Jie Liang et al.
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
Zhihao Li, Yufei Wang, Heliang Zheng et al.
PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis
Yan Wu, Esther Wershof, Sebastian Schmon et al.
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
Zhongxing Xu, Chengzhi Liu, Qingyue Wei et al.
On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning
Alvaro Arroyo, Alessio Gravina, Benjamin Gutteridge et al.
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Zhe Kong, Feng Gao, Yong Zhang et al.
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Yantai Yang, Yuhao Wang, Zichen Wen et al.
Tensor Product Attention Is All You Need
Yifan Zhang, Yifeng Liu, Huizhuo Yuan et al.
VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning
Qi Wang, Yanrui Yu, Ye Yuan et al.
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
Siwei Wen, junyan ye, Peilin Feng et al.
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
Xi Chen, Kaituo Feng, Changsheng Li et al.
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
Sai Sumedh R. Hindupur, Ekdeep S Lubana, Thomas Fel et al.
The Leaderboard Illusion
Shivalika Singh, Yiyang Nan, Alex Wang et al.
GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents
Yuqi Zhou, Sunhao Dai, Shuai Wang et al.
Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think
Ge Wu, Shen Zhang, Ruijing Shi et al.
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
Ibragim Badertdinov, Alexander Golubev, Maksim Nekrashevich et al.
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
Will Merrill, Ashish Sabharwal
MAT-Agent: Adaptive Multi-Agent Training Optimization
jusheng zhang, Kaitong Cai, Yijia Fan et al.
Reasoning Models Better Express Their Confidence
Dongkeun Yoon, Seungone Kim, Sohee Yang et al.
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
Sangmin Bae, Yujin Kim, Reza Bayat et al.
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey, Bin Zhang, Lorenzo Noci et al.
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning
Zhongwei Wan, Zhihao Dou, Che Liu et al.
OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization
Yiyou Sun, Shawn Hu, Georgia Zhou et al.
Checklists Are Better Than Reward Models For Aligning Language Models
Vijay Viswanathan, Yanchao Sun, Xiang Kong et al.
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
Guo Chen, Zhiqi Li, Shihao Wang et al.
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
Hanlin Zhu, Shibo Hao, Zhiting Hu et al.
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Weifeng Lin, Xinyu Wei, Ruichuan An et al.
Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms
Yinuo Ren, Haoxuan Chen, Yuchen Zhu et al.
Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
Wenhui Tan, Jiaze Li, Jianzhong Ju et al.
Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos
hanxue liang, Jiawei Ren, Ashkan Mirzaei et al.
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
Polina Kirichenko, Mark Ibrahim, Kamalika Chaudhuri et al.
How to build a consistency model: Learning flow maps via self-distillation
Nicholas Boffi, Michael Albergo, Eric Vanden-Eijnden
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training
Jintao Zhang, Jia wei, Haoxu Wang et al.
G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems
Guibin Zhang, Muxin Fu, Kun Wang et al.
Best-of-N Jailbreaking
John Hughes, Sara Price, Aengus Lynch et al.
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
Qiuchen Wang, Ruixue Ding, Yu Zeng et al.
ASGO: Adaptive Structured Gradient Optimization
Kang An, Yuxing Liu, Rui Pan et al.
Policy learning “without” overlap: Pessimism and generalized empirical Bernstein’s inequality
Ying Jin, Zhimei Ren, Zhuoran Yang et al.
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers
Yuchen Lin, Chenguo Lin, Panwang Pan et al.