Most Cited NEURIPS "noisy partial label learning" Papers
5,858 papers found • Page 2 of 30
Conference
Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search
Yuta Oshima, Masahiro Suzuki, Yutaka Matsuo et al.
V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception
Lei Yang, Xinyu Zhang, Jun Li et al.
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Jaihoon Kim, Taehoon Yoon, Jisung Hwang et al.
On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity
Quentin Bertrand, Anne Gagneux, Mathurin Massias et al.
HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation
Haoran Luo, Haihong E, Guanting Chen et al.
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
Siyu Jiao, Gengwei Zhang, Yinlong Qian et al.
Efficiently Scaling LLM Reasoning Programs with Certaindex
Yichao Fu, Junda Chen, Siqi Zhu et al.
Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
Wenhui Tan, Jiaze Li, Jianzhong Ju et al.
Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models
Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy et al.
Mechanism Design for LLM Fine-tuning with Multiple Reward Models
Haoran Sun, Yurong Chen, Siwei Wang et al.
SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
Wenkai Fang, Shunyu Liu, Yang Zhou et al.
Self-Adapting Language Models
Adam Zweiger, Jyo Pari, Han Guo et al.
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets
Yuzhe YANG, Yifei Zhang, Minghao Wu et al.
REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites
Div Garg, Diego Caples, Andis Draguns et al.
Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Shizhe Diao, Yu Yang, Yonggan Fu et al.
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
Qizhe Zhang, Mengzhen Liu, Lichen Li et al.
VLM-R³: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
Chaoya Jiang, Yongrui Heng, Wei Ye et al.
Diversity-Aware Policy Optimization for Large Language Model Reasoning
Jian Yao, Ran Cheng, Xingyu Wu et al.
Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
Anke Tang, Enneng Yang, Li Shen et al.
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
Yang Liu, Ming Ma, Xiaomin Yu et al.
UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions
Xue zhucun, Jiangning Zhang, Teng Hu et al.
HoliTom: Holistic Token Merging for Fast Video Large Language Models
Kele Shao, Keda TAO, Can Qin et al.
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
Thomas Kuntz, Agatha Duzan, Hao Zhao et al.
SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning
Wanjia Zhao, Mert Yuksekgonul, Shirley Wu et al.
Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search
Yuichi Inoue, Kou Misaki, Yuki Imajuku et al.
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
Hao Liang, Zhiquan Luo
MoonCast: High-Quality Zero-Shot Podcast Generation
Zeqian Ju, Dongchao Yang, Shen Kai et al.
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Yicheng Xiao, Lin Song, Yukang Chen et al.
Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding
Weiyu Guo, Ziyang Chen, Shaoguang WANG et al.
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
Uladzislau Sobal, Wancong Zhang, Kyunghyun Cho et al.
Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)
Zhenjie Yang, Xiaosong Jia, Qifeng Li et al.
Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
Felipe Maia Polo, Seamus Somerstep, Leshem Choshen et al.
Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations
Li Hao, He CAO, Bin Feng et al.
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Yue Liu, Shengfang Zhai, Mingzhe Du et al.
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images
Xiang Lan, Feng Wu, Kai He et al.
Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
Oussama Zekri, Nicolas Boulle
Learning 3D Persistent Embodied World Models
Siyuan Zhou, Yilun Du, Yuncong Yang et al.
GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments
Enjun Du, Xunkai Li, Tian Jin et al.
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding
Xiaoyi Zhang, Zhaoyang Jia, Zongyu Guo et al.
VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model
Zuwei Long, Yunhang Shen, Chaoyou Fu et al.
MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
Xiaohu Huang, Jingjing Wu, Qunyi Xie et al.
Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning
Kai Jiang, Zhengyan Shi, Dell Zhang et al.
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories
Maurice Kraus, Felix Divo, Devendra Singh Dhami et al.
Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces
Iskander Azangulov, Andrei Smolensky, Alexander Terenin et al.
ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing
Huadai Liu, Kaicheng Luo, Jialei Wang et al.
VeriThinker: Learning to Verify Makes Reasoning Model Efficient
Zigeng Chen, Xinyin Ma, Gongfan Fang et al.
Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization
Jiaming Zhou, Ke Ye, Jiayi Liu et al.
BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning
Jianyang Gu, Sam Stevens, Elizabeth Campolongo et al.
Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models
Matvei Popov, Peter Robicheaux, Anish Madan et al.
SensorLM: Learning the Language of Wearable Sensors
Yuwei Zhang, Kumar Ayush, Siyuan Qiao et al.
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Xinyu Yang, Yuwei An, Hongyi Liu et al.
Efficient Part-level 3D Object Generation via Dual Volume Packing
Jiaxiang Tang, Ruijie Lu, Max Li et al.
Memory Injection Attacks on LLM Agents via Query-Only Interaction
Shen Dong, Shaochen Xu, Pengfei He et al.
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Leqi Shen, Guoqiang Gong, Tao He et al.
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
Zimu Lu, Yunqiao Yang, Houxing Ren et al.
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models
Sagnik Mukherjee, Lifan Yuan, Dilek Hakkani-Tur et al.
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
Yiran Guo, Lijie Xu, Jie Liu et al.
SAFE: Multitask Failure Detection for Vision-Language-Action Models
Qiao Gu, Yuanliang Ju, Shengxiang Sun et al.
The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense
Yangyang Guo, Fangkai Jiao, Liqiang Nie et al.
AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling
Zhining Zhang, Chuanyang Jin, Mung Yao Jia et al.
Horizon Reduction Makes RL Scalable
Seohong Park, Kevin Frans, Deepinder Mann et al.
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
Xiaoyuan Liu, Tian Liang, Zhiwei He et al.
Training-Free Efficient Video Generation via Dynamic Token Carving
Yuechen Zhang, Jinbo Xing, bin xia et al.
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou et al.
AutoPartGen: Autoregressive 3D Part Generation and Discovery
Minghao Chen, Jianyuan Wang, Roman Shapovalov et al.
LeVo: High-Quality Song Generation with Multi-Preference Alignment
Shun Lei, Yaoxun XU, ZhiweiLin et al.
OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles
Yihe Deng, Hritik Bansal, Fan Yin et al.
DINO-Foresight: Looking into the Future with DINO
Efstathios Karypidis, Ioannis Kakogeorgiou, Spyridon Gidaris et al.
RoboScape: Physics-informed Embodied World Model
Yu Shang, Xin Zhang, Yinzhou Tang et al.
AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration
Andy Zhou, Kevin Wu, Francesco Pinto et al.
AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios
Yunjia Qi, Hao Peng, Xiaozhi Wang et al.
Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable
Ruoxin Chen, Junwei Xi, Zhiyuan Yan et al.
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers
Yan Gong, Yiren Song, Yicheng Li et al.
MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation
Ning Li, Xiangmou Qu, Jiamu Zhou et al.
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.
Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning
Jaehun Jung, Seungju Han, Ximing Lu et al.
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench
Edan Toledo, Karen Hambardzumyan, Martin Josifoski et al.
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning
Yuhao Zhou, Yiheng Wang, Xuming He et al.
Is Artificial Intelligence Generated Image Detection a Solved Problem?
Ziqiang Li, Jiazhen Yan, Ziwen He et al.
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu, Qinghao Hu, Haocheng Xi et al.
AgentAuditor: Human-level Safety and Security Evaluation for LLM Agents
Hanjun Luo, Shenyu Dai, Chiming Ni et al.
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
Vineeth Dorna, Anmol Mekala, Wenlong Zhao et al.
Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching
Benjamin Minixhofer, Ivan Vulić, Edoardo Maria Ponti
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks
Hongyuan Tao, Ying Zhang, Zhenhao Tang et al.
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
Yana Wei, Liang Zhao, Jianjian Sun et al.
NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering
Zhihao Huang, Xi Qiu, Yukuo Ma et al.
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning
Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference
Xiang Liu, Zhenheng Tang, Peijie Dong et al.
ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
Shulin Huang, Linyi Yang, Yan Song et al.
Inference-Time Hyper-Scaling with KV Cache Compression
Adrian Łańcucki, Konrad Staniszewski, Piotr Nawrot et al.
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao, Chenlu Ye, Quanquan Gu et al.
Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints
Utkarsh Utkarsh, Pengfei Cai, Alan Edelman et al.
REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints
Di Wu, Liu Liu, Zhou Linli et al.
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
Bo Wang, Qinyuan Cheng, Runyu Peng et al.
Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift
Kaizheng Wang
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Hao Tang, Chen-Wei Xie, Haiyang Wang et al.
OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers
Ziqiao Peng, Jiwen Liu, Haoxian Zhang et al.
QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation
Yaoyu Zhu, Di Huang, Hanqi Lyu et al.
Joint Velocity-Growth Flow Matching for Single-Cell Dynamics Modeling
Dongyi Wang, Yuanwei Jiang, Zhenyi Zhang et al.
4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
Zhen Xu, Zhengqin Li, Zhao Dong et al.
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Jiatao Gu, Tianrong Chen, David Berthelot et al.
Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?
Hyeong Kyu Choi, Jerry Zhu, Sharon Li
Detecting High-Stakes Interactions with Activation Probes
Alex McKenzie, Urja Pawar, Phil Blandfort et al.
Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding
Yixiong Fang, Ziran Yang, Zhaorun Chen et al.
Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards
Charles Arnal, Gaëtan Narozniak, Vivien Cabannes et al.
Force Prompting: Video Generation Models Can Learn And Generalize Physics-based Control Signals
Nate Gillman, Charles Herrmann, Michael Freeman et al.
Conformal Prediction for Causal Effects of Continuous Treatments
Maresa Schröder, Dennis Frauen, Jonas Schweisthal et al.
Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations
Ji-An Li, Huadong Xiong, Robert Wilson et al.
Let LRMs Break Free from Overthinking via Self-Braking Tuning
Haoran Zhao, Yuchen Yan, Yongliang Shen et al.
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
Yuncong Yang, Jiageng Liu, Zheyuan Zhang et al.
PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference
Jiarui Fang, Jinzhe Pan, Aoyu Li et al.
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li, Xiyang Wu, Guangyao Shi et al.
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
Tao Zhang, Cheng Da, Kun Ding et al.
UFM: A Simple Path towards Unified Dense Correspondence with Flow
Yuchen Zhang, Nikhil Keetha, Chenwei Lyu et al.
Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code
Augusto B. Corrêa, André G. Pereira, Jendrik Seipp
Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
Jiyuan Shi, Xinzhe Liu, Dewei Wang et al.
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang
ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Tonghe Zhang, Chao Yu, Sichang Su et al.
MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants
Zeyu Zhang, Quanyu Dai, Luyu Chen et al.
ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
Xiaoyang Liu, Kangjie Bao, Jiashuo Zhang et al.
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging
Jinluan Yang, Dingnan Jin, Anke Tang et al.
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful
Martin Marek, Sanae Lotfi, Aditya Somasundaram et al.
VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception
Ziang Yan, Yinan He, Xinhao Li et al.
Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
Gouki Minegishi, Hiroki Furuta, Takeshi Kojima et al.
EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code
Yuhao Qing, Boyu Zhu, Mingzhe Du et al.
MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks
Yinghao Zhu, Ziyi He, Haoran Hu et al.
Establishing Best Practices in Building Rigorous Agentic Benchmarks
Yuxuan Zhu, Tengjun Jin, Yada Pruksachatkun et al.
In Search of Adam’s Secret Sauce
Antonio Orvieto, Robert Gower
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
Jang-Hyun Kim, Jinuk Kim, Sangwoo Kwon et al.
Imagine360: Immersive 360 Video Generation from Perspective Anchor
Jing Tan, Shuai Yang, Tong Wu et al.
TabDPT: Scaling Tabular Foundation Models on Real Data
Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh et al.
PaTH Attention: Position Encoding via Accumulating Householder Transformations
Songlin Yang, Yikang Shen, Kaiyue Wen et al.
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
Minheng Ni, Zhengyuan Yang, Linjie Li et al.
Backdoor Cleaning without External Guidance in MLLM Fine-tuning
Xuankun Rong, Wenke Huang, Jian Liang et al.
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
Yibo Wang, Tiansheng Huang, Li Shen et al.
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring
Benjamin Arnav, Pablo Bernabeu-Perez, Nathan Helm-Burger et al.
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Weixiang Zhao, Xingyu Sui, Yulin Hu et al.
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Hao Zhong, Muzhi Zhu, Zongze Du et al.
Searching Latent Program Spaces
Matthew Macfarlane, Clem Bonnet
Accelerating RL for LLM Reasoning with Optimal Advantage Regression
Kianté Brantley, Mingyu Chen, Zhaolin Gao et al.
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
Kongcheng Zhang, QI YAO, Shunyu Liu et al.
Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Yanming Wan, Jiaxing Wu, Marwa Abdulhai et al.
Nested Learning: The Illusion of Deep Learning Architectures
Ali Behrouz, Meisam Razaviyayn, Peilin Zhong et al.
NAVIX: Scaling MiniGrid Environments with JAX
Eduardo Pignatelli, Jarek Liesen, Robert Lange et al.
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
Xiaojun Jia, Sensen Gao, Simeng Qin et al.
GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
Advik Basani, Xiao Zhang
Ambient Diffusion Omni: Training Good Models with Bad Data
Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans et al.
Puppeteer: Rig and Animate Your 3D Models
Chaoyue Song, Xiu Li, Fan Yang et al.
Vision Transformers Don't Need Trained Registers
Nicholas Jiang, Amil Dravid, Alexei Efros et al.
InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
Minsoo Kim, Kyuhong Shim, Jungwook Choi et al.
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model
Adibvafa Fallahpour, Andrew Magnuson, Purav Gupta et al.
Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems
Shangbin Feng, Zifeng Wang, Palash Goyal et al.
Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation
Jiyuan Wang, Chunyu Lin, cheng guan et al.
ReSim: Reliable World Simulation for Autonomous Driving
Jiazhi Yang, Kashyap Chitta, Shenyuan Gao et al.
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary Charles, Gabriel Teston, Lucio Dery et al.
Bag of Tricks for Inference-time Computation of LLM Reasoning
Fan LIU, Wen-Shuo Chao, Naiqiang Tan et al.
Linguini: A benchmark for language-agnostic linguistic reasoning
Eduardo Sánchez, Belen Alastruey, Christophe Ropers et al.
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
Pengxiang Li, Shilin Yan, Jiayin Cai et al.
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
Chaofan Lin, Jiaming Tang, Shuo Yang et al.
Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper
Xinyue Zhu, Binghao Huang, Yunzhu Li
KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning
Wei Sun, Wen Yang, Pu Jian et al.
Multi-step Visual Reasoning with Visual Tokens Scaling and Verification
Tianyi Bai, Zengjie Hu, Fupeng Sun et al.
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Minki Kang, Jongwon Jeong, Seanie Lee et al.
Large language models can learn and generalize steganographic chain-of-thought under process supervision
ROBERT MC CARTHY, Joey SKAF, Luis Ibanez-Lissen et al.
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks
Hwiwon Lee, Ziqi Zhang, Hanxiao Lu et al.
GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Fengxiang Wang, Mingshuo Chen, Yueying Li et al.
This Time is Different: An Observability Perspective on Time Series Foundation Models
Ben Cohen, Emaad Khwaja, Youssef Doubli et al.
CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks
Danning Xie, Mingwei Zheng, Xuwei Liu et al.
BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization
Xueyang Zhou, Guiyao Tie, Guowen Zhang et al.
Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection
Yue Zhou, Xinan He, Kaiqing Lin et al.
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
Liwei Jiang, Yuanjun Chai, Margaret Li et al.
Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
Nan Wang, Lixing Xiao, Yuantao Chen et al.
Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization
Abhishek Roy, Geelon So, Yian Ma
Hyperbolic Fine-Tuning for Large Language Models
Menglin Yang, Ram Samarth B B, Aosong Feng et al.
AI-Researcher: Autonomous Scientific Innovation
Jiabin Tang, Lianghao Xia, Zhonghang Li et al.
Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens
Xixian Yong, Xiao Zhou, Yingying Zhang et al.
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
Jiarui Yao, Yifan Hao, Hanning Zhang et al.
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
Chen Wang, Chuhao Chen, Yiming Huang et al.
TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster
Kanghui Ning, Zijie Pan, Yu Liu et al.
Don’t Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
Sohyun An, Ruochen Wang, Tianyi Zhou et al.
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
Junyoung Park, Dalton Jones, Matthew Morse et al.
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling
Yitian Chen, Jingfan Xia, Siyu Shao et al.
DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response
Junjue Wang, Weihao Xuan, Heli Qi et al.
DOTA: Distributional Test-time Adaptation of Vision-Language Models
Zongbo Han, Jialong Yang, Guangyu Wang et al.
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Yiyou Sun, Yu Gai, Lijie Chen et al.
AdvPrefix: An Objective for Nuanced LLM Jailbreaks
Sicheng Zhu, Brandon Amos, Yuandong Tian et al.
Emergent Temporal Correspondences from Video Diffusion Transformers
Jisu Nam, Soowon Son, Dahyun Chung et al.
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker, Frederick Altrock, Benjamin Risse
A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities
Han-Jia Ye, Si-Yang Liu, Wei-Lun (Harry) Chao
MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control
Yuchen Zhu, Wei Guo, Jaemoo Choi et al.
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)
Tianyi Zhang, Mohsen Hariri, Shaochen (Henry) Zhong et al.
$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
Jin Zhou, Kaiwen Wang, Jonathan Chang et al.
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models
Huajie Tan, Yuheng Ji, Xiaoshuai Hao et al.
EgoExoBench: A Benchmark for First- and Third-person View Video Understanding in MLLMs
Yuping He, Yifei Huang, Guo Chen et al.
Scaling Laws for Optimal Data Mixtures
Mustafa Shukor, Louis Bethune, Dan Busbridge et al.
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
Wenhao Wang, Yi Yang
Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions
Siqiao Mu, Diego Klabjan
Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning
Qitao Tan, Jun Liu, Zheng Zhan et al.