Most Cited NEURIPS "lateral stream components" Papers
5,858 papers found • Page 4 of 30
Conference
MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement
Jaehyun Nam, Jinsung Yoon, Jiefeng Chen et al.
Training-Free Constrained Generation With Stable Diffusion Models
Stefano Zampini, Jacob K Christopher, Luca Oneto et al.
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
Yaxin Luo, Zhaoyi Li, Jiacheng Liu et al.
Hyperbolic Dataset Distillation
Wenyuan Li, Guang Li, Keisuke Maeda et al.
SnapMoGen: Human Motion Generation from Expressive Texts
chuan guo, Inwoo Hwang, Jian Wang et al.
Extrapolation by Association: Length Generalization Transfer In Transformers
Ziyang Cai, Nayoung Lee, Avi Schwarzschild et al.
Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness
Thomas Pethick, Wanyun Xie, Mete Erdogan et al.
Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models
Xiyuan Zhang, Danielle Maddix Robinson, Junming Yin et al.
Beyond Verifiable Rewards: Scaling Reinforcement Learning in Language Models to Unverifiable Data
Yunhao Tang, Sid Wang, Lovish Madaan et al.
Value-Guided Search for Efficient Chain-of-Thought Reasoning
Kaiwen Wang, Jin Zhou, Jonathan Chang et al.
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
Nandan Thakur, Jimmy Lin, Samuel Havens et al.
MLZero: A Multi-Agent System for End-to-end Machine Learning Automation
Haoyang Fang, Boran Han, Nick Erickson et al.
DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing
Chenxi Xie, Minghan Li, Shuai Li et al.
UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset
Chen Zhao, En Ci, Yunzhe Xu et al.
Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
Rongzhe Wei, Peizhi Niu, Hans Hao-Hsun Hsu et al.
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.
Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence
Shaopeng Fu, Liang Ding, Jingfeng ZHANG et al.
Lifelong Safety Alignment for Language Models
Haoyu Wang, Yifei Zhao, Zeyu Qin et al.
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
Yunheng Li, Jing Cheng, Shaoyong Jia et al.
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
Benjamin Walker, Lingyi Yang, Nicola Muca Cirone et al.
SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization
Xiaofeng Tan, Hongsong Wang, Xin Geng et al.
Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization
Guanchen Li, Yixing Xu, Zeping Li et al.
Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets
Benjamin Dupuis, Paul Viallard, George Deligiannidis et al.
HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
Zhi Jing, Siyuan Yang, Jicong Ao et al.
Large Language Models Think Too Fast To Explore Effectively
Lan Pan, Hanbo Xie, Robert Wilson
DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy
Yuran Wang, Ruihai Wu, Yue Chen et al.
Towards Doctor-Like Reasoning: Medical RAG Fusing Knowledge with Patient Analogy through Textual Gradients
Yuxing Lu, Gecheng Fu, Wei Wu et al.
Probing Equivariance and Symmetry Breaking in Convolutional Networks
Sharvaree Vadgama, Mohammad Islam, Domas Buracas et al.
Provable Scaling Laws for the Test-Time Compute of Large Language Models
Yanxi Chen, Xuchen Pan, Yaliang Li et al.
T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks
Jiayang Liu, Siyuan Liang, Shiqian Zhao et al.
Zebra-Llama: Towards Extremely Efficient Hybrid Models
Mingyu Yang, Mehdi Rezagholizadeh, Guihong Li et al.
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
Yaniv Nikankin, Dana Arad, Yossi Gandelsman et al.
LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale
Miran Özdogan, Gilad Landau, Gereon Elvers et al.
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
Leying Zhang, Yao Qian, Xiaofei Wang et al.
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
Mingzhe Du, Anh Tuan Luu, Yue Liu et al.
Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Liliang Ren, Congcong Chen, Haoran Xu et al.
FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
Jintao Tong, Wenwei Jin, Pengda Qin et al.
Fast Inference for Augmented Large Language Models
Rana Shahout, Cong Liang, Shiji Xin et al.
Split Gibbs Discrete Diffusion Posterior Sampling
Wenda Chu, Zihui Wu, Yifan Chen et al.
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
Kaihang Pan, Yang Wu, Wendong Bu et al.
Decompile-Bench: Million-Scale Binary-Source Function Pairs for Real-World Binary Decompilation
hanzhuo tan, Xiaolong Tian, Hanrui Qi et al.
Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference
Denis Blessing, Julius Berner, Lorenz Richter et al.
Sherlock: Self-Correcting Reasoning in Vision-Language Models
Yi Ding, Ruqi Zhang
Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
Herun Wan, Jiaying Wu, Minnan Luo et al.
Attention Mechanism, Max-Affine Partition, and Universal Approximation
Hude Liu, Jerry Yao-Chieh Hu, Zhao Song et al.
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin, Anton Xue, Weiqiu You et al.
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
Yuqian Yuan, Ronghao Dang, long li et al.
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
Kunjun Li, Zigeng Chen, Cheng-Yen Yang et al.
Simultaneous Swap Regret Minimization via KL-Calibration
Haipeng Luo, Spandan Senapati, Vatsal Sharan
Understanding Adam Requires Better Rotation Dependent Assumptions
Tianyue Zhang, Lucas Maes, Alan Milligan et al.
Prediction-Powered Causal Inferences
Riccardo Cadei, Ilker Demirel, Piersilvio De Bartolomeis et al.
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Kazuki Irie, Morris Yau, Samuel J Gershman
Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data
Zhenqing Ling, Daoyuan Chen, Liuyi Yao et al.
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Yu Zhang, Jialei Zhou, Xinchen Li et al.
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Xiao Li, Zekai Zhang, Xiang Li et al.
Statistically Valid Post-Deployment Monitoring Should Be Standard for AI-Based Digital Health
Pavel Dolin, Weizhi Li, Gautam Dasarathy et al.
The Persistence of Neural Collapse Despite Low-Rank Bias
Connall Garrod, Jonathan Keating
A multiscale analysis of mean-field transformers in the moderate interaction regime
Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Yuheng Yuan, Qiuhong Shen, Xingyi Yang et al.
Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
Steffen Schotthöfer, Lexie Yang, Stefan Schnake
Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
Jie Ren, Zhenwei Dai, Xianfeng Tang et al.
Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
Chaoyang Wang, Ashkan Mirzaei, Vidit Goel et al.
SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting
Mengjiao Ma, Qi Ma, Yue Li et al.
CausalPFN: Amortized Causal Effect Estimation via In-Context Learning
Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas et al.
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Yu Zhou, Xingyu Wu, Jibin Wu et al.
BOOM: Benchmarking Out-Of-distribution Molecular Property Predictions of Machine Learning Models
Evan Antoniuk, Shehtab Zaman, Tal Ben-Nun et al.
SimpleStrat: Diversifying Language Model Generation with Stratification
Justin Wong, Yury Orlovskiy, Alexander Shypula et al.
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
Yehonathan Refael, Guy Smorodinsky, Tom Tirer et al.
Multimodal Tabular Reasoning with Privileged Structured Information
Jun-Peng Jiang, Yu Xia, Hai-Long Sun et al.
Momentum Multi-Marginal Schrödinger Bridge Matching
Panagiotis Theodoropoulos, Augustinos Saravanos, Evangelos Theodorou et al.
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Michal Nauman, Marek Cygan, Carmelo Sferrazza et al.
Benign Overfitting in Single-Head Attention
Roey Magen, Shuning Shang, Zhiwei Xu et al.
Straight-Line Diffusion Model for Efficient 3D Molecular Generation
Yuyan Ni, Shikun Feng, Haohan Chi et al.
A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search
Arnav Kumar Jain, Vibhakar Mohta, Subin Kim et al.
Validating LLM-as-a-Judge Systems under Rating Indeterminacy
Luke Guerdan, Solon Barocas, Kenneth Holstein et al.
Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models
Wenzhuo Tang, Haitao Mao, Danial Dervovic et al.
Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding
Xiaoqian Shen, Wenxuan Zhang, Jun Chen et al.
Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization
Andrés Guzmán-Cordero, Felix Dangel, Gil Goldshlager et al.
Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Juan Rodriguez, Haotian Zhang, Abhay Puri et al.
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
Yuyao Zhang, Jinghao Li, Yu-Wing Tai
Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers
Zhengliang Shi, Lingyong Yan, Dawei Yin et al.
ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS
Weijie Wang, Donny Y. Chen, Zeyu Zhang et al.
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
Yanjun Fu, Faisal Hamman, Sanghamitra Dutta
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan et al.
Scaffolding Dexterous Manipulation with Vision-Language Models
Vincent de Bakker, Joey Hejna, Tyler Lum et al.
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models
Daoyuan Chen, Yilun Huang, Xuchen Pan et al.
Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
Han Lin, Jaemin Cho, Amir Zadeh et al.
Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers
Chaehyun Kim, Heeseong Shin, Eunbeen Hong et al.
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
Xuanming Zhang, Yuxuan Chen, Samuel (Min-Hsuan) Yeh et al.
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
Rui Li, Zeyu Zhang, Xiaohe Bo et al.
DataRater: Meta-Learned Dataset Curation
Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.
DyMU: Dynamic Merging and Virtual Unmerging for Efficient Variable-Length VLMs
Zhenhailong Wang, Senthil Purushwalkam, Caiming Xiong et al.
SteerConf: Steering LLMs for Confidence Elicitation
Ziang Zhou, Tianyuan Jin, Jieming Shi et al.
Space Group Equivariant Crystal Diffusion
Rees Chang, Angela Pak, Alex Guerra et al.
Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs
Hao Fang, Changle Zhou, Jiawei Kong et al.
Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations
Brian Zheng, Alisa Liu, Orevaoghene Ahia et al.
Test3R: Learning to Reconstruct 3D at Test Time
Yuheng Yuan, Qiuhong Shen, Shizun Wang et al.
Manipulating Feature Visualizations with Gradient Slingshots
Dilyara Bareeva, Marina Höhne, Alexander Warnecke et al.
Token Perturbation Guidance for Diffusion Models
Javad Rajabi, Soroush Mehraban, Seyedmorteza Sadat et al.
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
Jingli Lin, Chenming Zhu, Runsen Xu et al.
DMWM: Dual-Mind World Model with Long-Term Imagination
Lingyi Wang, Rashed Shelim, Walid Saad et al.
Video Perception Models for 3D Scene Synthesis
Rui Huang, Guangyao Zhai, Zuria Bauer et al.
A solvable model of learning generative diffusion: theory and insights
Hugo Cui, Cengiz Pehlevan, Yue Lu
Dense SAE Latents Are Features, Not Bugs
Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.
Why Do Some Language Models Fake Alignment While Others Don't?
Abhay Sheshadri, John Hughes, Julian Michael et al.
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
Yue Gong, Chuan Lei, Xiao Qin et al.
Thousand Voices of Trauma: A Large-Scale Synthetic Dataset for Modeling Prolonged Exposure Therapy Conversations
Suhas BN, Andrew Sherrill, Rosa I. Arriaga et al.
Constrained Optimization From a Control Perspective via Feedback Linearization
Runyu Zhang, Arvind Raghunathan, Jeff Shamma et al.
Language Models Can Predict Their Own Behavior
Dhananjay Ashok, Jonathan May
On the Relation between Rectified Flows and Optimal Transport
Johannes Hertrich, Antonin Chambolle, Julie Delon
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
Jiashuo Sun, Xianrui Zhong, Sizhe Zhou et al.
WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
Zhiheng Liu, Xueqing Deng, Shoufa Chen et al.
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Ao Wang, Hui Chen, Jianchao Tan et al.
FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies
Dongyue Lu, Lingdong Kong, Gim Hee Lee et al.
Thinker: Learning to Think Fast and Slow
Stephen Chung, Wenyu Du, Jie Fu
Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling
Xinglin Wang, Yiwei Li, Shaoxiong Feng et al.
Seeking and Updating with Live Visual Knowledge
Mingyang Fu, Yuyang Peng, Dongping Chen et al.
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Boyang Wang, Xuweiyi Chen, Matheus Gadelha et al.
On Union-Closedness of Language Generation
Steve Hanneke, Amin Karbasi, Anay Mehrotra et al.
VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
Patrick Yubeaton, Andre Nakkab, Weihua Xiao et al.
On Extending Direct Preference Optimization to Accommodate Ties
Jinghong Chen, Guangyu Yang, Weizhe Lin et al.
MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization
Zeyuan Ma, Yue-Jiao Gong, Hongshu Guo et al.
New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results
Francesco Orabona, Ryan D'Orazio
BEDLAM2.0: Synthetic humans and cameras in motion
Joachim Tesch, Giorgio Becherini, Prerana Achar et al.
Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings
Qiong Wu, Wenhao Lin, Yiyi Zhou et al.
Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning
Yihong Tang, Kehai Chen, Muyun Yang et al.
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Runa Eschenhagen, Aaron Defazio, Tsung-Hsien Lee et al.
SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
Xianzhe Fan, Xuhui Zhou, Chuanyang Jin et al.
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction
Bin Lei, Weitai Kang, Zijian Zhang et al.
DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance
Maximilian Du, Shuran Song
Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting
ChengAo Shen, Wenchao Yu, Ziming Zhao et al.
DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?
Tianhong Zhou, xu yin, Yingtao Zhu et al.
ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks
Santiago Cadena, Andrea Merlo, Emanuel Laude et al.
Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
Haoran Sun, Yankai Jiang, Wenjie Lou et al.
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.
Stochastic Process Learning via Operator Flow Matching
Yaozhong Shi, Zachary Ross, Domniki Asimaki et al.
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Bingquan Dai, Luo Li, Qihong Tang et al.
Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis
Yunwei Ren, Jason Lee
The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
Shuai Yuan, Xingshuo Han, Hongwei Li et al.
Uncertainty Quantification with the Empirical Neural Tangent Kernel
Joseph Wilson, Chris van der Heide, Liam Hodgkinson et al.
Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology
Wenhao Tang, Rong Qin, Heng Fang et al.
MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
kaixing yang, Xulong Tang, Ziqiao Peng et al.
DreamPRM: Domain-reweighted Process Reward Model for Multimodal Reasoning
Qi Cao, Ruiyi Wang, Ruiyi Zhang et al.
CAT: Content-Adaptive Image Tokenization
Junhong Shen, Kushal Tirumala, Michihiro Yasunaga et al.
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws
Gerard Ben Arous, Murat Erdogdu, Nuri Mert Vural et al.
ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding
Muye Huang, Lingling Zhang, Jie Ma et al.
Hybrid Latent Reasoning via Reinforcement Learning
Zhenrui Yue, Bowen Jin, Huimin Zeng et al.
nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning
Tianqi Luo, Chuhan Huang, Leixian Shen et al.
Predicting Empirical AI Research Outcomes with Language Models
Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.
Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models
Haolang Lu, Yilian Liu, Jingxin Xu et al.
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly
Liang Ma, Jiajun Wen, Min Lin et al.
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code
Tianyu Hua, Harper Hua, Violet Xiang et al.
On scalable and efficient training of diffusion samplers
Minkyu Kim, Kiyoung Seong, Dongyeop Woo et al.
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training
Will Merrill, Shane Arora, Dirk Groeneveld et al.
Demystifying Language Model Forgetting with Low-rank Example Associations
Xisen Jin, Xiang Ren
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
Jinlai Liu, Jian Han, Bin Yan et al.
When Are Concepts Erased From Diffusion Models?
Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.
Distillation Robustifies Unlearning
Bruce W, Lee, Addie Foote, Alex Infanger et al.
In-Context Learning of Stochastic Differential Equations with Foundation Inference Models
Patrick Seifner, Kostadin Cvejoski, David Berghaus et al.
COME: Adding Scene-Centric Forecasting Control to Occupancy World Model
Yining Shi, Kun Jiang, Qiang Meng et al.
Scaling Physical Reasoning with the PHYSICS Dataset
Shenghe Zheng, Qianjia Cheng, Junchi Yao et al.
Learning with Calibration: Exploring Test-Time Computing of Spatio-Temporal Forecasting
Wei Chen, Yuxuan Liang
On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
Yichao Cai, Yuhang Liu, Erdun Gao et al.
Vid-SME: Membership Inference Attacks against Large Video Understanding Models
Qi Li, Runpeng Yu, Xinchao Wang
Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling
Tianyi Tan, Yinan Zheng, Ruiming Liang et al.
Efficient Quadratic Corrections for Frank-Wolfe Algorithms
Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.
Improved Balanced Classification with Theoretically Grounded Loss Functions
Corinna Cortes, Mehryar Mohri, Yutao Zhong
What Matters in Data for DPO?
Yu Pan, Zhongze Cai, Huaiyang Zhong et al.
We Should Chart an Atlas of All the World's Models
Eliahu Horwitz, Nitzan Kurer, Jonathan Kahana et al.
Estimating Model Performance Under Covariate Shift Without Labels
Jakub Białek, Juhani Kivimäki, Wojciech Kuberski et al.
MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants
Hritik Bansal, Daniel Israel, Siyan Zhao et al.
The emergence of sparse attention: impact of data distribution and benefits of repetition
Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.
Curly Flow Matching for Learning Non-gradient Field Dynamics
Katarina Petrović, Lazar Atanackovic, Viggo Moro et al.
MindGYM: What Matters in Question Synthesis for Thinking-Centric Fine-Tuning?
Zhe Xu, Daoyuan Chen, Zhenqing Ling et al.
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
Mathurin VIDEAU, Badr Youbi Idrissi, Alessandro Leite et al.
AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation
Qingqiu Li, Zihang Cui, Seongsu Bae et al.
Scaling Offline RL via Efficient and Expressive Shortcut Models
Nicolas Espinosa-Dice, Yiyi Zhang, Yiding Chen et al.
Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation
Edward Fish, Richard Bowden
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
Dingqiang Ye, Chao Fan, Zhanbo Huang et al.
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
ShuHang Xun, Sicheng Tao, Jungang Li et al.
Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.
SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing
Mingfei Chen, Zijun Cui, Xiulong Liu et al.
Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning
Haozhe Ma, Zhengding Luo, Thanh Vinh Vo et al.
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Christoph Jürgen Hemmer, Daniel Durstewitz
Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series
Ching Chang, Jeehyun Hwang, Yidan Shi et al.
PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms
Yifei Xia, Shuchen Weng, Siqi Yang et al.
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models
Xiaohao Liu, Xiaobo Xia, Weixiang Zhao et al.
Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
Dekai Zhu, Yixuan Hu, Youquan Liu et al.
Multiplayer Federated Learning: Reaching Equilibrium with Less Communication
TaeHo Yoon, Sayantan Choudhury, Nicolas Loizou
EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding
Ege Özsoy, Arda Mamur, Felix Tristram et al.
Learning Diffusion Models with Flexible Representation Guidance
Chenyu Wang, Cai Zhou, Sharut Gupta et al.
Logic.py: Bridging the Gap between LLMs and Constraint Solvers
Pascal Kesseli, Peter O'Hearn, Ricardo Cabral
Scaling Speculative Decoding with Lookahead Reasoning
Yichao Fu, Rui Ge, Zelei Shao et al.
Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
Zhanyi Sun, Shuran Song
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Zihuan Qiu, Yi Xu, Chiyuan He et al.
Time-o1: Time-Series Forecasting Needs Transformed Label Alignment
Hao Wang, Licheng Pan, Zhichao Chen et al.
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
Weizhi Fei, Xueyan Niu, XIE GUOQING et al.
Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression
Thibaut Loiseau, Guillaume Bourmaud, Vincent Lepetit