Most Cited NEURIPS "guided generation" Papers
5,858 papers found • Page 4 of 30
Conference
MLZero: A Multi-Agent System for End-to-end Machine Learning Automation
Haoyang Fang, Boran Han, Nick Erickson et al.
Value-Guided Search for Efficient Chain-of-Thought Reasoning
Kaiwen Wang, Jin Zhou, Jonathan Chang et al.
Training-Free Constrained Generation With Stable Diffusion Models
Stefano Zampini, Jacob K Christopher, Luca Oneto et al.
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung et al.
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai, Yuma Ichikawa
VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.
FlashMD: long-stride, universal prediction of molecular dynamics
Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi et al.
Locality in Image Diffusion Models Emerges from Data Statistics
Artem Lukoianov, Chenyang Yuan, Justin Solomon et al.
Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents
Qizheng Zhang, Michael Wornow, Kunle Olukotun
Enhancing 3D Reconstruction for Dynamic Scenes
Jisang Han, Honggyu An, Jaewoo Jung et al.
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS
Wanhua Li, Yujie Zhao, Minghan Qin et al.
Depth-Bounds for Neural Networks via the Braid Arrangement
Moritz Grillo, Christoph Hertrich, Georg Loho
Don't Just Chase “Highlighted Tokens” in MLLMs: Revisiting Visual Holistic Context Retention
Xin Zou, Di Lu, Yizhou Wang et al.
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
Jiayi Zhou, Jiaming Ji, Boyuan Chen et al.
FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
Jintao Tong, Wenwei Jin, Pengda Qin et al.
Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets
Benjamin Dupuis, Paul Viallard, George Deligiannidis et al.
Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data
Zhenqing Ling, Daoyuan Chen, Liuyi Yao et al.
Provable Scaling Laws for the Test-Time Compute of Large Language Models
Yanxi Chen, Xuchen Pan, Yaliang Li et al.
SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization
Xiaofeng Tan, Hongsong Wang, Xin Geng et al.
Scaffolding Dexterous Manipulation with Vision-Language Models
Vincent de Bakker, Joey Hejna, Tyler Lum et al.
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan et al.
Sherlock: Self-Correcting Reasoning in Vision-Language Models
Yi Ding, Ruqi Zhang
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
Yunheng Li, Jing Cheng, Shaoyong Jia et al.
DataRater: Meta-Learned Dataset Curation
Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
Yanjun Fu, Faisal Hamman, Sanghamitra Dutta
Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers
Zhengliang Shi, Lingyong Yan, Dawei Yin et al.
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Michal Nauman, Marek Cygan, Carmelo Sferrazza et al.
SimpleStrat: Diversifying Language Model Generation with Stratification
Justin Wong, Yury Orlovskiy, Alexander Shypula et al.
CausalPFN: Amortized Causal Effect Estimation via In-Context Learning
Vahid Balazadeh, Hamidreza Kamkari, Valentin Thomas et al.
SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting
Mengjiao Ma, Qi Ma, Yue Li et al.
Fast Inference for Augmented Large Language Models
Rana Shahout, Cong Liang, Shiji Xin et al.
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
Jingli Lin, Chenming Zhu, Runsen Xu et al.
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
Yuqian Yuan, Ronghao Dang, long li et al.
Decompile-Bench: Million-Scale Binary-Source Function Pairs for Real-World Binary Decompilation
hanzhuo tan, Xiaolong Tian, Hanrui Qi et al.
LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale
Miran Özdogan, Gilad Landau, Gereon Elvers et al.
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
Yaniv Nikankin, Dana Arad, Yossi Gandelsman et al.
The Persistence of Neural Collapse Despite Low-Rank Bias
Connall Garrod, Jonathan Keating
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models
Daoyuan Chen, Yilun Huang, Xuchen Pan et al.
Validating LLM-as-a-Judge Systems under Rating Indeterminacy
Luke Guerdan, Solon Barocas, Kenneth Holstein et al.
Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
Steffen Schotthöfer, Lexie Yang, Stefan Schnake
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Xiao Li, Zekai Zhang, Xiang Li et al.
Prediction-Powered Causal Inferences
Riccardo Cadei, Ilker Demirel, Piersilvio De Bartolomeis et al.
Understanding Adam Requires Better Rotation Dependent Assumptions
Tianyue Zhang, Lucas Maes, Alan Milligan et al.
Attention Mechanism, Max-Affine Partition, and Universal Approximation
Hude Liu, Jerry Yao-Chieh Hu, Zhao Song et al.
Split Gibbs Discrete Diffusion Posterior Sampling
Wenda Chu, Zihui Wu, Yifan Chen et al.
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag
T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks
Jiayang Liu, Siyuan Liang, Shiqian Zhao et al.
Towards Doctor-Like Reasoning: Medical RAG Fusing Knowledge with Patient Analogy through Textual Gradients
Yuxing Lu, Gecheng Fu, Wei Wu et al.
Probing Equivariance and Symmetry Breaking in Convolutional Networks
Sharvaree Vadgama, Mohammad Islam, Domas Buracas et al.
Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
Rongzhe Wei, Peizhi Niu, Hans Hao-Hsun Hsu et al.
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
Xuanming Zhang, Yuxuan Chen, Samuel (Min-Hsuan) Yeh et al.
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
Yuyao Zhang, Jinghao Li, Yu-Wing Tai
Straight-Line Diffusion Model for Efficient 3D Molecular Generation
Yuyan Ni, Shikun Feng, Haohan Chi et al.
ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS
Weijie Wang, Donny Y. Chen, Zeyu Zhang et al.
BOOM: Benchmarking Out-Of-distribution Molecular Property Predictions of Machine Learning Models
Evan Antoniuk, Shehtab Zaman, Tal Ben-Nun et al.
Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
Jie Ren, Zhenwei Dai, Xianfeng Tang et al.
A multiscale analysis of mean-field transformers in the moderate interaction regime
Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi
Statistically Valid Post-Deployment Monitoring Should Be Standard for AI-Based Digital Health
Pavel Dolin, Weizhi Li, Gautam Dasarathy et al.
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Kazuki Irie, Morris Yau, Samuel J Gershman
Simultaneous Swap Regret Minimization via KL-Calibration
Haipeng Luo, Spandan Senapati, Vatsal Sharan
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
Kunjun Li, Zigeng Chen, Cheng-Yen Yang et al.
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
Kaihang Pan, Yang Wu, Wendong Bu et al.
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin, Anton Xue, Weiqiu You et al.
Test3R: Learning to Reconstruct 3D at Test Time
Yuheng Yuan, Qiuhong Shen, Shizun Wang et al.
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
Leying Zhang, Yao Qian, Xiaofei Wang et al.
Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Liliang Ren, Congcong Chen, Haoran Xu et al.
Large Language Models Think Too Fast To Explore Effectively
Lan Pan, Hanbo Xie, Robert Wilson
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
Benjamin Walker, Lingyi Yang, Nicola Muca Cirone et al.
Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence
Shaopeng Fu, Liang Ding, Jingfeng ZHANG et al.
Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs
Hao Fang, Changle Zhou, Jiawei Kong et al.
Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations
Brian Zheng, Alisa Liu, Orevaoghene Ahia et al.
SteerConf: Steering LLMs for Confidence Elicitation
Ziang Zhou, Tianyuan Jin, Jieming Shi et al.
Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
Han Lin, Jaemin Cho, Amir Zadeh et al.
Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization
Andrés Guzmán-Cordero, Felix Dangel, Gil Goldshlager et al.
Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding
Xiaoqian Shen, Wenxuan Zhang, Jun Chen et al.
Multimodal Tabular Reasoning with Privileged Structured Information
Jun-Peng Jiang, Yu Xia, Hai-Long Sun et al.
Momentum Multi-Marginal Schrödinger Bridge Matching
Panagiotis Theodoropoulos, Augustinos Saravanos, Evangelos Theodorou et al.
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
Yehonathan Refael, Guy Smorodinsky, Tom Tirer et al.
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Yu Zhou, Xingyu Wu, Jibin Wu et al.
Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference
Denis Blessing, Julius Berner, Lorenz Richter et al.
Zebra-Llama: Towards Extremely Efficient Hybrid Models
Mingyu Yang, Mehdi Rezagholizadeh, Guihong Li et al.
Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization
Guanchen Li, Yixing Xu, Zeping Li et al.
HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
Zhi Jing, Siyuan Yang, Jicong Ao et al.
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.
DyMU: Dynamic Merging and Virtual Unmerging for Efficient Variable-Length VLMs
Zhenhailong Wang, Senthil Purushwalkam, Caiming Xiong et al.
Space Group Equivariant Crystal Diffusion
Rees Chang, Angela Pak, Alex Guerra et al.
Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers
Chaehyun Kim, Heeseong Shin, Eunbeen Hong et al.
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
Rui Li, Zeyu Zhang, Xiaohe Bo et al.
Benign Overfitting in Single-Head Attention
Roey Magen, Shuning Shang, Zhiwei Xu et al.
A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search
Arnav Kumar Jain, Vibhakar Mohta, Subin Kim et al.
Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
Chaoyang Wang, Ashkan Mirzaei, Vidit Goel et al.
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Yuheng Yuan, Qiuhong Shen, Xingyi Yang et al.
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Yu Zhang, Jialei Zhou, Xinchen Li et al.
Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
Herun Wan, Jiaying Wu, Minnan Luo et al.
DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy
Yuran Wang, Ruihai Wu, Yue Chen et al.
Lifelong Safety Alignment for Language Models
Haoyu Wang, Yifei Zhao, Zeyu Qin et al.
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
Mingzhe Du, Anh Tuan Luu, Yue Liu et al.
Manipulating Feature Visualizations with Gradient Slingshots
Dilyara Bareeva, Marina Höhne, Alexander Warnecke et al.
Token Perturbation Guidance for Diffusion Models
Javad Rajabi, Soroush Mehraban, Seyedmorteza Sadat et al.
Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models
Wenzhuo Tang, Haitao Mao, Danial Dervovic et al.
Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Juan Rodriguez, Haotian Zhang, Abhay Puri et al.
Integral Imprecise Probability Metrics
Siu Lun (Alan) Chau, Michele Caprio, Krikamol Muandet
PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions
Daeun Kyung, Hyunseung Chung, Seongsu Bae et al.
MindGYM: What Matters in Question Synthesis for Thinking-Centric Fine-Tuning?
Zhe Xu, Daoyuan Chen, Zhenqing Ling et al.
BEDLAM2.0: Synthetic humans and cameras in motion
Joachim Tesch, Giorgio Becherini, Prerana Achar et al.
Seeking and Updating with Live Visual Knowledge
Mingyang Fu, Yuyang Peng, Dongping Chen et al.
Logic.py: Bridging the Gap between LLMs and Constraint Solvers
Pascal Kesseli, Peter O'Hearn, Ricardo Cabral
PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms
Yifei Xia, Shuchen Weng, Siqi Yang et al.
ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding
Muye Huang, Lingling Zhang, Jie Ma et al.
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.
ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks
Santiago Cadena, Andrea Merlo, Emanuel Laude et al.
Exact Expressive Power of Transformers with Padding
Will Merrill, Ashish Sabharwal
Shape it Up! Restoring LLM Safety during Finetuning
ShengYun Peng, Pin-Yu Chen, Jianfeng Chi et al.
Time-o1: Time-Series Forecasting Needs Transformed Label Alignment
Hao Wang, Licheng Pan, Zhichao Chen et al.
Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
Dekai Zhu, Yixuan Hu, Youquan Liu et al.
CAT: Content-Adaptive Image Tokenization
Junhong Shen, Kushal Tirumala, Michihiro Yasunaga et al.
DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?
Tianhong Zhou, xu yin, Yingtao Zhu et al.
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
Yue Gong, Chuan Lei, Xiao Qin et al.
Dense SAE Latents Are Features, Not Bugs
Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.
Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent
Tong Yang, Yu Huang, Yingbin Liang et al.
Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs
Kejia Zhang, Keda TAO, Jiasheng Tang et al.
Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression
Thibaut Loiseau, Guillaume Bourmaud, Vincent Lepetit
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
Weizhi Fei, Xueyan Niu, XIE GUOQING et al.
Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
Zhanyi Sun, Shuran Song
Scaling Speculative Decoding with Lookahead Reasoning
Yichao Fu, Rui Ge, Zelei Shao et al.
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Christoph Jürgen Hemmer, Daniel Durstewitz
Scaling Offline RL via Efficient and Expressive Shortcut Models
Nicolas Espinosa-Dice, Yiyi Zhang, Yiding Chen et al.
MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants
Hritik Bansal, Daniel Israel, Siyan Zhao et al.
We Should Chart an Atlas of All the World's Models
Eliahu Horwitz, Nitzan Kurer, Jonathan Kahana et al.
What Matters in Data for DPO?
Yu Pan, Zhongze Cai, Huaiyang Zhong et al.
Scaling Physical Reasoning with the PHYSICS Dataset
Shenghe Zheng, Qianjia Cheng, Junchi Yao et al.
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code
Tianyu Hua, Harper Hua, Violet Xiang et al.
Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models
Haolang Lu, Yilian Liu, Jingxin Xu et al.
nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning
Tianqi Luo, Chuhan Huang, Leixian Shen et al.
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws
Gerard Ben Arous, Murat Erdogdu, Nuri Mert Vural et al.
Multiplayer Federated Learning: Reaching Equilibrium with Less Communication
TaeHo Yoon, Sayantan Choudhury, Nicolas Loizou
Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning
Haozhe Ma, Zhengding Luo, Thanh Vinh Vo et al.
AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation
Qingqiu Li, Zihang Cui, Seongsu Bae et al.
The emergence of sparse attention: impact of data distribution and benefits of repetition
Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.
In-Context Learning of Stochastic Differential Equations with Foundation Inference Models
Patrick Seifner, Kostadin Cvejoski, David Berghaus et al.
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly
Liang Ma, Jiajun Wen, Min Lin et al.
Hybrid Latent Reasoning via Reinforcement Learning
Zhenrui Yue, Bowen Jin, Huimin Zeng et al.
On Union-Closedness of Language Generation
Steve Hanneke, Amin Karbasi, Anay Mehrotra et al.
Thousand Voices of Trauma: A Large-Scale Synthetic Dataset for Modeling Prolonged Exposure Therapy Conversations
Suhas BN, Andrew Sherrill, Rosa I. Arriaga et al.
SymMaP: Improving Computational Efficiency in Linear Solvers through Symbolic Preconditioning
Hong Wang, Jie Wang, Minghao Ma et al.
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding
YUXIANG WEI, Yanteng Zhang, Xi Xiao et al.
SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset
Peng Xie, Xingyuan Liu, Yequan Bie et al.
Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA
Shuangyi Chen, Yuanxin Guo, Yue Ju et al.
EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding
Ege Özsoy, Arda Mamur, Felix Tristram et al.
Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series
Ching Chang, Jeehyun Hwang, Yidan Shi et al.
Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.
Curly Flow Matching for Learning Non-gradient Field Dynamics
Katarina Petrović, Lazar Atanackovic, Viggo Moro et al.
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
Jinlai Liu, Jian Han, Bin Yan et al.
Uncertainty Quantification with the Empirical Neural Tangent Kernel
Joseph Wilson, Chris van der Heide, Liam Hodgkinson et al.
The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
Shuai Yuan, Xingshuo Han, Hongwei Li et al.
Language Models Can Predict Their Own Behavior
Dhananjay Ashok, Jonathan May
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
Zixiang Zhao, Haowen Bai, Bingxin Ke et al.
LuxDiT: Lighting Estimation with Video Diffusion Transformer
Ruofan Liang, Kai He, Zan Gojcic et al.
EgoBridge: Domain Adaptation for Generalizable Imitation from Egocentric Human Data
Ryan Punamiya, Dhruv Patel, Patcharapong Aphiwetsa et al.
Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Zihan Wang, Seungjun Lee, Gim Hee Lee
LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits
Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin et al.
Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation
Edward Fish, Richard Bowden
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
Mathurin VIDEAU, Badr Youbi Idrissi, Alessandro Leite et al.
SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing
Mingfei Chen, Zijun Cui, Xiulong Liu et al.
Distillation Robustifies Unlearning
Bruce W, Lee, Addie Foote, Alex Infanger et al.
Efficient Quadratic Corrections for Frank-Wolfe Algorithms
Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.
On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
Yichao Cai, Yuhang Liu, Erdun Gao et al.
MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
kaixing yang, Xulong Tang, Ziqiao Peng et al.
Stochastic Process Learning via Operator Flow Matching
Yaozhong Shi, Zachary Ross, Domniki Asimaki et al.
SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
Xianzhe Fan, Xuhui Zhou, Chuanyang Jin et al.
MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization
Zeyuan Ma, Yue-Jiao Gong, Hongshu Guo et al.
Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning
Yihong Tang, Kehai Chen, Muyun Yang et al.
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Boyang Wang, Xuweiyi Chen, Matheus Gadelha et al.
Constrained Optimization From a Control Perspective via Feedback Linearization
Runyu Zhang, Arvind Raghunathan, Jeff Shamma et al.
Video Perception Models for 3D Scene Synthesis
Rui Huang, Guangyao Zhai, Zuria Bauer et al.
A solvable model of learning generative diffusion: theory and insights
Hugo Cui, Cengiz Pehlevan, Yue Lu
Why Do Some Language Models Fake Alignment While Others Don't?
Abhay Sheshadri, John Hughes, Julian Michael et al.
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Jingjing Jiang, Chongjie Si, Jun Luo et al.
Atomic Thinking of LLMs: Decoupling and Exploring Mathematical Reasoning Abilities
Jiayi Kuang, Haojing Huang, Yinghui Li et al.
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Weidong Liu, Jiyuan Tu, Xi Chen et al.
DeblurDiff: Real-Word Image Deblurring with Generative Diffusion Models
Lingshun Kong, Jiawei Zhang, Dongqing Zou et al.
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Zihuan Qiu, Yi Xu, Chiyuan He et al.
Learning Diffusion Models with Flexible Representation Guidance
Chenyu Wang, Cai Zhou, Sharut Gupta et al.
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
Dingqiang Ye, Chao Fan, Zhanbo Huang et al.
Estimating Model Performance Under Covariate Shift Without Labels
Jakub Białek, Juhani Kivimäki, Wojciech Kuberski et al.
Improved Balanced Classification with Theoretically Grounded Loss Functions
Corinna Cortes, Mehryar Mohri, Yutao Zhong
When Are Concepts Erased From Diffusion Models?
Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training
Will Merrill, Shane Arora, Dirk Groeneveld et al.
On scalable and efficient training of diffusion samplers
Minkyu Kim, Kiyoung Seong, Dongyeop Woo et al.
Predicting Empirical AI Research Outcomes with Language Models
Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Bingquan Dai, Luo Li, Qihong Tang et al.
Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis
Yunwei Ren, Jason Lee
Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology
Wenhao Tang, Rong Qin, Heng Fang et al.
Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
Haoran Sun, Yankai Jiang, Wenjie Lou et al.
Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting
ChengAo Shen, Wenchao Yu, Ziming Zhao et al.
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Runa Eschenhagen, Aaron Defazio, Tsung-Hsien Lee et al.
VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
Patrick Yubeaton, Andre Nakkab, Weihua Xiao et al.
FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies
Dongyue Lu, Lingdong Kong, Gim Hee Lee et al.
Thinker: Learning to Think Fast and Slow
Stephen Chung, Wenyu Du, Jie Fu