Most Cited ICLR "compositional text-to-image generation" Papers
6,124 papers found • Page 4 of 31
Conference
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
Ziniu Li, Congliang Chen, Tian Xu et al.
LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
Zhe Li, Weihao Yuan, Yisheng He et al.
CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning
Ji Qi, Ming Ding, Weihan Wang et al.
LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection
Sifan Zhou, Liang Li, Xinyu Zhang et al.
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
CHEN CHEN, Ruizhe Li, Yuchen Hu et al.
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello, Lili Yu, Yixin Nie et al.
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Yucheng Li, Huiqiang Jiang, Qianhui Wu et al.
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang, Pei Zhang, Baosong Yang et al.
Can LLMs Understand Time Series Anomalies?
Zihao Zhou, Rose Yu
Random Feature Amplification: Feature Learning and Generalization in Neural Networks
Spencer Frei, Niladri Chatterji, Peter L. Bartlett
AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
Ximing Lu, Melanie Sclar, Skyler Hallinan et al.
CPPO: Continual Learning for Reinforcement Learning with Human Feedback
Han Zhang, Yu Lei, Lin Gui et al.
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
Emily Cheng, Diego Doimo, Corentin Kervadec et al.
SpikePoint: An Efficient Point-based Spiking Neural Network for Event Cameras Action Recognition
Hongwei Ren, Yue ZHOU, Xiaopeng LIN et al.
Training Unbiased Diffusion Models From Biased Dataset
Yeongmin Kim, Byeonghu Na, Minsang Park et al.
Fair and Efficient Contribution Valuation for Vertical Federated Learning
Zhenan Fan, Huang Fang, Xinglu Wang et al.
A Closer Look at Machine Unlearning for Large Language Models
Xiaojian Yuan, Tianyu Pang, Chao Du et al.
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning
Chenyu Zhang, Han Wang, Aritra Mitra et al.
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step
Mingyuan Zhou, Huangjie Zheng, Yi Gu et al.
VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing
Xiangpeng Yang, Linchao Zhu, Hehe Fan et al.
REEF: Representation Encoding Fingerprints for Large Language Models
Jie Zhang, Dongrui Liu, Chen Qian et al.
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li, Sen Mei, Zhenghao Liu et al.
CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
Yang Liu, Chuanchen Luo, Zhongkai Mao et al.
ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
Ezra Karger, Houtan Bastani, Chen Yueh-Han et al.
Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation
Yuan Yuan, Chenyang Shao, Jingtao Ding et al.
GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers
Takeru Miyato, Bernhard Jaeger, Max Welling et al.
PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance
Haohan Weng, Yikai Wang, Tong Zhang et al.
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang, Dian Yu, Baolin Peng et al.
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong, Lujun Li, Yuedong Zhong et al.
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Peng Jin, Bo Zhu, Yuan Li et al.
System 1.x: Learning to Balance Fast and Slow Planning with Language Models
Swarnadeep Saha, Archiki Prasad, Justin Chen et al.
Number Cookbook: Number Understanding of Language Models and How to Improve It
Haotong Yang, Yi Hu, Shijia Kang et al.
A New Perspective on Shampoo's Preconditioner
Depen Morwani, Itai Shapira, Nikhil Vyas et al.
Image Inpainting via Tractable Steering of Diffusion Models
Anji Liu, Mathias Niepert, Guy Van den Broeck
From Tokens to Words: On the Inner Lexicon of LLMs
Guy Kaplan, Matanel Oren, Yuval Reif et al.
What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Kunhao Zheng, Juliette Decugis, Jonas Gehring et al.
Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs
Aldo Pareja, Nikhil Shivakumar Nayak, Hao Wang et al.
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Tsung-Han Wu, Giscard Biamby, Jerome Quenum et al.
One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
Tao Liu, Kai Wang, Senmao Li et al.
Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models Trained on Corrupted Data
Asad Aali, Giannis Daras, Brett Levac et al.
NECO: NEural Collapse Based Out-of-distribution detection
Mouïn Ben Ammar, Nacim Belkhir, Sebastian Popescu et al.
McEval: Massively Multilingual Code Evaluation
Linzheng Chai, Shukai Liu, Jian Yang et al.
Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs
Qi Wu, Yubo Zhao, Yifan Wang et al.
Can Knowledge Editing Really Correct Hallucinations?
Baixiang Huang, Canyu Chen, Xiongxiao Xu et al.
Self-Boosting Large Language Models with Synthetic Preference Data
Qingxiu Dong, Li Dong, Xingxing Zhang et al.
From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs
Alireza Rezazadeh, Zichao Li, Wei Wei et al.
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang, Saksham Suri, Yixuan Ren et al.
Copula Conformal prediction for multi-step time series prediction
Sophia Sun, Rose Yu
Biased Temporal Convolution Graph Network for Time Series Forecasting with Missing Values
Xiaodan Chen, Xiucheng Li, Bo Liu et al.
InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences
Chenyang Zhu, Kai Li, Yue Ma et al.
Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
Sheng Liu, Haotian Ye, James Y Zou
Logical Languages Accepted by Transformer Encoders with Hard Attention
Pablo Barcelo, Alexander Kozachinskiy, Anthony W. Lin et al.
PolyVoice: Language Models for Speech to Speech Translation
Qianqian Dong, Zhiying Huang, Qiao Tian et al.
Understanding In-Context Learning from Repetitions
Jianhao (Elliott) Yan, Jin Xu, Chiyu Song et al.
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
Wenbo Hu, Jia-Chen Gu, Zi-Yi Dou et al.
Ghost on the Shell: An Expressive Representation of General 3D Shapes
Zhen Liu, Yao Feng, Yuliang Xiu et al.
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
Xiaojuan Wang, Boyang Zhou, Brian Curless et al.
Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View
Xuan Liu, Jie ZHANG, HaoYang Shang et al.
Longhorn: State Space Models are Amortized Online Learners
Bo Liu, Rui Wang, Lemeng Wu et al.
The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling
Andre Cornman, Jacob West-Roberts, Antonio Camargo et al.
STAMP: Scalable Task- And Model-agnostic Collaborative Perception
Xiangbo Gao, Runsheng Xu, Jiachen Li et al.
Gramian Multimodal Representation Learning and Alignment
Giordano Cicchetti, Eleonora Grassucci, Luigi Sigillo et al.
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin, Shangqian Gao, James Smith et al.
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Maojia Song, Shang Hong Sim, Rishabh Bhardwaj et al.
Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao, Sizhe Dang, Haishan Ye et al.
OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation
Yuchen Lin, Chenguo Lin, Jianjin Xu et al.
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Doohyuk Jang, Sihwan Park, June Yong Yang et al.
Parallelizing non-linear sequential models over the sequence length
Yi Heng Lim, Qi Zhu, Joshua Selfridge et al.
Unified Generative Modeling of 3D Molecules with Bayesian Flow Networks
Yuxuan Song, Jingjing Gong, Hao Zhou et al.
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge
Haomiao Xiong, Zongxin Yang, Jiazuo Yu et al.
Spurious Forgetting in Continual Learning of Language Models
Junhao Zheng, Xidi Cai, Shengjie Qiu et al.
Diffusion-based Neural Network Weights Generation
Bedionita Soro, Bruno Andreis, Hayeon Lee et al.
Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport
Zhenyi Zhang, Tiejun Li, Peijie Zhou
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
Junyan Ye, Baichuan Zhou, Zilong Huang et al.
Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps
Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi et al.
Image Watermarks are Removable using Controllable Regeneration from Clean Noise
Yepeng Liu, Yiren Song, Hai Ci et al.
Simple ReFlow: Improved Techniques for Fast Flow Models
Beomsu Kim, Yu-Guan Hsieh, Michal Klein et al.
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
Keon Lee, Dong Won Kim, Jaehyeon Kim et al.
Herald: A Natural Language Annotated Lean 4 Dataset
Guoxiong Gao, Yutong Wang, Jiedong Jiang et al.
DreamFlow: High-quality text-to-3D generation by Approximating Probability Flow
Kyungmin Lee, Kihyuk Sohn, Jinwoo Shin
DREAM: Dual Structured Exploration with Mixup for Open-set Graph Domain Adaption
Nan Yin, Mengzhu Wang, Mengzhu Wang et al.
Machine Unlearning Fails to Remove Data Poisoning Attacks
Martin Pawelczyk, Jimmy Di, Yiwei Lu et al.
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min, Enrique Mallada, Rene Vidal
GOFA: A Generative One-For-All Model for Joint Graph Language Modeling
Lecheng Kong, Jiarui Feng, Hao Liu et al.
Can Large Language Models Understand Symbolic Graphics Programs?
Zeju Qiu, Weiyang Liu, Haiwen Feng et al.
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Jiacheng Chen, Tianhao Liang, Sherman Siu et al.
Context-Alignment: Activating and Enhancing LLMs Capabilities in Time Series
Yuxiao Hu, Qian Li, Dongxiao Zhang et al.
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Ziteng Wang, Jun Zhu, Jianfei Chen
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Wenda Xu, Rujun Han, Zifeng Wang et al.
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
Zhiyuan Zhou, Andy Peng, Qiyang Li et al.
Masked Structural Growth for 2x Faster Language Model Pre-training
Yiqun Yao, Zheng Zhang, Jing Li et al.
Improving Uncertainty Estimation through Semantically Diverse Language Generation
Lukas Aichberger, Kajetan Schweighofer, Mykyta Ielanskyi et al.
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Zhenyu Pan, Haozheng Luo, Manling Li et al.
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li, Shujie HU, Shujie LIU et al.
Energy-guided Entropic Neural Optimal Transport
Petr Mokrov, Alexander Korotin, Alexander Kolesov et al.
PersonalLLM: Tailoring LLMs to Individual Preferences
Thomas Zollo, Andrew Siah, Naimeng Ye et al.
Language-Image Models with 3D Understanding
Jang Hyun Cho, Boris Ivanovic, Yulong Cao et al.
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
Zimu Lu, Aojun Zhou, Ke Wang et al.
Backdoor Federated Learning by Poisoning Backdoor-Critical Layers
Haomin Zhuang, Mingxian Yu, Hao Wang et al.
Sparse autoencoders reveal selective remapping of visual concepts during adaptation
Hyesu Lim, Jinho Choi, Jaegul Choo et al.
Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
Ziyue Li, Tianyi Zhou
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Junkang Wu, Yuexiang Xie, Zhengyi Yang et al.
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling
Zhicheng YANG, Yiwei Wang, Yinya Huang et al.
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
Benjamin Feuer, Micah Goldblum, Teresa Datta et al.
R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image Generation
Jiayu Xiao, Henglei Lv, Henglei Lv et al.
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Xiaoshuai Song, Muxi Diao, Guanting Dong et al.
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding
Zhongyi Shui, Jianpeng Zhang, Weiwei Cao et al.
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
Jixun Yao, Hexin Liu, CHEN CHEN et al.
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
Xiu Yuan, Tongzhou Mu, Stone Tao et al.
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
Eduard Gorbunov, Nazarii Tupitsa, Sayantan Choudhury et al.
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
Yu Ying Chiu, Liwei Jiang, Yejin Choi
CViT: Continuous Vision Transformer for Operator Learning
Sifan Wang, Jacob Seidman, Shyam Sankaran et al.
Predicting Emergent Abilities with Infinite Resolution Evaluation
Shengding Hu, Xin Liu, Xu Han et al.
Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection
Lichen Bai, Shitong Shao, zikai zhou et al.
Improved baselines for vision-language pre-training
Jakob Verbeek, Enrico Fini, Michal Drozdzal et al.
CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression
Yu-Ting Zhan, Cheng-Yuan Ho, He-Bi Yang et al.
Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement
Dominik Grimm, Jonathan Pirnay
InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences
Hongkai Zheng, Wenda Chu, Bingliang Zhang et al.
ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents
Haiyang SHEN, Yue Li, Desong Meng et al.
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin, Xinyu Wei, Renrui Zhang et al.
InterMask: 3D Human Interaction Generation via Collaborative Masked Modeling
Muhammad Gohar Javed, chuan guo, Li Cheng et al.
Efficient and Scalable Graph Generation through Iterative Local Expansion
Andreas Bergmeister, Karolis Martinkus, Nathanaël Perraudin et al.
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
Cong Lu, Shengran Hu, Jeff Clune
Peering Through Preferences: Unraveling Feedback Acquisition for Aligning Large Language Models
Hritik Bansal, John Dang, Aditya Grover
Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models
Zhaowei Zhu, Jialu Wang, Hao Cheng et al.
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Lucas D. Lingle
SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning
Yichen Wu, Hongming Piao, Long-Kai Huang et al.
Fast Feedforward 3D Gaussian Splatting Compression
Yihang Chen, Qianyi Wu, Mengyao Li et al.
On Large Language Model Continual Unlearning
Chongyang Gao, Lixu Wang, Kaize Ding et al.
Zero Bubble (Almost) Pipeline Parallelism
Penghui Qi, Xinyi Wan, Guangxing Huang et al.
MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos with Depth Priors
Qingming LIU, Yuan Liu, Jiepeng Wang et al.
Multimodal Patient Representation Learning with Missing Modalities and Labels
Zhenbang Wu, Anant Dadu, Nicholas Tustison et al.
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
Qizhou Wang, Jin Zhou, (Andrew) Zhanke Zhou et al.
Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation
Divyat Mahajan, Ioannis Mitliagkas, Brady Neal et al.
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
Justin Deschenaux, Caglar Gulcehre
Can LLMs Solve Longer Math Word Problems Better?
Xin Xu, Tong Xiao, Zitong Chao et al.
An Intelligent Agentic System for Complex Image Restoration Problems
Kaiwen Zhu, Jinjin Gu, Zhiyuan You et al.
Adversarial Search Engine Optimization for Large Language Models
Fredrik Nestaas, Edoardo Debenedetti, Florian Tramer
Scaling Laws for Associative Memories
Vivien Cabannes, Elvis Dohmatob, Alberto Bietti
Understanding Factual Recall in Transformers via Associative Memories
Eshaan Nichani, Jason Lee, Alberto Bietti
Moral Alignment for LLM Agents
Elizaveta Tennant, Stephen Hailes, Mirco Musolesi
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
Lawrence Jang, Yinheng Li, Dan Zhao et al.
A Formal Framework for Understanding Length Generalization in Transformers
Xinting Huang, Andy Yang, Satwik Bhattamishra et al.
RLIF: Interactive Imitation Learning as Reinforcement Learning
Jianlan Luo, Perry Dong, Yuexiang Zhai et al.
Steering Large Language Models between Code Execution and Textual Reasoning
Yongchao Chen, Harsh Jhamtani, Srinagesh Sharma et al.
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
Cong Chen, Mingyu Liu, Chenchen Jing et al.
Learning to design protein-protein interactions with enhanced generalization
Anton Bushuiev, Roman Bushuiev, Petr Kouba et al.
Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning
Hyun Ryu, Gyeongman Kim, Hyemin S. Lee et al.
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation
haiyang liu, Xingchao Yang, Tomoya Akiyama et al.
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
Zewei Zhang, Huan Liu, Jun Chen et al.
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
Zizheng Pan, Bohan Zhuang, De-An Huang et al.
On Error Propagation of Diffusion Models
Yangming Li, Mihaela van der Schaar
Entity-Centric Reinforcement Learning for Object Manipulation from Pixels
Dan Haramati, Tal Daniel, Aviv Tamar
Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages
Guozheng Ma, Lu Li, Sen Zhang et al.
STORM: Spatio-TempOral Reconstruction Model For Large-Scale Outdoor Scenes
Jiawei Yang, Jiahui Huang, Boris Ivanovic et al.
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation
Chengwen Qi, Ren Ma, Bowen Li et al.
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold
Lazar Atanackovic, Xi (Nicole) Zhang, Brandon Amos et al.
Inverse Constitutional AI: Compressing Preferences into Principles
Arduin Findeis, Timo Kaufmann, Eyke Hüllermeier et al.
Instructive Decoding: Instruction-Tuned Large Language Models are Self-Refiner from Noisy Instructions
Taehyeon Kim, JOONKEE KIM, Gihun Lee et al.
ADBM: Adversarial Diffusion Bridge Model for Reliable Adversarial Purification
Xiao Li, Wenxuan Sun, Huanran Chen et al.
Min-K%++: Improved Baseline for Pre-Training Data Detection from Large Language Models
Jingyang Zhang, Jingwei Sun, Eric Yeats et al.
Context-Aware Meta-Learning
Christopher Fifty, Dennis Duan, Ronald Junkins et al.
Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination
Leonardo Barcellona, Andrii Zadaianchuk, Davide Allegro et al.
Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel
Shivam Gupta, Linda Cai, Sitan Chen
Quasi-Monte Carlo for 3D Sliced Wasserstein
Khai Nguyen, Nicola Bariletto, Nhat Ho
RouteLLM: Learning to Route LLMs from Preference Data
Isaac Ong, Amjad Almahairi, Vincent Wu et al.
FLIP: Flow-Centric Generative Planning as General-Purpose Manipulation World Model
Chongkai Gao, Haozhuo Zhang, Zhixuan Xu et al.
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
Sarah Wiegreffe, Oyvind Tafjord, Yonatan Belinkov et al.
Probabilistically Rewired Message-Passing Neural Networks
Chendi Qian, Andrei Manolache, Kareem Ahmed et al.
Specialized Foundation Models Struggle to Beat Supervised Baselines
Zongzhe Xu, Ritvik Gupta, Wenduo Cheng et al.
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Haoyi Zhu, Honghui Yang, Yating Wang et al.
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
Xinpeng Wang, Chengzhi (Martin) Hu, Paul Röttger et al.
Generating CAD Code with Vision-Language Models for 3D Designs
Kamel Alrashedy, Pradyumna Tambwekar, Zulfiqar Haider Zaidi et al.
What Makes a Good Diffusion Planner for Decision Making?
Haofei Lu, Dongqi Han, Yifei Shen et al.
Energy-Weighted Flow Matching for Offline Reinforcement Learning
Shiyuan Zhang, Weitong Zhang, Quanquan Gu
The Superposition of Diffusion Models Using the Itô Density Estimator
Marta Skreta, Lazar Atanackovic, Joey Bose et al.
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Laura Ruis, Maximilian Mozes, Juhan Bae et al.
Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models
Ziyu Wang, Lejun Min, Gus Xia
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Guangchen (Eric) Lan, Dong-Jun Han, Abolfazl Hashemi et al.
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Weihao Zeng, Yuzhen Huang, Lulu Zhao et al.
FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning
Chenhao Li, Elijah Stanger-Jones, Steve Heim et al.
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
Dongping Chen, Yue Huang, Siyuan Wu et al.
LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging
Ke Wang, Nikos Dimitriadis, Alessandro Favero et al.
ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation
Kim-Celine Kahl, Carsten Lüth, Maximilian Zenk et al.
Text-to-Image Rectified Flow as Plug-and-Play Priors
Xiaofeng Yang, Cheng Chen, xulei yang et al.
Reward Guided Latent Consistency Distillation
William Wang, Jiachen Li, Weixi Feng et al.
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Luxi He, Yangsibo Huang, Weijia Shi et al.
HELMET: How to Evaluate Long-context Models Effectively and Thoroughly
Howard Yen, Tianyu Gao, Minmin Hou et al.
What's in a Prior? Learned Proximal Networks for Inverse Problems
Zhenghan Fang, Sam Buchanan, Jeremias Sulam
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Teng Xiao, Yige Yuan, Zhengyu Chen et al.
Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning
Mingyang Chen, sunhaoze, Tianpeng Li et al.
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens
Cunxiang Wang, Ruoxi Ning, Boqi Pan et al.
ICLR: In-Context Learning of Representations
Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana et al.
Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping
Zijian Liu, Zhengyuan Zhou
Some Fundamental Aspects about Lipschitz Continuity of Neural Networks
Grigory Khromov, Sidak Pal Singh
miniCTX: Neural Theorem Proving with (Long-)Contexts
Jiewen Hu, Thomas Zhu, Sean Welleck
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini, Pierre Ablin, David Grangier
L2MAC: Large Language Model Automatic Computer for Extensive Code Generation
Samuel Holt, Max Ruiz Luyten, Mihaela van der Schaar
Limits to scalable evaluation at the frontier: LLM as judge won’t beat twice the data
Florian Eddie Dorner, Vivian Nastl, Moritz Hardt
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
Wenxuan Zhang, Philip Torr, Mohamed Elhoseiny et al.