Most Cited ICLR "linear recurrent networks" Papers
6,124 papers found • Page 7 of 31
Conference
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello, Lili Yu, Yixin Nie et al.
Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain
Marcus J. Min, Yangruibo Ding, Luca Buratti et al.
O(d/T) Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions
Gen Li, Yuling Yan
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
Maximillian Chen, Ruoxi Sun, Tomas Pfister et al.
Computational Limits of Low-Rank Adaptation (LoRA) Fine-Tuning for Transformer Models
Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo et al.
SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning
Yichen Wu, Hongming Piao, Long-Kai Huang et al.
Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time
Yuzhou Gu, Zhao Song, Junze Yin et al.
Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming
Yilun Hao, Yang Zhang, Chuchu Fan
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang, Dian Yu, Baolin Peng et al.
ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
Ezra Karger, Houtan Bastani, Chen Yueh-Han et al.
Text4Seg: Reimagining Image Segmentation as Text Generation
Mengcheng Lan, Chaofeng Chen, Yue Zhou et al.
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li, Sen Mei, Zhenghao Liu et al.
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
Gregor Bachmann, Sotiris Anagnostidis, Albert Pumarola et al.
Scalable Language Model with Generalized Continual Learning
Bohao PENG, Zhuotao Tian, Shu Liu et al.
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Ruizhe Shi, Yuyao Liu, Yanjie Ze et al.
Early Stopping Against Label Noise Without Validation Data
Suqin Yuan, Lei Feng, Tongliang Liu
Let Models Speak Ciphers: Multiagent Debate through Embeddings
Chau Pham, Boyi Liu, Yingxiang Yang et al.
How Does Unlabeled Data Provably Help Out-of-Distribution Detection?
Xuefeng Du, Zhen Fang, Ilias Diakonikolas et al.
What to align in multimodal contrastive learning?
Benoit Dufumier, Javiera Castillo Navarro, Devis Tuia et al.
Neural Monge Map estimation and its applications
Shaojun Ma, Yongxin Chen, Hao-Min Zhou et al.
Revisiting text-to-image evaluation with Gecko: on metrics, prompts, and human rating
Olivia Wiles, Chuhan Zhang, Isabela Albuquerque et al.
On the Relation between Trainability and Dequantization of Variational Quantum Learning Models
Elies Gil-Fuster, Casper Gyurik, Adrian Perez-Salinas et al.
Random Feature Amplification: Feature Learning and Generalization in Neural Networks
Spencer Frei, Niladri Chatterji, Peter L. Bartlett
AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
Ximing Lu, Melanie Sclar, Skyler Hallinan et al.
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo
Haque Ishfaq, Qingfeng Lan, Pan Xu et al.
LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection
Sifan Zhou, Liang Li, Xinyu Zhang et al.
Scaling Wearable Foundation Models
Girish Narayanswamy, Xin Liu, Kumar Ayush et al.
Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection
Jiawei Liang, Siyuan Liang, Aishan Liu et al.
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman, Alexei Efros, Jacob Steinhardt
gRNAde: Geometric Deep Learning for 3D RNA inverse design
Chaitanya Joshi, Arian Jamasb, Ramon Viñas et al.
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Wenda Xu, Rujun Han, Zifeng Wang et al.
Gramian Multimodal Representation Learning and Alignment
Giordano Cicchetti, Eleonora Grassucci, Luigi Sigillo et al.
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Junkang Wu, Yuexiang Xie, Zhengyi Yang et al.
The Hidden Language of Diffusion Models
Hila Chefer, Oran Lang, Mor Geva et al.
Training Unbiased Diffusion Models From Biased Dataset
Yeongmin Kim, Byeonghu Na, Minsang Park et al.
Logical Languages Accepted by Transformer Encoders with Hard Attention
Pablo Barcelo, Alexander Kozachinskiy, Anthony W. Lin et al.
CABINET: Content Relevance-based Noise Reduction for Table Question Answering
Sohan Patnaik, Heril Changwal, Milan Aggarwal et al.
Memorization Capacity of Multi-Head Attention in Transformers
Sadegh Mahdavi, Renjie Liao, Christos Thrampoulidis
Don't Play Favorites: Minority Guidance for Diffusion Models
Soobin Um, Suhyeon Lee, Jong Chul YE
SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training
Kazem Meidani, Parshin Shojaee, Chandan Reddy et al.
VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing
Xiangpeng Yang, Linchao Zhu, Hehe Fan et al.
Hyper-Connections
Defa Zhu, Hongzhi Huang, Zihao Huang et al.
AgentRefine: Enhancing Agent Generalization through Refinement Tuning
Dayuan Fu, Keqing He, Yejie Wang et al.
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
CHEN CHEN, Ruizhe Li, Yuchen Hu et al.
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin, Shangqian Gao, James Smith et al.
GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers
Takeru Miyato, Bernhard Jaeger, Max Welling et al.
Self-Boosting Large Language Models with Synthetic Preference Data
Qingxiu Dong, Li Dong, Xingxing Zhang et al.
Get more for less: Principled Data Selection for Warming Up Fine-Tuning in LLMs
Feiyang Kang, Hoang Anh Just, Yifan Sun et al.
CPPO: Continual Learning for Reinforcement Learning with Human Feedback
Han Zhang, Yu Lei, Lin Gui et al.
InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences
Hongkai Zheng, Wenda Chu, Bingliang Zhang et al.
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
Katie Matton, Robert Ness, John Guttag et al.
Context-Alignment: Activating and Enhancing LLMs Capabilities in Time Series
Yuxiao Hu, Qian Li, Dongxiao Zhang et al.
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
Chi Zhang, Huaping Zhong, Kuan Zhang et al.
OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation
Yuchen Lin, Chenguo Lin, Jianjin Xu et al.
Interpretable Diffusion via Information Decomposition
Xianghao Kong, Ollie Liu, Han Li et al.
CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences
Ziran Qin, Yuchen Cao, Mingbao Lin et al.
CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
Yang Liu, Chuanchen Luo, Zhongkai Mao et al.
ICLR: In-Context Learning of Representations
Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana et al.
Number Cookbook: Number Understanding of Language Models and How to Improve It
Haotong Yang, Yi Hu, Shijia Kang et al.
Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting
Rong Dai, Yonggang Zhang, Ang Li et al.
CausalLM is not optimal for in-context learning
Nan Ding, Tomer Levinboim, Jialin Wu et al.
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
Xiu Yuan, Tongzhou Mu, Stone Tao et al.
TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Aiwei Liu, Haoping Bai, Zhiyun Lu et al.
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Yingyu Liang, Jiangxuan Long, Zhenmei Shi et al.
Sin3DM: Learning a Diffusion Model from a Single 3D Textured Shape
Rundi Wu, Ruoshi Liu, Carl Vondrick et al.
Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models Trained on Corrupted Data
Asad Aali, Giannis Daras, Brett Levac et al.
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong, Lujun Li, Yuedong Zhong et al.
PeFLL: Personalized Federated Learning by Learning to Learn
Jonathan Scott, Hossein Zakerinia, Christoph Lampert
Exploring Diffusion Time-steps for Unsupervised Representation Learning
Zhongqi Yue, Zhongqi Yue, Jiankun Wang et al.
Language Model Alignment in Multilingual Trolley Problems
Zhijing Jin, Max Kleiman-Weiner, Giorgio Piatti et al.
LILO: Learning Interpretable Libraries by Compressing and Documenting Code
Gabriel Grand, Lio Wong, Maddy Bowers et al.
Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View
Xuan Liu, Jie ZHANG, HaoYang Shang et al.
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Peng Xia, Siwei Han, Shi Qiu et al.
From Tokens to Words: On the Inner Lexicon of LLMs
Guy Kaplan, Matanel Oren, Yuval Reif et al.
Initializing Models with Larger Ones
Zhiqiu Xu, Yanjie Chen, Kirill Vishniakov et al.
Image Inpainting via Tractable Steering of Diffusion Models
Anji Liu, Mathias Niepert, Guy Van den Broeck
InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists
Yulu Gan, Sung Woo Park, Alexander Schubert et al.
Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation
Yuan Yuan, Chenyang Shao, Jingtao Ding et al.
SEGNO: Generalizing Equivariant Graph Neural Networks with Physical Inductive Biases
Yang Liu, Jiashun Cheng, Haihong Zhao et al.
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
Saman Kazemkhani, Aarav Pandya, Daphne Cornelisse et al.
Forgetting Transformer: Softmax Attention with a Forget Gate
Zhixuan Lin, Evgenii Nikishin, Xu He et al.
Herald: A Natural Language Annotated Lean 4 Dataset
Guoxiong Gao, Yutong Wang, Jiedong Jiang et al.
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
Leqi Shen, Tianxiang Hao, Tao He et al.
Domain-Agnostic Molecular Generation with Chemical Feedback
Yin Fang, Ningyu Zhang, Zhuo Chen et al.
CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?
Ibrahim Alabdulmohsin, Xiao Wang, Andreas Steiner et al.
CViT: Continuous Vision Transformer for Operator Learning
Sifan Wang, Jacob Seidman, Shyam Sankaran et al.
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding, Xiaoyu Liu, Zhijun Tu et al.
STAMP: Scalable Task- And Model-agnostic Collaborative Perception
Xiangbo Gao, Runsheng Xu, Jiachen Li et al.
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Siyuan Qi, Shuo Chen, Yexin Li et al.
FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs
Sepehr Dehdashtian, Lan Wang, Vishnu Boddeti
System 1.x: Learning to Balance Fast and Slow Planning with Language Models
Swarnadeep Saha, Archiki Prasad, Justin Chen et al.
SparseDFF: Sparse-View Feature Distillation for One-Shot Dexterous Manipulation
Qianxu Wang, Haotong Zhang, Congyue Deng et al.
Sparse autoencoders reveal selective remapping of visual concepts during adaptation
Hyesu Lim, Jinho Choi, Jaegul Choo et al.
Longhorn: State Space Models are Amortized Online Learners
Bo Liu, Rui Wang, Lemeng Wu et al.
GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion
Xueyi Liu, Li Yi
Transformer Fusion with Optimal Transport
Moritz Imfeld, Jacopo Graldi, Marco Giordano et al.
PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance
Haohan Weng, Yikai Wang, Tong Zhang et al.
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling
Zhicheng YANG, Yiwei Wang, Yinya Huang et al.
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang, Saksham Suri, Yixuan Ren et al.
NECO: NEural Collapse Based Out-of-distribution detection
Mouïn Ben Ammar, Nacim Belkhir, Sebastian Popescu et al.
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Ziteng Wang, Jun Zhu, Jianfei Chen
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
Eduard Gorbunov, Nazarii Tupitsa, Sayantan Choudhury et al.
McEval: Massively Multilingual Code Evaluation
Linzheng Chai, Shukai Liu, Jian Yang et al.
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning
Chenyu Zhang, Han Wang, Aritra Mitra et al.
Fair and Efficient Contribution Valuation for Vertical Federated Learning
Zhenan Fan, Huang Fang, Xinglu Wang et al.
From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs
Alireza Rezazadeh, Zichao Li, Wei Wei et al.
REEF: Representation Encoding Fingerprints for Large Language Models
Jie Zhang, Dongrui Liu, Chen Qian et al.
Spurious Forgetting in Continual Learning of Language Models
Junhao Zheng, Xidi Cai, Shengjie Qiu et al.
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Lucas D. Lingle
GOFA: A Generative One-For-All Model for Joint Graph Language Modeling
Lecheng Kong, Jiarui Feng, Hao Liu et al.
Closing the Curious Case of Neural Text Degeneration
Matthew Finlayson, John Hewitt, Alexander Koller et al.
Feature emergence via margin maximization: case studies in algebraic tasks
Depen Morwani, Benjamin Edelman, Costin-Andrei Oncescu et al.
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
Junyan Ye, Baichuan Zhou, Zilong Huang et al.
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Doohyuk Jang, Sihwan Park, June Yong Yang et al.
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Sihang Li, Jin Huang, Jiaxi Zhuang et al.
Image Watermarks are Removable using Controllable Regeneration from Clean Noise
Yepeng Liu, Yiren Song, Hai Ci et al.
Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs
Qi Wu, Yubo Zhao, Yifan Wang et al.
Long-Short-Range Message-Passing: A Physics-Informed Framework to Capture Non-Local Interaction for Scalable Molecular Dynamics Simulation
Yunyang Li, Yusong Wang, Lin Huang et al.
Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao, Sizhe Dang, Haishan Ye et al.
Achieving Human Parity in Content-Grounded Datasets Generation
Asaf Yehudai, Boaz Carmeli, Yosi Mass et al.
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
Xiaojuan Wang, Boyang Zhou, Brian Curless et al.
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min, Enrique Mallada, Rene Vidal
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Jiacheng Chen, Tianhao Liang, Sherman Siu et al.
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Tsung-Han Wu, Giscard Biamby, Jerome Quenum et al.
DIFFTACTILE: A Physics-based Differentiable Tactile Simulator for Contact-rich Robotic Manipulation
Zilin Si, Gu Zhang, Qingwei Ben et al.
Backdoor Federated Learning by Poisoning Backdoor-Critical Layers
Haomin Zhuang, Mingxian Yu, Hao Wang et al.
Learning to Reject with a Fixed Predictor: Application to Decontextualization
Christopher Mohri, Daniel Andor, Eunsol Choi et al.
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
Wenbo Hu, Jia-Chen Gu, Zi-Yi Dou et al.
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge
Haomiao Xiong, Zongxin Yang, Jiazuo Yu et al.
Understanding In-Context Learning from Repetitions
Jianhao (Elliott) Yan, Jin Xu, Chiyu Song et al.
Multi-Scale Representations by Varying Window Attention for Semantic Segmentation
Haotian Yan, Ming Wu, Chuang Zhang
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
zehan wang, Ziang Zhang, Minjie Hong et al.
Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making
Jeonghye Kim, Su Young Lee, Woojun Kim et al.
On Large Language Model Continual Unlearning
Chongyang Gao, Lixu Wang, Kaize Ding et al.
InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences
Chenyang Zhu, Kai Li, Yue Ma et al.
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
Zhiyuan Zhou, Andy Peng, Qiyang Li et al.
Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide, Josh Engels, Eric Michaud et al.
RetroBridge: Modeling Retrosynthesis with Markov Bridges
Ilia Igashov, Arne Schneuing, Marwin Segler et al.
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
Jixun Yao, Hexin Liu, CHEN CHEN et al.
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
Moritz Reuss, Jyothish Pari, Pulkit Agrawal et al.
Understanding Addition in Transformers
Philip Quirke, Fazl Barez
Learning the greatest common divisor: explaining transformer predictions
François Charton
Can Knowledge Editing Really Correct Hallucinations?
Baixiang Huang, Canyu Chen, Xiongxiao Xu et al.
ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
Zhengzhuo Xu, Bowen Qu, Yiyan Qi et al.
Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection
Lichen Bai, Shitong Shao, zikai zhou et al.
Multimodal Situational Safety
Kaiwen Zhou, Chengzhi Liu, Xuandong Zhao et al.
Object-Aware Inversion and Reassembly for Image Editing
Zhen Yang, Ganggui Ding, Wen Wang et al.
Parallelizing non-linear sequential models over the sequence length
Yi Heng Lim, Qi Zhu, Joshua Selfridge et al.
Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors
Weixuan Wang, JINGYUAN YANG, Wei Peng
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Maojia Song, Shang Hong Sim, Rishabh Bhardwaj et al.
Ghost on the Shell: An Expressive Representation of General 3D Shapes
Zhen Liu, Yao Feng, Yuliang Xiu et al.
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding
Zhongyi Shui, Jianpeng Zhang, Weiwei Cao et al.
Holistically Evaluating the Environmental Impact of Creating Language Models
Jacob Morrison, Clara Na, Jared Fernandez et al.
Learning Energy Decompositions for Partial Inference in GFlowNets
Hyosoon Jang, Minsu Kim, Sungsoo Ahn
How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework
Yinuo Ren, Haoxuan Chen, Grant Rotskoff et al.
Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations
Xinyue Xu, Yi Qin, Lu Mi et al.
ASID: Active Exploration for System Identification in Robotic Manipulation
Marius Memmel, Andrew Wagenmaker, Chuning Zhu et al.
A Formal Framework for Understanding Length Generalization in Transformers
Xinting Huang, Andy Yang, Satwik Bhattamishra et al.
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Zhenyu Pan, Haozheng Luo, Manling Li et al.
Biased Temporal Convolution Graph Network for Time Series Forecasting with Missing Values
Xiaodan Chen, Xiucheng Li, Bo Liu et al.
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
Yu Ying Chiu, Liwei Jiang, Yejin Choi
PolyVoice: Language Models for Speech to Speech Translation
Qianqian Dong, Zhiying Huang, Qiao Tian et al.
CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression
Yu-Ting Zhan, Cheng-Yuan Ho, He-Bi Yang et al.
Multimodal Patient Representation Learning with Missing Modalities and Labels
Zhenbang Wu, Anant Dadu, Nicholas Tustison et al.
Batched Low-Rank Adaptation of Foundation Models
Yeming Wen, Swarat Chaudhuri
On the Stability of Expressive Positional Encodings for Graphs
Yinan Huang, William Lu, Joshua Robinson et al.
Machine Unlearning Fails to Remove Data Poisoning Attacks
Martin Pawelczyk, Jimmy Di, Yiwei Lu et al.
Posterior Sampling Based on Gradient Flows of the MMD with Negative Distance Kernel
Paul Hagemann, Johannes Hertrich, Fabian Altekrüger et al.
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
Justin Deschenaux, Caglar Gulcehre
Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron
Yiran Zhao, Wenxuan Zhang, Yuxi Xie et al.
Copula Conformal prediction for multi-step time series prediction
Sophia Sun, Rose Yu
GraphArena: Evaluating and Exploring Large Language Models on Graph Computation
Jianheng Tang, Qifan Zhang, Yuhan Li et al.
Can Large Language Models Understand Symbolic Graphics Programs?
Zeju Qiu, Weiyang Liu, Haiwen Feng et al.
UniTabE: A Universal Pretraining Protocol for Tabular Foundation Model in Data Science
Yazheng Yang, Yuqi Wang, Guang Liu et al.
Scaling physics-informed hard constraints with mixture-of-experts
Nithin Chalapathi, Yiheng Du, Aditi Krishnapriyan
Energy-Weighted Flow Matching for Offline Reinforcement Learning
Shiyuan Zhang, Weitong Zhang, Quanquan Gu
PhyloGFN: Phylogenetic inference with generative flow networks
MING YANG ZHOU, Zichao Yan, Elliot Layne et al.
Contrastive Learning is Spectral Clustering on Similarity Graph
Zhiquan Tan, Yifan Zhang, Jingqin Yang et al.
CURIE: Evaluating LLMs on Multitask Scientific Long-Context Understanding and Reasoning
Hao Cui, Zahra Shamsi, Gowoon Cheon et al.
Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
Sheng Liu, Haotian Ye, James Y Zou
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
Cong Chen, Mingyu Liu, Chenchen Jing et al.
Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
Ziyue Li, Tianyi Zhou
The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling
Andre Cornman, Jacob West-Roberts, Antonio Camargo et al.
NOLA: Compressing LoRA using Linear Combination of Random Basis
Soroush Abbasi Koohpayegani, K L Navaneet, Parsa Nooralinejad et al.
Competing Large Language Models in Multi-Agent Gaming Environments
Jen-Tse Huang, Eric John Li, Man Ho LAM et al.
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
Zimu Lu, Aojun Zhou, Ke Wang et al.
Improving Uncertainty Estimation through Semantically Diverse Language Generation
Lukas Aichberger, Kajetan Schweighofer, Mykyta Ielanskyi et al.
Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping
Zijian Liu, Zhengyuan Zhou
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
Dongping Chen, Yue Huang, Siyuan Wu et al.
Unified Generative Modeling of 3D Molecules with Bayesian Flow Networks
Yuxuan Song, Jingjing Gong, Hao Zhou et al.
DREAM: Dual Structured Exploration with Mixup for Open-set Graph Domain Adaption
Nan Yin, Mengzhu Wang, Mengzhu Wang et al.
Denoising Autoregressive Transformers for Scalable Text-to-Image Generation
Jiatao Gu, Yuyang Wang, Yizhe Zhang et al.
The LLM Surgeon
Tycho van der Ouderaa, Markus Nagel, Mart van Baalen et al.
Inherently Interpretable Time Series Classification via Multiple Instance Learning
Joseph Early, Gavin Cheung, Kurt Cutajar et al.
The Superposition of Diffusion Models Using the Itô Density Estimator
Marta Skreta, Lazar Atanackovic, Joey Bose et al.
Compressed Context Memory for Online Language Model Interaction
Jang-Hyun Kim, Junyoung Yeom, Sangdoo Yun et al.
A Geometric Framework for Understanding Memorization in Generative Models
Brendan Ross, Hamidreza Kamkari, Tongzi Wu et al.
CREAM: Consistency Regularized Self-Rewarding Language Models
Zhaoyang Wang, Weilei He, Zhiyuan Liang et al.
Diffusion-based Neural Network Weights Generation
Bedionita Soro, Bruno Andreis, Hayeon Lee et al.
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
Changle Qu, Sunhao Dai, Xiaochi Wei et al.