Most Cited ICLR "zero-shot approach" Papers
6,124 papers found • Page 11 of 31
Conference
Air Quality Prediction with Physics-Guided Dual Neural ODEs in Open Systems
jindong tian, Yuxuan Liang, Ronghui Xu et al.
Minimum width for universal approximation using ReLU networks on compact domain
Namjun Kim, Chanho Min, Sejun Park
Locality Sensitive Sparse Encoding for Learning World Models Online
Zichen Liu, Chao Du, Wee Sun Lee et al.
Adversarial AutoMixup
Huafeng Qin, Xin Jin, Yun Jiang et al.
Benchmarking Algorithms for Federated Domain Generalization
Ruqi Bai, Saurabh Bagchi, David Inouye
Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood
yaxuan zhu, Jianwen Xie, Yingnian Wu et al.
SemiReward: A General Reward Model for Semi-supervised Learning
Siyuan Li, Weiyang Jin, Zedong Wang et al.
DarkBench: Benchmarking Dark Patterns in Large Language Models
Esben Kran, Hieu Minh Nguyen, Akash Kundu et al.
Variance-aware Regret Bounds for Stochastic Contextual Dueling Bandits
Qiwei Di, Tao Jin, Yue Wu et al.
SLMRec: Distilling Large Language Models into Small for Sequential Recommendation
Wujiang Xu, Qitian Wu, Zujie Liang et al.
Generalization through variance: how noise shapes inductive biases in diffusion models
John Vastola
Active Learning for Neural PDE Solvers
Daniel Musekamp, Marimuthu Kalimuthu, David Holzmüller et al.
Aioli: A Unified Optimization Framework for Language Model Data Mixing
Mayee Chen, Michael Hu, Nicholas Lourie et al.
Conformal Prediction via Regression-as-Classification
Etash Guha, Shlok Natarajan, Thomas Möllenhoff et al.
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Anna Bair, Hongxu Yin, Maying Shen et al.
GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting
Changkun Liu, Shuai Chen, Yash Bhalgat et al.
Palu: KV-Cache Compression with Low-Rank Projection
Chi-Chih Chang, Wei-Cheng Lin, Chien-Yu Lin et al.
Contextual Document Embeddings
John X. Morris, Alexander Rush
De novo Protein Design Using Geometric Vector Field Networks
weian mao, Muzhi Zhu, Zheng Sun et al.
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences
Niklas Schmidinger, Lisa Schneckenreiter, Philipp Seidl et al.
Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark
Mengxi Ya, Yiming Li, Tao Dai et al.
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Yuxuan YAO, Han Wu, Mingyang LIU et al.
Understanding and Enhancing the Transferability of Jailbreaking Attacks
Runqi Lin, Bo Han, Fengwang Li et al.
VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
Yongshuo Zong, Ondrej Bohdal, Timothy Hospedales
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
Nikola Zubic, Federico Soldà, Aurelio Sulser et al.
MiniPLM: Knowledge Distillation for Pre-training Language Models
Yuxian Gu, Hao Zhou, Fandong Meng et al.
Improving Reasoning Performance in Large Language Models via Representation Engineering
Bertram Højer, Oliver Jarvis, Stefan Heinrich
Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency
Jerry Yao-Chieh Hu, Wei-Po Wang, Ammar Gilani et al.
Scalable Influence and Fact Tracing for Large Language Model Pretraining
Tyler Chang, Dheeraj Rajagopal, Tolga Bolukbasi et al.
Label-Agnostic Forgetting: A Supervision-Free Unlearning in Deep Models
Shaofei Shen, Chenhao Zhang, Yawen Zhao et al.
Efficient Multi-agent Reinforcement Learning by Planning
Qihan Liu, Jianing Ye, Xiaoteng Ma et al.
Perm: A Parametric Representation for Multi-Style 3D Hair Modeling
Chengan He, Xin Sun, Zhixin Shu et al.
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Junyan Li, Delin Chen, Yining Hong et al.
Discretization-invariance? On the Discretization Mismatch Errors in Neural Operators
Wenhan Gao, Ruichen Xu, Yuefan Deng et al.
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai, Bingbin Liu, Andrej Risteski et al.
MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Akio Hayakawa, Masato Ishii, Takashi Shibuya et al.
Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems
Fu Luo, Xi Lin, Yaoxin Wu et al.
Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding
Yanming Liu, Xinyue Peng, Jiannan Cao et al.
Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity
Jiachen Jiang, Jinxin Zhou, Zhihui Zhu
End-to-End (Instance)-Image Goal Navigation through Correspondence as an Emergent Phenomenon
Guillaume Bono, Leonid Antsfeld, Boris Chidlovskii et al.
Energy-based Automated Model Evaluation
Ru Peng, Heming Zou, Haobo Wang et al.
Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
Haozhe Ma, Zhengding Luo, Thanh Vinh Vo et al.
No Preference Left Behind: Group Distributional Preference Optimization
Binwei Yao, Zefan Cai, Yun-Shiuan Chuang et al.
MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
Xi Jiang, Jian Li, Hanqiu Deng et al.
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
Xiyao Wang, Ruijie Zheng, Yanchao Sun et al.
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
Runyi Hu, Jie Zhang, Yiming Li et al.
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Kaiyue Wen, Huaqing Zhang, Hongzhou Lin et al.
GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation
Hongyin Zhang, Pengxiang Ding, Shangke Lyu et al.
Learning Graph Quantized Tokenizers
Limei Wang, Kaveh Hassani, Si Zhang et al.
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
Shoubin Yu, Jaehong Yoon, Mohit Bansal
Spiking Vision Transformer with Saccadic Attention
Shuai Wang, Malu Zhang, Dehao Zhang et al.
An Engorgio Prompt Makes Large Language Model Babble on
Jianshuo Dong, Ziyuan Zhang, Qingjie Zhang et al.
Defining and extracting generalizable interaction primitives from DNNs
Lu Chen, Siyu Lou, Benhao Huang et al.
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo, Haodong Wen, Shengding Hu et al.
Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Junxuan Wang, Xuyang Ge, Wentao Shu et al.
OmniKV: Dynamic Context Selection for Efficient Long-Context LLMs
Jitai Hao, Yuke Zhu, Tian Wang et al.
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model
Karsten Roth, Lukas Thede, A. Sophia Koepke et al.
Locality-aware Gaussian Compression for Fast and High-quality Rendering
Seungjoo Shin, Jaesik Park, Sunghyun Cho
Unleashing the Potential of Fractional Calculus in Graph Neural Networks with FROND
Qiyu Kang, Kai Zhao, Qinxu Ding et al.
CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph
Haitao Lin, Guojiang Zhao, Odin Zhang et al.
SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch
Chun-Liang Li, Tomas Pfister, Kihyuk Sohn et al.
Graph Sparsification via Mixture of Graphs
Guibin Zhang, Xiangguo SUN, Yanwei Yue et al.
What Makes a Good Prune? Maximal Unstructured Pruning for Maximal Cosine Similarity
Gabryel Mason-Williams, Fredrik Dahlqvist
Non-Exchangeable Conformal Risk Control
António Farinhas, Chrysoula Zerva, Dennis Ulmer et al.
DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models
Sohyun An, Hayeon Lee, Jaehyeong Jo et al.
Towards Enhancing Time Series Contrastive Learning: A Dynamic Bad Pair Mining Approach
Xiang Lan, Hanshu Yan, Shenda Hong et al.
MetaMetrics: Calibrating Metrics for Generation Tasks Using Human Preferences
Genta Winata, David Anugraha, Lucky Susanto et al.
Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention
Weitai Kang, Mengxue Qu, Jyoti Kini et al.
Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation
Kai Huang, Hanyun Yin, Heng Huang et al.
Simulating Human-like Daily Activities with Desire-driven Autonomy
Yiding Wang, Yuxuan Chen, Fangwei Zhong et al.
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics
Sebastian Sanokowski, Wilhelm Berghammer, Haoyu Wang et al.
Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Jun Chen, Haishan Ye, Mengmeng Wang et al.
RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code
Dhruv Gautam, Spandan Garg, Jinu Jang et al.
Can We Talk Models Into Seeing the World Differently?
Paul Gavrikov, Jovita Lukasik, Steffen Jung et al.
Sliced Denoising: A Physics-Informed Molecular Pre-Training Method
yuyan ni, Shikun Feng, Wei-Ying Ma et al.
Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark
Yili Wang, Yixin Liu, Xu Shen et al.
Efficient Learning with Sine-Activated Low-Rank Matrices
Yiping Ji, Hemanth Saratchandran, Cameron Gordon et al.
Blending Imitation and Reinforcement Learning for Robust Policy Improvement
Xuefeng Liu, Takuma Yoneda, Rick Stevens et al.
FedCompass: Efficient Cross-Silo Federated Learning on Heterogeneous Client Devices Using a Computing Power-Aware Scheduler
Zilinghan Li, Pranshu Chaturvedi, Shilan He et al.
Self-Supervised Contrastive Learning for Long-term Forecasting
Junwoo Park, Daehoon Gwak, Jaegul Choo et al.
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
Dexuan Ding, Lei Wang, Liyun Zhu et al.
u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
Charles Blake, Constantin Eichenberg, Josef Dean et al.
Cross-Entropy Is All You Need To Invert the Data Generating Process
Patrik Reizinger, Alice Bizeul, Attila Juhos et al.
Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN
Biswadeep Chakraborty, Beomseok Kang, Harshit Kumar et al.
PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
Botao Ren, Xue Yang, Yi Yu et al.
Prompting Fairness: Integrating Causality to Debias Large Language Models
Jingling Li, Zeyu Tang, Xiaoyu Liu et al.
Predictive, scalable and interpretable knowledge tracing on structured domains
Hanqi Zhou, Robert Bamler, Charley Wu et al.
Leave-one-out Distinguishability in Machine Learning
Jiayuan Ye, Anastasia Borovykh, Soufiane Hayou et al.
Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs
Xuandong Zhao, Lei Li, Yu-Xiang Wang
LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units
Zeyu Liu, Gourav Datta, Anni Li et al.
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
Qinghao Ye, Xianhan Zeng, Fu Li et al.
Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model
Tudor Cebere, Aurélien Bellet, Nicolas Papernot
Optimal Transport for Time Series Imputation
Hao Wang, zhengnan li, Haoxuan Li et al.
Towards Understanding Factual Knowledge of Large Language Models
Xuming Hu, Junzhe Chen, Xiaochuan Li et al.
Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
Sreyan Ghosh, Chandra Kiran Evuru, Sonal Kumar et al.
DeLLMa: Decision Making Under Uncertainty with Large Language Models
Ollie Liu, Deqing Fu, Dani Yogatama et al.
U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
Song Mei
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
Yue Yang, Shuibo Zhang, Kaipeng Zhang et al.
NeuroBack: Improving CDCL SAT Solving using Graph Neural Networks
Wenxi Wang, Yang Hu, Mohit Tiwari et al.
Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise
Enea Monzio Compagnoni, Tianlin Liu, Rustem Islamov et al.
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
Yuzhe Gu, Wenwei Zhang, Chengqi Lyu et al.
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
Manuel Brenner, Elias Weber, Georgia Koppe et al.
What Matters in Learning from Large-Scale Datasets for Robot Manipulation
Vaibhav Saxena, Matthew Bronars, Nadun Ranawaka Arachchige et al.
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki Asano et al.
AutoBencher: Towards Declarative Benchmark Construction
XIANG LI, Farzaan Kaiyom, Evan Liu et al.
Quamba: A Post-Training Quantization Recipe for Selective State Space Models
Hung-Yueh Chiang, Chi-Chih Chang, Natalia Frumkin et al.
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
Guanxing Lu, Ziwei Wang, Changliu Liu et al.
From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation
Nikita Kotelevskii, Vladimir Kondratyev, Martin Takáč et al.
A Probabilistic Perspective on Unlearning and Alignment for Large Language Models
Yan Scholten, Stephan Günnemann, Leo Schwinn
EControl: Fast Distributed Optimization with Compression and Error Control
Yuan Gao, Rustem Islamov, Sebastian Stich
Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning
Riccardo Salami, Pietro Buzzega, Matteo Mosconi et al.
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens
Ziteng Gao, Zhan Tong, Limin Wang et al.
Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality
Xuxi Chen, Yu Yang, Zhangyang Wang et al.
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis
Xiaoxiao Sun, Xingjian Leng, Zijian Wang et al.
Image Inpainting via Iteratively Decoupled Probabilistic Modeling
Wenbo Li, Xin Yu, Kun Zhou et al.
Controllable Context Sensitivity and the Knob Behind It
Julian Minder, Kevin Du, Niklas Stoehr et al.
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi, Ghazal Khalighinejad, Anej Svete et al.
Learning Efficient Positional Encodings with Graph Neural Networks
Charilaos Kanatsoulis, Evelyn Choi, Stefanie Jegelka et al.
CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images
olga fourkioti, Matt De Vries, Chris Bakal
Learning Clustering-based Prototypes for Compositional Zero-Shot Learning
Hongyu Qu, Jianan Wei, Xiangbo Shu et al.
A CLIP-Powered Framework for Robust and Generalizable Data Selection
Suorong Yang, Peng Ye, Wanli Ouyang et al.
R-MAE: Regions Meet Masked Autoencoders
Duy-Kien Nguyen, Yanghao Li, Vaibhav Aggarwal et al.
Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity
Shuo Xie, Mohamad Amin Mohamadi, Zhiyuan Li
Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene
Jiahao Wu, Rui Peng, Zhiyan Wang et al.
Identifying Representations for Intervention Extrapolation
Sorawit (James) Saengkyongam, Elan Rosenfeld, Pradeep K Ravikumar et al.
Generative Flows on Synthetic Pathway for Drug Design
Seonghwan Seo, Minsu Kim, Tony Shen et al.
Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding
Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki et al.
Improving Domain Generalization with Domain Relations
Huaxiu Yao, Xinyu Yang, Xinyi Pan et al.
When should we prefer Decision Transformers for Offline Reinforcement Learning?
Prajjwal Bhargava, Rohan Chitnis, Alborz Geramifard et al.
Generating Images with 3D Annotations Using Diffusion Models
Wufei Ma, Qihao Liu, Jiahao Wang et al.
CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation
Gaojie Lin, Jianwen Jiang, Chao Liang et al.
Block-Attention for Efficient Prefilling
Dongyang Ma, Yan Wang, Tian Lan
Self-supervised Representation Learning from Random Data Projectors
Yi Sui, Tongzi Wu, Jesse Cresswell et al.
CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects
Yoonyoung Cho, Junhyek Han, Yoontae Cho et al.
DiffEnc: Variational Diffusion with a Learned Encoder
Beatrix M. G. Nielsen, Anders Christensen, Andrea Dittadi et al.
Endless Jailbreaks with Bijection Learning
Brian R.Y. Huang, Max Li, Leonard Tang
Multimarginal Generative Modeling with Stochastic Interpolants
Michael Albergo, Nicholas Boffi, Michael Lindsey et al.
DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints
Xia Jiang, Yaoxin Wu, Chenhao Zhang et al.
Mirage: Model-agnostic Graph Distillation for Graph Classification
Mridul Gupta, Sahil Manchanda, HARIPRASAD KODAMANA et al.
Jointly-Learned Exit and Inference for a Dynamic Neural Network
Florence Regol, Joud Chataoui, Mark Coates
COME: Test-time Adaption by Conservatively Minimizing Entropy
Qingyang Zhang, Yatao Bian, Xinke Kong et al.
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
Yougang Lyu, Lingyong Yan, Zihan Wang et al.
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability
Ivan Lee, Nan Jiang, Taylor Berg-Kirkpatrick
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
Jun Luo, Chen Chen, Shandong Wu
Masked Completion via Structured Diffusion with White-Box Transformers
Druv Pai, Sam Buchanan, Ziyang Wu et al.
InstructDET: Diversifying Referring Object Detection with Generalized Instructions
Ronghao Dang, Jiangyan Feng, Haodong Zhang et al.
On Calibration of LLM-based Guard Models for Reliable Content Moderation
Hongfu Liu, Hengguan Huang, Xiangming Gu et al.
MetaOOD: Automatic Selection of OOD Detection Models
Yuehan Qin, Yichi Zhang, Yi Nian et al.
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang, Vardan Papyan
Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling
Huangjie Zheng, Zhendong Wang, Jianbo Yuan et al.
Do LLMs estimate uncertainty well in instruction-following?
Juyeon Heo, Miao Xiong, Christina Heinze-Deml et al.
What Matters to You? Towards Visual Representation Alignment for Robot Learning
Thomas Tian, Chenfeng Xu, Masayoshi Tomizuka et al.
Learning Flexible Body Collision Dynamics with Hierarchical Contact Mesh Transformer
Youn-Yeol Yu, Jeongwhan Choi, Woojin Cho et al.
S$2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic
Safa Messaoud, Billel Mokeddem, Zhenghai Xue et al.
Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
Sheryl Hsu, Omar Khattab, Chelsea Finn et al.
Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness
Baolong Bi, Shenghua Liu, Yiwei Wang et al.
SiReRAG: Indexing Similar and Related Information for Multihop Reasoning
Nan Zhang, Prafulla Kumar Choubey, Alexander Fabbri et al.
Robustifying State-space Models for Long Sequences via Approximate Diagonalization
Annan Yu, Arnur Nigmetov, Dmitriy Morozov et al.
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
Jilan Xu, Yifei Huang, Baoqi Pei et al.
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini, Shikhar Murty, Christopher Manning et al.
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
eslam Abdelrahman, Mohamed Ayman Mohamed, Mahmoud Ahmed et al.
Label-Noise Robust Diffusion Models
Byeonghu Na, Yeongmin Kim, HeeSun Bae et al.
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes
Isabella Liu, Hao Su, Xiaolong Wang
Can In-context Learning Really Generalize to Out-of-distribution Tasks?
Qixun Wang, Yifei Wang, Xianghua Ying et al.
LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
Tianyi Zhang, Anshumali Shrivastava
Track-On: Transformer-based Online Point Tracking with Memory
Görkay Aydemir, Xiongyi Cai, Weidi Xie et al.
3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds
Hengshuo Chu, Xiang Deng, Qi Lv et al.
Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes
Georg Manten, Cecilia Casolo, Emilio Ferrucci et al.
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Jiyeon Kim, Hyunji Lee, Hyowon Cho et al.
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Oskar van der Wal, Pietro Lesci, Max Müller-Eberstein et al.
BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
Yu Feng, Ben Zhou, Weidong Lin et al.
Black-Box Detection of Language Model Watermarks
Thibaud Gloaguen, Nikola Jovanović, Robin Staab et al.
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
Yinghui Li, Haojing Huang, Jiayi Kuang et al.
Concept Bottleneck Language Models For Protein Design
Aya Ismail, Tuomas Oikarinen, Amy Wang et al.
ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process
Changyao Tian, Chenxin Tao, Jifeng Dai et al.
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
Jui-Nan Yen, Si Si, Zhao Meng et al.
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning
Haque Ishfaq, Guangyuan Wang, Sami Islam et al.
Adapting Multi-modal Large Language Model to Concept Drift From Pre-training Onwards
Xiaoyu Yang, Jie Lu, En Yu
FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models
Zhanwei Zhang, Shizhao Sun, Wenxiao Wang et al.
Probabilistic Language-Image Pre-Training
Sanghyuk Chun, Wonjae Kim, Song Park et al.
Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization
Yuhang Zang, Hanlin Goh, Joshua Susskind et al.
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Marco Mistretta, Alberto Baldrati, Lorenzo Agnolucci et al.
EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents
Junting Chen, Checheng Yu, Xunzhe Zhou et al.
Re-Thinking Inverse Graphics With Large Language Models
Haiwen Feng, Michael J Black, Weiyang Liu et al.
Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation
Tien Manh Luong, Khai Nguyen, Nhat Ho et al.
Provably Accurate Shapley Value Estimation via Leverage Score Sampling
Christopher Musco, R. Teal Witter
Quadratic models for understanding catapult dynamics of neural networks
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.
GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models
Mianchu Wang, Rui Yang, Xi Chen et al.
APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
Xinyu Yang, Tianqi Chen, Beidi Chen
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Simon Welker, Matthew Le, Ricky T. Q. Chen et al.
To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts
Yukun Huang, Sanxing Chen, Hongyi Cai et al.
DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation
Changdae Oh, Yixuan Li, Kyungwoo Song et al.
RobustTSF: Towards Theory and Design of Robust Time Series Forecasting with Anomalies
Hao Cheng, Qingsong Wen, Yang Liu et al.
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Lijie Yang, Zhihao Zhang, Zhuofu Chen et al.
Presto! Distilling Steps and Layers for Accelerating Music Generation
Zachary Novack, Ge Zhu, Jonah Casebeer et al.
Functional Interpolation for Relative Positions improves Long Context Transformers
Shanda Li, Chong You, Guru Guruganesh et al.
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
Ming Zhong, Chenxin An, Weizhu Chen et al.
OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?
Zijian Chen, tingzhu chen, Wenjun Zhang et al.
BayesDiff: Estimating Pixel-wise Uncertainty in Diffusion via Bayesian Inference
Siqi Kou, Lei Gan, Dequan Wang et al.
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang, Yufei Wang, Tiezheng YU et al.