On Stronger Computational Separations Between Multimodal and Unimodal Machine Learning

0
Citations
#10
in ICML 2024
of 2635 papers
1
Authors
1
Data Points

Abstract

Recently, multimodal machine learning has enjoyed huge empirical success (e.g. GPT-4). Motivated to develop theoretical justification for this empirical success, Lu (NeurIPS '23, ALT '24) introduces a theory of multimodal learning, and considers possibleseparationsbetween theoretical models of multimodal and unimodal learning. In particular, Lu (ALT '24) shows a computational separation, which is relevant toworst-caseinstances of the learning task. In this paper, we give a strongeraverage-casecomputational separation, where for "typical" instances of the learning task, unimodal learning is computationally hard, but multimodal learning is easy. We then question how "natural" the average-case separation is. Would it be encountered in practice? To this end, we prove that under basic conditions, any given computational separation between average-case unimodal and multimodal learning tasks implies a corresponding cryptographic key agreement protocol. We suggest to interpret this as evidence that very strongcomputationaladvantages of multimodal learning may ariseinfrequentlyin practice, since they exist only for the "pathological" case of inherently cryptographic distributions. However, this does not apply to possible (super-polynomial)statisticaladvantages.

Citation History

Jan 28, 2026
0