Sequential Representation Learning via Static-Dynamic Conditional Disentanglement

4citations

arXiv:2408.05599 PDF

Citations

#1189

in ECCV 2024

of 2387 papers

Authors

Data Points

Authors

Mathieu Simon Pascal Frossard Christophe De Vleeschouwer

Topics

disentangled representation learning sequential data analysis static-dynamic disentanglement causal relationship modeling normalizing flows factor identifiability self-supervised learning video representation learning

Abstract

This paper explores self-supervised disentangled representation learning within sequential data, focusing on separating time-independent and time-varying factors in videos. We propose a new model that breaks the usual independence assumption between those factors by explicitly accounting for the causal relationship between the static/dynamic variables and that improves the model expressivity through additional Normalizing Flows. A formal definition of the factors is proposed. This formalism leads to the derivation of sufficient conditions for the ground truth factors to be identifiable, and to the introduction of a novel theoretically grounded disentanglement constraint that can be directly and efficiently incorporated into our new framework. The experiments show that the proposed approach outperforms previous complex state-of-the-art techniques in scenarios where the dynamics of a scene are influenced by its content.

Citation History

Jan 26, 2026

Jan 27, 2026

Feb 2, 2026

4+4