Disentangled representation learning

2025-03-10 — 2025-05-21

Bayes

bounded compute

dynamical systems

feature construction

high d

language

machine learning

metrics

mind

NLP

sparser than thou

statmech

stochastic processes

Suspiciously similar content

Disentangled representation learning aims to factor a data point’s latent encoding so each dimension (or chunk) aligns with one underlying generative factor—like object pose, scale, or lighting. Further, we want it to be decoupled from the others, meaning it is statistically independent or orthogonal and (roughly) insensitive to all others. What exactly we require depends on the architecture, AFAICT.

The upshot is the same. Imagine encoding face images: one latent should “turn the head,” another “brighten the cheek,” and another “open the mouth.” When we achieve this, downstream tasks become easier, since we can e.g. turn someone’s head without changing their expression.

Justifications for this include:

Interpretability: disentangled representations are easier to understand and visualise.
Control: disentangled representations allow for more precise control over the generated samples.
Generalization: disentangled representations can improve the generalization of models to unseen data. Or at least that’s what people claim. I’m sceptical of this as a blanket statement but think it might be interesting in causal settings.

Disentangling was big business in early generative AI when we weren’t sure how to condition GANs or VAEs on specific features. We use other tools to condition Diffusions. Maybe still relevant today for interpretability / robustness.

1 Grandaddy example: β-VAE

The β-VAE augments the standard variational autoencoder objective

$L = E_{q (z ∣ x)} [- \log p (x ∣ z)] + KL (q (z ∣ x) ∥ p (z))$

by weighting the KL term with a factor β > 1:

$L_{β -VAE} = E_{q (z ∣ x)} [- \log p (x ∣ z)] + β KL (q (z ∣ x) ∥ p (z)) .$

This stronger bottleneck (larger β) encourages each latent dimension to carry only the minimal information needed, which pushes them toward independence—and often yields interpretable axes like “rotate” or “zoom.”

2 Fancier: total-correlation in β-TCVAE

β-TCVAE decomposes the VAE’s KL into three parts—mutual information, dimension-wise KL, and total correlation (TC)—and then penalizes TC more heavily. Concretely:

$TC (q (z)) = KL (q (z) ∥ \prod_{i} q (z_{i})),$

and the objective becomes

$L_{β -TCVAE} = E_{q (z ∣ x)} [- \log p (x ∣ z)] + \underset{info term}{\underset{⏟}{α I_{q} (x; z)}} + \underset{disentanglement}{\underset{⏟}{β TC (q (z))}} + \underset{marginals}{\underset{⏟}{γ \sum_{i} KL (q (z_{i}) ∥ p (z_{i}))}} .$

By dialling up β on the TC term, we more directly push the joint q(z) toward a product of its marginals, which sharpens disentanglement.

3 InfoGAN

InfoGAN (Chen et al. 2016) is a GAN-based disentangling method that lives in that same family. It augments the usual GAN min–max with a mutual-information term to coax one part of the latent code to line up with a single factor:

$min_{G} max_{D} V (D, G) - λ I (c; G (z, c)),$

where

$V (D, G)$ is the standard GAN loss,
$z$ is noise,
$c$ is a “code” you hope will become interpretable (e.g. rotation, thickness), and
$I (c; G (z, c))$ encourages $c$ to actually control something in the output.

By maximizing $I (c; G (z, c))$ , InfoGAN pushes those chosen code dimensions to capture distinct, meaningful factors—just like β-VAE or β-TCVAE push their KL/TC penalties toward independence.

4 Other examples

In general, it seems you pick a backbone (VAE, GAN, diffusion, …), add one of these disentangling penalties. You have succeeded if traversing a single latent smoothly transforms just one aspect of the output. A handful of dimensions are then devoted to “rotation,” “colour,” or “thickness” etc, and the rest of the model behaves normally, doing whatever you were doing before, except now our face generator has a colour knob.

5 Causal

Interesting, and why I got involved in this field in the first place, after seeing Yang et al. (2021). See Causal abstraction for more.

6 Questions

How many factors can we disentangle?
How much can this be made unsupervised?
Are there underexploited tools in this toolkit for Mechanistic interpretability for more.
Are there underexploited tools in Developmental interpretability for more.

7 References

Chen, Duan, Houthooft, et al. 2016. “InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets.” In Advances in Neural Information Processing Systems 29.

Dang, Luo, Jia, et al. 2024. “Disentangled Representation Learning With Transmitted Information Bottleneck.” IEEE Transactions on Circuits and Systems for Video Technology.

Dezfouli, Ashtiani, Ghattas, et al. 2019. “Disentangled Behavioural Representations.” In Advances in Neural Information Processing Systems.

Esmaeili, Wu, Jain, et al. 2019. “Structured Disentangled Representations.” In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics.

Fumero, Wenzel, Zancato, et al. 2023. “Leveraging Sparse and Shared Feature Activations for Disentangled Representation Learning.”

Higgins, Amos, Pfau, et al. 2018. “Towards a Definition of Disentangled Representations.”

Hsu, Hamid, Burns, et al. 2024. “Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning.”

Jeon, Lee, Pyeon, et al. 2021. “IB-GAN: Disentangled Representation Learning with Information Bottleneck Generative Adversarial Networks.” In Proceedings of the AAAI Conference on Artificial Intelligence.

Kurutach, Tamar, Yang, et al. 2018. “Learning Plannable Representations with Causal InfoGAN.” In Advances in Neural Information Processing Systems.

Li, Adam, Pan, and Bareinboim. 2024. “Disentangled Representation Learning in Non-Markovian Causal Systems.” In.

Li, Yating, Xiao, Liao, et al. 2023. “A Review of Disentangled Representation Learning for Visual Data Processing and Analysis.” Journal of Image and Graphics.

Locatello, Abbati, Rainforth, et al. 2019. “On the Fairness of Disentangled Representations.” In Advances in Neural Information Processing Systems.

Locatello, Bauer, Lucic, et al. 2019. “Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations.” In Proceedings of the 36th International Conference on Machine Learning.

Tran, Yin, and Liu. 2017. “Disentangled Representation Learning GAN for Pose-Invariant Face Recognition.” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Wang, Xin, Chen, Tang, et al. 2024. “Disentangled Representation Learning.” IEEE Transactions on Pattern Analysis and Machine Intelligence.

Wang, Mi, Wang, Xiao, et al. 2022. “A Review of Disentangled Representation Learning for Remote Sensing Data.” CAAI Artificial Intelligence Research.

Wu, and Zheng. 2024. “Factorized Diffusion Autoencoder for Unsupervised Disentangled Representation Learning.” In Proceedings of the AAAI Conference on Artificial Intelligence.

Yang, Liu, Chen, et al. 2021. “CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models.” In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).