Optimal conditioning

2023-10-16 — 2023-10-16

Wherein the Problem of Selecting Conditioning Variables for Optimal Prediction Is Treated, and the Role of Learned Features in Transformers Is Examined, With Compressibility Is Proposed as an Adjunct.

algebra

graphical models

how do science

machine learning

meta learning

networks

probability

statistics

Working out what we need to condition on to make the best possible prediction in generic learning algorithms.

Thinkbubble: in transformer, could we think of words or concepts as learned conditioning features? I think we might need something extra to make that go, such as compressibility.

1 References

Balestriero, Pesenti, and LeCun. 2021. “Learning in High Dimension Always Amounts to Extrapolation.”

Bareinboim, and Pearl. 2013. “A General Algorithm for Deciding Transportability of Experimental Results.” Journal of Causal Inference.

———. 2014. “Transportability from Multiple Environments with Limited Experiments: Completeness Results.” In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1. NIPS’14.

———. 2016. “Causal Inference and the Data-Fusion Problem.” Proceedings of the National Academy of Sciences.

Belkin. 2021. “Fit Without Fear: Remarkable Mathematical Phenomena of Deep Learning Through the Prism of Interpolation.” Acta Numerica.

Ben-David, Blitzer, Crammer, et al. 2006. “Analysis of Representations for Domain Adaptation.” In Advances in Neural Information Processing Systems.

Dumoulin, Perez, Schucher, et al. 2018. “Feature-Wise Transformations.” Distill.

Gulrajani, and Lopez-Paz. 2020. “In Search of Lost Domain Generalization.” In.

Hoel. 2021. “The Overfitted Brain: Dreams Evolved to Assist Generalization.” Patterns.

Power, Burda, Edwards, et al. 2022. “Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets.”

Rosenfeld, Ravikumar, and Risteski. 2020. “The Risks of Invariant Risk Minimization.” In.

Xu, Wang, and Ni. 2022. “Graphical Modeling for Multi-Source Domain Adaptation.” IEEE Transactions on Pattern Analysis and Machine Intelligence.

Yue, Sun, Hua, et al. 2021. “Transporting Causal Mechanisms for Unsupervised Domain Adaptation.” In.

Zhang, Chiyuan, Bengio, Hardt, et al. 2017. “Understanding Deep Learning Requires Rethinking Generalization.” In Proceedings of ICLR.

———, et al. 2021. “Understanding Deep Learning (Still) Requires Rethinking Generalization.” Communications of the ACM.

Zhang, Yabin, Deng, Tang, et al. 2020. “Unsupervised Multi-Class Domain Adaptation: Theory, Algorithms, and Practice.” IEEE Transactions on Pattern Analysis and Machine Intelligence.

Zhao, Combes, Zhang, et al. 2019. “On Learning Invariant Representation for Domain Adaptation.”