Conditional expectation and probability

February 4, 2020 — September 21, 2022

algebra
Bayes
functional analysis
networks
probability

Things I would like to re-derive for my own entertainment:

Conditioning in the sense of measure-theoretic probability. Kolmogorov formulation. Conditioning as Radon-Nikodym derivative. Clunkiness of definition due to niceties of Lebesgue integration.

H.H. Rugh’s answer is nice.

1 Conditional algebra

TBC

2 Nonparametric

Conditioning in full measure-theoretic glory for Bayesian nonparametrics. E.g. conditioning of Gaussian Processes is also fun.

3 Disintegration

Chang and Pollard (1997), Kallenberg (2002).

4 BLUE in Gaussian conditioning

e.g. Wilson et al. (2021):

Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a probability space and denote by \((\boldsymbol{a}, \boldsymbol{b})\) a pair of square integrable, centred random variables on \(\mathbb{R}^{n_{a}} \times \mathbb{R}^{n_{b}}\). The conditional expectation is the unique random variable that minimises the optimization problem \[ \mathbb{E}(\boldsymbol{a} \mid \boldsymbol{b})=\underset{\hat{\boldsymbol{a}}=f(\boldsymbol{b})}{\arg \min } \mathbb{E}(\hat{\boldsymbol{a}}-\boldsymbol{a})^{2} \] In words then, \(\mathbb{E}(\boldsymbol{a} \mid \boldsymbol{b})\) is the measurable function of \(\boldsymbol{b}\) that best predicts \(\boldsymbol{a}\) in the sense of minimizing the mean square error \((6)\).

Uncorrelated, jointly Gaussian random variables are independent. Consequently, when \(\boldsymbol{a}\) and \(\boldsymbol{b}\) are jointly Gaussian, the optimal predictor \(\mathbb{E}(\boldsymbol{a} \mid \boldsymbol{b})\) manifests as the best unbiased linear estimator \(\hat{\boldsymbol{a}}=\mathbf{S} \boldsymbol{b}\) of \(\boldsymbol{a}\)

Figure 1

5 References

Alquier, and Gerber. 2024. Universal Robust Regression via Maximum Mean Discrepancy.” Biometrika.
Chang, and Pollard. 1997. Conditioning as Disintegration.” Statistica Neerlandica.
Cherief-Abdellatif, and Alquier. 2020. MMD-Bayes: Robust Bayesian Estimation via Maximum Mean Discrepancy.” In Proceedings of The 2nd Symposium on Advances in Approximate Bayesian Inference.
Cuzzolin. 2021. A Geometric Approach to Conditioning Belief Functions.”
Kallenberg. 2002. Foundations of Modern Probability. Probability and Its Applications.
Matthies, Zander, Rosić, et al. 2016. Parameter Estimation via Conditional Expectation: A Bayesian Inversion.” Advanced Modeling and Simulation in Engineering Sciences.
Schervish. 2012. Theory of Statistics. Springer Series in Statistics.
Wilson, Borovitskiy, Terenin, et al. 2021. Pathwise Conditioning of Gaussian Processes.” Journal of Machine Learning Research.