Conditional expectation and probability
February 4, 2020 — September 21, 2022
Things I would like to re-derive for my own entertainment:
Conditioning in the sense of measure-theoretic probability. Kolmogorov formulation. Conditioning as Radon-Nikodym derivative. Clunkiness of definition due to niceties of Lebesgue integration.
H.H. Rugh’s answer is nice.
1 Conditional algebra
TBC
2 Nonparametric
Conditioning in full measure-theoretic glory for Bayesian nonparametrics. E.g. conditioning of Gaussian Processes is also fun.
3 Disintegration
4 BLUE in Gaussian conditioning
e.g. Wilson et al. (2021):
Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a probability space and denote by \((\boldsymbol{a}, \boldsymbol{b})\) a pair of square integrable, centred random variables on \(\mathbb{R}^{n_{a}} \times \mathbb{R}^{n_{b}}\). The conditional expectation is the unique random variable that minimises the optimization problem \[ \mathbb{E}(\boldsymbol{a} \mid \boldsymbol{b})=\underset{\hat{\boldsymbol{a}}=f(\boldsymbol{b})}{\arg \min } \mathbb{E}(\hat{\boldsymbol{a}}-\boldsymbol{a})^{2} \] In words then, \(\mathbb{E}(\boldsymbol{a} \mid \boldsymbol{b})\) is the measurable function of \(\boldsymbol{b}\) that best predicts \(\boldsymbol{a}\) in the sense of minimizing the mean square error \((6)\).
Uncorrelated, jointly Gaussian random variables are independent. Consequently, when \(\boldsymbol{a}\) and \(\boldsymbol{b}\) are jointly Gaussian, the optimal predictor \(\mathbb{E}(\boldsymbol{a} \mid \boldsymbol{b})\) manifests as the best unbiased linear estimator \(\hat{\boldsymbol{a}}=\mathbf{S} \boldsymbol{b}\) of \(\boldsymbol{a}\)