Kalman-Bucy filter and variants, recursive estimation, predictive state models, Data assimilation. A particular sub-field of signal processing for models with hidden state.

In statistics terms, the state filters are a kind of online-updating hierarchical model for sequential observations of a dynamical system where the random state is unobserved, but you can get an optimal estimate of it based on incoming measurements and known parameters.

A unifying feature of all these is by assuming a sparse influence graph between observations and dynamics, that you can estimate behaviour using efficient message passing.

This is a twin problem to
optimal control.
If I wish to tackle this problem from the perspective of *observations* rather than true state, perhaps I could do it from the perspective of Koopman operators.

## Linear dynamical systems

In Kalman filters *per se* you are usually concerned with multivariate real
vector signals representing different axes of some telemetry data problem.
In the degenerate case, where there is no observation noise, you can just
design a linear filter.

The classic Kalman filter (R. E. Kalman 1960) assumes a linear model with Gaussian noise, although it might work with not-quite Gaussian, not-quite linear models if you prod it. You can extend this flavour to somewhat more general dynamics. For that, see later

NB I’m conflating linear observation and linear process models, but this but this is all easier to explain once we have the basic model explained.

There are a large number of equivalent formulations of the Kalman filter. The notation of Fearnhead and Künsch (2018) is representative. They start from the usual state filter setting: The state process \(\left(\mathbf{X}_{t}\right)\) is assumed to be Markovian and the \(i\)-th observation, \(\mathbf{Y}_{i}\), depends only on the state at time \(i, \mathbf{X}_{i}\), so that the evolution and observation variates are defined by \[ \begin{aligned} \mathbf{X}_{t} \mid\left(\mathbf{x}_{0: t-1}, \mathbf{y}_{1: t-1}\right) & \sim P\left(d \mathbf{x}_{t} \mid \mathbf{x}_{t-1}\right), \quad \mathbf{X}_{0} \sim \pi_{0}\left(d \mathbf{x}_{0}\right) \\ \mathbf{Y}_{t} \mid\left(\mathbf{x}_{0: t}, \mathbf{y}_{1: t-1}\right) & \sim g\left(\mathbf{y}_{t} \mid \mathbf{x}_{t}\right) d \nu\left(\mathbf{y}_{t}\right) \end{aligned} \] with joint distribution \[ \left(\mathbf{X}_{0: s}, \mathbf{Y}_{1: t}\right) \sim \pi_{0}\left(d \mathbf{x}_{0}\right) \prod_{i=1}^{s} P\left(d \mathbf{x}_{i} \mid \mathbf{x}_{i-1}\right) \prod_{j=1}^{t} g\left(\mathbf{y}_{j} \mid \mathbf{x}_{j}\right) \nu\left(d \mathbf{y}_{j}\right), \quad s \geq t. \]

Integrating out the path of the state process, we obtain that \[\begin{aligned} \mathbf{Y}_{1: t} &\sim p\left(\mathbf{y}_{1: t}\right) \prod_{j} \nu\left(d \mathbf{y}_{j}\right)\text{, where}\\ p\left(\mathbf{y}_{1: t}\right) &=\int \pi_{0}\left(d \mathbf{x}_{0}\right) \prod_{i=1}^{s} P\left(d \mathbf{x}_{i} \mid \mathbf{x}_{i-1}\right) \prod_{j=1}^{t} g\left(\mathbf{y}_{j} \mid \mathbf{x}_{j}\right). \end{aligned} \] We wish to find the distribution \(\pi_{0: s \mid t}=\frac{p(\mathbf{y}_{1: t},\mathbf{x}_{0:s})}{p(\mathbf{y}_{1: t})}\) (by Bayes’ rule). We deduce the recursion \[ \begin{aligned} \pi_{0: t \mid t-1}\left(d \mathbf{x}_{0: t} \mid \mathbf{y}_{1: t-1}\right) &=\pi_{0: t-1 \mid t-1}\left(d \mathbf{x}_{0: t-1} \mid \mathbf{y}_{1: t-1}\right) P\left(d \mathbf{x}_{t} \mid \mathbf{x}_{t-1}\right) &\text{ prediction}\\ \pi_{0: t \mid t}\left(d \mathbf{x}_{0: t} \mid \mathbf{y}_{1: t}\right) &=\pi_{0: t \mid t-1}\left(d \mathbf{x}_{0: t} \mid \mathbf{y}_{1: t-1}\right) \frac{g\left(\mathbf{y}_{t} \mid \mathbf{x}_{t}\right)}{p\left(\mathbf{y}_{t} \mid \mathbf{y}_{1: t-1}\right)} &\text{ correction} \end{aligned} \] where \[ p\left(\mathbf{y}_{t} \mid \mathbf{y}_{1: t-1}\right)=\frac{p\left(\mathbf{y}_{1: t}\right)}{p\left(\mathbf{y}_{1: t-1}\right)}=\int \pi_{t \mid t-1}\left(d \mathbf{x}_{t} \mid \mathbf{y}_{1: t-1}\right) g\left(\mathbf{y}_{t} \mid \mathbf{x}_{t}\right) . \] Integrating out all but the latest states \(\mathbf{x}_{0: t-1}\) gives us the one-step recursion \[ \begin{aligned} \pi_{t \mid t-1}\left(d \mathbf{x}_{t} \mid \mathbf{y}_{1: t-1}\right) &=\int \pi_{t-1}\left(d \mathbf{x}_{t-1} \mid \mathbf{y}_{1: t-1}\right) P\left(d \mathbf{x}_{t} \mid \mathbf{x}_{t-1}\right) &\text{ prediction}\\ \pi_{t}\left(d \mathbf{x}_{t} \mid \mathbf{y}_{1: t}\right) &=\pi_{t \mid t-1}\left(d \mathbf{x}_{t} \mid \mathbf{y}_{1: t-1}\right) \frac{g\left(\mathbf{y}_{t} \mid \mathbf{x}_{t}\right)}{p_{t}\left(\mathbf{y}_{t} \mid \mathbf{y}_{1: t-1}\right)}&\text{ correction} \end{aligned} \]

If we approximate the filter distribution \(\pi_t\) with a Monte Carlo sample, we are doing particle filtering, which Fearnhead and Künsch (2018) refer to as *bootstrap filtering*.

TODO: implied Kalman gain etc.

## Non-linear dynamical systems

Cute exercise: you can derive the analytic Kalman filter for any noise and process dynamics of with Bayesian conjugate, and this leads to filters of nonlinear behaviour. Multivariate distributions are a bit of a mess for non-Gaussians, though, and a beta-Kalman filter feels contrived.

Upshot is, the non-linear extensions don’t usually rely on non-Gaussian conjugate distributions and analytic forms, but rather do some Gaussian/linear approximation, or use randomised methods such as particle filters.

For some examples in Stan see Sinhrks’ stan-statespace.

## As errors-in-variables models

see, e.g. Bagge Carlson (2018).

## Unscented Kalman filter

i.e. using the unscented transform.

## Variational state filters

## Kalman filtering Gaussian processes

## Ensemble Kalman filters

## State filter inference

How about learning the *parameters* of the model generating your states?
Ways that you can do this in dynamical systems include basic
linear system identification,
general system identification, .

## References

*IEEE Transactions on Automatic Control*18 (6): 601–7.

*IEEE Transactions on Signal Processing*40 (6): 1548–62.

*The Annals of Statistics*13 (4): 1286–316.

*IEEE Transactions on Signal Processing*50 (2): 174–88.

*Journal of Multivariate Analysis*120 (September): 1–17.

*arXiv:1702.05390 [Physics, Stat]*, February.

*International Conference on Machine Learning*, 544–52.

*Geophysical Prospecting*24 (1): 141–97.

*International Computer Science Institute*4 (510): 126.

*SIAM Journal on Control and Optimization*55 (6): 4015–47.

*arXiv:1701.05978 [Math]*, January.

*The Annals of Applied Statistics*3 (1): 319–48.

*Proceedings of the National Academy of Sciences*113 (15): 3932–37.

*Digital Signal Processing*23 (3): 751–70.

*Compressed Sensing & Sparse Filtering*, edited by Avishy Y. Carmi, Lyudmila Mihaylova, and Simon J. Godsill, 281–324. Signals and Communication Technology. Springer Berlin Heidelberg.

*IEEE Transactions on Medical Imaging*34 (4): 846–60.

*Journal of The Royal Society Interface*5 (25): 885–97.

*IEEE Transactions on Signal Processing*64 (21): 5644–56.

*Econometric Theory*28 (01): 130–78.

*IEEE Transactions on Signal Processing*60 (8): 3978–87.

*Advances in Neural Information Processing Systems 28*, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2980–88. Curran Associates, Inc.

*Ecology*85 (11): 3140–50.

*An Introduction to State Space Time Series Analysis*. 1 edition. Oxford ; New York: Oxford University Press.

*International Journal of Approximate Reasoning*104 (January): 185–204.

*Journal of the American Statistical Association*94 (448): 1330–39.

*Journal of Computational and Graphical Statistics*19 (3): 724–45.

*Statistics for Spatio-Temporal Data*. Wiley Series in Probability and Statistics 2.0. John Wiley and Sons.

*SIAM Journal on Control and Optimization*55 (1): 119–55.

*arXiv:1304.5768 [Stat]*, April.

*Biometrika*84 (3): 669–84.

*Time Series Analysis by State Space Methods*. 2nd ed. Oxford Statistical Science Series 38. Oxford: Oxford University Press.

*IEEE Transactions on Information Theory*19 (1): 19–28.

*IEEE Transactions on Information Theory*19 (1): 29–37.

*arXiv:2006.13429 [Cs, Math]*, June.

*Current Opinion in Structural Biology*6 (3): 361–65.

*Neural Computation*16 (5): 971–98.

*Statistical Modelling*15 (4): 301–25.

*Advances in Neural Information Processing Systems 30*, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5309–19. Curran Associates, Inc.

*Annual Review of Statistics and Its Application*5 (1): 421–49.

*arXiv:1606.08650 [Stat]*, June.

*arXiv:1711.00799 [Stat]*, November.

*Advances in Neural Information Processing Systems 29*, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 2199–2207. Curran Associates, Inc.

*Hidden Markov Models and Dynamical Systems*. Philadelphia, PA: Society for Industrial and Applied Mathematics.

*Cambridge University Engineering Department, Cambridge, England, Technical Report TR-328*.

*Proceedings of the 11th International Conference on Neural Information Processing Systems*, 410–16. NIPS’98. Cambridge, MA, USA: MIT Press.

*1975 IEEE Conference on Decision and Control Including the 14th Symposium on Adaptive Processes*, 57–58.

*Advances in Neural Information Processing Systems 27*, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 3680–88. Curran Associates, Inc.

*Advances in Neural Information Processing Systems 26*, edited by C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, 3156–64. Curran Associates, Inc.

*NeuroImage*41 (3): 747–66.

*IEEE Transactions on Automatic Control*18 (6): 588–600.

*arXiv:2007.07383 [Physics, Stat]*, July.

*Journal of Time Series Analysis*, January, n/a–.

*Advances in Neural Information Processing Systems*, 34:572–85. Curran Associates, Inc.

*arXiv:1805.08034 [Cs, Math]*, May.

*arXiv:1611.05414 [Physics, Stat]*, November.

*2010 IEEE International Workshop on Machine Learning for Signal Processing*, 379–84. Kittila, Finland: IEEE.

*Encyclopedia of Biostatistics*. John Wiley & Sons, Ltd.

*Journal of the American Statistical Association*109 (507): 1112–22.

*Journal of The Royal Society Interface*7 (43): 271–83.

*arXiv:1505.05310 [Cs, Stat]*, May.

*International Journal of Systems Science*39 (10): 925–46.

*arXiv:1610.00195 [Physics, Stat]*, October.

*In Proceedings of INTERSPEECH*.

*Journal of Computer and System Sciences*, JCSS Special Issue: Cloud Computing 2011, 78 (5): 1460–80.

*Pattern Recognition Letters*45 (August): 85–91.

*Proceedings of the National Academy of Sciences*103 (49): 18438–43.

*The Annals of Statistics*39 (3): 1776–1802.

*Scis & Isis*2006: 1866–71.

*arXiv:1204.2477 [Cs, Stat]*, April.

*American Control Conference, Proceedings of the 1995*, 3:1628–1632 vol.3.

*IEEE Transactions on Information Theory*17 (5): 530–49.

*IEEE Transactions on Information Theory*20 (2): 146–81.

*IEEE Transactions on Information Theory*18 (6): 730–45.

*IEEE Transactions on Automatic Control*16 (6): 720–27.

*IEEE Transactions on Automatic Control*18 (5): 435–53.

*IEEE Transactions on Information Theory*21 (1): 15–23.

*IRE Transactions on Automatic Control*4 (3): 110–10.

*Journal of Basic Engineering*82 (1): 35.

*Signal Processing*91 (8): 1910–19.

*2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP)*, 1–6. Vietri sul Mare, Salerno, Italy: IEEE.

*Nonlinearity*27 (10): 2579.

*Bayesian Analysis*14 (4): 1037–73.

*Journal of the American Statistical Association*82 (400): 1032–41.

*Journal of Computational and Graphical Statistics*5 (1): 1–25.

*Smoothness Priors Analysis of Time Series*. Lecture notes in statistics 116. New York, NY: Springer New York : Imprint : Springer.

*Probability, Random Processes, and Statistical Analysis: Applications to Communications, Signal Processing, Queueing Theory and Mathematical Finance*. Cambridge University Press.

*Journal of Time Series Analysis*21 (3): 281–96.

*Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence*, 2101–9.

*BMC Neuroscience*16 (Suppl 1): P196.

*Journal of Machine Learning Research*11 (Jun): 1865–81.

*Monthly Weather Review*138 (4): 1293–1306.

*arXiv:1703.08596 [Cs, Math, Stat]*, March.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*73 (4): 423–98.

*IEEE Transactions on Information Theory*22 (4): 488–91.

*1975 IEEE Conference on Decision and Control Including the 14th Symposium on Adaptive Processes*, 55–56.

*Proceedings of the IEEE*95 (6): 1295–1322.

*IEEE Transactions on Signal Processing*46 (9): 2431–47.

*Journal of Process Control*, DYCOPS-CAB 2016, 60 (December): 82–94.

*Proceedings of ICLR*.

*WIREs Computational Statistics*n/a (n/a): e1532.

*Journal of Computational and Applied Mathematics*119 (1–2): 301–31.

*Journal of Agricultural, Biological and Environmental Statistics*25 (1): 1–16.

*International Conference on Machine Learning*, 3789–98.

*44th IEEE Conference on Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC ’05*, 8179–84. Seville, Spain: IEEE.

*arXiv:1703.00209 [Math, Stat]*, March.

*Principles and Practice of Constraint Programming*, 341–50. Lecture Notes in Computer Science. Switzerland: Springer, Cham.

*IEEE Spectrum*47 (5): 47–50.

*IEEE Control Systems*33 (3): 40–54.

*Stochastic systems: theory and applications*. River Edge, NJ: World Scientific.

*Automatica*18 (6): 685–96.

*Journal of Machine Learning Research*6 (Dec): 1939–59.

*IEEE ASSP Magazine*3 (1): 4–16.

*Proceedings of the IEEE*77 (2): 257–86.

*Stochastic Control*, edited by N. K. Sinha and L. A. Telksnys, 183–88. IFAC Symposia Series. Oxford: Pergamon.

*2010 13th International Conference on Information Fusion*, 1–9.

*ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*, 3905–9.

*Proceedings of the 7th International Conference on New Interfaces for Musical Expression*, 234–37. NIME ’07. New York, NY, USA: ACM.

*Proceedings of the International Computer Music Conference 2011*.

*Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases*.

*Journal of Time Series Analysis*30 (2): 167–78.

*EURASIP Journal on Advances in Signal Processing*2017 (1): 56.

*Journal of Computer and Systems Sciences International*52 (6): 866–92.

*2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP)*, 1–6.

*IEEE Transactions on Automatic Control*52 (9): 1631–41.

*Bayesian Filtering and Smoothing*. Institute of Mathematical Statistics Textbooks 3. Cambridge, U.K. ; New York: Cambridge University Press.

*Artificial Intelligence and Statistics*.

*IEEE Transactions on Automatic Control*54 (3): 596–600.

*IEEE Signal Processing Magazine*30 (4): 51–61.

*Advances In Neural Information Processing Systems*, 5006–14.

*arXiv:2103.10153 [Cs, Stat]*, June.

*IEEE Transactions on Information Theory*21 (2): 143–49.

*IEEE Spectrum*7 (7): 63–68.

*The Annals of Applied Statistics*7 (4): 2157–79.

*Journal of the American Statistical Association*, March, 1–31.

*Proceedings of the International Conference on Machine Learning*. Bled, Slovenia.

*Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics*, 868–75.

*Physica D: Nonlinear Phenomena*, Data Assimilation, 230 (1): 1–16.

*Environmental and Ecological Statistics*5 (2): 117–54.

*2007 5th International Symposium on Image and Signal Processing and Analysis*, 435–40. Istanbul, Turkey: IEEE.

## No comments yet. Why not leave one?