State filtering

Kalman and friends

Kalman-Bucy filter and variants, recursive estimation, predictive state models, Data assimilation. A particular sub-field of signal processing for models with hidden state.

In statistics terms, the state filters are a kind of online-updating hierarchical model for sequential observations of a dynamical system where the random state is unobserved, but you can get an optimal estimate of it based on incoming measurements and known parameters.

A unifying feature of all these is by assuming a sparse influence graph between observations and dynamics, that you can estimate behaviour using efficient message passing.

This is a twin problem to optimal control.

Linear systems

In Kalman filters per se you are usually concerned with multivariate real vector signals representing different axes of some telemetry data problem. In the degenerate case, where there is no observation noise, you can just design a linear filter.

The classic Kalman filter (Kalm60) assumes a linear model with Gaussian noise, although it might work with not-quite Gaussian, not-quite linear models if you prod it. You can extend this flavour to somewhat more general dynamics.

If you are doing telemetry then you probably know a priori that your model is not linear in this case, and extensions are advisable.

(NB I’m conflating linear observation and linear process models here, but this is fine for a link list, I think.)

Non-linear dynamical systems

Cute exercise: you can derive the analytic Kalman filter for any noise and process dynamics of with Bayesian conjugate, and this leads to filters of nonlinear behaviour. Multivariate distributions are a bit of a mess for non-Gaussians, though, and a beta-Kalman filter feels contrived.

Upshot is, the non-linear extensions don’t usually rely on non-Gaussian conjugate distributions and analytic forms, but rather do some Gaussian/linear approximation, or use randomised methods such as particle filters.

For some example of doing this in Stan see Sinhrks’ stan-statespace.

Discrete state Hidden Markov models

🏗 Viterbi algorithm.

Variational state filters

See Variational state filters.

Kalman Filtering Gaussian Processes

See Kalman Filtering Gaussian Processes.

State filter inference

How about learning the parameters of the model generating your states? Ways that you can do this in dynamical systems include basic linear system identification, general system identification, . But can you identify the parameters (not just hidden states) with a state filter? Yes.

Aasnaes, H., and T. Kailath. 1973. “An Innovations Approach to Least-Squares Estimation–Part VII: Some Applications of Vector Autoregressive-Moving Average Models.” IEEE Transactions on Automatic Control 18 (6): 601–7. https://doi.org/10.1109/TAC.1973.1100412.

Alliney, S. 1992. “Digital Filters as Absolute Norm Regularizers.” IEEE Transactions on Signal Processing 40 (6): 1548–62. https://doi.org/10.1109/78.139258.

Ansley, Craig F., and Robert Kohn. 1985. “Estimation, Filtering, and Smoothing in State Space Models with Incompletely Specified Initial Conditions.” The Annals of Statistics 13 (4): 1286–1316. https://doi.org/10.1214/aos/1176349739.

Arulampalam, M. S., S. Maskell, N. Gordon, and T. Clapp. 2002. “A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking.” IEEE Transactions on Signal Processing 50 (2): 174–88. https://doi.org/10.1109/78.978374.

Battey, Heather, and Alessio Sancetta. 2013. “Conditional Estimation for Dependent Functional Data.” Journal of Multivariate Analysis 120 (September): 1–17. https://doi.org/10.1016/j.jmva.2013.04.009.

Batz, Philipp, Andreas Ruttor, and Manfred Opper. 2017. “Approximate Bayes Learning of Stochastic Differential Equations,” February. http://arxiv.org/abs/1702.05390.

Becker, Philipp, Harit Pandya, Gregor Gebhardt, Cheng Zhao, C. James Taylor, and Gerhard Neumann. 2019. “Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces.” In International Conference on Machine Learning, 544–52. http://proceedings.mlr.press/v97/becker19a.html.

Berkhout, A. J., and P. R. Zaanen. 1976. “A Comparison Between Wiener Filtering, Kalman Filtering, and Deterministic Least Squares Estimation*.” Geophysical Prospecting 24 (1): 141–97. https://doi.org/10.1111/j.1365-2478.1976.tb00390.x.

Bilmes, Jeff A. 1998. “A Gentle Tutorial of the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models.” International Computer Science Institute 4 (510): 126. http://lasa.epfl.ch/teaching/lectures/ML_Phd/Notes/GP-GMM.pdf.

Bishop, Adrian N., and Pierre Del Moral. 2016. “On the Stability of Kalman-Bucy Diffusion Processes,” October. http://arxiv.org/abs/1610.04686.

Bishop, Adrian N., Pierre Del Moral, and Sahani D. Pathiraja. 2017. “Perturbations and Projections of Kalman-Bucy Semigroups Motivated by Methods in Data Assimilation,” January. http://arxiv.org/abs/1701.05978.

Bretó, Carles, Daihai He, Edward L. Ionides, and Aaron A. King. 2009. “Time Series Analysis via Mechanistic Models.” The Annals of Applied Statistics 3 (1): 319–48. https://doi.org/10.1214/08-AOAS201.

Brunton, Steven L., Joshua L. Proctor, and J. Nathan Kutz. 2016. “Discovering Governing Equations from Data by Sparse Identification of Nonlinear Dynamical Systems.” Proceedings of the National Academy of Sciences 113 (15): 3932–7. https://doi.org/10.1073/pnas.1517384113.

Carmi, Avishy Y. 2014. “Compressive System Identification.” In Compressed Sensing & Sparse Filtering, edited by Avishy Y. Carmi, Lyudmila Mihaylova, and Simon J. Godsill, 281–324. Signals and Communication Technology. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-38398-4_9.

———. 2013. “Compressive System Identification: Sequential Methods and Entropy Bounds.” Digital Signal Processing 23 (3): 751–70. https://doi.org/10.1016/j.dsp.2012.12.006.

Cassidy, Ben, Caroline Rae, and Victor Solo. 2015. “Brain Activity: Connectivity, Sparsity, and Mutual Information.” IEEE Transactions on Medical Imaging 34 (4): 846–60. https://doi.org/10.1109/TMI.2014.2358681.

Cauchemez, Simon, and Neil M. Ferguson. 2008. “Likelihood-Based Estimation of Continuous-Time Epidemic Models from Time-Series Data: Application to Measles Transmission in London.” Journal of the Royal Society Interface 5 (25): 885–97. https://doi.org/10.1098/rsif.2007.1292.

Charles, Adam, Aurele Balavoine, and Christopher Rozell. 2016. “Dynamic Filtering of Time-Varying Sparse Signals via L1 Minimization.” IEEE Transactions on Signal Processing 64 (21): 5644–56. https://doi.org/10.1109/TSP.2016.2586745.

Chen, Bin, and Yongmiao Hong. 2012. “Testing for the Markov Property in Time Series.” Econometric Theory 28 (01): 130–78. https://doi.org/10.1017/S0266466611000065.

Chen, Y., and A. O. Hero. 2012. “Recursive ℓ1,∞ Group Lasso.” IEEE Transactions on Signal Processing 60 (8): 3978–87. https://doi.org/10.1109/TSP.2012.2192924.

Chung, Junyoung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Yoshua Bengio. 2015. “A Recurrent Latent Variable Model for Sequential Data.” In Advances in Neural Information Processing Systems 28, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2980–8. Curran Associates, Inc. http://papers.nips.cc/paper/5653-a-recurrent-latent-variable-model-for-sequential-data.pdf.

Clark, James S., and Ottar N. Bjørnstad. 2004. “Population Time Series: Process Variability, Observation Errors, Missing Values, Lags, and Hidden States.” Ecology 85 (11): 3140–50. https://doi.org/10.1890/03-0520.

Commandeur, Jacques J. F., and Siem Jan Koopman. 2007. An Introduction to State Space Time Series Analysis. 1 edition. Oxford ; New York: Oxford University Press.

Cox, Marco, Thijs van de Laar, and Bert de Vries. 2019. “A Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms.” International Journal of Approximate Reasoning 104 (January): 185–204. https://doi.org/10.1016/j.ijar.2018.11.002.

Cressie, Noel, and Hsin-Cheng Huang. 1999. “Classes of Nonseparable, Spatio-Temporal Stationary Covariance Functions.” Journal of the American Statistical Association 94 (448): 1330–9. https://doi.org/10.1080/01621459.1999.10473885.

Cressie, Noel, Tao SHI, and Emily L. KANG. 2010. “Fixed Rank Filtering for Spatio-Temporal Data.” Journal of Computational and Graphical Statistics 19 (3): 724–45. http://gms.gsfc.nasa.gov/vis/a000000/a003800/a003812/2010_Cressie_et_al_JCGS.pdf.

Cressie, Noel, and Christopher K. Wikle. 2006. “Space-Time Kalman Filter.” In Encyclopedia of Environmetrics. John Wiley & Sons, Ltd. http://stat.missouri.edu/~wikle/s037-_o.pdf.

———. 2015. Statistics for Spatio-Temporal Data. John Wiley & Sons. http://books.google.com?id=4L_dCgAAQBAJ.

Del Moral, P., A. Kurtzmann, and J. Tugaut. 2017. “On the Stability and the Uniform Propagation of Chaos of a Class of Extended Ensemble Kalman–Bucy Filters.” SIAM Journal on Control and Optimization 55 (1): 119–55. https://doi.org/10.1137/16M1087497.

Doucet, Arnaud, Pierre E. Jacob, and Sylvain Rubenthaler. 2013. “Derivative-Free Estimation of the Score Vector and Observed Information Matrix with Application to State-Space Models,” April. http://arxiv.org/abs/1304.5768.

Durbin, J., and S. J. Koopman. 2012. Time Series Analysis by State Space Methods. 2nd ed. Oxford Statistical Science Series 38. Oxford: Oxford University Press.

———. 1997. “Monte Carlo Maximum Likelihood Estimation for Non-Gaussian State Space Models.” Biometrika 84 (3): 669–84. https://doi.org/10.1093/biomet/84.3.669.

Duttweiler, D., and T. Kailath. 1973a. “RKHS Approach to Detection and Estimation Problems–IV: Non-Gaussian Detection.” IEEE Transactions on Information Theory 19 (1): 19–28. https://doi.org/10.1109/TIT.1973.1054928.

———. 1973b. “RKHS Approach to Detection and Estimation Problems–V: Parameter Estimation.” IEEE Transactions on Information Theory 19 (1): 29–37. https://doi.org/10.1109/TIT.1973.1054949.

Eddy, Sean R. 1996. “Hidden Markov Models.” Current Opinion in Structural Biology 6 (3): 361–65. https://doi.org/10.1016/S0959-440X(96)80056-X.

Eden, U, L Frank, R Barbieri, V Solo, and E Brown. 2004. “Dynamic Analysis of Neural Encoding by Point Process Adaptive Filtering.” Neural Computation 16 (5): 971–98. https://doi.org/10.1162/089976604773135069.

Edwards, David, and Smitha Ankinakatte. 2015. “Context-Specific Graphical Models for Discrete Longitudinal Data.” Statistical Modelling 15 (4): 301–25. https://doi.org/10.1177/1471082X14551248.

Eleftheriadis, Stefanos, Tom Nicholson, Marc Deisenroth, and James Hensman. 2017. “Identification of Gaussian Process State Space Models.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5309–19. Curran Associates, Inc. http://papers.nips.cc/paper/7115-identification-of-gaussian-process-state-space-models.pdf.

Fearnhead, Paul, and Hans R. Künsch. 2018. “Particle Filters and Data Assimilation.” Annual Review of Statistics and Its Application 5 (1): 421–49. https://doi.org/10.1146/annurev-statistics-031017-100232.

Finke, Axel, and Sumeetpal S. Singh. 2016. “Approximate Smoothing and Parameter Estimation in High-Dimensional State-Space Models,” June. http://arxiv.org/abs/1606.08650.

Föll, Roman, Bernard Haasdonk, Markus Hanselmann, and Holger Ulmer. 2017. “Deep Recurrent Gaussian Process with Variational Sparse Spectrum Approximation,” November. http://arxiv.org/abs/1711.00799.

Fraccaro, Marco, Sø ren Kaae Sø nderby, Ulrich Paquet, and Ole Winther. 2016. “Sequential Neural Models with Stochastic Layers.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 2199–2207. Curran Associates, Inc. http://papers.nips.cc/paper/6039-sequential-neural-models-with-stochastic-layers.pdf.

Fraser, Andrew M. 2008. Hidden Markov Models and Dynamical Systems. Philadelphia, PA: Society for Industrial and Applied Mathematics.

Friedlander, B., T. Kailath, and L. Ljung. 1975. “Scattering Theory and Linear Least Squares Estimation: Part II: Discrete-Time Problems.” In 1975 IEEE Conference on Decision and Control Including the 14th Symposium on Adaptive Processes, 57–58. https://doi.org/10.1109/CDC.1975.270648.

Frigola, Roger, Yutian Chen, and Carl Edward Rasmussen. 2014. “Variational Gaussian Process State-Space Models.” In Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 3680–8. Curran Associates, Inc. http://papers.nips.cc/paper/5375-variational-gaussian-process-state-space-models.pdf.

Frigola, Roger, Fredrik Lindsten, Thomas B Schön, and Carl Edward Rasmussen. 2013. “Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC.” In Advances in Neural Information Processing Systems 26, edited by C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, 3156–64. Curran Associates, Inc. http://papers.nips.cc/paper/5085-bayesian-inference-and-learning-in-gaussian-process-state-space-models-with-particle-mcmc.pdf.

Friston, K. J. 2008. “Variational Filtering.” NeuroImage 41 (3): 747–66. https://doi.org/10.1016/j.neuroimage.2008.03.017.

Gevers, M., and T. Kailath. 1973. “An Innovations Approach to Least-Squares Estimation–Part VI: Discrete-Time Innovations Representations and Recursive Estimation.” IEEE Transactions on Automatic Control 18 (6): 588–600. https://doi.org/10.1109/TAC.1973.1100419.

Gourieroux, Christian, and Joann Jasiak. 2015. “Filtering, Prediction and Simulation Methods for Noncausal Processes.” Journal of Time Series Analysis, January, n/a–n/a. https://doi.org/10.1111/jtsa.12165.

Haber, Eldad, Felix Lucka, and Lars Ruthotto. 2018. “Never Look Back - A Modified EnKF Method and Its Application to the Training of Neural Networks Without Back Propagation,” May. http://arxiv.org/abs/1805.08034.

Hamilton, Franz, Tyrus Berry, and Timothy Sauer. 2016. “Kalman-Takens Filtering in the Presence of Dynamical Noise,” November. http://arxiv.org/abs/1611.05414.

Hartikainen, J., and S. Särkkä. 2010. “Kalman Filtering and Smoothing Solutions to Temporal Gaussian Process Regression Models.” In 2010 IEEE International Workshop on Machine Learning for Signal Processing, 379–84. Kittila, Finland: IEEE. https://doi.org/10.1109/MLSP.2010.5589113.

Harvey, A., and S. J. Koopman. 2005. “Structural Time Series Models.” In Encyclopedia of Biostatistics. John Wiley & Sons, Ltd. http://onlinelibrary.wiley.com/doi/10.1002/0470011815.b2a12069/abstract.

Harvey, Andrew, and Alessandra Luati. 2014. “Filtering with Heavy Tails.” Journal of the American Statistical Association 109 (507): 1112–22. https://doi.org/10.1080/01621459.2014.887011.

He, Daihai, Edward L. Ionides, and Aaron A. King. 2010. “Plug-and-Play Inference for Disease Dynamics: Measles in Large and Small Populations as a Case Study.” Journal of the Royal Society Interface 7 (43): 271–83. https://doi.org/10.1098/rsif.2009.0151.

Hefny, Ahmed, Carlton Downey, and Geoffrey Gordon. 2015. “A New View of Predictive State Methods for Dynamical System Learning,” May. http://arxiv.org/abs/1505.05310.

Hong, X., R. J. Mitchell, S. Chen, C. J. Harris, K. Li, and G. W. Irwin. 2008. “Model Selection Approaches for Non-Linear System Identification: A Review.” International Journal of Systems Science 39 (10): 925–46. https://doi.org/10.1080/00207720802083018.

Hou, Elizabeth, Earl Lawrence, and Alfred O. Hero. 2016. “Penalized Ensemble Kalman Filters for High Dimensional Non-Linear Systems,” October. http://arxiv.org/abs/1610.00195.

Hsiao, Roger, and Tanja Schultz. 2011. “Generalized Baum-Welch Algorithm and Its Implication to a New Extended Baum-Welch Algorithm.” In In Proceedings of INTERSPEECH.

Hsu, Daniel, Sham M. Kakade, and Tong Zhang. 2012. “A Spectral Algorithm for Learning Hidden Markov Models.” Journal of Computer and System Sciences, JCSS Special Issue: Cloud Computing 2011, 78 (5): 1460–80. https://doi.org/10.1016/j.jcss.2011.12.025.

Huber, Marco F. 2014. “Recursive Gaussian Process: On-Line Regression and Learning.” Pattern Recognition Letters 45 (August): 85–91. https://doi.org/10.1016/j.patrec.2014.03.004.

Ionides, Edward L., Anindya Bhadra, Yves Atchadé, and Aaron King. 2011. “Iterated Filtering.” The Annals of Statistics 39 (3): 1776–1802. https://doi.org/10.1214/11-AOS886.

Ionides, E. L., C. Bretó, and A. A. King. 2006. “Inference for Nonlinear Dynamical Systems.” Proceedings of the National Academy of Sciences 103 (49): 18438–43. https://doi.org/10.1073/pnas.0603181103.

Johnson, Matthew James. 2012. “A Simple Explanation of A Spectral Algorithm for Learning Hidden Markov Models,” April. http://arxiv.org/abs/1204.2477.

Julier, S. J., J. K. Uhlmann, and H. F. Durrant-Whyte. 1995. “A New Approach for Filtering Nonlinear Systems.” In American Control Conference, Proceedings of the 1995, 3:1628–32 vol.3. https://doi.org/10.1109/ACC.1995.529783.

Kailath, T. 1971. “RKHS Approach to Detection and Estimation Problems–I: Deterministic Signals in Gaussian Noise.” IEEE Transactions on Information Theory 17 (5): 530–49. https://doi.org/10.1109/TIT.1971.1054673.

———. 1974. “A View of Three Decades of Linear Filtering Theory.” IEEE Transactions on Information Theory 20 (2): 146–81. https://doi.org/10.1109/TIT.1974.1055174.

Kailath, T., and D. Duttweiler. 1972. “An RKHS Approach to Detection and Estimation Problems– III: Generalized Innovations Representations and a Likelihood-Ratio Formula.” IEEE Transactions on Information Theory 18 (6): 730–45. https://doi.org/10.1109/TIT.1972.1054925.

Kailath, T., and R. Geesey. 1971. “An Innovations Approach to Least Squares Estimation–Part IV: Recursive Estimation Given Lumped Covariance Functions.” IEEE Transactions on Automatic Control 16 (6): 720–27. https://doi.org/10.1109/TAC.1971.1099835.

———. 1973. “An Innovations Approach to Least-Squares Estimation–Part V: Innovations Representations and Recursive Estimation in Colored Noise.” IEEE Transactions on Automatic Control 18 (5): 435–53. https://doi.org/10.1109/TAC.1973.1100366.

Kailath, T., and H. Weinert. 1975. “An RKHS Approach to Detection and Estimation Problems–II: Gaussian Signal Detection.” IEEE Transactions on Information Theory 21 (1): 15–23. https://doi.org/10.1109/TIT.1975.1055328.

Kalman, R. 1959. “On the General Theory of Control Systems.” IRE Transactions on Automatic Control 4 (3): 110–10. https://doi.org/10.1109/TAC.1959.1104873.

Kalman, R. E. 1960. “A New Approach to Linear Filtering and Prediction Problems.” Journal of Basic Engineering 82 (1): 35. https://doi.org/10.1115/1.3662552.

Kalouptsidis, Nicholas, Gerasimos Mileounis, Behtash Babadi, and Vahid Tarokh. 2011. “Adaptive Algorithms for Sparse System Identification.” Signal Processing 91 (8): 1910–9. https://doi.org/10.1016/j.sigpro.2011.02.013.

Karvonen, Toni, and Simo Särkkä. 2016. “Approximate State-Space Gaussian Processes via Spectral Transformation.” In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), 1–6. Vietri sul Mare, Salerno, Italy: IEEE. https://doi.org/10.1109/MLSP.2016.7738812.

Kelly, D. T. B., K. J. H. Law, and A. M. Stuart. 2014. “Well-Posedness and Accuracy of the Ensemble Kalman Filter in Discrete and Continuous Time.” Nonlinearity 27 (10): 2579. https://doi.org/10.1088/0951-7715/27/10/2579.

Kitagawa, Genshiro. 1987. “Non-Gaussian State—Space Modeling of Nonstationary Time Series.” Journal of the American Statistical Association 82 (400): 1032–41. https://doi.org/10.1080/01621459.1987.10478534.

———. 1996. “Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models.” Journal of Computational and Graphical Statistics 5 (1): 1–25. https://doi.org/10.1080/10618600.1996.10474692.

Kitagawa, Genshiro, and Will Gersch. 1996. Smoothness Priors Analysis of Time Series. Lecture Notes in Statistics 116. New York, NY: Springer New York : Imprint : Springer. http://dx.doi.org/10.1007/978-1-4612-0761-0.

Kobayashi, Hisashi, Brian L. Mark, and William Turin. 2011. Probability, Random Processes, and Statistical Analysis: Applications to Communications, Signal Processing, Queueing Theory and Mathematical Finance. Cambridge University Press.

Koopman, S. J., and J. Durbin. 2000. “Fast Filtering and Smoothing for Multivariate State Space Models.” Journal of Time Series Analysis 21 (3): 281–96. https://doi.org/10.1111/1467-9892.00186.

Krishnan, Rahul G., Uri Shalit, and David Sontag. 2017. “Structured Inference Networks for Nonlinear State Space Models.” In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2101–9. http://arxiv.org/abs/1609.09869.

Kutschireiter, Anna, Simone C Surace, Henning Sprekeler, and Jean-Pascal Pfister. 2015. “Approximate Nonlinear Filtering with a Recurrent Neural Network.” BMC Neuroscience 16 (Suppl 1): P196. https://doi.org/10.1186/1471-2202-16-S1-P196.

Lázaro-Gredilla, Miguel, Joaquin Quiñonero-Candela, Carl Edward Rasmussen, and Aníbal R. Figueiras-Vidal. 2010. “Sparse Spectrum Gaussian Process Regression.” Journal of Machine Learning Research 11 (Jun): 1865–81. http://www.jmlr.org/papers/v11/lazaro-gredilla10a.

Le Gland, François, Valerie Monbet, and Vu-Duc Tran. 2009. “Large Sample Asymptotics for the Ensemble Kalman Filter,” 25. https://hal.inria.fr/inria-00409060/document.

Lei, Jing, Peter Bickel, and Chris Snyder. 2009. “Comparison of Ensemble Kalman Filters Under Non-Gaussianity.” Monthly Weather Review 138 (4): 1293–1306. https://doi.org/10.1175/2009MWR3133.1.

Levin, David N. 2017. “The Inner Structure of Time-Dependent Signals,” March. http://arxiv.org/abs/1703.08596.

Lindgren, Finn, Håvard Rue, and Johan Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (4): 423–98. https://doi.org/10.1111/j.1467-9868.2011.00777.x.

Ljung, L., and T. Kailath. 1976. “Backwards Markovian Models for Second-Order Stochastic Processes (Corresp.).” IEEE Transactions on Information Theory 22 (4): 488–91. https://doi.org/10.1109/TIT.1976.1055570.

Ljung, L., T. Kailath, and B. Friedlander. 1975. “Scattering Theory and Linear Least Squares Estimation: Part I: Continuous-Time Problems.” In 1975 IEEE Conference on Decision and Control Including the 14th Symposium on Adaptive Processes, 55–56. https://doi.org/10.1109/CDC.1975.270647.

Loeliger, Hans-Andrea, Justin Dauwels, Junli Hu, Sascha Korl, Li Ping, and Frank R. Kschischang. 2007. “The Factor Graph Approach to Model-Based Signal Processing.” Proceedings of the IEEE 95 (6): 1295–1322. https://doi.org/10.1109/JPROC.2007.896497.

Manton, J. H., V. Krishnamurthy, and H. V. Poor. 1998. “James-Stein State Filtering Algorithms.” IEEE Transactions on Signal Processing 46 (9): 2431–47. https://doi.org/10.1109/78.709532.

Mattos, César Lincoln C., Zhenwen Dai, Andreas Damianou, Guilherme A. Barreto, and Neil D. Lawrence. 2017. “Deep Recurrent Gaussian Processes for Outlier-Robust System Identification.” Journal of Process Control, DYCOPS-CAB 2016, 60 (December): 82–94. https://doi.org/10.1016/j.jprocont.2017.06.010.

Mattos, César Lincoln C., Zhenwen Dai, Andreas Damianou, Jeremy Forth, Guilherme A. Barreto, and Neil D. Lawrence. 2016. “Recurrent Gaussian Processes.” In Proceedings of ICLR. http://arxiv.org/abs/1511.06644.

Micchelli, Charles A., and Peder Olsen. 2000. “Penalized Maximum-Likelihood Estimation, the Baum–Welch Algorithm, Diagonal Balancing of Symmetric Matrices and Applications to Training Acoustic Data.” Journal of Computational and Applied Mathematics 119 (1–2): 301–31. https://doi.org/10.1016/S0377-0427(00)00385-X.

Nickisch, Hannes, Arno Solin, and Alexander Grigorevskiy. 2018. “State Space Gaussian Processes with Non-Gaussian Likelihood.” In International Conference on Machine Learning, 3789–98. http://proceedings.mlr.press/v80/nickisch18a.html.

Olfati-Saber, R. 2005. “Distributed Kalman Filter with Embedded Consensus Filters.” In 44th IEEE Conference on Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC ’05, 8179–84. Seville, Spain: IEEE. https://doi.org/10.1109/CDC.2005.1583486.

Ollivier, Yann. 2017. “Online Natural Gradient as a Kalman Filter,” March. http://arxiv.org/abs/1703.00209.

Papadopoulos, Alexandre, François Pachet, Pierre Roy, and Jason Sakellariou. 2015. “Exact Sampling for Regular and Markov Constraints with Belief Propagation.” In Principles and Practice of Constraint Programming, 341–50. Lecture Notes in Computer Science. Switzerland: Springer, Cham. https://doi.org/10.1007/978-3-319-23219-5_24.

Perry, T. S. 2010. “Andrew Viterbi’s Fabulous Formula [Medal of Honor].” IEEE Spectrum 47 (5): 47–50. https://doi.org/10.1109/MSPEC.2010.5453141.

Psiaki, M. 2013. “The Blind Tricyclist Problem and a Comparative Study of Nonlinear Filters: A Challenging Benchmark for Evaluating Nonlinear Estimation Methods.” IEEE Control Systems 33 (3): 40–54. https://doi.org/10.1109/MCS.2013.2249422.

Quiñonero-Candela, Joaquin, and Carl Edward Rasmussen. 2005. “A Unifying View of Sparse Approximate Gaussian Process Regression.” Journal of Machine Learning Research 6 (Dec): 1939–59. http://jmlr.org/papers/volume6/quinonero-candela05a/quinonero-candela05a.pdf.

Rabiner, L., and B. H. Juang. 1986. “An Introduction to Hidden Markov Models.” IEEE ASSP Magazine 3 (1): 4–16. https://doi.org/10.1109/MASSP.1986.1165342.

Rabiner, L. R. 1989. “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.” Proceedings of the IEEE 77 (2): 257–86. https://doi.org/10.1109/5.18626.

Reece, S., and S. Roberts. 2010. “An Introduction to Gaussian Processes for the Kalman Filter Expert.” In 2010 13th International Conference on Information Fusion, 1–9. https://doi.org/10.1109/ICIF.2010.5711863.

Robertson, Andrew N. 2011. “A Bayesian Approach to Drum Tracking.” In. http://smcnetwork.org/system/files/smc2011_submission_185.pdf.

Robertson, Andrew, and Mark Plumbley. 2007. “B-Keeper: A Beat-Tracker for Live Performance.” In Proceedings of the 7th International Conference on New Interfaces for Musical Expression, 234–37. NIME ’07. New York, NY, USA: ACM. https://doi.org/10.1145/1279740.1279787.

Robertson, Andrew, Adam Stark, and Matthew EP Davies. 2013. “Percussive Beat Tracking Using Real-Time Median Filtering.” In Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. http://www.ecmlpkdd2013.org/wp-content/uploads/2013/09/MLMU_Robertson.pdf.

Robertson, Andrew, Adam M. Stark, and Mark D. Plumbley. 2011. “Real-Time Visual Beat Tracking Using a Comb Filter Matrix.” In Proceedings of the International Computer Music Conference 2011. https://www.eecs.qmul.ac.uk/~markp/2011/RobertsonStarkPlumbleyICMC2011_accepted.pdf.

Rodriguez, Alejandro, and Esther Ruiz. 2009. “Bootstrap Prediction Intervals in State–Space Models.” Journal of Time Series Analysis 30 (2): 167–78. https://doi.org/10.1111/j.1467-9892.2008.00604.x.

Särkkä, S., and J. Hartikainen. 2013. “Non-Linear Noise Adaptive Kalman Filtering via Variational Bayes.” In 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 1–6. https://doi.org/10.1109/MLSP.2013.6661935.

Särkkä, Simo. 2013. Bayesian Filtering and Smoothing. Institute of Mathematical Statistics Textbooks 3. Cambridge, U.K. ; New York: Cambridge University Press. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.461.4042&rep=rep1&type=pdf.

———. 2007. “On Unscented Kalman Filtering for State Estimation of Continuous-Time Nonlinear Systems.” IEEE Transactions on Automatic Control 52 (9): 1631–41. https://doi.org/10.1109/TAC.2007.904453.

Särkkä, Simo, and Jouni Hartikainen. 2012. “Infinite-Dimensional Kalman Filtering Approach to Spatio-Temporal Gaussian Process Regression.” In Artificial Intelligence and Statistics. http://www.jmlr.org/proceedings/papers/v22/sarkka12.html.

Särkkä, Simo, and A. Nummenmaa. 2009. “Recursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations.” IEEE Transactions on Automatic Control 54 (3): 596–600. https://doi.org/10.1109/TAC.2008.2008348.

Särkkä, Simo, A. Solin, and J. Hartikainen. 2013. “Spatiotemporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing: A Look at Gaussian Process Regression Through Kalman Filtering.” IEEE Signal Processing Magazine 30 (4): 51–61. https://doi.org/10.1109/MSP.2013.2246292.

Schein, Aaron, Hanna Wallach, and Mingyuan Zhou. 2016. “Poisson-Gamma Dynamical Systems.” In Advances in Neural Information Processing Systems, 5006–14. http://papers.nips.cc/paper/6082-poisson-gamma-dynamical-systems.

Segall, A., M. Davis, and T. Kailath. 1975. “Nonlinear Filtering with Counting Observations.” IEEE Transactions on Information Theory 21 (2): 143–49. https://doi.org/10.1109/TIT.1975.1055360.

Sorenson, H. W. 1970. “Least-Squares Estimation: From Gauss to Kalman.” IEEE Spectrum 7 (7): 63–68. https://doi.org/10.1109/MSPEC.1970.5213471.

Städler, Nicolas, and Sach Mukherjee. 2013. “Penalized Estimation in High-Dimensional Hidden Markov Models with State-Specific Graphical Models.” The Annals of Applied Statistics 7 (4): 2157–79. https://doi.org/10.1214/13-AOAS662.

Surace, Simone Carlo, and Jean-Pascal Pfister. 2016. “Online Maximum Likelihood Estimation of the Parameters of Partially Observed Diffusion Processes.” In.

Tavakoli, Shahin, and Victor M. Panaretos. 2016. “Detecting and Localizing Differences in Functional Time Series Dynamics: A Case Study in Molecular Biophysics.” Journal of the American Statistical Association, March, 1–31. https://doi.org/10.1080/01621459.2016.1147355.

Thrun, Sebastian, and John Langford. 1998. “Monte Carlo Hidden Markov Models.” DTIC Document. http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA363714.

Thrun, Sebastian, John Langford, and Dieter Fox. 1999. “Monte Carlo Hidden Markov Models: Learning Non-Parametric Models of Partially Observable Stochastic Processes.” In Proceedings of the International Conference on Machine Learning. Bled, Slovenia. http://robots.stanford.edu/papers/thrun.mchmm.pdf.

Turner, Ryan, Marc Deisenroth, and Carl Rasmussen. 2010. “State-Space Inference and Learning with Gaussian Processes.” In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 868–75. http://proceedings.mlr.press/v9/turner10a.html.

Wikle, Christopher K., L. Mark Berliner, and Noel Cressie. 1998. “Hierarchical Bayesian Space-Time Models.” Environmental and Ecological Statistics 5 (2): 117–54. https://doi.org/10.1023/A:1009662704779.