Probabilistic spectral analysis

Graphical introduction to nonstationary modelling of audio data. The input (bottom) is a sound recording of female speech. We seek to decompose the signal into Gaussian process carrier waveforms (blue block) multiplied by a spectrogram (green block). The spectrogram is learned from the data as a nonnegative matrix of weights times positive modulators (top).(W. J. Wilkinson et al. 2019b)

I am interested in probabilistic analogues of time frequency analysis and, what is nearly the same thing, autocorrelation analysis, but for non-stationary signals. This is natural with Gaussian processes.

I am especially interested in this for audio signals, which can be very very large, but have certain simplicities - i.e. being scalar functions of a univariate time index, usually regularly sampled.

In signal processing we frequently use Fourier transforms as a notionally nonparametric model for such a system, or a source of features for analysis.

That is classic stuff but it is (for me) always unsatisfying just taking the Fourier transform of something and hoping to have learned stuff about the system. There are a lot of arbitrary tuning parameters and awkward assumptions about, e.g. local stationarity and arbitrary ways of introducing non-local correlation. The same holds for the deterministic autocorrelogram, on which I have recently published a paper. I got good results, but I had no principled way to select the regularisation and interpretation of the methods. Unsatisfying.

I think we can do better by looking at the probabilistic behaviour of Fourier transforms and treating these as Bayesian nonparametric problems. This could solve a few problems at once.

This is an active area, with a several approaches.

Classic: stochastic processes studied via correlation function

I’ve discussed basic stationary signal analysis everywhere, but why not check out some backgrounders in (Wiener and Masani 1957; Yaglom 1987)?

Non-stationary spectral kernel

The central tool here in practice is Bochner’s theorem, which states that the Fourier transform of some spectral measure is a valid covariance kernel:

\[\kappa(\Delta t)=\mathcal{F}_{\Delta t}.\]

Taking this insight and running with it you can do lots of fun stuff. Turner and Sahani (2014) is sometimes mentioned as ground-zero of this kind of research, although the connections are certainly much older, e.g. Curtain (1975). Wiener and Khintchine approaches were not far from this (Wiener and Masani 1958, 1957) and it is implicit in Kalman-Bucy filtering (R. Kalman 1959; R. E. Kalman 1960; Kailath 1971). There are natural extensions of classic results, e.g. a Shannon-Nyquist theorem Tobar (2019). In modern times we have related but more specialised techniques such as the probabilistic phase vocoder (Godsill and Cemgil 2005). See also the connections to time series state models of Hartikainen and SΓ€rkkΓ€ (2010), Lindgren, Rue, and LindstrΓΆm (2011), Reece and Roberts (2010) and Liutkus, Badeau, and Richard (2011).

There are nice introductions in some papers (Solin 2016; Alvarado, Alvarez, and Stowell 2019; W. J. Wilkinson et al. 2019b), which unite various pieces I was discussing above with actual applications. I will work through these methods here for my own edification.


The basic setting is the same as for typical audio signal analysis; we begin with a (random) signal \(f:\mathbb{R}\to\mathbb{R}\), where the argument is a continuous time index. We do not know this signal, but will infer its properties will have some countable number of discrete observations, \(\mathbf{f}:=\{f(t_k);k=1,2,\dots,K\}.\)

We imagine observations from this signal are modelled by a Gaussian process, giving us the same setup as Gaussian process regression. We introduce the additional assumption here that the scalar index \(\mathcal{I}:=\mathbb{R}\). ) represents time.

I suppose what we are doing here is requiring that there be some model for sampling error and that it may as well be the most convenient possible model to work with, which is additive Gaussian. More general noise models are indeed possible, and if we allow other Gaussian processes as additive noise models then we are on the way to constructing a source separation model. That is indeed what (Liutkus, Badeau, and Richard 2011) do.1

Anyway, with these choices, this becomes absolutely the classic Gaussian process regression with some specialisation. (univariate index, mean-0)

It is also not far from the classic time frequency spectral analysis setup, where we take Fourier transforms over fixed size windows to estimate a kind of deterministic approximation to \(\kappa\) (thanks Wiener-Khintchine theorem); in that context we are effectively assuming that for each window we have an independent estimation problem, and a periodic kernel. I should make that relationship precise. πŸ—

There is clearly a lot wrapped up in the kernel, \(\kappa(t, t';\mathbf{\theta}).\)

Typically this is some kind of spectral mixture kernel (W. J. Wilkinson et al. 2019a; Wilson and Adams 2013). W. J. Wilkinson et al. (2019b) summarizes these as:

\[ \begin{aligned} \kappa_{\mathrm{sm}}\left(t, t^{\prime}\right) &=\sum_{d=1}^{D} \kappa_{z}^{(d)}\left(t, t^{\prime}\right) \\ \kappa_{z}^{(d)}\left(t, t^{\prime}\right) &=\sigma_{d}^{2} \cos \left(\omega_{d}\left(t-t^{\prime}\right)\right) \kappa_{d}\left(t, t^{\prime}\right) \end{aligned} \]

\(\kappa_{d}\) is free to be chosen, but is typically from the MatΓ©rn class of kernel functions. Parameters \(\omega_{d}\) determine the periodicity of the kernel components, which can be interpreted as the centre frequencies of the filters in a probabilistic filter bank. By choosing the exponential kernel \(\kappa_{d}\left(t, t^{\prime}\right)=\exp \left(\left|t-t^{\prime}\right| / \ell_{d}\right)\) we recover exactly the probabilistic phase vocoder (Cemgil & Godsill, 2005 ), and the lengthscales \(\ell_{d}\) control the filter bandwidths.

More generally we would like this to be a non-stationary kernel, which requires a model for the density of these kernels. W. J. Wilkinson et al. (2019b) uses a NMF model with a GP prior on some matrix rows and applying a softmax link. (Remes, Heinonen, and Kaski (2018) seems to get a similar structure?)

Locally stationary

Connection to the short time Fourier transform, where signals are assumed stationary. Change point detection version

There is an alternative approach which looks at switching between covariance kernels/spectrogram. One strand is the AdaptSpect family of methods (Bertolacci et al. 2020; Rosen, Wood, and Stoffer 2012), whgcih develop fast MCMC samplers by using Whittle likelihood approaches over randomised change points. Disclosure of bias: I just enjoyed a seminar by my colleague Michael Bertolacci on this theme, and Sally Cripps nΓ©e Wood works 20m from me and was co-author these.

Russell Tsuchida has made me aware of a parallel body o -g mouse off f work (Adams and MacKay 2007; Edwards, Meyer, and Christensen 2019; Osborne 2010; Roberts et al. 2013; Saatçi, Turner, and Rasmussen 2010) which keeps the spectrogram implicit and changes the covariance kernel. This is still reasonably fast thanks to Lattice GP tricks.

TODO: compare and contrast these methods. ls -al /dev/disk/by-uuid/I suspect a major difference is that the former targets statisticians and the latter ML people but they can probably be combined, or at least a neat cherry-picked method leveraging both should be feasible.

Non-Gaussian approaches

For now, see sparse stochastic processes.


Adams, Ryan Prescott, and David J. C. MacKay. 2007. β€œBayesian Online Changepoint Detection.” arXiv:0710.3742 [Stat], October.
Alvarado, Pablo A., Mauricio A. Alvarez, and Dan Stowell. 2019. β€œSparse Gaussian Process Audio Source Separation Using Spectrum Priors in the Time-Domain.” In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 995–99.
Alvarado, Pablo A., and Dan Stowell. 2018. β€œEfficient Learning of Harmonic Priors for Pitch Detection in Polyphonic Music.” arXiv:1705.07104 [Cs, Stat], November.
Bertolacci, Michael. 2019. β€œHierarchical Bayesian Mixture Models for Spatiotemporal Data with Nonstandard Features.”
Bertolacci, Michael, Edward Cripps, Ori Rosen, John W. Lau, and Sally Cripps. 2019. β€œClimate Inference on Daily Rainfall Across the Australian Continent, 1876–2015.” Annals of Applied Statistics 13 (2): 683–712.
Bertolacci, Michael, Ori Rosen, Edward Cripps, and Sally Cripps. 2020. β€œAdaptSPEC-X: Covariate Dependent Spectral Modeling of Multiple Nonstationary Time Series.” arXiv:1908.06622 [Stat], June.
Bruinsma, Wessel, and Richard E. Turner. 2018. β€œLearning Causally-Generated Stationary Time Series.” arXiv:1802.08167 [Stat], February.
Cemgil, Ali Taylan. 2009. β€œBayesian Inference for Nonnegative Matrix Factorisation Models.” Computational Intelligence and Neuroscience.
Cheng, Changqing, Akkarapol Sa-Ngasoongsong, Omer Beyca, Trung Le, Hui Yang, Zhenyu (James) Kong, and Satish T. S. Bukkapatnam. 2015. β€œTime Series Forecasting for Nonlinear and Non-Stationary Processes: A Review and Comparative Study.” IIE Transactions 47 (10): 1053–71.
Choudhuri, Nidhan, Subhashis Ghosal, and Anindya Roy. 2004a. β€œContiguity of the Whittle Measure for a Gaussian Time Series.” Biometrika 91 (1): 211–18.
β€”β€”β€”. 2004b. β€œBayesian Estimation of the Spectral Density of a Time Series.” Journal of the American Statistical Association 99 (468): 1050–59.
Cunningham, John P., Krishna V. Shenoy, and Maneesh Sahani. 2008. β€œFast Gaussian Process Methods for Point Process Intensity Estimation.” In Proceedings of the 25th International Conference on Machine Learning, 192–99. ICML ’08. New York, NY, USA: ACM Press.
Curtain, Ruth F. 1975. β€œInfinite-Dimensional Filtering.” SIAM Journal on Control 13 (1): 89–104.
Duvenaud, David K., Hannes Nickisch, and Carl E. Rasmussen. 2011. β€œAdditive Gaussian Processes.” In Advances in Neural Information Processing Systems, 226–34.
Dym, H., and Henry P. McKean. 2008. Gaussian Processes, Function Theory, and the Inverse Spectral Problem. Dover ed. Dover Books on Mathematics. Mineola, N.Y: Dover Publications.
Edwards, Matthew C., Renate Meyer, and Nelson Christensen. 2015. β€œBayesian Semiparametric Power Spectral Density Estimation with Applications in Gravitational Wave Data Analysis,” May.
β€”β€”β€”. 2019. β€œBayesian Nonparametric Spectral Density Estimation Using B-Spline Priors.” Statistics and Computing 29 (1): 67–78.
FΓ©votte, CΓ©dric, Nancy Bertin, and Jean-Louis Durrieu. 2008. β€œNonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis.” Neural Computation 21 (3): 793–830.
Girolami, Mark, and Simon Rogers. 2005. β€œHierarchic Bayesian Models for Kernel Learning.” In Proceedings of the 22nd International Conference on Machine Learning - ICML ’05, 241–48. Bonn, Germany: ACM Press.
Godsill, Simon J, and Ali Taylan Cemgil. 2005. β€œProbabilistic Phase Vocoder and Its Application to Interpolation of Missing Values in Audio Signals.” In 2005 13th European Signal Processing Conference, 4.
Hartikainen, J., and S. SΓ€rkkΓ€. 2010. β€œKalman Filtering and Smoothing Solutions to Temporal Gaussian Process Regression Models.” In 2010 IEEE International Workshop on Machine Learning for Signal Processing, 379–84. Kittila, Finland: IEEE.
Hensman, James, Nicolas Durrande, and Arno Solin. 2018. β€œVariational Fourier Features for Gaussian Processes.” Journal of Machine Learning Research 18 (151): 1–52.
Hermansen, Gudmund Horn. 2008. β€œBayesian nonparametric modelling of covariance functions, with application to time series and spatial statistics.”
Jesus, Joao, and Richard E. Chandler. 2017. β€œInference with the Whittle Likelihood: A Tractable Approach Using Estimating Functions.” Journal of Time Series Analysis 38 (2): 204–24.
Kailath, Thomas. 1971. β€œThe Structure of Radon-Nikodym Derivatives with Respect to Wiener and Related Measures.” The Annals of Mathematical Statistics 42 (3): 1054–67.
Kalman, R. 1959. β€œOn the General Theory of Control Systems.” IRE Transactions on Automatic Control 4 (3): 110–10.
Kalman, R. E. 1960. β€œA New Approach to Linear Filtering and Prediction Problems.” Journal of Basic Engineering 82 (1): 35.
Karvonen, Toni, and Simo SΓ€rkkΓ€. 2016. β€œApproximate State-Space Gaussian Processes via Spectral Transformation.” In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), 1–6. Vietri sul Mare, Salerno, Italy: IEEE.
Kirch, Claudia, Matthew C. Edwards, Alexander Meier, and Renate Meyer. 2019. β€œBeyond Whittle: Nonparametric Correction of a Parametric Likelihood with a Focus on Bayesian Time Series Analysis.” Bayesian Analysis 14 (4): 1037–73.
Lindgren, Finn, HΓ₯vard Rue, and Johan LindstrΓΆm. 2011. β€œAn Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (4): 423–98.
Liutkus, Antoine, Roland Badeau, and GΓ€el Richard. 2011. β€œGaussian Processes for Underdetermined Source Separation.” IEEE Transactions on Signal Processing 59 (7): 3155–67.
Liutkus, Antoine, Zafar Rafii, Bryan Pardo, Derry Fitzgerald, and Laurent Daudet. 2014. β€œKernel Spectrogram Models for Source Separation.” In, 6–10. IEEE.
Macaro, Christian, and Raquel Prado. 2014. β€œSpectral Decompositions of Multiple Time Series: A Bayesian Non-Parametric Approach.” Psychometrika 79 (1): 105–29.
Meier, Alexander, Claudia Kirch, and Renate Meyer. 2020. β€œBayesian Nonparametric Analysis of Multivariate Time Series: A Matrix Gamma Process Approach.” Journal of Multivariate Analysis 175 (January): 104560.
Meyer, Renate, Matthew C. Edwards, Patricio Maturana-Russel, and Nelson Christensen. 2020. β€œComputational Techniques for Parameter Estimation of Gravitational Wave Signals.” WIREs Computational Statistics n/a (n/a): e1532.
Nickisch, Hannes, Arno Solin, and Alexander Grigorevskiy. 2018. β€œState Space Gaussian Processes with Non-Gaussian Likelihood.” In International Conference on Machine Learning, 3789–98.
Osborne, Michael A. 2010. β€œBayesian Gaussian Processes for Sequential Prediction, Optimisation and Quadrature.” Http://, Oxford University, UK.
Rasmussen, Carl Edward, and Hannes Nickisch. 2010. β€œGaussian Processes for Machine Learning (GPML) Toolbox.” Journal of Machine Learning Research 11 (Nov): 3011–15.
Reece, S., and S. Roberts. 2010. β€œAn Introduction to Gaussian Processes for the Kalman Filter Expert.” In 2010 13th International Conference on Information Fusion, 1–9.
Remes, Sami, Markus Heinonen, and Samuel Kaski. 2017. β€œNon-Stationary Spectral Kernels.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 4642–51. Curran Associates, Inc.
β€”β€”β€”. 2018. β€œNeural Non-Stationary Spectral Kernel.” arXiv:1811.10978 [Cs, Stat], November.
Roberts, S., M. Osborne, M. Ebden, S. Reece, N. Gibson, and S. Aigrain. 2013. β€œGaussian Processes for Time-Series Modelling.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371 (1984): 20110550.
Rosen, Ori, David S. Stoffer, and Sally Wood. 2009. β€œLocal Spectral Analysis via a Bayesian Mixture of Smoothing Splines.” Journal of the American Statistical Association 104 (485): 249–62.
Rosen, Ori, Sally Wood, and David S. Stoffer. 2012. β€œAdaptSPEC: Adaptive Spectral Estimation for Nonstationary Time Series.” Journal of the American Statistical Association 107 (500): 1575–89.
SaatΓ§i, Yunus, Ryan Turner, and Carl Edward Rasmussen. 2010. β€œGaussian Process Change Point Models.” In Proceedings of the 27th International Conference on International Conference on Machine Learning, 927–34. ICML’10. Madison, WI, USA: Omnipress.
SΓ€rkkΓ€, S., and J. Hartikainen. 2013. β€œNon-Linear Noise Adaptive Kalman Filtering via Variational Bayes.” In 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 1–6.
SΓ€rkkΓ€, Simo. 2007. β€œOn Unscented Kalman Filtering for State Estimation of Continuous-Time Nonlinear Systems.” IEEE Transactions on Automatic Control 52 (9): 1631–41.
SΓ€rkkΓ€, Simo, and A. Nummenmaa. 2009. β€œRecursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations.” IEEE Transactions on Automatic Control 54 (3): 596–600.
SΓ€rkkΓ€, Simo, A. Solin, and J. Hartikainen. 2013. β€œSpatiotemporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing: A Look at Gaussian Process Regression Through Kalman Filtering.” IEEE Signal Processing Magazine 30 (4): 51–61.
SΓ€rkkΓ€, Simo, and Arno Solin. 2019. Applied Stochastic Differential Equations. Institute of Mathematical Statistics Textbooks 10. Cambridge ; New York, NY: Cambridge University Press.
Solin, Arno. 2016. β€œStochastic Differential Equation Methods for Spatio-Temporal Gaussian Process Regression.” Aalto University.
Solin, Arno, and Simo SΓ€rkkΓ€. 2013. β€œInfinite-Dimensional Bayesian Filtering for Detection of Quasiperiodic Phenomena in Spatiotemporal Data.” Physical Review E 88 (5): 052909.
β€”β€”β€”. 2014. β€œExplicit Link Between Periodic Covariance Functions and State Space Models.” In Artificial Intelligence and Statistics, 904–12.
Sykulski, Adam M., Sofia C. Olhede, Arthur P. Guillaumin, Jonathan M. Lilly, and Jeffrey J. Early. 2019. β€œThe Debiased Whittle Likelihood.” Biometrika 106 (2): 251–66.
Tobar, Felipe. 2019. β€œBand-Limited Gaussian Processes: The Sinc Kernel.” Advances in Neural Information Processing Systems 32: 12749–59.
Tobar, Felipe, Lerko Araya-HernΓ‘ndez, Pablo Huijse, and Petar M. DjuriΔ‡. 2020. β€œBayesian Reconstruction of Fourier Pairs.” arXiv:2011.04585 [Eess, Stat], November.
Tuft, Marie. 2020. β€œStatistical Learning for the Spectral Analysis of Time Series Data.”
Turner, Richard E., and Maneesh Sahani. 2014. β€œTime-Frequency Analysis as Probabilistic Inference.” IEEE Transactions on Signal Processing 62 (23): 6171–83.
Valenzuela, C., and F. Tobar. 2019. β€œLow-Pass Filtering as Bayesian Inference.” In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3367–71.
Virtanen, T., A. Taylan Cemgil, and S. Godsill. 2008. β€œBayesian Extensions to Non-Negative Matrix Factorisation for Audio Signal Modelling.” In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 1825–28.
Wiener, N, and P Masani. 1957. β€œThe Prediction Theory of Multivariate Stochastic Processes.” Acta Mathematica 98 (1): 111–50.
β€”β€”β€”. 1958. β€œThe Prediction Theory of Multivariate Stochastic Processes, II.” Acta Mathematica 99 (1): 93–137.
Wilkinson, W. 2019. β€œGaussian Process Modelling for Audio Signals.” Thesis, London: Queen Mary University of London.
Wilkinson, William J., M. Riis Andersen, J. D. Reiss, D. Stowell, and A. Solin. 2019a. β€œUnifying Probabilistic Models for Time-Frequency Analysis.” In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3352–56.
Wilkinson, William J., Michael Riis Andersen, Joshua D. Reiss, Dan Stowell, and Arno Solin. 2019b. β€œEnd-to-End Probabilistic Inference for Nonstationary Audio Analysis.” arXiv:1901.11436 [Cs, Eess, Stat], January.
Wilson, Andrew Gordon, and Ryan Prescott Adams. 2013. β€œGaussian Process Kernels for Pattern Discovery and Extrapolation.” In International Conference on Machine Learning.
Yaglom, A. M. 1987. Correlation Theory of Stationary and Related Random Functions. Volume II: Supplementary Notes and References. Springer Series in Statistics. New York, NY: Springer Science & Business Media.
Zheng, Yanbing, Jun Zhu, and Anindya Roy. 2010. β€œNonparametric Bayesian Inference for the Spectral Density Function of a Random Field.” Biometrika 97 (1): 238–45.

  1. We might more generally consider a sampling problem where we observe the signal through inner products with some sampling kernel, possibly even a stochastic one, but that sounds complicated.β†©οΈŽ

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.