Correlograms

Also covariances

2018-08-07 — 2019-09-22

dynamical systems

Hilbert space

linear algebra

signal processing

statistics

time series

Suspiciously similar content

This material is revised and expanded from the appendix of draft versions of a recent conference submission, for my own reference. I used (deterministic) correlograms a lot in that, and it was startlingly hard to find a decent summary of their properties anywhere. Nothing new here, but… see the material about doing this in a probabilistic way via Wiener-Khintchine representation and covariance kernels which lead to a natural probabilistic spectral analysis.

Figure 2: Ning Ma, autocorrelogram of a processed audio signal

Consider an $L_{2}$ signal $f : R \to R .$ We frequently overload notation and refer to a signal with free argument $t$ , so that $f (r t - ξ),$ for example, refers to the signal $t \mapsto f (r t - ξ) .$ We write the inner product between signals $t \mapsto f (t)$ and $t \mapsto f^{'} (t)$ as $⟨ f (t), f^{'} (t) ⟩$ . Where it is not clear what the free argument is, e.g. $t$ , we annotate it $⟨ f (t), f^{'} (t); t ⟩$ .

The correlogram $A : L_{2} (R) \to L_{2} (R)$ maps signals to signals. Specifically, $A {f}$ is a signal $R \to R$ such that

$A {f} := ξ \mapsto ⟨ f (t), f (t - ξ); t ⟩$ This is the covariance between $f (t)$ and $f (t - ξ) .$ (Note that we here discuss the covariance between given deterministic signals, not between two stochastic sources; covariance of stochastic processes is a broader, let alone inferring the covariance of stochastic processes.) Note also that this is what I would call an autocovariance not an auto-correlation, since it’s not normalized, but I’ll stick with the latter for now for reasons of convention.

We derive the properties of this transform.

Multiplication by a constant. Consider a constant $c \in R .$

$\begin{aligned} A {c f} (ξ) & = ⟨ c f (t), c f (t - ξ) ⟩ \\ = c^{2} ⟨ f (t), f (t - ξ); t ⟩ \\ = c^{2} A {f} (ξ) . \end{aligned}$

Time scaling:

$\begin{aligned} A {f (r t)} (ξ) & = ⟨ f (r t), f (r t - ξ); t ⟩ \\ = \int f (r t) f (r t - ξ) d t \\ = \frac{1}{r} \int f (t) f (t - \frac{ξ}{r}) d t \\ = \frac{1}{r} A {f} (\frac{ξ}{r}) \end{aligned}$

Addition:

$\begin{aligned} A {f + f^{'}} (ξ) & = ⟨ f (t) + f^{'} (t), f (t - ξ) + f^{'} (t - ξ); t ⟩ \\ = ⟨ f (t), f (t - ξ) ⟩ + ⟨ f (t), f^{'} (t - ξ); t ⟩ + ⟨ f^{'} (t), f (t - ξ) ⟩ + ⟨ f^{'} (t), f^{'} (t - ξ); t ⟩ \\ = A {f} (ξ) + ⟨ f^{'} (t), f (t - ξ); t ⟩ + ⟨ f (t), f^{'} (t - ξ); t ⟩ + A {f^{'}} (ξ) . \\ = A {f} (ξ) + ⟨ f^{'} (t), f (t - ξ); t ⟩ + ⟨ f (t + ξ), f^{'} (t); t ⟩ + A {f^{'}} (ξ) . \\ = A {f} (ξ) + ⟨ f^{'} (t), f (t - ξ); t ⟩ + ⟨ f^{'} (t), f (t + ξ); t ⟩ + A {f^{'}} (ξ) . \end{aligned}$

We can say little about the term $⟨ f^{'} (t), f (t - ξ); + ⟩ ⟨ f^{'} (t), f (t + ξ); t ⟩$ without more information about the signals in question. However, we can solve a randomized version. Suppose $S_{i}, i \in N$ are i.i.d. Rademacher variables, i.e. that they assume a value in ${+ 1, - 1}$ with equal probability. Then, we can introduce the following property:

Randomised addition:

$\begin{aligned} E [A {S_{1} f + S_{2} f^{'}} (ξ) & = E [A {S_{1} f} (ξ) + ⟨ S_{2} f^{'} (t), S_{1} f (t - ξ); t ⟩ + ⟨ S_{2} f^{'} (t), S_{1} f (t + ξ); t ⟩ + A {S_{2} f^{'}} (ξ)] \\ = E [A {S_{1} f} (ξ)] + E ⟨ S_{2} f^{'} (t), S_{1} f (t - ξ); t ⟩ + E ⟨ S_{2} f^{'} (t), S_{1} f (t + ξ); t ⟩ + E [A {S_{2} f^{'}} (ξ)] \\ = A {f} (ξ) + E [S_{1} S_{2}] ⟨ f^{'} (t), f (t - ξ); t ⟩ + E [S_{1} S_{2}] ⟨ f^{'} (t), f (t + ξ); t ⟩ + A {f^{'}} (ξ) \\ = A {f} (ξ) + A {f^{'}} (ξ) \end{aligned}$

1 References

Abrahamsen. 1997. “A Review of Gaussian Random Fields and Correlation Functions.”

Bochner. 1959. Lectures on Fourier Integrals.

Brown, and Puckette. 1989. “Calculation of a ‘‘Narrowed’’ Autocorrelation Function.” The Journal of the Acoustical Society of America.

Cariani, and Delgutte. 1996. “Neural correlates of the pitch of complex tones. I. Pitch and pitch salience.” Journal of neurophysiology.

de Cheveigné, and Kawahara. 2002. “YIN, a Fundamental Frequency Estimator for Speech and Music.” The Journal of the Acoustical Society of America.

Kaso. 2018. “Computation of the Normalized Cross-Correlation by Fast Fourier Transform.” PLOS ONE.

Khintchine. 1934. “Korrelationstheorie der stationären stochastischen Prozesse.” Mathematische Annalen.

Langner. 1992. “Periodicity Coding in the Auditory System.” Hearing Research.

Lewis. 1995. “Fast Template Matching.” In.

Licklider. 1951. “A Duplex Theory of Pitch Perception.” Experientia.

Loynes. 1968. “On the Concept of the Spectrum for Non-Stationary Processes.” Journal of the Royal Statistical Society. Series B (Methodological).

Ma, Green, Barker, et al. 2007. “Exploiting Correlogram Structure for Robust Speech Recognition with Multiple Speech Sources.” Speech Communication.

Morales-Cordovilla, Peinado, Sanchez, et al. 2011. “Feature Extraction Based on Pitch-Synchronous Averaging for Robust Speech Recognition.” IEEE Transactions on Audio, Speech, and Language Processing.

Rabiner. 1977. “On the Use of Autocorrelation Analysis for Pitch Detection.” IEEE Transactions on Acoustics, Speech, and Signal Processing.

Slaney, and Lyon. 1990. “A Perceptual Pitch Detector.” In Proceedings of ICASSP.

Sondhi. 1968. “New Methods of Pitch Extraction.” IEEE Transactions on Audio and Electroacoustics.

Tan, and Alwan. 2011. “Noise-Robust F0 Estimation Using SNR-Weighted Summary Correlograms from Multi-Band Comb Filters.” In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

Wiener. 1930. “Generalized Harmonic Analysis.” Acta Mathematica.