Path smoothness properties of stochastic processes

Continuity, differentiability and other smoothness properties



\[\renewcommand{\var}{\operatorname{Var}} \renewcommand{\dd}{\mathrm{d}} \renewcommand{\bb}[1]{\mathbb{#1}} \renewcommand{\vv}[1]{\boldsymbol{#1}} \renewcommand{\rv}[1]{\mathsf{#1}} \renewcommand{\gvn}{\mid} \renewcommand{\Ex}{\mathbb{E}} \renewcommand{\Pr}{\mathbb{P}}\]

“When are the paths of a stochastic process continuous?” is a question one might like to ask. But we need to ask more precise questions than that, because things are complicated in probability land. If we are concerned about whether the paths sampled from the process are almost-surely continuous functions then we probably mean something like:

“Does the process \(\{\rv{f}(t)\}_t\) admit a modification such that \(t\mapsto \rv{f}(t)\) is a.e. Hölder-continuous with probability 1?” or some other such mouthful. There are many notions of continuity of stochastic processes. Continuous wrt what, with what probability etc? Feller-continuity etc. This notebook is not an exhaustive taxonomy, this is just a list of notions I need to remember. Commonly useful notions for a stochastic process \(\{\rv{f}(t)\}_{t\in T}\) include the following.

Continuity in probability:
\(\lim _{s \rightarrow t} \mathbb{P}\{|\rv{f}(t)-\rv{f}(s)| \geq \varepsilon\}=0, \quad\) for each \(t \in T\) and each \(\varepsilon>0.\)
Continuity in mean square, or \(L^{2}\) continuity:
\[ \lim _{s \rightarrow t} \mathbb{E}\left\{|\rv{f}(t)-\rv{f}(s)|^{2}\right\}=0, \quad \text { for each } t \in T. \]
Sample continuity:
\[ \mathbb{P}\left\{\lim _{s \rightarrow t}|\rv{f}(t)-\rv{f}(s)|=0, \text { for all } t \in T\right\}=1. \]

I have given the these as continuity properties for all \(t\in T,\) but they can also be considered pointwise for fixed \(t\). Since \(t\) is continuous this can lead to subtle problems with uncountable unions of events etc.

Jump processes show the difference between these. A Poisson process has paths which are not continuous with probability 1, but which are continuous in mean square and in probability.

Kolmogorov continuity theorem

The Kolmogorov continuity theorem gives us sufficient conditions for admitting a modification possessing a version which is Hölder of the process based on how rapidly moments of the process increments grow. Question: What gives us sufficient conditions? Lowther is good on this.

SDEs with rough paths

Despite the name, this is useful for smooth paths. See signatures and rough paths.

Connection to strong solutions of SDEs

TBD.

Continuity of Gaussian processes

Todo: Read Kanagawa et al. (2018) section 4, for the startling revelations:

… it is easy to show that a GP sample path \(\rv{f} \sim \mathcal{G P}(0, K)\) does not belong to the corresponding RKHS \(\mathcal{H}_{K}\) with probability 1 if \(\mathcal{H}_{K}\) is infinite dimensional… This implies that GP samples are “rougher”, or less regular, than RKHS functions … Note that this fact has been well known in the literature; see e.g., (Wahba 1990, 5) and (Lukić and Beder 2001 Corollary 7.1).

Let \(K\) be a positive definite kernel on a set \(\mathcal{X}\) and \(\mathcal{H}_{K}\) be its RKHS, and consider \(\rv{f} \sim \mathcal{G} \mathcal{P}(m, K)\) with \(m: \mathcal{X} \rightarrow \mathbb{R}\) satisfying \(m \in \mathcal{H}_{K} .\) Then if \(\mathcal{H}_{K}\) is infinite dimensional, then \(\rv{f} \in \mathcal{H}_{K}\) with probability \(0 .\) If \(\mathcal{H}_{K}\) is finite dimensional, then there is a version \(\tilde{\rv{f}}\) of \(\rv{f}\) such that \(\tilde{\rv{f}} \in \mathcal{H}_{K}\) with probability 1.

\(L^2\) derivatives of random fields

Robert J. Adler, Taylor, and Worsley (2016) defines \(L^2\) derivatives thus: Choose a point \(t \in \mathbb{R}^{d}\) and a sequence of \(k\) ‘directions’ \(t_{1}', \ldots, t_{k}'\) in \(\mathbb{R}^{d}\), and write these as \(t'=\left(t_{1}', \ldots, t_{k}'\right).\) From context I assume this means that these directions are supposed to have unit norm, \(\|t_j\|=1.\) We say that \(\rv{f}\) has a \(k\)-th order \(L^{2}\) partial derivative at \(t\), in the direction \(t'\), if the limit \[ D_{L^{2}}^{k} \rv{f}\left(t, t'\right) \triangleq \lim _{h_{1}, \ldots, h_{k} \rightarrow 0} \frac{1}{\prod_{j=1}^{k} h_{j}} \Delta^{k} \rv{f}\left(t, t', h\right) \] exists in mean square, where \(h=\left(h_{1}, \ldots, h_{k}\right)\). \(t_{j}\) is usually axis aligned, e.g. \(t_{j}=[\dots\, 0\, 1\,0\, \dots]^\top\). Here \(\Delta^{k} \rv{f}\left(t, t', h\right)\) is the symmetrized difference \[ \Delta^{k} \rv{f}\left(t, t', h\right)=\sum_{s \in\{0,1\}^{k}}(-1)^{k-\sum_{j=1}^{k} s_{j}} \rv{f}\left(t+\sum_{j=1}^{k} s_{j} h_{j} t_{j}'\right) \] and the limit is taken sequentially, i.e. first send \(h_{1}\to 0,\) then \(h_{2}\), etc.

That is a lot, so let us examine that for the special case of \(k=1\) and \(t_{1}=[1\,0\dots]^\top=:e_1.\) We choose a point \(t \in \mathbb{R}^{d}\) and a direction w.l.o.g. \(e_1.\) The symmetrised difference in this first order case becomes \[\begin{aligned} \Delta \rv{f}\left(t, e_1, h\right) &=\sum_{s \in\{0,1\}}(-1)^{1- s} \rv{f}\left(t+ s h_{j} e_1\right)\\ &=\rv{f}\left(t+ h_{j} e_1\right) - \rv{f}\left(t\right). \end{aligned}\] We say that \(\rv{f}\) has a first order \(L^{2}\) partial derivative at \(t\), in the direction \(e_1\), if the limit \[\begin{aligned} D_{L^{2}} \rv{f}\left(t, e_1\right) &= \lim _{h \rightarrow 0} \frac{1}{h} \Delta \rv{f}\left(t, t', h\right)\\ &= \lim _{h \rightarrow 0} \frac{\rv{f}\left(t+ h_{j} e_1\right) - \rv{f}\left(t\right)}{h} \end{aligned}\] exists in mean square. This should look like the usual first order (partial) derivative, just with the term mean-square thrown in front.

By choosing \(t^{\prime}=\left(e_{j_{1}}, \ldots, e_{j_{k}}\right)\), where \(e_{j}\) is the vector with \(j\) -th element 1 and all others zero, we can talk of the mean square partial derivatives of various orders \[ \frac{\partial^{k}}{\partial t_{j_{1}} \ldots \partial t_{j_{k}}} \rv{f}(t) \triangleq D_{L^{2}}^{k} \rv{f}\left(t,\left(e_{j_{1}}, \ldots, e_{j_{k}}\right)\right) \] of \(\rv{f}.\) Then we see that the covariance function of partial derivatives of a random field must, if it exists and is finite, be given by \[ \mathbb{E}\left\{\frac{\partial^{k} \rv{f}(s)}{\partial s_{j_{1}} \partial s_{j_{1}} \ldots \partial s_{j_{k}}} \frac{\partial^{k} \rv{f}(t)}{\partial t_{j_{1}} \partial t_{j_{1}} \ldots \partial t_{j_{k}}}\right\}=\frac{\partial^{2 k} K(s, t)}{\partial s_{j_{1}} \partial t_{j_{1}} \ldots \partial s_{j_{k}} \partial t_{j_{k}}}. \] Note that we have not assumed stationarity here, or Gaussianity, and still this process covariance function encodes a lot of information.

In the case that \(\rv{f}\) is stationary, we can use the spectral representation to analyze these derivatives. In this case, the corresponding variances have an interpretation in terms of spectral moments. We define the spectral moments \[ \omega_{j_{1} \ldots j_{N}} \triangleq \int_{\mathbb{R}^{N}} \omega_{1}^{j_{1}} \cdots \omega_{N}^{j_{N}} \nu(d \omega) \] for all multi-indices \(\left(j_{1}, \ldots, j_{N}\right)\) with \(j_{j} \geq 0\). Assuming that the underlying random field, and so the covariance function, are real valued, so that, as described above, stationarity implies that \(K(t)=K(-t)\) and \(\nu(A)=\nu(-A)\), it follows that the odd ordered spectral moments, when they exist, are zero; specifically, \[ \omega_{j_{1} \ldots j_{N}}=0 \quad \text { if } \sum_{j=1}^{N} j_{j} \text { is odd. } \]

For example, if \(\rv{f}\) has mean square partial derivatives of orders \(\alpha+\beta\) and \(\gamma+\delta\) for \(\alpha, \beta, \gamma, \delta \in\{0,1,2, \ldots\}\), then \[ \begin{aligned} \mathbb{E}\left\{\frac{\partial^{\alpha+\beta} \rv{f}(t)}{\partial^{\alpha} t_{j} \partial^{\beta} t_{k}} \frac{\partial^{\gamma+\delta} \rv{f}(t)}{\partial^{\gamma} t_{\ell} \partial^{\delta} t_{m}}\right\} &=\left.(-1)^{\alpha+\beta} \frac{\partial^{\alpha+\beta+\gamma+\delta}}{\partial^{\alpha} t_{j} \partial^{\beta} t_{k} \partial^{\gamma} t_{\ell} \partial^{\delta} t_{m}} K(t)\right|_{t=0} \\ &=(-1)^{\alpha+\beta} j^{\alpha+\beta+\gamma+\delta} \int_{\mathbb{R}^{N}} \omega_{j}^{\alpha} \omega_{k}^{\beta} \omega_{\ell}^{\gamma} \omega_{m}^{\delta} \nu(d \omega). \end{aligned} \] Note that although this equation seems to have some asymmetries in the powers, these disappear due to the fact that all odd ordered spectral moments, like all odd ordered derivatives of \(K\), are identically zero.

References

Adler, Robert J. 2010. The Geometry of Random Fields. SIAM ed. Philadelphia: Society for Industrial and Applied Mathematics.
Adler, Robert J., and Jonathan E. Taylor. 2007. Random Fields and Geometry. Springer Monographs in Mathematics 115. New York: Springer.
Adler, Robert J, Jonathan E Taylor, and Keith J Worsley. 2016. Applications of Random Fields and Geometry Draft.
Bongers, Stephan, and Joris M. Mooij. 2018. From Random Differential Equations to Structural Causal Models: The Stochastic Case.” arXiv:1803.08784 [Cs, Stat], March.
Chevyrev, Ilya, and Andrey Kormilitzin. 2016. A Primer on the Signature Method in Machine Learning.” arXiv:1603.03788 [Cs, Stat], March.
Kanagawa, Motonobu, Philipp Hennig, Dino Sejdinovic, and Bharath K. Sriperumbudur. 2018. Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences.” arXiv:1807.02582 [Cs, Stat], July.
Lukić, Milan, and Jay Beder. 2001. Stochastic Processes with Sample Paths in Reproducing Kernel Hilbert Spaces.” Transactions of the American Mathematical Society 353 (10): 3945–69.
Lyons, Terry. 2014. Rough Paths, Signatures and the Modelling of Functions on Streams.” arXiv:1405.4537 [Math, q-Fin, Stat], May.
Lyons, Terry J. 1998. Differential Equations Driven by Rough Signals.” Revista Matemática Iberoamericana 14 (2): 215–310.
Lyons, Terry J., Michael Caruana, and Thierry Lévy. 2007. Differential Equations Driven by Rough Paths. Vol. 1908. Lecture Notes in Mathematics. Springer, Berlin.
Lyons, Terry J., and Nadia Sidorova. 2005. Sound Compression: A Rough Path Approach.” In Proceedings of the 4th International Symposium on Information and Communication Technologies, 223–28. WISICT ’05. Cape Town, South Africa: Trinity College Dublin.
Pugachev, V. S., and I. N. Sinit︠s︡yn. 2001. Stochastic systems: theory and applications. River Edge, NJ: World Scientific.
Teye, Alfred Larm. 2010. “Stochastic Invariance via Wong-Zakai Theorem.” PhD Thesis, University of Amsterdam.
Wahba, Grace. 1990. Spline Models for Observational Data. SIAM.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.