Path smoothness properties of stochastic processes

Continuity, differentiability and other smoothness properties

February 26, 2020 — September 1, 2021

functional analysis
probability
stochastic processes
time series

Figure 1

“When are the paths of a stochastic process continuous?” is a question one might like to ask. But we need to ask more precise questions than that, because things are complicated in probability land. If we are concerned about whether the paths sampled from the process are almost-surely continuous functions then we probably mean something like:

“Does the process {f(t)}t admit a modification such that tf(t) is a.e. Hölder-continuous with probability 1?” or some other such mouthful. There are many notions of continuity of stochastic processes. Continuous with respect to what, with what probability, etc.? Feller-continuity, etc. This notebook is not an exhaustive taxonomy; this is just a list of notions I need to remember. Commonly useful notions for a stochastic process {f(t)}tT include the following.

Continuity in probability:
limstP{|f(t)f(s)|ε}=0, for each tT and each ε>0.
Continuity in mean square, or L2 continuity:
limstE{|f(t)f(s)|2}=0, for each tT.
Sample continuity:
P{limst|f(t)f(s)|=0, for all tT}=1.

I have given these as continuity properties for all tT, but they can also be considered pointwise for fixed t. Since t is continuous, this can lead to subtle problems with uncountable unions of events, etc.

Jump processes show the difference between these. A Poisson process has paths which are not continuous with probability 1, but which are continuous in mean square and in probability.

1 Kolmogorov continuity theorem

The Kolmogorov continuity theorem gives us sufficient conditions for admitting a modification possessing a version which is Hölder of the process based on how rapidly moments of the process increments grow. Question: What gives us sufficient conditions? Lowther is good on this.

2 SDEs with rough paths

Despite the name, this is useful for smooth paths. See signatures and rough paths.

3 Connection to strong solutions of SDEs

TBD.

4 Continuity of Gaussian processes

Todo: Read Kanagawa et al. () section 4, for the startling revelations:

… it is easy to show that a GP sample path fGP(0,K) does not belong to the corresponding RKHS HK with probability 1 if HK is infinite dimensional… This implies that GP samples are “rougher”, or less regular, than RKHS functions … Note that this fact has been well known in the literature; see e.g., () and ( Corollary 7.1).

Let K be a positive definite kernel on a set X and HK be its RKHS, and consider fGP(m,K) with m:XR satisfying mHK. Then if HK is infinite dimensional, then fHK with probability 0. If HK is finite dimensional, then there is a version f~ of f such that f~HK with probability 1.

5 L2 derivatives of random fields

Figure 2

Robert J. Adler, Taylor, and Worsley () defines L2 derivatives thus: Choose a point tRd and a sequence of k ‘directions’ t1,,tk in Rd, and write these as t=(t1,,tk). From context I assume this means that these directions are supposed to have unit norm, tj=1. We say that f has a k-th order L2 partial derivative at t, in the direction t, if the limit DL2kf(t,t)limh1,,hk01j=1khjΔkf(t,t,h) exists in mean square, where h=(h1,,hk). tj is usually axis aligned, e.g. tj=[010]. Here Δkf(t,t,h) is the symmetrized difference Δkf(t,t,h)=s{0,1}k(1)kj=1ksjf(t+j=1ksjhjtj) and the limit is taken sequentially, i.e. first send h10, then h2, etc.

That is a lot, so let us examine that for the special case of k=1 and t1=[10]=:e1. We choose a point tRd and a direction w.l.o.g. e1. The symmetrised difference in this first order case becomes Δf(t,e1,h)=s{0,1}(1)1sf(t+shje1)=f(t+hje1)f(t). We say that f has a first order L2 partial derivative at t, in the direction e1, if the limit DL2f(t,e1)=limh01hΔf(t,t,h)=limh0f(t+hje1)f(t)h exists in mean square. This should look like the usual first order (partial) derivative, just with the term mean-square thrown in front.

By choosing t=(ej1,,ejk), where ej is the vector with j -th element 1 and all others zero, we can talk of the mean square partial derivatives of various orders ktj1tjkf(t)DL2kf(t,(ej1,,ejk)) of f. Then we see that the covariance function of partial derivatives of a random field must, if it exists and is finite, be given by E{kf(s)sj1sj1sjkkf(t)tj1tj1tjk}=2kK(s,t)sj1tj1sjktjk. Note that we have not assumed stationarity here, or Gaussianity, and still this process covariance function encodes a lot of information.

In the case that f is stationary, we can use the spectral representation to analyse these derivatives. In this case, the corresponding variances have an interpretation in terms of spectral moments. We define the spectral moments ωj1jNRNω1j1ωNjNν(dω) for all multi-indices (j1,,jN) with jj0. Assuming that the underlying random field, and so the covariance function, are real valued, so that, as described above, stationarity implies that K(t)=K(t) and ν(A)=ν(A), it follows that the odd ordered spectral moments, when they exist, are zero; specifically, ωj1jN=0 if j=1Njj is odd. 

For example, if f has mean square partial derivatives of orders α+β and γ+δ for $, , $, δ{0,1,2,}, then E{α+βf(t)αtjβtkγ+δf(t)γtδtm}=(1)α+βα+β+γ+δαtjβtkγtδtmK(t)|t=0=(1)α+βjα+β+γ+δRNωjαωkβωγωmδν(dω). Note that although this equation seems to have some asymmetries in the powers, these disappear due to the fact that all odd ordered spectral moments, like all odd ordered derivatives of K, are identically zero.

6 References

Adler, Robert J. 2010. The Geometry of Random Fields.
Adler, Robert J., and Taylor. 2007. Random Fields and Geometry. Springer Monographs in Mathematics 115.
Adler, Robert J, Taylor, and Worsley. 2016. Applications of Random Fields and Geometry Draft.
Bongers, and Mooij. 2018. From Random Differential Equations to Structural Causal Models: The Stochastic Case.” arXiv:1803.08784 [Cs, Stat].
Chevyrev, and Kormilitzin. 2016. A Primer on the Signature Method in Machine Learning.” arXiv:1603.03788 [Cs, Stat].
Kanagawa, Hennig, Sejdinovic, et al. 2018. Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences.” arXiv:1807.02582 [Cs, Stat].
Lukić, and Beder. 2001. Stochastic Processes with Sample Paths in Reproducing Kernel Hilbert Spaces.” Transactions of the American Mathematical Society.
Lyons, Terry J. 1998. Differential Equations Driven by Rough Signals.” Revista Matemática Iberoamericana.
Lyons, Terry. 2014. Rough Paths, Signatures and the Modelling of Functions on Streams.” arXiv:1405.4537 [Math, q-Fin, Stat].
Lyons, Terry J., Caruana, and Lévy. 2007. Differential Equations Driven by Rough Paths. Lecture Notes in Mathematics.
Lyons, Terry J., and Sidorova. 2005. Sound Compression: A Rough Path Approach.” In Proceedings of the 4th International Symposium on Information and Communication Technologies. WISICT ’05.
Pugachev, and Sinit︠s︡yn. 2001. Stochastic systems: theory and applications.
Teye. 2010. “Stochastic Invariance via Wong-Zakai Theorem.”
Wahba. 1990. Spline Models for Observational Data.