Reproducing kernels satisfying physical equations
August 12, 2023 — October 24, 2024
A placeholder to collect articles on this idea (perhaps, collection of ideas) because, basically, the terminologies are not always obvious. No new insights yet.
If we want to use a kernel frequently to do some function-valued Gaussian process regression and the solutions should satisfy some partial differential equation, how might we encode that in the kernel itself? When is it worthwhile to do so? Closely related: learning physical operators.
When modelling physical systems, especially those governed by partial differential equations (PDEs), we often want to incorporate the underlying physical constraints directly into the kernel functions of reproducing kernel Hilbert spaces (RKHS). This approach ensures that the solutions not only fit the observed data but also adhere to known physical laws.
There are many methods that fit this description, and what works depends very much on which physical equations we are solving, on what domain, and so on.
The categories in this taxonomy are not mutually exclusive. I have not read the literature well enough to make claims about that. Some of them look very similar though.
1 Gaussian Processes and Their Derivatives
An important property of Gaussian processes is that linear transformations of GPs remain GPs under broad circumstances. Specifically, the derivative of a GP is also a GP, provided the covariance function is sufficiently smooth. If \(f\) is a GP with mean function \(m(x)\) and covariance function \(k(x, x')\), then its derivative \(f'\) is a GP with mean \(m'(x)\) and covariance \(k'(x, x')\), where:
\[ k'(x, x') = \frac{\partial^2 k(x, x')}{\partial x \partial x'} \]
This property allows us to incorporate differential operators into the GP framework, enabling us to encode PDE constraints directly into the kernel.
2 Latent Force Models
Latent force models are one of the earlier methods for integrating differential equations into GP regression (M. Álvarez, Luengo, and Lawrence 2009; M. A. Álvarez, Luengo, and Lawrence 2013; Moss et al. 2022). In LFMs, the idea is to model the unknown latent forces driving a system using GPs. These latent forces are then connected to the observed data through differential equations.
e.g. Consider a system governed by a linear ordinary differential equation (ODE):
\[ \frac{d f(t)}{d t} + a f(t) = u(t) \]
Here, \(f(t)\) is the observed function, \(a\) is a known constant, and \(u(t)\) is an unknown latent function modelled as a GP. By placing a GP prior on \(u(t)\), we induce a GP prior on \(f(t)\) that inherently satisfies the ODE.
The function \(f(t)\) resides in the Sobolev space \(W^{1,2}([0, T])\), which consists of functions whose first derivative is square-integrable over the interval ([0, T]).
3 Divergence-Free and Curl-Free Kernels
Some fun tricks are of special relevance to fluids; e.g. for kernels which imply divergence-free or curl-free fields, especially on the surface of a sphere (Narcowich, Ward, and Wright 2007; E. J. Fuselier, Shankar, and Wright 2016; E. J. Fuselier and Wright 2009).
E. Fuselier (2008) says:
Constructing divergence-free and curl-free matrix-valued RBFs is fairly simple. If \(\phi\) is a scalar-valued function consider
\[ \begin{aligned} \Phi_{\text {div }} & :=\left(-\Delta I+\nabla \nabla^T\right) \phi, \\ \Phi_{c u r l} & :=-\nabla \nabla^T \phi . \end{aligned} \]
If \(\phi\) is an RBF, then these functions can be used to produce divergence-free and curl-free interpolants, respectively. We note that these are not radial functions, but because they are usually generated by an RBF \(\phi\), they are still commonly called “matrix-valued RBFs”.
AFAICT there is nothing RBF-specific; I think it works for any stationary kernel. Do we even need stationarity?
The functions produced by these kernels reside in specific Sobolev spaces that respect the divergence-free or curl-free conditions. For instance, divergence-free vector fields in \(\mathbb{R}^3\) belong to the space:
\[ \mathbf{H}_{\text{div}} = \{\mathbf{f} \in [L^2(\Omega)]^3 : \nabla \cdot \mathbf{f} = 0\} \]
3.1 On the Sphere
When dealing with fields on the surface of a sphere, such as global wind patterns, special considerations are required (E. J. Fuselier and Wright 2009). The construction of divergence-free and curl-free kernels on the sphere involves accounting for the manifold’s curvature and ensuring that the vector fields are tangent to the sphere’s surface.
For a scalar function \(\phi\) defined on the sphere \(\mathbb{S}^2\), divergence-free kernels can be constructed using surface differential operators. These kernels help model tangential vector fields that are essential in geophysical applications, i.e. on the surface of the planet.
4 Linearly-constrained Operator-Valued Kernels
Operator-valued kernels extend the concept of scalar kernels to vector or function outputs. They are particularly handy when the physical constraints can be expressed as linear operators acting on functions (Lange-Hegermann 2018, 2021)
Consider a linear operator \(\mathcal{L}\) acting on a function \(f\). An operator-valued kernel \(K(x, x')\) can be designed such that:
\[ \mathcal{L}_x K(x, x') = \delta(x - x') \]
where \(\delta\) is the Dirac delta function. This approach ensures that functions drawn from the associated RKHS satisfy \(\mathcal{L} f = 0\).
5 Physics-Informed Gaussian Processes
Physics-informed Gaussian processes (PIGPs) incorporate physical laws by modifying the GP’s prior or likelihood to enforce PDE constraints (Raissi and Karniadakis 2018). This can be done by penalizing deviations from the PDE in the loss function or by directly incorporating the differential operators into the kernel.
For a function \(f(x)\) that should satisfy \(\mathcal{L} f(x) = 0\), the GP prior can be adjusted such that:
\[ \text{Cov}(\mathcal{L} f(x), \mathcal{L} f(x')) = k_{\mathcal{L}}(x, x') \]
where \(k_{\mathcal{L}}\) is a covariance function constructed to reflect the operator \(\mathcal{L}\).
More at (Perdikaris et al. 2017; Raissi and Karniadakis 2018; Raissi, Perdikaris, and Karniadakis 2017a, 2017b, 2018)
6 Implicit
Not quite sure what to call it, but Kian Ming A. Chai introduced us to Mora et al. (2024), which seems to be an interesting variant. Keyword match to Brouard, Szafranski, and D’Alché-Buc (2016).