Inverse problems where the unknown parameter is in some function space. For me, this usually implies a spatiotemporal model, usually in the context of PDE solvers, particularly approximate ones.
Suppose I have a PDE, possibly with some unknown parameters in the driving equation. I can do an adequate job of predicting the future behaviour of that system if I somehow know the governing equations, their parameters, and the current state. But what if I am missing some information? What if I wish to simultaneously infer some unknown inputs? Let us say, the starting state? This is the kind of problem that we refer to as an inverse problem. Inverse problems arise naturally in tomography, compressed sensing, deconvolution, inverting PDEs and many other areas.
The thing that is special about PDEs is that they have a spatial structure, much more structured than the low-dimensional inference problems that statisticians traditionally looked at, and so it is worth reasoning them through from first principles.
In particular, I would like to work through enough notation here that I can understand the various methods used to solve these inverse problems, for example, simulation-based inference, MCMC methods, GANs or variational inference.
Generally, I am interested in problems that use some kind of probabilistic network so that we can not just guess the solution but also do uncertainty quantification.
1 Discretisation
The first step is imagining how we can handle this complex problem in a finite computer. There are many ways we can discretize.
Lassas, Saksman, and Siltanen (2009) introduces a nice notation for the usual kind of discretization, which I use here. This connects the problem of inference to the problem of sampling theory, via the realisation that we need to discretize the solution in order to compute it.
I also wish to ransack their literature review:
The study of Bayesian inversion in infinite-dimensional function spaces was initiated by Franklin (1970) and continued by Mandelbaum (1984);Lehtinen, Paivarinta, and Somersalo (1989);Fitzpatrick (1991), and Luschgy (1996). The concept of discretization invariance was formulated by Markku Lehtinen in the 1990s and has been studied by D’Ambrogi, Mäenpää, and Markkanen (1999);Sari Lasanen (2002);S. Lasanen and Roininen (2005);Piiroinen (2005). A definition of discretization invariance similar to the above was given in Lassas and Siltanen (2004). For other kinds of discretization of continuum objects in the Bayesian framework, see Battle, Cunningham, and Hanson (1997);Niinimäki, Siltanen, and Kolehmainen (2007)… For regularization-based approaches for statistical inverse problems, see Bissantz, Hohage, and Munk (2004);Engl, Hofinger, and Kindermann (2005);Engl and Nashed (1981);Pikkarainen (2006). The relationship between continuous and discrete (non-statistical) inversion is studied in Hilbert spaces in Vogel (1984). See Borcea, Druskin, and Knizhnerman (2005) for specialized discretizations for inverse problems.
The insight is that there are at least two discretizations that are relevant, the discretization of the measurements and the discretization of the solution. We see naturally that we need to use one discretization,
Consider a quantity
Next, we introduce the practical measurement model, which is the first kind of discretisation. We assume that this measurement device provides us with a
In this notation, the inverse problem is: given a realization
We cannot represent that distribution yet because
Measurement
I said we would understand model inversion problems in Bayesian terms. We manufacture some prior density
Denote the probability density function of the random variable
TBC
2 Very nearly exact methods
For specific problems, there are specific methods, for example F. Sigrist, Künsch, and Stahel (2015b) and Liu, Yeo, and Lu (2020), for advection/diffusion equations.
3 Approximation of the posterior
Generic models are more tricky and we usually have to approximate _some_thing. See Bao et al. (2020);Jo et al. (2020);Lu et al. (2021);Raissi, Perdikaris, and Karniadakis (2019);Tait and Damoulas (2020);Xu and Darve (2020);Yang, Zhang, and Karniadakis (2020);D. Zhang, Guo, and Karniadakis (2020);D. Zhang et al. (2019).
4 Bayesian nonparametrics
Since this kind of problem naturally invites functional parameters, we can also imagine considering it in the context of Bayesian nonparametrics, which has a slightly different notation than you usually see in Bayes textbooks. I suspect that there is a useful role for diverse Bayesian nonparametrics here, especially non-smooth random measures, but the easiest of all is Gaussian process, which I handle next.
5 Gaussian process parameters
Alexanderian (2021) states a ‘well-known’ result, that the solution of a Bayesian linear inverse problem with Gaussian prior and noise models is a Gaussian posterior
Note the connection to Gaussian belief propagation, and Functional Gaussian processes.
6 Finite Element Models and belief propagation
Finite Element Models of PDEs (and possibly other representations? Orthogonal bases generally?) can be expressed through locally-linear relationships and thus analysed using Gaussian Belief Propagation (Y. El-Kurdi et al. 2016; Y. M. El-Kurdi 2014; Y. El-Kurdi et al. 2015). Note that in this setting, there is nothing special about the inversion process. Inference proceeds the same either forward or inversely, as a variational message passing algorithm.
7 Score-based generative models
a.k.a. neural diffusions etc. Powerful, probably a worthy default starting point for new work
8 Occupation Kernels
Another GP method, but a different way again, is to use occupation kernels, where we identify a function by its effect on trajectories (thing: learning ocean currents from the motion of buoys). I think this is really quite nifty. See, e.g. Rielly et al. (2025).