# Symbolic regression

March 14, 2023 — December 23, 2023

Fajardo-Fontiveros et al. (2023):

[C]onsider a dataset \(D=\left\{\left(y_i, \mathbf{x}_i\right)\right\}\), with \(i=1, \ldots, N\), generated using the closed form model \(m^*\), so that \(y_i=m^*\left(\mathbf{x}_i, \theta^*\right)+\epsilon_i\) with \(\theta^*\) being the parameters of the model, and \(\epsilon_i\) a random unbiased observation noise drawn from the normal distribution with variance \(s_\epsilon^2\). […] he question we are interested in is: Assuming that \(m^*\) can be expressed in closed form, when is it possible to identify it as the true generating model among all possible closed-form mathematical models, for someone who does not know the true model beforehand? Note that our focus is on learning the structure of the model \(m^*\) and not the values of the parameters \(\theta^*\), a problem that has received much more attention from the theoretical point of view. Additionally, we are interested in situations in which the dimension of the feature space \(\mathbf{x} \in \mathbb{R}^k\) is relatively small (compared to typical feature spaces in machine learning settings), which is the relevant regime for symbolic regression and model discovery.

That paper is particularly interesting for the connection to the statistical mechanics of statistics.

## 1 SINDy et al

There is some nifty work in learning approximations to physics, like the *SINDy* method, which is somehow at the intersection compressive-sensing, state filters and maybe even Koopman operators (Brunton, Proctor, and Kutz 2016); but it’s hard to imagine scaling this up (at least directly) to big things like large image sensor arrays and other such weakly structured input.

## 2 References

*Biology Letters*.

*Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control*.

*Proceedings of the National Academy of Sciences*.

*Nature Computational Science*.

*Machine Learning and the Physical Sciences Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS)*.

*Science*.

*Nature Communications*.

*Science Advances*.

*Royal Society Open Science*.

*arXiv:1910.08892 [Stat]*.

*Physica D: Nonlinear Phenomena*.

*arXiv:2107.10127 [Math, Stat]*.

*arXiv:2107.10879 [Physics]*.

*arXiv:2003.11755 [Cs, Stat]*.

*Science*.

*Science Advances*.