Symbolic regression

Fajardo-Fontiveros et al. (2023):

[C]onsider a dataset \(D=\left\{\left(y_i, \mathbf{x}_i\right)\right\}\), with \(i=1, \ldots, N\), generated using the closed form model \(m^*\), so that \(y_i=m^*\left(\mathbf{x}_i, \theta^*\right)+\epsilon_i\) with \(\theta^*\) being the parameters of the model, and \(\epsilon_i\) a random unbiased observation noise drawn from the normal distribution with variance \(s_\epsilon^2\). […] he question we are interested in is: Assuming that \(m^*\) can be expressed in closed form, when is it possible to identify it as the true generating model among all possible closed-form mathematical models, for someone who does not know the true model beforehand? Note that our focus is on learning the structure of the model \(m^*\) and not the values of the parameters \(\theta^*\), a problem that has received much more attention from the theoretical point of view. Additionally, we are interested in situations in which the dimension of the feature space \(\mathbf{x} \in \mathbb{R}^k\) is relatively small (compared to typical feature spaces in machine learning settings), which is the relevant regime for symbolic regression and model discovery.

That paper is particularly interesting for the connection to the statistical mechanics of statistics.


Atkinson, Steven, Waad Subber, and Liping Wang. 2019. “Data-Driven Discovery of Free-Form Governing Differential Equations.” In, 7.
Brunton, Steven L., Joshua L. Proctor, and J. Nathan Kutz. 2016. Discovering Governing Equations from Data by Sparse Identification of Nonlinear Dynamical Systems.” Proceedings of the National Academy of Sciences 113 (15): 3932–37.
Cranmer, Miles D, Rui Xu, Peter Battaglia, and Shirley Ho. 2019. “Learning Symbolic Physics with Graph Networks.” In Machine Learning and the Physical Sciences Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS), 6.
Evans, James, and Andrey Rzhetsky. 2010. Machine Science.” Science 329 (5990): 399–400.
Fajardo-Fontiveros, Oscar, Ignasi Reichardt, Harry R. De Los Ríos, Jordi Duch, Marta Sales-Pardo, and Roger Guimerà. 2023. Fundamental Limits to Learning Closed-Form Mathematical Models from Data.” Nature Communications 14 (1): 1043.
Guimerà, Roger, Ignasi Reichardt, Antoni Aguilar-Mogas, Francesco A. Massucci, Manuel Miranda, Jordi Pallarès, and Marta Sales-Pardo. 2020. A Bayesian Machine Scientist to Aid in the Solution of Challenging Scientific Problems.” Science Advances 6 (5): eaav6971.
Hirsh, Seth M., David A. Barajas-Solano, and J. Nathan Kutz. 2022. Sparsifying Priors for Bayesian Uncertainty Quantification in Model Discovery.” Royal Society Open Science 9 (2): 211823.
Jin, Ying, Weilin Fu, Jian Kang, Jiadong Guo, and Jian Guo. 2020. Bayesian Symbolic Regression.” arXiv:1910.08892 [Stat], January.
Schmidt, Michael, and Hod Lipson. 2009. Distilling Free-Form Natural Laws from Experimental Data.” Science 324 (5923): 81–85.
Udrescu, Silviu-Marian, and Max Tegmark. 2020. AI Feynman: A Physics-Inspired Method for Symbolic Regression.” Science Advances 6 (16): eaay2631.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.