Functional regression

Statistics where the samples are not just data but whole curves and manifolds, or subsamples from them. Function approximation meets statisticsm, especially in Karhunen-Loève expansion

Regression using curves

To quote Jim Ramsay:

Functional data analysis, […] is about the analysis of information on curves or functions. For example, these twenty traces of the writing of “fda” are curves in two ways: first, as static traces on the page that you see after the writing is finished, and second, as two sets functions of time, one for the horizontal “X” coordinate, and the other for the vertical “Y” coordinate.

FDA is a collection statistical techniques for answering questions like, “What are the main ways in which the curves vary from one writing to another?” In fact, most of the questions and problems associated with the usual multivariate data analyzed by statistical packages like SAS and SPSS have their functional counterparts.

But what is unique about functional data is the possibility of also using information on the rates of change or derivatives of the curves. We use slopes, curvatures, and other characteristics made available because these curves are intrinsically smooth, and we can use this information in many useful ways. For example, our high school physics tells us that force = mass times acceleration, and that suggests that we look at the acceleration or second derivative of the pen’s position as a function of time. What we see in the plot of the magnitudes of the acceleration vector is that acceleration hits nearly ten meters/second/second. That’s a lot of energy! Equally remarkable is the stability of these acceleration records from one trial to the next. Also, note that where the acceleration magnitudes are near zero, both the X and Y accelerations must simultaneously be zero. The brain seems to know what it’s doing!

Regression upon the shapes of curves entire. A stylishly nonparametric thing to do. Can be simpler than you’d think — just doing typical statistics on a functional basis, Hilbert-space-style. You can try to infer the differential operator that defines continuous dynamics. Apropos that, see the kernel trick. Many other nonparametric methods of function approximation, such as spline bases and density estimation, mixture models, and so on are generalised by functional data analysis representation.

See (Wahba 1990) for the foundational spline-smoothing work, and check the big names textbooks (Ramsay and Silverman 2005; Ferraty and Vieu 2006b) the modern framing.

An interesting related question is how you align the curves that are your objects of study. That is a problem of warping.

Functional autoregression

I’m interested in functional autoregressive models. In these we are concerned with a curve evolving in time. AFAICT this idea originates from (Bosq 1998) but has been generalised since then.


Arribas-Gil, Ana, and Juan Romo. 2012. Robust depth-based estimation in the time warping model.” Biostatistics (Oxford, England) 13 (3): 398–414.
Bathia, Neil, Qiwei Yao, and Flavio Ziegelmann. 2010. Identifying the Finite Dimensionality of Curve Time Series.” Annals of Statistics 38 (6): 3352–86.
Battey, Heather, Jianqing Fan, Han Liu, Junwei Lu, and Ziwei Zhu. 2015. Distributed Estimation and Inference with Statistical Guarantees.” arXiv:1509.05457 [Math, Stat], September.
Battey, Heather, and Oliver Linton. 2014. Nonparametric Estimation of Multivariate Elliptic Densities via Finite Mixture Sieves.” Journal of Multivariate Analysis 123 (January): 43–67.
Battey, Heather, and Han Liu. 2013. Smooth Projected Density Estimation.” arXiv:1308.3968 [Stat], August.
———. 2016. Nonparametrically Filtered Parametric Density Estimation.”
Battey, Heather, and Alessio Sancetta. 2013. Conditional Estimation for Dependent Functional Data.” Journal of Multivariate Analysis 120 (September): 1–17.
Bosq, Denis. 1998. Nonparametric Statistics for Stochastic Processes: Estimation and Prediction. 2nd ed. Lecture Notes in Statistics 110. New York: Springer.
Dupont, Emilien, Hyunjik Kim, S. M. Ali Eslami, Danilo Jimenez Rezende, and Dan Rosenbaum. 2022. From Data to Functa: Your Data Point Is a Function and You Can Treat It Like One.” In Proceedings of the 39th International Conference on Machine Learning, 5694–5725. PMLR.
Eilers, Paul H. C., and Brian D. Marx. 1996. Flexible Smoothing with B-Splines and Penalties.” Statistical Science 11 (2): 89–121.
Ferraty, Frédéric, Ali Laksaci, Amel Tadj, and Philippe Vieu. 2011. Kernel Regression with Functional Response.” Electronic Journal of Statistics 5: 159–71.
Ferraty, Frédéric, and Philippe Vieu, eds. 2006a. Introduction to Functional Nonparametric Statistics.” In Nonparametric Functional Data Analysis: Theory and Practice, 5–10. Springer Series in Statistics. New York, NY: Springer.
———. 2006b. Nonparametric Functional Data Analysis: Theory and Practice. Springer Series in Statistics. New York: Springer-Verlag.
Han, Kyunghee, and Hyejin Shin. n.d. Functional Linear Regression for Functional Response via Sparse Basis Selection.”
Heinonen, Markus, and Florence d’Alché-Buc. 2014. Learning Nonparametric Differential Equations with Operator-Valued Kernels and Gradient Matching.” arXiv:1411.5172 [Cs, Stat], November.
Horváth, Lajos, Marie Hušková, and Piotr Kokoszka. 2010. Testing the Stability of the Functional Autoregressive Process.” Journal of Multivariate Analysis, Statistical Methods and Problems in Infinite-dimensional Spaces, 101 (2): 352–67.
Horváth, Lajos, and Piotr Kokoszka. 2012a. Functional Autoregressive Model.” In Inference for Functional Data with Applications, edited by Lajos Horváth and Piotr Kokoszka, 235–52. Springer Series in Statistics. New York, NY: Springer.
———. 2012b. Inference for functional data with applications. Vol. 200. Springer series in statistics. New York: Springer.
Hsing, Tailen, and Randall L. Eubank. 2015. Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley Series in Probability and Statistics. Chichester, West Sussex: John Wiley and Sons, Inc.
Kadri, Hachem, Emmanuel Duflos, Philippe Preux, Stéphane Canu, Alain Rakotomamonjy, and Julien Audiffren. 2016. Operator-Valued Kernels for Learning from Functional Response Data.” The Journal of Machine Learning Research 17 (1): 613–66.
Koner, Salil, and Ana-Maria Staicu. 2023. Second-Generation Functional Data.” Annual Review of Statistics and Its Application 10 (1): 547–72.
Lian, Heng. 2007. Nonlinear Functional Models for Functional Responses in Reproducing Kernel Hilbert Spaces.” Canadian Journal of Statistics 35 (4): 597–606.
Liu, Chong, Surajit Ray, and Giles Hooker. 2014. Functional Principal Components Analysis of Spatially Correlated Data.” arXiv:1411.4681 [Math, Stat], November.
Mirzargar, Mahsa, Ross T. Whitaker, and Robert M. Kirby. 2014. Curve Boxplot: Generalization of Boxplot for Ensembles of Curves.” IEEE Transactions on Visualization and Computer Graphics 20 (12): 2654–63.
Morris, Jeffrey S. 2015. Functional Regression.” Annual Review of Statistics and Its Application 2 (1): 321–59.
Paparoditis, Efstathios, and Theofanis Sapatinas. 2014. Bootstrap-Based Testing for Functional Data.” arXiv:1409.4317 [Math, Stat], September.
Pham, Tung, and Victor Panaretos. 2016. Methodology and Convergence Rates for Functional Time Series Regression.” arXiv:1612.07197 [Math, Stat], December.
Ramsay, Jim O., Giles Hooker, and Spencer Graves. 2009. Functional Data Analysis with R and MATLAB. 2009 edition. Dordrecht ; New York: Springer.
Ramsay, Jim O., and B.W Silverman. 2005. Functional Data Analysis. Springer Series in Statistics. New York: Springer-Verlag.
Saha, Akash, and Palaniappan Balamurugan. 2020. Learning with Operator-Valued Kernels in Reproducing Kernel Krein Spaces.” In Advances in Neural Information Processing Systems. Vol. 33.
Shang, Han Lin. 2014. A Survey of Functional Principal Component Analysis.” AStA Advances in Statistical Analysis 98 (2): 121–42.
Sun, Ying, and Marc G. Genton. 2011. Functional Boxplots.” Journal of Computational and Graphical Statistics 20 (2): 316–34.
Tavakoli, Shahin, and Victor M. Panaretos. 2016. Detecting and Localizing Differences in Functional Time Series Dynamics: A Case Study in Molecular Biophysics.” Journal of the American Statistical Association, March, 1–31.
Wahba, Grace. 1990. Spline Models for Observational Data. SIAM.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.