# (Weighted) least squares fits

September 22, 2016 — October 4, 2022

A classic. Surprisingly deep.

A few non-comprehensive notes to approximating functions from data by the arbitrary-but-convenient expedient of minimising the sum of the squares of the deviances between two things; The linear algebra of least squares fits seems well-trodden and perenially classic. Used in many problems. e.g. lasso regression, Gaussian belief propagation. Least squares problems are a kind of platonic ideal of convex optimisation.

Francis Bach is interested in least squares relaxations, which he connects to Fourier domain. See his Sums-of-squares for dummies: a view from the Fourier domain.

## 1 Introduction

- the Ceres Solver Bibliography is a good start.
- Boyd and Vandenberghe’s Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares is a solid introduction to both linear algebra and least squares. Julia Companion. Python Companion.

## 2 Iteratively reweighted

## 3 Nonlinear least squares

Trust region and Levenberg-Marquardt methods in 2nd order optimisation.

## 4 Tools

Nonlinear least squares with ceres-solver:

Ceres Solver is an open source C++ library for modeling and solving large, complicated optimization problems. It can be used to solve Non-linear Least Squares problems with bounds constraints and general unconstrained optimization problems. It is a mature, feature rich, and performant library that has been used in production at Google since 2010.

gradslam gradslam/gradslam: gradslam is an open source differentiable dense SLAM library for PyTorch (Jatavallabhula, Iyer, and Paull 2020)

## 5 Jaxopt

jax toolkit JAXopt includes lots of neatg Nonlinear least squares tooling.

## 6 KeOps

The KeOps library lets you compute reductions of large arrays whose entries are given by a mathematical formula or a neural network. It combines efficient C++ routines with an automatic differentiation engine and can be used with Python (NumPy, PyTorch), Matlab and R.

It is perfectly suited to the computation of kernel matrix-vector products, K-nearest neighbors queries, N-body interactions, point cloud convolutions and the associated gradients. Crucially, it performs well even when the corresponding kernel or distance matrices do not fit into the RAM or GPU memory. Compared with a PyTorch GPU baseline, KeOps provides a x10-x100 speed-up on a wide range of geometric applications, from kernel methods to geometric deep learning.

## 7 Incoming

- The Gauss–Markov Theorem “under certain conditions, the least squares estimator is the minimum-variance linear unbiased estimator of the model parameters.”
- Weighted Least Squares

## 8 References

*arXiv:1701.09120 [Math, Stat]*.

*Julia Companion to Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares*.

*Management Science*.

*Journal of Machine Learning Research*.

*IEEE International Conference on Acoustics, Speech and Signal Processing, 2008. ICASSP 2008*.

*arXiv:1610.08244 [Stat]*.

*Mathematical Programming*.

*Computational Geosciences*.

*arXiv:1702.06429 [Math, Stat]*.

*Computational Statistics & Data Analysis*, Nonlinear Methods and Data Mining,.

*The Annals of Applied Statistics*.

*Journal of Statistical Software*.

*IEEE Transactions on Signal Processing*.

*Matrices, Moments and Quadrature with Applications*.

*A Distribution-Free Theory of Nonparametric Regression*. Springer Series in Statistics.

*2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)*.

*2020 IEEE International Conference on Robotics and Automation (ICRA)*.

*arXiv:1011.1576 [Cs]*.

*Python Language Companion to Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares*.

*Randomized Algorithms for Matrices and Data*.

*Statistical Science*.

*Operations Research*.

*The Annals of Statistics*.

*Physical Review E*.

*Computational Optimization and Applications*.