a.k.a. improper learning

General notes on the general technique of increasing the numebr of slack parameters you have, especially in machine learning. Convex relaxations often hinge upon this.

The combination of overparameterization and SGD is argued to be the secret to how deep learning works, by Zeyuan Allen-Zhu, Yuanzhi Li and Zhao Song.

RJ Liption discusses Arno van den Essen’s incidental work on stabilisation methods of polynomials, which relates. AFAICT, to transfer-function-type stability. Does this connect to the overparmeterisation of rational transfer fucntion analysis I so enjoyed?HaMR16


