Online learning
September 30, 2018 — October 20, 2022
An online learning perspective gives bounds on the regret: the gap in performance between online estimation and the optimal estimator when we have access to the entire data.
A lot of things are sort of online learning; stochastic gradient descent, for example, is closely related. However, if you meet someone who claims to study “online learning” they usually mean to emphasize particular things. Frequently seen in the context of bandit problems; connection TBD.
Hazan’s Introduction to online convex optimization looks fresh.
1 Follow-the-regularized leader
TBD
2 Parameter-free
3 Covariance
Learning covariance online is a much more basic application than the other fancy things considered here, but I guess it still fits. John D Cook:
This better way of computing variance goes back to a 1962 paper by B. P. Welford and is presented in Donald Knuth’s Art of Computer Programming, Vol 2, page 232, 3rd edition. […]
- Initialize \(M_1 = x_1\) and \(S_1 = 0.\)
- For subsequent \(x\)s, use the recurrence formulas \[M_k = M_{k-1} + (x_k — M_{k-1})/k\] \[S_k = S_{k-1} + (x_k — M_{k-1})(x_k — M_k).\]
- For \(2 \leq k \leq n\), the \(k\)th estimate of the variance is \[s_k^2 = S_k/(k — 1).\]
4 Anomaly detection
See also anomaly detection.