Meta learning
Few-shot learning, learning fast weights, learning to learn
September 16, 2021 — September 16, 2021
Placeholder for what we now call few-shot learning, I think?
Is this what Schmidhuber means when he discusses neural nets learning to program neural nets with fast weights? He dates that idea to the 1990s (Schmidhuber 1992) and relates it via Schlag, Irie, and Schmidhuber (2021) to transformer models.
A mainstream and current approach is to discuss meta-learning:
- MAML Explained
- An Overview of Meta-Learning Algorithms
- learn2learn
- Who models the models that model models? An exploration of GPT-3’s in-context model fitting ability
On the futility of trying to be clever (the bitter lesson redux) summarises some recent negative results
two recent papers, (Raghu et al. 2020; Tian et al. 2020), show that in practice the inner loop run doesn’t really do much in these algorithms, so much so that one can safely do away with the inner loop entirely. This means that the success of these algorithms can be explained completely by standard (single-loop) learning on the entire lumped meta-training dataset. Another recent beautiful theory paper (Du et al. 2021) sheds some light on these experimental results.