Meta learning

Few-shot learning, learning fast weights, learning to learn

Placeholder for what we now call few shot learning, I think?

Schmidhuber discusses this in terms of Neural nets learn to program neural nets with with fast weights and dates it to the 1990s (Schmidhuber 1992) and relates it (Schlag, Irie, and Schmidhuber 2021) to transformer models.

A mainstream and current approach is to discuss meta-learning:


Antoniou, Antreas, Harrison Edwards, and Amos Storkey. 2019. How to Train Your MAML.” arXiv:1810.09502 [Cs, Stat], March.
Arnold, Sébastien M. R., Praateek Mahajan, Debajyoti Datta, Ian Bunner, and Konstantinos Saitas Zarkias. 2020. Learn2learn: A Library for Meta-Learning Research.” arXiv:2008.12284 [Cs, Stat], August.
Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. 2020. Language Models Are Few-Shot Learners.” arXiv:2005.14165 [Cs], June.
Erven, Tim van, and Wouter M Koolen. 2016. MetaGrad: Multiple Learning Rates in Online Learning.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 3666–74. Curran Associates, Inc.
Fiebrink, Rebecca, Dan Trueman, and Perry R. Cook. 2009. A Metainstrument for Interactive, on-the-Fly Machine Learning.” In Proceefdings of NIME, 2:3.
Finn, Chelsea, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks.” In Proceedings of the 34th International Conference on Machine Learning, 1126–35. PMLR.
Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. 2019. Metalearners for Estimating Heterogeneous Treatment Effects Using Machine Learning.” Proceedings of the National Academy of Sciences 116 (10): 4156–65.
Lee, Kwonjoon, Subhransu Maji, Avinash Ravichandran, and Stefano Soatto. 2019. Meta-Learning with Differentiable Convex Optimization,” April.
Medasani, Bharat, Anthony Gamst, Hong Ding, Wei Chen, Kristin A. Persson, Mark Asta, Andrew Canning, and Maciej Haranczyk. 2016. Predicting Defect Behavior in B2 Intermetallics by Merging Ab Initio Modeling and Machine Learning.” Npj Computational Materials 2 (1): 1.
Mikulik, Vladimir, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, and Pedro A. Ortega. 2020. Meta-Trained Agents Implement Bayes-Optimal Agents.” arXiv.
Munkhdalai, Tsendsuren, Alessandro Sordoni, Tong Wang, and Adam Trischler. 2019. Metalearned Neural Memory.” In Advances In Neural Information Processing Systems.
Ortega, Pedro A., Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, et al. 2019. Meta-Learning of Sequential Strategies.” arXiv.
Pestourie, Raphaël, Youssef Mroueh, Thanh V. Nguyen, Payel Das, and Steven G. Johnson. 2020. Active Learning of Deep Surrogates for PDEs: Application to Metasurface Design.” Npj Computational Materials 6 (1): 1–7.
Rajeswaran, Aravind, Chelsea Finn, Sham Kakade, and Sergey Levine. 2019. Meta-Learning with Implicit Gradients,” September.
Schlag, Imanol, Kazuki Irie, and Jürgen Schmidhuber. 2021. Linear Transformers Are Secretly Fast Weight Programmers.” arXiv:2102.11174 [Cs], June.
Schmidhuber, Jürgen. 1992. Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks.” Neural Computation 4 (1): 131–39.
Uttl, Bob, Carmela A. White, and Daniela Wong Gonzalez. 2017. Meta-Analysis of Faculty’s Teaching Effectiveness: Student Evaluation of Teaching Ratings and Student Learning Are Not Related.” Studies in Educational Evaluation, Evaluation of teaching: Challenges and promises, 54 (September): 22–42.
Zhang, Kaiqi, and Yu-Xiang Wang. 2022. Deep Learning Meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive? arXiv.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.