Differentiable learning of automata

Learning stack machines, random access machines, nested hierarchical parsing machines, Turing machines and whatever other automata-with-memory that you wish, from data. In other words, teaching computers to program themselves, via a deep learning formalism.

Differentiable pointers

This is a kind of obvious idea and there are some charming toy examples. Indeed this is wort-of what we have traditionally imagined AI might do.

Obviously a hypothetical superhuman Artificial General Intelligence would be good at handling computer-science problems; It’s not the absolute hippest research area right now though, on account of being hard in general, just like we always imagined from earlier attempts. Some progress has been made. My sense is that most of the hyped research that looks like differentiable computer learning is in the slightly-better-contained area of reinforcement learning where more progress can be made, or in the hot area of transformer networks which are harder to explain but solve the same kind of problems whilst looking different inside..

Related: grammatical inference.


Blazek claims his neural networks implement predicate logic directly and yet are tractable which would be interesting to look into [Blazek and Lin (2021);BlazekNeural2020;BlazekDeep2021].

Google branded: Differentiable neural computers.

Christopher Olah’s Characteristically pedagogic intro

Adrian Colyer’s introduction to neural Turing machines.

Andrej Karpathy’s memory machine list.

Facebook’s GTN might solve this kind of problem:

GTN is an open source framework for automatic differentiation with a powerful, expressive type of graph called weighted finite-state transducers (WFSTs). Just as PyTorch provides a framework for automatic differentiation with tensors, GTN provides such a framework for WFSTs. AI researchers and engineers can use GTN to more effectively train graph-based machine learning models.


Blazek, Paul J., and Milo M. Lin. 2020. A Neural Network Model of Perception and Reasoning.” arXiv:2002.11319 [Cs, q-Bio], February.
———. 2021. Explainable Neural Networks That Simulate Reasoning.” Nature Computational Science 1 (9): 607–18.
Blazek, Paul J., Kesavan Venkatesh, and Milo M. Lin. 2021. Deep Distilling: Automated Code Generation Using Explainable Deep Learning.” arXiv:2111.08275 [Cs], November.
Bottou, Leon. 2011. From Machine Learning to Machine Reasoning.” arXiv:1102.1808 [Cs], February.
Bubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, et al. 2023. Sparks of Artificial General Intelligence: Early Experiments with GPT-4.” arXiv.
Clark, Peter, Oyvind Tafjord, and Kyle Richardson. 2020. Transformers as Soft Reasoners over Language.” In IJCAI 2020.
Ellis, Kevin, Armando Solar-Lezama, and Josh Tenenbaum. 2016. Sampling for Bayesian Program Learning.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 1289–97. Curran Associates, Inc.
Garcez, Artur d’Avila, and Luis C. Lamb. 2020. Neurosymbolic AI: The 3rd Wave.” arXiv.
Graves, Alex, Greg Wayne, and Ivo Danihelka. 2014. Neural Turing Machines.” arXiv:1410.5401 [Cs], October.
Graves, Alex, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwińska, Sergio Gómez Colmenarejo, et al. 2016. Hybrid Computing Using a Neural Network with Dynamic External Memory.” Nature advance online publication (October).
Grefenstette, Edward, Karl Moritz Hermann, Mustafa Suleyman, and Phil Blunsom. 2015. Learning to Transduce with Unbounded Memory.” arXiv:1506.02516 [Cs], June.
Gulcehre, Caglar, Sarath Chandar, Kyunghyun Cho, and Yoshua Bengio. 2016. Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes.” arXiv:1607.00036 [Cs], June.
Hannun, Awni, Vineel Pratap, Jacob Kahn, and Wei-Ning Hsu. 2020. Differentiable Weighted Finite-State Transducers.” arXiv:2010.01003 [Cs, Stat], October.
Ikeda, M. 1989. Decentralized Control of Large Scale Systems.” In Three Decades of Mathematical System Theory: A Collection of Surveys at the Occasion of the 50th Birthday of Jan C. Willems, edited by Hendrik Nijmeijer and Johannes M. Schumacher, 219–42. Lecture Notes in Control and Information Sciences. Berlin, Heidelberg: Springer.
Jaitly, Navdeep, David Sussillo, Quoc V. Le, Oriol Vinyals, Ilya Sutskever, and Samy Bengio. 2015. A Neural Transducer.” arXiv:1511.04868 [Cs], November.
Kaiser, Łukasz, and Ilya Sutskever. 2015. Neural GPUs Learn Algorithms.” arXiv:1511.08228 [Cs], November.
Kim, Jason Z., and Dani S. Bassett. 2022. A Neural Programming Language for the Reservoir Computer.” arXiv:2203.05032 [Cond-Mat, Physics:nlin], March.
Lamb, Luis C., Artur Garcez, Marco Gori, Marcelo Prates, Pedro Avelar, and Moshe Vardi. 2020. Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective.” In IJCAI 2020.
Lample, Guillaume, and François Charton. 2019. Deep Learning for Symbolic Mathematics.” arXiv:1912.01412 [Cs], December.
Looks, Moshe, Marcello Herreshoff, DeLesley Hutchins, and Peter Norvig. 2017. Deep Learning with Dynamic Computation Graphs.” In Proceedings of ICLR.
Perez, Julien, and Fei Liu. 2016. Gated End-to-End Memory Networks.” arXiv:1610.04211 [Cs, Stat], October.
Putzky, Patrick, and Max Welling. 2017. Recurrent Inference Machines for Solving Inverse Problems.” arXiv:1706.04008 [Cs], June.
Wang, Cheng, and Mathias Niepert. 2019. State-Regularized Recurrent Neural Networks.” arXiv.
Wang, Xin, Yudong Chen, and Wenwu Zhu. 2021. A Survey on Curriculum Learning.” arXiv.
Wei, Qi, Kai Fan, Lawrence Carin, and Katherine A. Heller. 2017. An Inner-Loop Free Solution to Inverse Problems Using Deep Neural Networks.” arXiv:1709.01841 [Cs], September.
Weston, Jason, Sumit Chopra, and Antoine Bordes. 2014. Memory Networks.” arXiv:1410.3916 [Cs, Stat], October.
Zhang, Yi, Arturs Backurs, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, and Tal Wagner. 2022. Unveiling Transformers with LEGO: A Synthetic Reasoning Task.” arXiv.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.