# Differentiable learning of automata

October 14, 2016 — March 5, 2024

Learning stack machines, random access machines, nested hierarchical parsing machines, Turing machines and whatever other automata-with-memory that you wish, from data. In other words, teaching computers to program themselves, via a deep learning formalism.

This is a kind of obvious idea and there are some charming toy examples. Indeed this is wort-of what we have traditionally imagined AI might do.

Obviously a hypothetical superhuman Artificial General Intelligence would be good at handling computer-science problems; It’s not the absolute hippest research area right now though, on account of being hard in general, just like we always imagined from earlier attempts. Some progress has been made. My sense is that most of the hyped research that looks like differentiable computer learning is in the slightly-better-contained area of reinforcement learning where more progress can be made, or in the hot area of transformer networks which are harder to explain but solve the same kind of problems whilst looking different inside..

Related: grammatical inference.

## 1 Incoming

Blazek claims his neural networks implement predicate logic directly and yet are tractable which would be interesting to look into (Blazek and Lin 2021, 2020; Blazek, Venkatesh, and Lin 2021).

Google branded: Differentiable neural computers.

Christopher Olah’s Characteristically pedagogic intro

Adrian Colyer’s introduction to neural Turing machines.

Andrej Karpathy’s memory machine list.

Facebook’s GTN might solve this kind of problem:

GTN is an open source framework for automatic differentiation with a powerful, expressive type of graph called weighted finite-state transducers (WFSTs). Just as PyTorch provides a framework for automatic differentiation with tensors, GTN provides such a framework for WFSTs. AI researchers and engineers can use GTN to more effectively train graph-based machine learning models.

## 2 References

*arXiv:2002.11319 [Cs, q-Bio]*.

*Nature Computational Science*.

*arXiv:2111.08275 [Cs]*.

*arXiv:1102.1808 [Cs]*.

*IJCAI 2020*.

*Advances in Neural Information Processing Systems 29*.

*arXiv:1410.5401 [Cs]*.

*Nature*.

*arXiv:1506.02516 [Cs]*.

*arXiv:1607.00036 [Cs]*.

*arXiv:2010.01003 [Cs, Stat]*.

*Three Decades of Mathematical System Theory: A Collection of Surveys at the Occasion of the 50th Birthday of Jan C. Willems*. Lecture Notes in Control and Information Sciences.

*arXiv:1511.04868 [Cs]*.

*arXiv:1511.08228 [Cs]*.

*arXiv:2203.05032 [Cond-Mat, Physics:nlin]*.

*Proceedings of The 25th International Conference on Artificial Intelligence and Statistics*.

*IJCAI 2020*.

*arXiv:1912.01412 [Cs]*.

*Proceedings of ICLR*.

*arXiv:1610.04211 [Cs, Stat]*.

*arXiv:1706.04008 [Cs]*.

*Patterns*.

*arXiv:1709.01841 [Cs]*.

*arXiv:1410.3916 [Cs, Stat]*.