Computational complexity and computability results in neural nets
2025-01-17 — 2025-05-23
Wherein neural nets are represented as prover‑verifier games, and neural interactive proofs are shown to capture PSPACE and NEXP decision procedures, with zero‑knowledge variants obtained for certain protocols.
We can build automata from neural nets. And they do weird things, like learn languages, in a predictable way, which is wildly at odds with our traditional understanding of the difficulty of the task (Paging Doctor Chomsky). How can we analyse NNs in terms of computational complexity? What are the useful results in this domain?
Related: grammatical inference, memory machines, overparameterization, NN compression, learning automata, NN at scale, explainability…
1 Interactive proof and debate
2 Incoming
Sean Trott, Perceptrons, XOR, and the first “AI winter” is a decent summary of the history of connectionist approaches dying out for a while in AI.
I haven’t read Minsky and Papert (1972) which is commonly credited with burying NNs in favour of GOFAI for a while, but I gather the argument is essentially geometric rather than computational. I think the parallels can be made though (heh).
Language Generation in the Limit (Kleinberg and Mullainathan 2024) looks interesting.
Cheng Soon Ong described it as a
talk by Jon Kleinberg on language generation being easier than language identification. He mentions an interesting tension between hallucination and mode collapse.
Blazek claims his non-gradient-based neural networks implement predicate logic directly and yet are tractable which would be interesting to look into (Blazek and Lin 2021, 2020; Blazek, Venkatesh, and Lin 2021). This sounds like a reboot of the GOFAI project; I wonder what would be the secret sauce that would give the author hope? Would it be enough to persuade me also? I don’t put much prior stock in this plan.
Are Transformers Turing-complete? A Good Disguise Is All You Need.
Blurry JPEG or Frozen Concentrate
What we want is a description of a program \(p\) that corresponds to a set of possible Wikipedia articles, of which the real article is a random example of this set. An ideal version of ChatGPT would choose a random article from this set. Dall-E, generative AI for art, works a similar way, creating art that is a random example of what art might have been.
In terms of Kolmogorov complexity, this corresponds to the Kolmogorov Structure Function, basically the smallest program \(p\) such that \(p\) describes a set \(S\) of size \(m\) that contains \(x\). with \(|p| + \log m \approx K(x)\). The string \(x\) is just a random element of \(S\), you can get a string like it by picking an element of \(S\) at random.
