Computational complexity and computability results in neural nets
2025-01-17 — 2025-05-23
Suspiciously similar content
We can build automata from neural nets. And they do weird things, like learn languages, in a predictable way, which is wildly at odds with our traditional understanding of the difficulty of the task (Paging Doctor Chomsky). How can we analyse NNs in terms of computational complexity? What are the useful results in this domain?
Related: grammatical inference, memory machines, overparameterization, NN compression, learning automata, NN at scale, explainability…
1 Computational cost of explaining complex reasoning
Intuitively, something might be, should be difficult, computationally about explaining the reasoning of a large neural network. Surely simpler things cannot explain more complex things?
As an extreme example, we might run into incompleteness results. (Prove the consistency of the following system of axioms…) I can believe such extreme esoterica would not be relevant in practice, but could problems close to those arise in practice? TBC.
2 Neural interactive proofs
The main theoretical challenges are to: i) represent a given protocol in the form of a prover-verifier game (Kirchner et al. 2024); and ii) train the models to approximate the right equilibria of this game. While the first challenge is reasonably straightforward, the power of different protocols can vary greatly depending on several subtle details such as the number of messages the agents can send to each other, their ability to randomise, and whether messages can be sent privately to different agents. By taking these subtleties into account, we can show an equivalence between the equilibria of the game and valid proof systems for a range of different protocols.
Model | Rounds | Complexity | Zero-knowledge |
---|---|---|---|
adp |
2 | NP | ❌ |
debate |
$T$ | PSPACE | ❌ |
mac |
2 | MA | ❌ |
nip |
$T$ | PSPACE | ❌ |
mnip |
$T$ | NEXP | ❌ |
zk-nip |
$T$ | PSPACE | ✅ |
zk-mnip |
$T$ | NEXP | ✅ |
The second theoretical challenge arises because the equilibria that form this equivalence are (approximate) Stackelberg equilibria over the worst-case loss, which are difficult to optimise for using conventional machine learning algorithms. We discuss several approaches to overcoming this challenge, including the use of Stackelberg policy gradient and opponent-shaping algorithms to approximate Stackelberg equilibria, and the efficacy of average-case optimisation and adversarial training when it comes to minimising worst-case losses.
3 Incoming
Sean Trott, Perceptrons, XOR, and the first “AI winter” is a decent summary of the history of connectionist approaches dying out for a while in AI.
I haven’t read Minsky and Papert (1972) which is commonly credited with burying NNs in favour of GOFAI for a while, but I gather the argument is essentially geometric rather than computational. I think the parallels can be made though (heh).
Language Generation in the Limit (Kleinberg and Mullainathan 2024) looks interesting.
Cheng Soon Ong described it as a
talk by Jon Kleinberg on language generation being easier than language identification. He mentions an interesting tension between hallucination and mode collapse.
Blazek claims his non-gradient-based neural networks implement predicate logic directly and yet are tractable which would be interesting to look into (Blazek and Lin 2021, 2020; Blazek, Venkatesh, and Lin 2021). This sounds like a reboot of the GOFAI project; I wonder what would be the secret sauce that would give the author hope? Would it be enough to persuade me also? I don’t put much prior stock in this plan.
Are Transformers Turing-complete? A Good Disguise Is All You Need.