Can we work out how a complex or adaptive works by destroying bits of it?
Can a biologist fix a radio (Lazebnik 2002)? Could a neuroscientist even understand a microprocessor? (Jonas and Kording 2017) Should an alien doctor stop bleeding by removing blood? How about if we use the phrase edge of chaos?
Great analogy: Interpretability Creationism:
[…] Stochastic Gradient Descent is not literally biological evolution, but post-hoc analysis in machine learning has a lot in common with scientific approaches in biology, and likewise often requires an understanding of the origin of model behavior. Therefore, the following holds whether looking at parasitic brooding behavior or at the inner representations of a neural network: if we do not consider how a system develops, it is difficult to distinguish a pleasing story from a useful analysis. In this piece, I will discuss the tendency towards “interpretability creationism” – interpretability methods that only look at the final state of the model and ignore its evolution over the course of training—and propose a focus on the training process to supplement interpretability research.