Random neural networks



If you do not bother to train your neural net, what happens? In the infinite-width limit you get a Gaussian process. There are a number of net architectures which do not make use of that argument and which are still random though.

Recurrent: Echo State Machines / Random reservoir networks

This sounds deliciously lazy; At a glance it sounds like the process is to construct a random recurrent network, i.e. a network of random saturating IIR filters. Let the network converge to a steady state for a given stimulus. These are the features to which you fit your classifier/regressor/etc.

Easy to implement, that. I wonder when it actually works, constraints on topology etc.

Some of the literature here claims these are be based on spiking (i.e. even-driven) models, but AFAICT this is not necessary, although it might be convenient for convergence.

Various claims are made about how hard they avoid the training difficulty of similarly basic RNNs by being essentially untrained; you use them as a feature factory for another supervised output algorithm.

Suggestive parallel with random projections. Not strictly recurrent, but same general idea: He, Wang, and Hopcroft (2016).

Lukoševičius and Jaeger (2009) maps out various types, as much as that is possible in the shifting buzzword sand of neural network research.

From a dynamical systems perspective, there are two main classes of RNNs. Models from the first class are characterized by an energy-minimizing stochastic dynamics and symmetric connections. The best known instantiations are Hopfield networks, Boltzmann machines, and the recently emerging Deep Belief Networks. These networks are mostly trained in some unsupervised learning scheme. Typical targeted network functionalities in this field are associative memories, data compression, the unsupervised modeling of data distributions, and static pattern classification, where the model is run for multiple time steps per single input instance to reach some type of convergence or equilibrium (but see e.g., Taylor, Hinton, and Roweis (2006) for extension to temporal data). The mathematical background is rooted in statistical physics. In contrast, the second big class of RNN models typically features a deterministic update dynamics and directed connections. Systems from this class implement nonlinear filters, which transform an input time series into an output time series. The mathematical background here is nonlinear dynamical systems. The standard training mode is supervised.

Gauthier et al. (2021) remedies some of the annoying ad hoc flavour of NNs: Scientists develop the next generation of reservoir computing.

Random convolutions

🏗

References

Auer, Peter, Harald Burgsteiner, and Wolfgang Maass. 2008. “A Learning Rule for Very Simple Universal Approximators Consisting of a Single Layer of Perceptrons.” Neural Networks 21 (5): 786–95. https://doi.org/10.1016/j.neunet.2007.12.036.
Baldi, Pierre, Peter Sadowski, and Zhiqin Lu. 2016. “Learning in the Machine: Random Backpropagation and the Learning Channel.” arXiv:1612.02734 [cs], December. http://arxiv.org/abs/1612.02734.
Cao, Feilong, Dianhui Wang, Houying Zhu, and Yuguang Wang. 2016. “An Iterative Learning Algorithm for Feedforward Neural Networks with Random Weights.” Information Sciences 328: 546–57. https://doi.org/10.1016/j.ins.2015.09.002.
Charles, Adam, Dong Yin, and Christopher Rozell. 2016. “Distributed Sequence Memory of Multidimensional Inputs in Recurrent Networks.” arXiv:1605.08346 [cs, Math, Stat], May. http://arxiv.org/abs/1605.08346.
Cover, T. M. 1965. “Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition.” IEEE Transactions on Electronic Computers EC-14 (3): 326–34. https://doi.org/10.1109/PGEC.1965.264137.
Gauthier, Daniel J., Erik Bollt, Aaron Griffith, and Wendson A. S. Barbosa. 2021. “Next Generation Reservoir Computing.” Nature Communications 12 (1): 5564. https://doi.org/10.1038/s41467-021-25801-2.
Giryes, R., G. Sapiro, and A. M. Bronstein. 2016. “Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?” IEEE Transactions on Signal Processing 64 (13): 3444–57. https://doi.org/10.1109/TSP.2016.2546221.
Globerson, Amir, and Roi Livni. 2016. “Learning Infinite-Layer Networks: Beyond the Kernel Trick.” arXiv:1606.05316 [cs], June. http://arxiv.org/abs/1606.05316.
Goudarzi, Alireza, Peter Banda, Matthew R. Lakin, Christof Teuscher, and Darko Stefanovic. 2014. “A Comparative Study of Reservoir Computing for Temporal Signal Processing.” arXiv:1401.2224 [cs], January. http://arxiv.org/abs/1401.2224.
Goudarzi, Alireza, and Christof Teuscher. 2016. “Reservoir Computing: Quo Vadis?” In Proceedings of the 3rd ACM International Conference on Nanoscale Computing and Communication, 13:1–6. NANOCOM’16. New York, NY, USA: ACM. https://doi.org/10.1145/2967446.2967448.
Grzyb, B. J., E. Chinellato, G. M. Wojcik, and W. A. Kaminski. 2009. “Which Model to Use for the Liquid State Machine?” In 2009 International Joint Conference on Neural Networks, 1018–24. https://doi.org/10.1109/IJCNN.2009.5178822.
Hazan, Hananel, and Larry M. Manevitz. 2012. “Topological Constraints and Robustness in Liquid State Machines.” Expert Systems with Applications 39 (2): 1597–1606. https://doi.org/10.1016/j.eswa.2011.06.052.
He, Kun, Yan Wang, and John Hopcroft. 2016. “A Powerful Generative Model Using Random Weights for the Deep Image Representation.” In Advances in Neural Information Processing Systems. http://arxiv.org/abs/1606.04801.
Huang, Guang-Bin, and Chee-Kheong Siew. 2005. “Extreme Learning Machine with Randomly Assigned RBF Kernels.” International Journal of Information Technology 11 (1): 16–24. http://pop.intjit.org/journal/volume/11/1/111_2.pdf.
Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. 2004. “Extreme Learning Machine: A New Learning Scheme of Feedforward Neural Networks.” In 2004 IEEE International Joint Conference on Neural Networks, 2004. Proceedings, 2:985–990 vol.2. https://doi.org/10.1109/IJCNN.2004.1380068.
———. 2006. “Extreme Learning Machine: Theory and Applications.” Neurocomputing, Neural Networks Selected Papers from the 7th Brazilian Symposium on Neural Networks (SBRN ’04) 7th Brazilian Symposium on Neural Networks, 70 (1–3): 489–501. https://doi.org/10.1016/j.neucom.2005.12.126.
Li, Ming, and Dianhui Wang. 2017. “Insights into Randomized Algorithms for Neural Networks: Practical Issues and Common Pitfalls.” Information Sciences 382–383 (March): 170–78. https://doi.org/10.1016/j.ins.2016.12.007.
Lukoševičius, Mantas, and Herbert Jaeger. 2009. “Reservoir Computing Approaches to Recurrent Neural Network Training.” Computer Science Review 3 (3): 127–49. https://doi.org/10.1016/j.cosrev.2009.03.005.
Maass, W., T. Natschläger, and H. Markram. 2004. “Computational Models for Generic Cortical Microcircuits.” In Computational Neuroscience: A Comprehensive Approach, 575–605. Chapman & Hall/CRC. http://www.igi.tu-graz.ac.at/maass/psfiles/149-v05.pdf.
Martinsson, Per-Gunnar. 2016. “Randomized Methods for Matrix Computations and Analysis of High Dimensional Data.” arXiv:1607.01649 [math], July. http://arxiv.org/abs/1607.01649.
Oyallon, Edouard, Eugene Belilovsky, and Sergey Zagoruyko. 2017. “Scaling the Scattering Transform: Deep Hybrid Networks.” arXiv Preprint arXiv:1703.08961. https://arxiv.org/abs/1703.08961.
Pathak, Jaideep, Brian Hunt, Michelle Girvan, Zhixin Lu, and Edward Ott. 2018. “Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach.” Physical Review Letters 120 (2): 024102. https://doi.org/10.1103/PhysRevLett.120.024102.
Pathak, Jaideep, Zhixin Lu, Brian R. Hunt, Michelle Girvan, and Edward Ott. 2017. “Using Machine Learning to Replicate Chaotic Attractors and Calculate Lyapunov Exponents from Data.” Chaos: An Interdisciplinary Journal of Nonlinear Science 27 (12): 121102. https://doi.org/10.1063/1.5010300.
Perez, Carlos E. 2016. “Deep Learning: The Unreasonable Effectiveness of Randomness.” Medium (blog). November 6, 2016. https://medium.com/intuitionmachine/deep-learning-the-unreasonable-effectiveness-of-randomness-14d5aef13f87#.g5sjhxjrn.
Rahimi, Ali, and Benjamin Recht. 2009. “Weighted Sums of Random Kitchen Sinks: Replacing Minimization with Randomization in Learning.” In Advances in Neural Information Processing Systems, 1313–20. Curran Associates, Inc. http://papers.nips.cc/paper/3495-weighted-sums-of-random-kitchen-sinks-replacing-minimization-with-randomization-in-learning.
Scardapane, Simone, and Dianhui Wang. 2017. “Randomness in Neural Networks: An Overview.” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 7 (2). https://doi.org/10.1002/widm.1200.
Steil, J. J. 2004. “Backpropagation-Decorrelation: Online Recurrent Learning with O(N) Complexity.” In 2004 IEEE International Joint Conference on Neural Networks, 2004. Proceedings, 2:843–848 vol.2. https://doi.org/10.1109/IJCNN.2004.1380039.
Taylor, Graham W., Geoffrey E. Hinton, and Sam T. Roweis. 2006. “Modeling Human Motion Using Binary Latent Variables.” In Advances in Neural Information Processing Systems, 1345–52. http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2006_693.pdf.
Tong, Matthew H., Adam D. Bickett, Eric M. Christiansen, and Garrison W. Cottrell. 2007. “Learning Grammatical Structure with Echo State Networks.” Neural Networks 20 (3): 424–32. https://doi.org/10.1016/j.neunet.2007.04.013.
Triefenbach, F., A. Jalalvand, K. Demuynck, and J. P. Martens. 2013. “Acoustic Modeling With Hierarchical Reservoirs.” IEEE Transactions on Audio, Speech, and Language Processing 21 (11): 2439–50. https://doi.org/10.1109/TASL.2013.2280209.
Zhang, Le, and P. N. Suganthan. 2016. “A Survey of Randomized Algorithms for Training Neural Networks.” Information Sciences 364–365 (C): 146–55. https://doi.org/10.1016/j.ins.2016.01.039.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.