Agostinelli, Forest, Matthew Hoffman, Peter Sadowski, and Pierre Baldi. 2015. “Learning Activation Functions to Improve Deep Neural Networks.”
In Proceedings of International Conference on Learning Representations (ICLR) 2015
Anil, Cem, James Lucas, and Roger Grosse. 2018. “Sorting Out Lipschitz Function Approximation,”
Arjovsky, Martin, Amar Shah, and Yoshua Bengio. 2016. “Unitary Evolution Recurrent Neural Networks.”
In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48
, 1120–28. ICML’16. New York, NY, USA: JMLR.org.
Balduzzi, David, Marcus Frean, Lennox Leary, J. P. Lewis, Kurt Wan-Duo Ma, and Brian McWilliams. 2017. “The Shattered Gradients Problem: If Resnets Are the Answer, Then What Is the Question?”
Cho, Youngmin, and Lawrence K. Saul. 2009. “Kernel Methods for Deep Learning.”
In Proceedings of the 22nd International Conference on Neural Information Processing Systems
, 22:342–50. NIPS’09. Red Hook, NY, USA: Curran Associates Inc.
Clevert, Djork-Arné, Thomas Unterthiner, and Sepp Hochreiter. 2016. “Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs).”
In Proceedings of ICLR
Duch, Włodzisław, and Norbert Jankowski. 1999. “Survey of Neural Transfer Functions.”
Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. 2011. “Deep Sparse Rectifier Neural Networks.”
Goodfellow, Ian J., David Warde-Farley, Mehdi Mirza, Aaron Courville, and Yoshua Bengio. 2013. “Maxout Networks.”
In ICML (3)
Hayou, Soufiane, Arnaud Doucet, and Judith Rousseau. 2019. “On the Impact of the Activation Function on Deep Neural Networks Training.”
In Proceedings of the 36th International Conference on Machine Learning
, 2672–80. PMLR.
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015a. “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification.” arXiv:1502.01852 [Cs]
Hochreiter, Sepp. 1998. “The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions.” International Journal of Uncertainty Fuzziness and Knowledge Based Systems
Hochreiter, Sepp, Yoshua Bengio, Paolo Frasconi, and Jürgen Schmidhuber. 2001. “Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies.”
In A Field Guide to Dynamical Recurrent Neural Networks
. IEEE Press.
Klambauer, Günter, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2017. “Self-Normalizing Neural Networks.”
In Proceedings of the 31st International Conference on Neural Information Processing Systems
, 972–81. Red Hook, NY, USA: Curran Associates Inc.
Laurent, Thomas. n.d. “The Multilinear Structure of ReLU Networks,” 9.
Lee, Jaehoon, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, and Jascha Sohl-Dickstein. 2018. “Deep Neural Networks as Gaussian Processes.”
Maas, Andrew L., Awni Y. Hannun, and Andrew Y. Ng. 2013. “Rectifier Nonlinearities Improve Neural Network Acoustic Models.”
In Proceedings of ICML
. Vol. 30.
Pascanu, Razvan, Tomas Mikolov, and Yoshua Bengio. 2013. “On the Difficulty of Training Recurrent Neural Networks.”
In arXiv:1211.5063 [Cs]
Rahaman, Nasim, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, and Aaron Courville. 2019. “On the Spectral Bias of Neural Networks.” arXiv:1806.08734 [Cs, Stat]
Ramachandran, Prajit, Barret Zoph, and Quoc V. Le. 2017. “Searching for Activation Functions.” arXiv:1710.05941 [Cs]
Sitzmann, Vincent, Julien N. P. Martel, Alexander W. Bergman, David B. Lindell, and Gordon Wetzstein. 2020. “Implicit Neural Representations with Periodic Activation Functions.” arXiv:2006.09661 [Cs, Eess]
Srivastava, Rupesh Kumar, Klaus Greff, and Jürgen Schmidhuber. 2015. “Highway Networks.”
In arXiv:1505.00387 [Cs]
Unser, Michael. 2019. “A Representer Theorem for Deep Neural Networks.” Journal of Machine Learning Research
20 (110): 30.
Wisdom, Scott, Thomas Powers, John Hershey, Jonathan Le Roux, and Les Atlas. 2016. “Full-Capacity Unitary Recurrent Neural Networks.”
In Advances in Neural Information Processing Systems
Yang, Greg, and Hadi Salman. 2020. “A Fine-Grained Spectral Perspective on Neural Networks.” arXiv:1907.10599 [Cs, Stat]