Amari, Shun-ichi. 1998.
βNatural Gradient Works Efficiently in Learning.β Neural Computation 10 (2): 251β76.
Amari, Shunichi. 1967.
βA Theory of Adaptive Pattern Classifiers.β IEEE Transactions on Electronic Computers EC-16 (3): 299β307.
Andrychowicz, Marcin, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, and Nando de Freitas. 2016.
βLearning to Learn by Gradient Descent by Gradient Descent.β arXiv:1606.04474 [Cs], June.
Arel, I, D C Rose, and T P Karnowski. 2010.
βDeep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier].β IEEE Computational Intelligence Magazine 5 (4): 13β18.
Arora, Sanjeev, Rong Ge, Tengyu Ma, and Ankur Moitra. 2015.
βSimple, Efficient, and Neural Algorithms for Sparse Coding.β In
Proceedings of The 28th Conference on Learning Theory, 40:113β49. Paris, France: PMLR.
Bach, Francis. 2014.
βBreaking the Curse of Dimensionality with Convex Neural Networks.β arXiv:1412.8690 [Cs, Math, Stat], December.
Baldassi, Carlo, Christian Borgs, Jennifer T. Chayes, Alessandro Ingrosso, Carlo Lucibello, Luca Saglietti, and Riccardo Zecchina. 2016.
βUnreasonable Effectiveness of Learning Neural Networks: From Accessible States and Robust Ensembles to Basic Algorithmic Schemes.β Proceedings of the National Academy of Sciences 113 (48): E7655β62.
Barron, A.R. 1993.
βUniversal Approximation Bounds for Superpositions of a Sigmoidal Function.β IEEE Transactions on Information Theory 39 (3): 930β45.
Baydin, AtΔ±lΔ±m GΓΌneΕ, Barak A. Pearlmutter, and Jeffrey Mark Siskind. 2016.
βTricks from Deep Learning.β arXiv:1611.03777 [Cs, Stat], November.
Bengio, Yoshua, Aaron Courville, and Pascal Vincent. 2013.
βRepresentation Learning: A Review and New Perspectives.β IEEE Transactions on Pattern Analysis and Machine Intelligence 35: 1798β828.
Bengio, Yoshua, and Yann LeCun. 2007.
βScaling Learning Algorithms Towards AI.β Large-Scale Kernel Machines 34: 1β41.
Bengio, Yoshua, Nicolas L. Roux, Pascal Vincent, Olivier Delalleau, and Patrice Marcotte. 2005.
βConvex Neural Networks.β In
Advances in Neural Information Processing Systems, 18:123β30. MIT Press.
Boser, B. 1991.
βAn Analog Neural Network Processor with Programmable Topology.β J. Solid State Circuits 26: 2017β25.
Brock, Andrew, Theodore Lim, J. M. Ritchie, and Nick Weston. 2017.
βFreezeOut: Accelerate Training by Progressively Freezing Layers.β arXiv:1706.04983 [Cs, Stat], June.
Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. 2015.
βNet2Net: Accelerating Learning via Knowledge Transfer.β arXiv:1511.05641 [Cs], November.
Cho, Kyunghyun, Bart van MerriΓ«nboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014.
βOn the Properties of Neural Machine Translation: Encoder-Decoder Approaches.β arXiv Preprint arXiv:1409.1259.
Choromanska, Anna, MIkael Henaff, Michael Mathieu, Gerard Ben Arous, and Yann LeCun. 2015.
βThe Loss Surfaces of Multilayer Networks.β In
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 192β204.
Cybenko, G. 1989.
βApproximation by Superpositions of a Sigmoidal Function.β Mathematics of Control, Signals and Systems 2: 303β14.
Dahl, G. E. 2012.
βContext-Dependent Pre-Trained Deep Neural Networks for Large Vocabulary Speech Recognition.β IEEE Transactions on Audio, Speech and Language Processing 20: 33β42.
Dauphin, Yann, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio. 2014.
βIdentifying and Attacking the Saddle Point Problem in High-Dimensional Non-Convex Optimization.β In
Advances in Neural Information Processing Systems 27, 2933β41. Curran Associates, Inc.
Dieleman, Sander, and Benjamin Schrauwen. 2014.
βEnd to End Learning for Music Audio.β In
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6964β68. IEEE.
Erhan, Dumitru, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010.
βWhy Does Unsupervised Pre-Training Help Deep Learning?β Journal of Machine Learning Research 11 (Feb): 625β60.
Farabet, C. 2013.
βLearning Hierarchical Features for Scene Labeling.β IEEE Transactions on Pattern Analysis and Machine Intelligence 35: 1915β29.
Gal, Yarin, and Zoubin Ghahramani. 2016.
βA Theoretically Grounded Application of Dropout in Recurrent Neural Networks.β In
arXiv:1512.05287 [Stat].
Garcia, C. 2004.
βConvolutional Face Finder: A Neural Architecture for Fast and Robust Face Detection.β IEEE Transactions on Pattern Analysis and Machine Intelligence 26: 1408β23.
Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. 2015.
βA Neural Algorithm of Artistic Style.β arXiv:1508.06576 [Cs, q-Bio], August.
Giryes, Raja, Guillermo Sapiro, and Alex M. Bronstein. 2014.
βOn the Stability of Deep Networks.β arXiv:1412.5896 [Cs, Math, Stat], December.
Giryes, R., G. Sapiro, and A. M. Bronstein. 2016.
βDeep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?β IEEE Transactions on Signal Processing 64 (13): 3444β57.
Globerson, Amir, and Roi Livni. 2016.
βLearning Infinite-Layer Networks: Beyond the Kernel Trick.β arXiv:1606.05316 [Cs], June.
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. 2014.
βExplaining and Harnessing Adversarial Examples.β arXiv:1412.6572 [Cs, Stat], December.
Goodfellow, Ian J., Oriol Vinyals, and Andrew M. Saxe. 2014.
βQualitatively Characterizing Neural Network Optimization Problems.β arXiv:1412.6544 [Cs, Stat], December.
Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014.
βGenerative Adversarial Nets.β In
Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 2672β80. NIPSβ14. Cambridge, MA, USA: Curran Associates, Inc.
Hadsell, R., S. Chopra, and Y. LeCun. 2006.
βDimensionality Reduction by Learning an Invariant Mapping.β In
2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2:1735β42.
He, Kun, Yan Wang, and John Hopcroft. 2016.
βA Powerful Generative Model Using Random Weights for the Deep Image Representation.β In
Advances in Neural Information Processing Systems.
Hinton, G. E. 1995.
βThe Wake-Sleep Algorithm for Unsupervised Neural Networks.β Science 268 (5214): 1558β1161.
Hinton, G., Li Deng, Dong Yu, G.E. Dahl, A. Mohamed, N. Jaitly, A. Senior, et al. 2012.
βDeep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups.β IEEE Signal Processing Magazine 29 (6): 82β97.
Hinton, Geoffrey. 2010.
βA Practical Guide to Training Restricted Boltzmann Machines.β In
Neural Networks: Tricks of the Trade, 9:926. Lecture Notes in Computer Science 7700. Springer Berlin Heidelberg.
Hinton, Geoffrey E. 2007.
βTo Recognize Shapes, First Learn to Generate Images.β In
Progress in Brain Research, edited by Trevor Drew and John F. Kalaska Paul Cisek, Volume 165:535β47. Computational Neuroscience: Theoretical Insights into Brain Function. Elsevier.
Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. 2006.
βReducing the Dimensionality of Data with Neural Networks.β Science 313 (5786): 504β7.
Hinton, G, S Osindero, and Y Teh. 2006.
βA Fast Learning Algorithm for Deep Belief Nets.β Neural Computation 18 (7): 1527β54.
Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. 1989.
βMultilayer Feedforward Networks Are Universal Approximators.β Neural Networks 2 (5): 359β66.
Hu, Tao, Cengiz Pehlevan, and Dmitri B. Chklovskii. 2014.
βA Hebbian/Anti-Hebbian Network for Online Sparse Dictionary Learning Derived from Symmetric Matrix Factorization.β In
2014 48th Asilomar Conference on Signals, Systems and Computers.
Huang, Guang-Bin, and Chee-Kheong Siew. 2005.
βExtreme Learning Machine with Randomly Assigned RBF Kernels.β International Journal of Information Technology 11 (1): 16β24.
Huang, Guang-Bin, Dian Hui Wang, and Yuan Lan. 2011.
βExtreme Learning Machines: A Survey.β International Journal of Machine Learning and Cybernetics 2 (2): 107β22.
Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. 2004.
βExtreme Learning Machine: A New Learning Scheme of Feedforward Neural Networks.β In
2004 IEEE International Joint Conference on Neural Networks, 2004. Proceedings, 2:985β990 vol.2.
βββ. 2006.
βExtreme Learning Machine: Theory and Applications.β Neurocomputing, Neural Networks Selected Papers from the 7th Brazilian Symposium on Neural Networks (SBRN β04) 7th Brazilian Symposium on Neural Networks, 70 (1β3): 489β501.
Jaderberg, Max, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, and Koray Kavukcuoglu. 2016.
βDecoupled Neural Interfaces Using Synthetic Gradients.β arXiv:1608.05343 [Cs], August.
Kaiser, Εukasz, and Ilya Sutskever. 2015.
βNeural GPUs Learn Algorithms.β arXiv:1511.08228 [Cs], November.
Kalchbrenner, Nal, Ivo Danihelka, and Alex Graves. 2016.
βGrid Long Short-Term Memory.β arXiv:1507.01526 [Cs], January.
Kavukcuoglu, Koray, MarcβAurelio Ranzato, and Yann LeCun. 2010.
βFast Inference in Sparse Coding Algorithms with Applications to Object Recognition.β arXiv:1010.3467 [Cs], October.
Kingma, Diederik P., Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016.
βImproving Variational Inference with Inverse Autoregressive Flow.β In
Advances in Neural Information Processing Systems 29. Curran Associates, Inc.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. 2012.
βImagenet Classification with Deep Convolutional Neural Networks.β In
Advances in Neural Information Processing Systems, 1097β1105.
Kulkarni, Tejas D., Will Whitney, Pushmeet Kohli, and Joshua B. Tenenbaum. 2015.
βDeep Convolutional Inverse Graphics Network.β arXiv:1503.03167 [Cs], March.
Larsen, Anders Boesen Lindbo, SΓΈren Kaae SΓΈnderby, Hugo Larochelle, and Ole Winther. 2015.
βAutoencoding Beyond Pixels Using a Learned Similarity Metric.β arXiv:1512.09300 [Cs, Stat], December.
Lawrence, S. 1997.
βFace Recognition: A Convolutional Neural-Network Approach.β IEEE Transactions on Neural Networks 8: 98β113.
LeCun, Y. 1998.
βGradient-Based Learning Applied to Document Recognition.β Proceedings of the IEEE 86 (11): 2278β2324.
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015.
βDeep Learning.β Nature 521 (7553): 436β44.
LeCun, Yann, Sumit Chopra, Raia Hadsell, M. Ranzato, and F. Huang. 2006.
βA Tutorial on Energy-Based Learning.β In
Predicting Structured Data.
Lee, Honglak, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng. 2009.
βConvolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations.β In
Proceedings of the 26th Annual International Conference on Machine Learning, 609β16. ICML β09. New York, NY, USA: ACM.
Lee, Wee Sun, Peter L. Bartlett, and Robert C. Williamson. 1996.
βEfficient Agnostic Learning of Neural Networks with Bounded Fan-in.β IEEE Transactions on Information Theory 42 (6): 2118β32.
Leung, M. K. 2014.
βDeep Learning of the Tissue-Regulated Splicing Code.β Bioinformatics 30: i121β29.
Liang, Feynman, Marcin Tomczak, Matt Johnson, Mark Gotham, Jamie Shotten, and Bill Byrne. n.d. βBachBot: Deep Generative Modeling of Bach Chorales,β 1.
Lin, Henry W., and Max Tegmark. 2016a.
βCritical Behavior from Deep Dynamics: A Hidden Dimension in Natural Language.β arXiv:1606.06737 [Cond-Mat], June.
βββ. 2016b.
βWhy Does Deep and Cheap Learning Work so Well?β arXiv:1608.08225 [Cond-Mat, Stat], August.
Lipton, Zachary C. 2016a.
βStuck in a What? Adventures in Weight Space.β arXiv:1602.07320 [Cs], February.
βββ. 2016b.
βThe Mythos of Model Interpretability.β In
arXiv:1606.03490 [Cs, Stat].
Lipton, Zachary C., John Berkowitz, and Charles Elkan. 2015.
βA Critical Review of Recurrent Neural Networks for Sequence Learning.β arXiv:1506.00019 [Cs], May.
Maclaurin, Dougal, David Duvenaud, and Ryan Adams. 2015.
βGradient-Based Hyperparameter Optimization Through Reversible Learning.β In
Proceedings of the 32nd International Conference on Machine Learning, 2113β22. PMLR.
Mallat, StΓ©phane. 2012.
βGroup Invariant Scattering.β Communications on Pure and Applied Mathematics 65 (10): 1331β98.
βββ. 2016.
βUnderstanding Deep Convolutional Networks.β arXiv:1601.04920 [Cs, Stat], January.
Mehta, Pankaj, and David J. Schwab. 2014.
βAn Exact Mapping Between the Variational Renormalization Group and Deep Learning.β arXiv:1410.3831 [Cond-Mat, Stat], October.
Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013.
βEfficient Estimation of Word Representations in Vector Space.β arXiv:1301.3781 [Cs], January.
Mikolov, Tomas, Quoc V. Le, and Ilya Sutskever. 2013.
βExploiting Similarities Among Languages for Machine Translation.β arXiv:1309.4168 [Cs], September.
Mohamed, A. r, G. E. Dahl, and G. Hinton. 2012.
βAcoustic Modeling Using Deep Belief Networks.β IEEE Transactions on Audio, Speech, and Language Processing 20 (1): 14β22.
Monner, Derek, and James A. Reggia. 2012.
βA Generalized LSTM-Like Training Algorithm for Second-Order Recurrent Neural Networks.β Neural Networks 25 (January): 70β83.
Ning, F. 2005.
βToward Automatic Phenotyping of Developing Embryos from Videos.β IEEE Transactions on Image Processing 14: 1360β71.
NΓΈkland, Arild. 2016.
βDirect Feedback Alignment Provides Learning in Deep Neural Networks.β In
Advances In Neural Information Processing Systems.
Olshausen, B. A., and D. J. Field. 1996.
βNatural image statistics and efficient coding.β Network (Bristol, England) 7 (2): 333β39.
Olshausen, Bruno A, and David J Field. 2004.
βSparse Coding of Sensory Inputs.β Current Opinion in Neurobiology 14 (4): 481β87.
Oord, Aaron van den, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016.
βWaveNet: A Generative Model for Raw Audio.β In
9th ISCA Speech Synthesis Workshop.
Oord, AΓ€ron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016.
βPixel Recurrent Neural Networks.β arXiv:1601.06759 [Cs], January.
Oord, AΓ€ron van den, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016.
βConditional Image Generation with PixelCNN Decoders.β arXiv:1606.05328 [Cs], June.
Pan, Wei, Hao Dong, and Yike Guo. 2016.
βDropNeuron: Simplifying the Structure of Deep Neural Networks.β arXiv:1606.07326 [Cs, Stat], June.
Parisotto, Emilio, and Ruslan Salakhutdinov. 2017.
βNeural Map: Structured Memory for Deep Reinforcement Learning.β arXiv:1702.08360 [Cs], February.
Pascanu, Razvan, Yann N. Dauphin, Surya Ganguli, and Yoshua Bengio. 2014.
βOn the Saddle Point Problem for Non-Convex Optimization.β arXiv:1405.4604 [Cs], May.
Paul, Arnab, and Suresh Venkatasubramanian. 2014.
βWhy Does Deep Learning Work? - A Perspective from Group Theory.β arXiv:1412.6621 [Cs, Stat], December.
Pinkus, Allan. 1999.
βApproximation Theory of the MLP Model in Neural Networks.β Acta Numerica 8 (January): 143β95.
Ranzato, M. 2013.
βModeling Natural Images Using Gated MRFs.β IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (9): 2206β22.
Ranzato, Marcβaurelio, Y.-lan Boureau, and Yann L. Cun. 2008.
βSparse Feature Learning for Deep Belief Networks.β In
Advances in Neural Information Processing Systems 20, edited by J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, 1185β92. Curran Associates, Inc.
Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. 1986.
βLearning Representations by Back-Propagating Errors.β Nature 323 (6088): 533β36.
Sagun, Levent, V. Ugur Guney, Gerard Ben Arous, and Yann LeCun. 2014.
βExplorations on High Dimensional Landscapes.β arXiv:1412.6615 [Cs, Stat], December.
Salimans, Tim, and Diederik P Kingma. 2016.
βWeight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks.β In
Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 901β1. Curran Associates, Inc.
Scardapane, Simone, Danilo Comminiello, Amir Hussain, and Aurelio Uncini. 2016.
βGroup Sparse Regularization for Deep Neural Networks.β arXiv:1607.00485 [Cs, Stat], July.
Shazeer, Noam, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017.
βOutrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer.β arXiv:1701.06538 [Cs, Stat], January.
Shwartz-Ziv, Ravid, and Naftali Tishby. 2017.
βOpening the Black Box of Deep Neural Networks via Information.β arXiv:1703.00810 [Cs], March.
Smith, Leslie N., and Nicholay Topin. 2017.
βExploring Loss Function Topology with Cyclical Learning Rates.β arXiv:1702.04283 [Cs], February.
Springenberg, Jost Tobias, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. 2014.
βStriving for Simplicity: The All Convolutional Net.β In
Proceedings of International Conference on Learning Representations (ICLR) 2015.
Starr, M. Allen (Moses Allen). 1913.
Organic and functional nervous diseases; a text-book of neurology. New York, Philadelphia, Lea & Febiger.
Steeg, Greg Ver, and Aram Galstyan. 2015.
βThe Information Sieve.β arXiv:1507.02284 [Cs, Math, Stat], July.
Telgarsky, Matus. 2015.
βRepresentation Benefits of Deep Feedforward Networks.β arXiv:1509.08101 [Cs], September.
Urban, Gregor, Krzysztof J. Geras, Samira Ebrahimi Kahou, Ozlem Aslan, Shengjie Wang, Rich Caruana, Abdelrahman Mohamed, Matthai Philipose, and Matt Richardson. 2016.
βDo Deep Convolutional Nets Really Need to Be Deep (Or Even Convolutional)?β arXiv:1603.05691 [Cs, Stat], March.
Wiatowski, Thomas, and Helmut BΓΆlcskei. 2015.
βA Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction.β In
Proceedings of IEEE International Symposium on Information Theory.
Wiatowski, Thomas, Philipp Grohs, and Helmut BΓΆlcskei. 2018.
βEnergy Propagation in Deep Convolutional Neural Networks.β IEEE Transactions on Information Theory 64 (7): 1β1.
Xie, Bo, Yingyu Liang, and Le Song. 2016.
βDiversity Leads to Generalization in Neural Networks.β arXiv:1611.03131 [Cs, Stat], November.
Zhang, Chiyuan, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2017.
βUnderstanding Deep Learning Requires Rethinking Generalization.β In
Proceedings of ICLR.
Zhang, Sixin, Anna Choromanska, and Yann LeCun. 2015.
βDeep Learning with Elastic Averaging SGD.β In
Advances In Neural Information Processing Systems.
No comments yet. Why not leave one?