Aicher, Christopher, Nicholas J. Foti, and Emily B. Fox. 2020.
“Adaptively Truncating Backpropagation Through Time to Control Gradient Bias.” In
Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, 799–808. PMLR.
Allen-Zhu, Zeyuan, and Yuanzhi Li. 2019.
“Can SGD Learn Recurrent Neural Networks with Provable Generalization?” arXiv:1902.01028 [Cs, Math, Stat], February.
Anderson, Alexander G., and Cory P. Berg. 2017.
“The High-Dimensional Geometry of Binary Neural Networks.” arXiv:1705.07199 [Cs], May.
Arisoy, Ebru, Tara N. Sainath, Brian Kingsbury, and Bhuvana Ramabhadran. 2012. “Deep Neural Network Language Models.” In Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-Gram Model? On the Future of Language Modeling for HLT, 20–28. WLM ’12. Montreal, Canada: Association for Computational Linguistics.
Arjovsky, Martin, Amar Shah, and Yoshua Bengio. 2016.
“Unitary Evolution Recurrent Neural Networks.” In
Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, 1120–28. ICML’16. New York, NY, USA: JMLR.org.
Balduzzi, David, Marcus Frean, Lennox Leary, J. P. Lewis, Kurt Wan-Duo Ma, and Brian McWilliams. 2017.
“The Shattered Gradients Problem: If Resnets Are the Answer, Then What Is the Question?” In
PMLR, 342–50.
Bazzani, Loris, Lorenzo Torresani, and Hugo Larochelle. 2017. “Recurrent Mixture Density Network for Spatiotemporal Visual Attention,” 15.
Ben Taieb, Souhaib, and Amir F. Atiya. 2016.
“A Bias and Variance Analysis for Multistep-Ahead Time Series Forecasting.” IEEE transactions on neural networks and learning systems 27 (1): 62–76.
Bengio, Samy, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer. 2015.
“Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks.” In
Advances in Neural Information Processing Systems 28, 1171–79. NIPS’15. Cambridge, MA, USA: Curran Associates, Inc.
Bengio, Y., P. Simard, and P. Frasconi. 1994.
“Learning Long-Term Dependencies with Gradient Descent Is Difficult.” IEEE Transactions on Neural Networks 5 (2): 157–66.
Boulanger-Lewandowski, Nicolas, Yoshua Bengio, and Pascal Vincent. 2012.
“Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription.” In
29th International Conference on Machine Learning.
Bown, Oliver, and Sebastian Lexer. 2006.
“Continuous-Time Recurrent Neural Networks for Generative and Interactive Musical Performance.” In
Applications of Evolutionary Computing, edited by Franz Rothlauf, Jürgen Branke, Stefano Cagnoni, Ernesto Costa, Carlos Cotta, Rolf Drechsler, Evelyne Lutton, et al., 652–63. Lecture Notes in Computer Science 3907. Springer Berlin Heidelberg.
Buhusi, Catalin V., and Warren H. Meck. 2005.
“What Makes Us Tick? Functional and Neural Mechanisms of Interval Timing.” Nature Reviews Neuroscience 6 (10): 755–65.
Chang, Bo, Minmin Chen, Eldad Haber, and Ed H. Chi. 2019.
“AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks.” In
Proceedings of ICLR.
Charles, Adam, Dong Yin, and Christopher Rozell. 2016.
“Distributed Sequence Memory of Multidimensional Inputs in Recurrent Networks.” arXiv:1605.08346 [Cs, Math, Stat], May.
Chevillon, Guillaume. 2007.
“Direct Multi-Step Estimation and Forecasting.” Journal of Economic Surveys 21 (4): 746–85.
Cho, Kyunghyun, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014.
“Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation.” In
EMNLP 2014.
Cho, Kyunghyun, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014.
“On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.” arXiv Preprint arXiv:1409.1259.
Chung, Junyoung, Sungjin Ahn, and Yoshua Bengio. 2016.
“Hierarchical Multiscale Recurrent Neural Networks.” arXiv:1609.01704 [Cs], September.
Chung, Junyoung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014.
“Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.” In
NIPS.
Chung, Junyoung, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. 2015.
“Gated Feedback Recurrent Neural Networks.” In
Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, 2067–75. ICML’15. JMLR.org.
Chung, Junyoung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Yoshua Bengio. 2015.
“A Recurrent Latent Variable Model for Sequential Data.” In
Advances in Neural Information Processing Systems 28, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2980–88. Curran Associates, Inc.
Collins, Jasmine, Jascha Sohl-Dickstein, and David Sussillo. 2016.
“Capacity and Trainability in Recurrent Neural Networks.” In
arXiv:1611.09913 [Cs, Stat].
Cooijmans, Tim, Nicolas Ballas, César Laurent, Çağlar Gülçehre, and Aaron Courville. 2016.
“Recurrent Batch Normalization.” arXiv Preprint arXiv:1603.09025.
Dasgupta, Sakyasingha, Takayuki Yoshizumi, and Takayuki Osogami. 2016.
“Regularized Dynamic Boltzmann Machine with Delay Pruning for Unsupervised Learning of Temporal Sequences.” arXiv:1610.01989 [Cs, Stat], September.
Doelling, Keith B., and David Poeppel. 2015.
“Cortical Entrainment to Music and Its Modulation by Expertise.” Proceedings of the National Academy of Sciences 112 (45): E6233–42.
Elman, Jeffrey L. 1990.
“Finding Structure in Time.” Cognitive Science 14: 179–211.
Fortunato, Meire, Charles Blundell, and Oriol Vinyals. 2017.
“Bayesian Recurrent Neural Networks.” arXiv:1704.02798 [Cs, Stat], April.
Fraccaro, Marco, Sø ren Kaae Sø nderby, Ulrich Paquet, and Ole Winther. 2016.
“Sequential Neural Models with Stochastic Layers.” In
Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 2199–2207. Curran Associates, Inc.
Gers, Felix A., Jürgen Schmidhuber, and Fred Cummins. 2000.
“Learning to Forget: Continual Prediction with LSTM.” Neural Computation 12 (10): 2451–71.
Gers, Felix A., Nicol N. Schraudolph, and Jürgen Schmidhuber. 2002.
“Learning Precise Timing with LSTM Recurrent Networks.” Journal of Machine Learning Research 3 (Aug): 115–43.
Graves, Alex. 2011.
“Practical Variational Inference for Neural Networks.” In
Proceedings of the 24th International Conference on Neural Information Processing Systems, 2348–56. NIPS’11. USA: Curran Associates Inc.
———. 2012.
Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, v. 385. Heidelberg ; New York: Springer.
Gregor, Karol, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. 2015.
“DRAW: A Recurrent Neural Network For Image Generation.” arXiv:1502.04623 [Cs], February.
Gruslys, Audrunas, Remi Munos, Ivo Danihelka, Marc Lanctot, and Alex Graves. 2016.
“Memory-Efficient Backpropagation Through Time.” In
Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 4125–33. Curran Associates, Inc.
Grzyb, B. J., E. Chinellato, G. M. Wojcik, and W. A. Kaminski. 2009.
“Which Model to Use for the Liquid State Machine?” In
2009 International Joint Conference on Neural Networks, 1018–24.
Gu, Albert, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, and Christopher Ré. 2021.
“Combining Recurrent, Convolutional, and Continuous-Time Models with Linear State Space Layers.” In
Advances in Neural Information Processing Systems, 34:572–85. Curran Associates, Inc.
Hardt, Moritz, Tengyu Ma, and Benjamin Recht. 2018.
“Gradient Descent Learns Linear Dynamical Systems.” The Journal of Machine Learning Research 19 (1): 1025–68.
Hazan, Elad, Karan Singh, and Cyril Zhang. 2017.
“Learning Linear Dynamical Systems via Spectral Filtering.” In
NIPS.
Hazan, Hananel, and Larry M. Manevitz. 2012.
“Topological Constraints and Robustness in Liquid State Machines.” Expert Systems with Applications 39 (2): 1597–1606.
He, Kun, Yan Wang, and John Hopcroft. 2016.
“A Powerful Generative Model Using Random Weights for the Deep Image Representation.” In
Advances in Neural Information Processing Systems.
Hinton, G., Li Deng, Dong Yu, G.E. Dahl, A. Mohamed, N. Jaitly, A. Senior, et al. 2012.
“Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups.” IEEE Signal Processing Magazine 29 (6): 82–97.
Hochreiter, Sepp. 1998.
“The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions.” International Journal of Uncertainty Fuzziness and Knowledge Based Systems 6: 107–15.
Hochreiter, Sepp, Yoshua Bengio, Paolo Frasconi, and Jürgen Schmidhuber. 2001.
“Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies.” In
A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press.
Hochreiter, Sepp, and Jiirgen Schmidhuber. 1997a.
“LTSM Can Solve Hard Time Lag Problems.” In
Advances in Neural Information Processing Systems: Proceedings of the 1996 Conference, 473–79.
Hochreiter, Sepp, and Jürgen Schmidhuber. 1997b.
“Long Short-Term Memory.” Neural Computation 9 (8): 1735–80.
Jing, Li, Yichen Shen, Tena Dubcek, John Peurifoy, Scott Skirlo, Yann LeCun, Max Tegmark, and Marin Soljačić. 2017.
“Tunable Efficient Unitary Neural Networks (EUNN) and Their Application to RNNs.” In
PMLR, 1733–41.
Jozefowicz, Rafal, Wojciech Zaremba, and Ilya Sutskever. 2015.
“An Empirical Exploration of Recurrent Network Architectures.” In
Proceedings of the 32nd International Conference on Machine Learning (ICML-15), 2342–50.
Karpathy, Andrej, Justin Johnson, and Li Fei-Fei. 2015.
“Visualizing and Understanding Recurrent Networks.” arXiv:1506.02078 [Cs], June.
Kingma, Diederik P., Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016.
“Improving Variational Inference with Inverse Autoregressive Flow.” In
Advances in Neural Information Processing Systems 29. Curran Associates, Inc.
Koutník, Jan, Klaus Greff, Faustino Gomez, and Jürgen Schmidhuber. 2014.
“A Clockwork RNN.” arXiv:1402.3511 [Cs], February.
Krishnan, Rahul G., Uri Shalit, and David Sontag. 2015.
“Deep Kalman Filters.” arXiv Preprint arXiv:1511.05121.
Lamb, Alex, Anirudh Goyal, Ying Zhang, Saizheng Zhang, Aaron Courville, and Yoshua Bengio. 2016.
“Professor Forcing: A New Algorithm for Training Recurrent Networks.” In
Advances In Neural Information Processing Systems.
Laurent, Thomas, and James von Brecht. 2016.
“A Recurrent Neural Network Without Chaos.” arXiv:1612.06212 [Cs], December.
LeCun, Y. 1998.
“Gradient-Based Learning Applied to Document Recognition.” Proceedings of the IEEE 86 (11): 2278–2324.
Legenstein, Robert, Christian Naeger, and Wolfgang Maass. 2005.
“What Can a Neuron Learn with Spike-Timing-Dependent Plasticity?” Neural Computation 17 (11): 2337–82.
Lillicrap, Timothy P, and Adam Santoro. 2019.
“Backpropagation Through Time and the Brain.” Current Opinion in Neurobiology, Machine Learning, Big Data, and Neuroscience, 55 (April): 82–89.
Lipton, Zachary C., John Berkowitz, and Charles Elkan. 2015.
“A Critical Review of Recurrent Neural Networks for Sequence Learning.” arXiv:1506.00019 [Cs], May.
Lukoševičius, Mantas, and Herbert Jaeger. 2009.
“Reservoir Computing Approaches to Recurrent Neural Network Training.” Computer Science Review 3 (3): 127–49.
Maass, W., T. Natschläger, and H. Markram. 2004.
“Computational Models for Generic Cortical Microcircuits.” In
Computational Neuroscience: A Comprehensive Approach, 575–605. Chapman & Hall/CRC.
MacKay, Matthew, Paul Vicol, Jimmy Ba, and Roger Grosse. 2018.
“Reversible Recurrent Neural Networks.” In
Advances In Neural Information Processing Systems.
Maddison, Chris J., Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, Andriy Mnih, Arnaud Doucet, and Yee Whye Teh. 2017.
“Filtering Variational Objectives.” arXiv Preprint arXiv:1705.09279.
Martens, James. 2010.
“Deep Learning via Hessian-Free Optimization.” In
Proceedings of the 27th International Conference on International Conference on Machine Learning, 735–42. ICML’10. USA: Omnipress.
Martens, James, and Ilya Sutskever. 2011.
“Learning Recurrent Neural Networks with Hessian-Free Optimization.” In
Proceedings of the 28th International Conference on International Conference on Machine Learning, 1033–40. ICML’11. USA: Omnipress.
———. 2012.
“Training Deep and Recurrent Networks with Hessian-Free Optimization.” In
Neural Networks: Tricks of the Trade, 479–535. Lecture Notes in Computer Science. Springer.
Mhammedi, Zakaria, Andrew Hellicar, Ashfaqur Rahman, and James Bailey. 2017.
“Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections.” In
PMLR, 2401–9.
Mikolov, Tomáš, Martin Karafiát, Lukáš Burget, Jan Černockỳ, and Sanjeev Khudanpur. 2010.
“Recurrent Neural Network Based Language Model.” In
Eleventh Annual Conference of the International Speech Communication Association.
Miller, John, and Moritz Hardt. 2018.
“When Recurrent Models Don’t Need To Be Recurrent.” arXiv:1805.10369 [Cs, Stat], May.
Mohamed, A. r, G. E. Dahl, and G. Hinton. 2012.
“Acoustic Modeling Using Deep Belief Networks.” IEEE Transactions on Audio, Speech, and Language Processing 20 (1): 14–22.
Neil, Daniel, Michael Pfeiffer, and Shih-Chii Liu. 2016.
“Phased LSTM: Accelerating Recurrent Network Training for Long or Event-Based Sequences.” arXiv:1610.09513 [Cs], October.
Niu, Murphy Yuezhen, Lior Horesh, and Isaac Chuang. 2019.
“Recurrent Neural Networks in the Eye of Differential Equations.” arXiv:1904.12933 [Quant-Ph, Stat], April.
Nussbaum-Thom, Markus, Jia Cui, Bhuvana Ramabhadran, and Vaibhava Goel. 2016.
“Acoustic Modeling Using Bidirectional Gated Recurrent Convolutional Units.” In, 390–94.
Oliva, Junier B., Barnabas Poczos, and Jeff Schneider. 2017.
“The Statistical Recurrent Unit.” arXiv:1703.00381 [Cs, Stat], March.
Pascanu, Razvan, Tomas Mikolov, and Yoshua Bengio. 2013.
“On the Difficulty of Training Recurrent Neural Networks.” In
arXiv:1211.5063 [Cs], 1310–18.
Patraucean, Viorica, Ankur Handa, and Roberto Cipolla. 2015.
“Spatio-Temporal Video Autoencoder with Differentiable Memory.” arXiv:1511.06309 [Cs], November.
Pillonetto, Gianluigi. 2016.
“The Interplay Between System Identification and Machine Learning.” arXiv:1612.09158 [Cs, Stat], December.
Ravanbakhsh, Siamak, Jeff Schneider, and Barnabas Poczos. 2016.
“Deep Learning with Sets and Point Clouds.” In
arXiv:1611.04500 [Cs, Stat].
Roberts, Adam, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck. 2018.
“A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music.” arXiv:1803.05428 [Cs, Eess, Stat], March.
Rohrbach, Anna, Marcus Rohrbach, and Bernt Schiele. 2015.
“The Long-Short Story of Movie Description.” arXiv:1506.01698 [Cs], June.
Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. 1986.
“Learning Representations by Back-Propagating Errors.” Nature 323 (6088): 533–36.
Ryder, Thomas, Andrew Golightly, A. Stephen McGough, and Dennis Prangle. 2018.
“Black-Box Variational Inference for Stochastic Differential Equations.” arXiv:1802.03335 [Stat], February.
Sjöberg, Jonas, Qinghua Zhang, Lennart Ljung, Albert Benveniste, Bernard Delyon, Pierre-Yves Glorennec, Håkan Hjalmarsson, and Anatoli Juditsky. 1995.
“Nonlinear Black-Box Modeling in System Identification: A Unified Overview.” Automatica, Trends in System Identification, 31 (12): 1691–1724.
Song, Yang, Chenlin Meng, Renjie Liao, and Stefano Ermon. 2020.
“Nonlinear Equation Solving: A Faster Alternative to Feedforward Computation.” arXiv:2002.03629 [Cs, Stat], February.
Steil, J. J. 2004.
“Backpropagation-Decorrelation: Online Recurrent Learning with O(N) Complexity.” In
2004 IEEE International Joint Conference on Neural Networks, 2004. Proceedings, 2:843–848 vol.2.
Surace, Simone Carlo, and Jean-Pascal Pfister. 2016. “Online Maximum Likelihood Estimation of the Parameters of Partially Observed Diffusion Processes.” In.
Sutskever, Ilya. 2013.
“Training Recurrent Neural Networks.” PhD Thesis, Toronto, Ont., Canada, Canada: University of Toronto.
Takamoto, Makoto, Timothy Praditia, Raphael Leiteritz, Dan MacKinlay, Francesco Alesiani, Dirk Pflüger, and Mathias Niepert. 2022.
“PDEBench: An Extensive Benchmark for Scientific Machine Learning.” In.
Tallec, Corentin, and Yann Ollivier. 2017.
“Unbiasing Truncated Backpropagation Through Time.” arXiv.
Taylor, Graham W., Geoffrey E. Hinton, and Sam T. Roweis. 2006.
“Modeling Human Motion Using Binary Latent Variables.” In
Advances in Neural Information Processing Systems, 1345–52.
Theis, Lucas, and Matthias Bethge. 2015.
“Generative Image Modeling Using Spatial LSTMs.” arXiv:1506.03478 [Cs, Stat], June.
Visin, Francesco, Kyle Kastner, Kyunghyun Cho, Matteo Matteucci, Aaron Courville, and Yoshua Bengio. 2015.
“ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks.” arXiv:1505.00393 [Cs], May.
Voelker, Aaron R, Ivana Kajic, and Chris Eliasmith. n.d. “Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks,” 10.
Wen, Ruofeng, Kari Torkkola, and Balakrishnan Narayanaswamy. 2017.
“A Multi-Horizon Quantile Recurrent Forecaster.” arXiv:1711.11053 [Stat], November.
Williams, Ronald J., and David Zipser. 1989.
“A Learning Algorithm for Continually Running Fully Recurrent Neural Networks.” Neural Computation 1 (2): 270–80.
Wisdom, Scott, Thomas Powers, John Hershey, Jonathan Le Roux, and Les Atlas. 2016.
“Full-Capacity Unitary Recurrent Neural Networks.” In
Advances in Neural Information Processing Systems, 4880–88.
Wisdom, Scott, Thomas Powers, James Pitton, and Les Atlas. 2016.
“Interpretable Recurrent Neural Networks Using Sequential Sparse Recovery.” In
Advances in Neural Information Processing Systems 29.
Wu, Yuhuai, Saizheng Zhang, Ying Zhang, Yoshua Bengio, and Ruslan R Salakhutdinov. 2016.
“On Multiplicative Integration with Recurrent Neural Networks.” In
Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 2856–64. Curran Associates, Inc.
Yao, Li, Atousa Torabi, Kyunghyun Cho, Nicolas Ballas, Christopher Pal, Hugo Larochelle, and Aaron Courville. 2015.
“Describing Videos by Exploiting Temporal Structure.” arXiv:1502.08029 [Cs, Stat], February.
No comments yet. Why not leave one?