Learning Gamelan



Attention conservation notice: Crib notes for a 2 year long project which I ultimately abandoned in late 2018 about approximating convnet with recurrent neural networks for analysing time series. This project currently exists purely as LaTeX files on my hard drive, which need to be imported for future reference. I did learn some useful tricks along the way about controlling the poles of IIR filters for learning by gradient descent, and those will be actually interesting.

I feel a certain class of audio signal should be easy to decompose and thence learn in a musically useful way; ones approximated by LTI, nearly-linear, nearly-additive filterbanks with sparse activations. Mostly we handle musical signals via convnets which is not satisfying, and one feels one could do better with a more appropriate architecture. This project was about finding that architecture.

πŸ—

References

Abdallah, Samer A., and Mark D. Plumbley. 2004. β€œPolyphonic Music Transcription by Non-Negative Sparse Coding of Power Spectra.” In.
Allen-Zhu, Zeyuan, and Yuanzhi Li. 2019. β€œCan SGD Learn Recurrent Neural Networks with Provable Generalization?” arXiv:1902.01028 [Cs, Math, Stat], February.
Alliney, S. 1992. β€œDigital Filters as Absolute Norm Regularizers.” IEEE Transactions on Signal Processing 40 (6): 1548–62.
Antoniou, Andreas. 2005. Digital signal processing: signals, systems and filters. New York: McGraw-Hill.
Arjovsky, Martin, Amar Shah, and Yoshua Bengio. 2016. β€œUnitary Evolution Recurrent Neural Networks.” In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, 1120–28. ICML’16. New York, NY, USA: JMLR.org.
Ascher, Uri M. 2008. Numerical methods for evolutionary differential equations. Computational science and engineering 5. Philadelphia, Pa: SIAM, Soc. for Industrial and Applied Mathematics.
Atal, B. S. 2006. β€œThe History of Linear Prediction.” IEEE Signal Processing Magazine 23 (2): 154–61.
Bach, Francis R., and Michael I. Jordan. 2006. β€œLearning Spectral Clustering, with Application to Speech Separation.” Journal of Machine Learning Research 7 (Oct): 1963–2001.
Bach, Francis R., and Eric Moulines. 2013. β€œNon-Strongly-Convex Smooth Stochastic Approximation with Convergence Rate O(1/n).” In arXiv:1306.2119 [Cs, Math, Stat], 773–81.
Banitalebi-Dehkordi, Mehdi, and Amin Banitalebi-Dehkordi. 2014. β€œMusic Genre Classification Using Spectral Analysis and Sparse Representation of the Signals.” Journal of Signal Processing Systems 74 (2): 273–80.
Barron, A.R. 1993. β€œUniversal Approximation Bounds for Superpositions of a Sigmoidal Function.” IEEE Transactions on Information Theory 39 (3): 930–45.
Baydin, Atilim Gunes, and Barak A. Pearlmutter. 2014. β€œAutomatic Differentiation of Algorithms for Machine Learning.” arXiv:1404.7456 [Cs, Stat], April.
Bayro-Corrochano, Eduardo. 2005. β€œThe Theory and Use of the Quaternion Wavelet Transform.” Journal of Mathematical Imaging and Vision 24 (1): 19–35.
Ben Taieb, Souhaib, and Amir F. Atiya. 2016. β€œA Bias and Variance Analysis for Multistep-Ahead Time Series Forecasting.” IEEE transactions on neural networks and learning systems 27 (1): 62–76.
Bengio, Samy, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer. 2015. β€œScheduled Sampling for Sequence Prediction with Recurrent Neural Networks.” In Advances in Neural Information Processing Systems 28, 1171–79. NIPS’15. Cambridge, MA, USA: Curran Associates, Inc.
Bengio, Y., P. Simard, and P. Frasconi. 1994. β€œLearning Long-Term Dependencies with Gradient Descent Is Difficult.” IEEE Transactions on Neural Networks 5 (2): 157–66.
Bertin, N., R. Badeau, and E. Vincent. 2010. β€œEnforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription.” IEEE Transactions on Audio, Speech, and Language Processing 18 (3): 538–49.
Blackman, R. B., and J. W. Tukey. 1959. The measurement of power spectra from the point of view of communications engineering. New York: Dover Publications.
Blei, David M., Alp Kucukelbir, and Jon D. McAuliffe. 2017. β€œVariational Inference: A Review for Statisticians.” Journal of the American Statistical Association 112 (518): 859–77.
Bogert, B P, M J R Healy, and J W Tukey. 1963. β€œThe Quefrency Alanysis of Time Series for Echoes: Cepstrum, Pseudo-Autocovariance, Cross-Cepstrum and Saphe Cracking.” In, 209–43.
Bojarski, Mariusz, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, et al. 2016. β€œEnd to End Learning for Self-Driving Cars.” arXiv:1604.07316 [Cs], April.
Bora, Ashish, Ajil Jalal, Eric Price, and Alexandros G. Dimakis. 2017. β€œCompressed Sensing Using Generative Models.” In International Conference on Machine Learning, 537–46.
Bordes, Antoine, LΓ©on Bottou, and Patrick Gallinari. 2009. β€œSGD-QN: Careful Quasi-Newton Stochastic Gradient Descent.” Journal of Machine Learning Research 10 (December): 1737–54.
Borzì, Alfio, and Volker Schulz. 2012. Computational Optimization of Systems Governed by Partial Differential Equations. Computational Science and Engineering Series. Philadelphia: Society for Industrial and Applied Mathematics.
Bottou, LΓ©on. 1998. β€œOnline Algorithms and Stochastic Approximations.” In Online Learning and Neural Networks, edited by David Saad, 17:142. Cambridge, UK: Cambridge University Press.
β€”β€”β€”. 2010. β€œLarge-Scale Machine Learning with Stochastic Gradient Descent.” In Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT’2010), 177–86. Paris, France: Springer.
β€”β€”β€”. 2012. β€œStochastic Gradient Descent Tricks.” In Neural Networks: Tricks of the Trade, 421–36. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg.
Bottou, LΓ©on, and Olivier Bousquet. 2008. β€œThe Tradeoffs of Large Scale Learning.” In Advances in Neural Information Processing Systems, edited by J.C. Platt, D. Koller, Y. Singer, and S. Roweis, 20:161–68. NIPS Foundation (http://books.nips.cc).
Bottou, LΓ©on, Frank E. Curtis, and Jorge Nocedal. 2016. β€œOptimization Methods for Large-Scale Machine Learning.” arXiv:1606.04838 [Cs, Math, Stat], June.
Bottou, LΓ©on, and Yann LeCun. 2004. β€œLarge Scale Online Learning.” In Advances in Neural Information Processing Systems 16, edited by Sebastian Thrun, Lawrence Saul, and Bernhard SchΓΆlkopf. Cambridge, MA: MIT Press.
Boulanger-Lewandowski, Nicolas, Yoshua Bengio, and Pascal Vincent. 2012. β€œModeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription.” In 29th International Conference on Machine Learning.
Box, George E. P., Gwilym M. Jenkins, Gregory C. Reinsel, and Greta M. Ljung. 2016. Time Series Analysis: Forecasting and Control. Fifth edition. Wiley Series in Probability and Statistics. Hoboken, New Jersey: John Wiley & Sons, Inc.
Bridle, J. S., and M. D. Brown. 1974. β€œAn Experimental Automatic Word Recognition System.” JSRU Report 1003 (5).
Buch, Michael, Elio Quinton, and Bob L Sturm. 2017. β€œNichtnegativeMatrixFaktorisierungnutzendesKlangsynthesenSystem (NiMFKS): Extensions of NMF-Based Concatenative Sound Synthesis.” In Proceedings of the 20th International Conference on Digital Audio Effects, 7. Edinburgh.
Cakir, Emre, Ezgi Can Ozan, and Tuomas Virtanen. 2016. β€œFilterbank Learning for Deep Neural Network Based Polyphonic Sound Event Detection.” In Neural Networks (IJCNN), 2016 International Joint Conference on, 3399–3406. IEEE.
Carabias-Orti, J. J., T. Virtanen, P. Vera-Candeas, N. Ruiz-Reyes, and F. J. Canadas-Quesada. 2011. β€œMusical Instrument Sound Multi-Excitation Model for Non-Negative Spectrogram Factorization.” IEEE Journal of Selected Topics in Signal Processing 5 (6): 1144–58.
Carpenter, Bob, Matthew D. Hoffman, Marcus Brubaker, Daniel Lee, Peter Li, and Michael Betancourt. 2015. β€œThe Stan Math Library: Reverse-Mode Automatic Differentiation in C++.” arXiv Preprint arXiv:1509.07164.
Chang, Bo, Minmin Chen, Eldad Haber, and Ed H. Chi. 2019. β€œAntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks.” In Proceedings of ICLR.
Chang, Bo, Lili Meng, Eldad Haber, Lars Ruthotto, David Begert, and Elliot Holtham. 2018. β€œReversible Architectures for Arbitrarily Deep Residual Neural Networks.” In arXiv:1709.03698 [Cs, Stat].
Chang, Bo, Lili Meng, Eldad Haber, Frederick Tung, and David Begert. 2018. β€œMulti-Level Residual Networks from Dynamical Systems View.” In PRoceedings of ICLR.
Charles, Adam, Aurele Balavoine, and Christopher Rozell. 2016. β€œDynamic Filtering of Time-Varying Sparse Signals via L1 Minimization.” IEEE Transactions on Signal Processing 64 (21): 5644–56.
Chen, Y., and A. O. Hero. 2012. β€œRecursive β„“1,∞ Group Lasso.” IEEE Transactions on Signal Processing 60 (8): 3978–87.
Chevillon, Guillaume. 2007. β€œDirect Multi-Step Estimation and Forecasting.” Journal of Economic Surveys 21 (4): 746–85.
Cho, Kyunghyun, Bart van MerriΓ«nboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. β€œOn the Properties of Neural Machine Translation: Encoder-Decoder Approaches.” arXiv Preprint arXiv:1409.1259.
Choi, Keunwoo, George Fazekas, and Mark Sandler. 2016. β€œAutomatic Tagging Using Deep Convolutional Neural Networks.” In PRoceedings of ISMIR.
Choi, Keunwoo, George Fazekas, Mark Sandler, and Kyunghyun Cho. 2016. β€œConvolutional Recurrent Neural Networks for Music Classification.” In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2392–96.
Choi, Keunwoo, GyΓΆrgy Fazekas, Kyunghyun Cho, and Mark Sandler. 2017. β€œA Tutorial on Deep Learning for Music Information Retrieval.” arXiv:1709.04396 [Cs], September.
Choi, Keunwoo, GyΓΆrgy Fazekas, Mark Sandler, and Kyunghyun Cho. 2017. β€œTransfer Learning for Music Classification and Regression Tasks.” In Proceeding of The 18th International Society of Music Information Retrieval (ISMIR) Conference 2017. suzhou, China.
Chollet, FranΓ§ois. 2016. β€œXception: Deep Learning with Depthwise Separable Convolutions.” arXiv:1610.02357 [Cs], October.
Choromanska, Anna, MIkael Henaff, Michael Mathieu, Gerard Ben Arous, and Yann LeCun. 2015. β€œThe Loss Surfaces of Multilayer Networks.” In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 192–204.
Chung, Junyoung, Sungjin Ahn, and Yoshua Bengio. 2016. β€œHierarchical Multiscale Recurrent Neural Networks.” arXiv:1609.01704 [Cs], September.
Chung, Junyoung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. β€œEmpirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.” In NIPS.
Chung, Junyoung, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. 2015. β€œGated Feedback Recurrent Neural Networks.” In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, 2067–75. ICML’15. JMLR.org.
Chung, Junyoung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Yoshua Bengio. 2015. β€œA Recurrent Latent Variable Model for Sequential Data.” In Advances in Neural Information Processing Systems 28, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2980–88. Curran Associates, Inc.
Collins, Jasmine, Jascha Sohl-Dickstein, and David Sussillo. 2016. β€œCapacity and Trainability in Recurrent Neural Networks.” In arXiv:1611.09913 [Cs, Stat].
Cooijmans, Tim, Nicolas Ballas, CΓ©sar Laurent, Γ‡ağlar GΓΌlΓ§ehre, and Aaron Courville. 2016. β€œRecurrent Batch Normalization.” arXiv Preprint arXiv:1603.09025.
Cybenko, G. 1989. β€œApproximation by Superpositions of a Sigmoidal Function.” Mathematics of Control, Signals and Systems 2: 303–14.
Cyrta, Pawel, Tomasz TrzciΕ„ski, and Wojciech Stokowiec. 2017. β€œSpeaker Diarization Using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings.” arXiv:1708.02840 [Cs], August.
Dai, Wei, Chia Dai, Shuhui Qu, Juncheng Li, and Samarjit Das. 2016. β€œVery Deep Convolutional Neural Networks for Raw Waveforms.” arXiv:1610.00087 [Cs], October.
Dauphin, Yann, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio. 2014. β€œIdentifying and Attacking the Saddle Point Problem in High-Dimensional Non-Convex Optimization.” In Advances in Neural Information Processing Systems 27, 2933–41. Curran Associates, Inc.
Davis, S., and P. Mermelstein. 1980. β€œComparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences.” IEEE Transactions on Acoustics, Speech, and Signal Processing 28 (4): 357–66.
Defferrard, MichaΓ«l, Kirell Benzi, Pierre Vandergheynst, and Xavier Bresson. 2017. β€œFMA: A Dataset For Music Analysis.” In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR’2017), Suzhou, China.
Dieleman, Sander, and Benjamin Schrauwen. 2014. β€œEnd to End Learning for Music Audio.” In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6964–68. IEEE.
Doerr, Andreas, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, and Sebastian Trimpe. 2018. β€œProbabilistic Recurrent State-Space Models.” arXiv:1801.10395 [Stat], January.
Doucet, Arnaud, Nando Freitas, and Neil Gordon. 2001. Sequential Monte Carlo Methods in Practice. New York, NY: Springer New York.
Dozat, Timothy. n.d. β€œNAdam Report.”
Duchi, John, Elad Hazan, and Yoram Singer. 2011. β€œAdaptive Subgradient Methods for Online Learning and Stochastic Optimization.” Journal of Machine Learning Research 12 (Jul): 2121–59.
Dumitrescu, Bogdan. 2017. Positive trigonometric polynomials and signal processing applications. Second edition. Signals and communication technology. Cham: Springer.
Durbin, J., and S. J. Koopman. 2012. Time Series Analysis by State Space Methods. 2nd ed. Oxford Statistical Science Series 38. Oxford: Oxford University Press.
Eichler, Michael, Rainer Dahlhaus, and Johannes Dueck. 2016. β€œGraphical Modeling for Multivariate Hawkes Processes with Nonparametric Link Functions.” Journal of Time Series Analysis, January, n/a–.
Ekanadham, C., D. Tranchina, and E. P. Simoncelli. 2011. β€œRecovery of Sparse Translation-Invariant Signals With Continuous Basis Pursuit.” IEEE Transactions on Signal Processing 59 (10): 4735–44.
Elbaz, Dan, and Michael Zibulevsky. 2017. β€œPerceptual Audio Loss Function for Deep Learning.” In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR’2017), Suzhou, China.
Engel, Jesse, Cinjon Resnick, Adam Roberts, Sander Dieleman, Douglas Eck, Karen Simonyan, and Mohammad Norouzi. 2017. β€œNeural Audio Synthesis of Musical Notes with WaveNet Autoencoders.” In PMLR.
Evensen, G. 2009. β€œThe Ensemble Kalman Filter for Combined State and Parameter Estimation.” IEEE Control Systems 29 (3): 83–104.
FΓ©votte, CΓ©dric, Nancy Bertin, and Jean-Louis Durrieu. 2008. β€œNonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis.” Neural Computation 21 (3): 793–830.
Finke, Axel, and Sumeetpal S. Singh. 2016. β€œApproximate Smoothing and Parameter Estimation in High-Dimensional State-Space Models.” arXiv:1606.08650 [Stat], June.
Flamary, RΓ©mi, CΓ©dric FΓ©votte, Nicolas Courty, and Valentin Emiya. 2016. β€œOptimal Spectral Transportation with Application to Music Transcription.” In arXiv:1609.09799 [Cs, Stat], 703–11. Curran Associates, Inc.
Fonseca, Eduardo, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, and Xavier Serra. 2019. β€œLearning Sound Event Classifiers from Web Audio with Noisy Labels.” arXiv:1901.01189 [Cs, Eess, Stat], January.
Fraccaro, Marco, SΓΈ ren Kaae SΓΈ nderby, Ulrich Paquet, and Ole Winther. 2016. β€œSequential Neural Models with Stochastic Layers.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 2199–2207. Curran Associates, Inc.
Friston, K. J. 2008. β€œVariational Filtering.” NeuroImage 41 (3): 747–66.
Fukumizu, K., and S. Amari. 2000. β€œLocal Minima and Plateaus in Hierarchical Structures of Multilayer Perceptrons.” Neural Networks 13 (3): 317–27.
Gal, Yarin, and Zoubin Ghahramani. 2015. β€œDropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning.” In Proceedings of the 33rd International Conference on Machine Learning (ICML-16).
β€”β€”β€”. 2016. β€œA Theoretically Grounded Application of Dropout in Recurrent Neural Networks.” In arXiv:1512.05287 [Stat].
Gemmeke, Jort F., Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. β€œAudio Set: An Ontology and Human-Labeled Dataset for Audio Events.” In Proceedings of ICASSP 2017. New Orleans, LA.
Geronimo, Jeffrey S., and Hugo J. Woerdeman. 2004. β€œPositive Extensions, FejΓ©r-Riesz Factorization and Autoregressive Filters in Two Variables.” Annals of Mathematics 160 (3): 839–906.
Gers, Felix A., Nicol N. Schraudolph, and JΓΌrgen Schmidhuber. 2002. β€œLearning Precise Timing with LSTM Recurrent Networks.” Journal of Machine Learning Research 3 (Aug): 115–43.
Ghosh, Tapabrata. 2017. β€œTowards a New Interpretation of Separable Convolutions.” arXiv:1701.04489 [Cs, Stat], January.
Goertzel, Gerald. 1958. β€œAn Algorithm for the Evaluation of Finite Trigonometric Series.” The American Mathematical Monthly 65 (1): 34.
Goodfellow, Ian J., Oriol Vinyals, and Andrew M. Saxe. 2014. β€œQualitatively Characterizing Neural Network Optimization Problems.” arXiv:1412.6544 [Cs, Stat], December.
Goodwin, M M, and M Vetterli. 1999. β€œMatching Pursuit and Atomic Signal Models Based on Recursive Filter Banks.” IEEE Transactions on Signal Processing 47 (7): 1890–1902.
Goudarzi, Alireza, Peter Banda, Matthew R. Lakin, Christof Teuscher, and Darko Stefanovic. 2014. β€œA Comparative Study of Reservoir Computing for Temporal Signal Processing.” arXiv:1401.2224 [Cs], January.
Graves, Alex. 2012. Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, v. 385. Heidelberg ; New York: Springer.
Green, D., and S. Bass. 1984. β€œRepresenting Periodic Waveforms with Nonorthogonal Basis Functions.” IEEE Transactions on Circuits and Systems 31 (6): 518–34.
Gregor, Karol, and Yann LeCun. 2010. β€œLearning fast approximations of sparse coding.” In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 399–406.
β€”β€”β€”. 2011. β€œEfficient Learning of Sparse Invariant Representations.” arXiv:1105.5307 [Cs], May.
Gribonval, R. 2003. β€œPiecewise Linear Source Separation.” In Proc. Soc. Photographic Instrumentation Eng., 5207:297–310. San Diego, CA, USA.
Gribonval, R., and Emmanuel Bacry. 2003. β€œHarmonic Decomposition of Audio Signals with Matching Pursuit.” IEEE Transactions on Signal Processing 51 (1): 101–11.
Gribonval, R., R. M. Figueras i Ventura, and P. Vandergheynst. 2006. β€œA Simple Test to Check the Optimality of a Sparse Signal Approximation.” Signal Processing, Sparse Approximations in Signal and Image ProcessingSparse Approximations in Signal and Image Processing, 86 (3): 496–510.
Griewank, Andreas, and Andrea Walther. 2008. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. 2nd ed. Philadelphia, PA: Society for Industrial and Applied Mathematics.
Grosse, Roger, Rajat Raina, Helen Kwong, and Andrew Y. Ng. 2007. β€œShift-Invariant Sparse Coding for Audio Classification.” In The Twenty-Third Conference on Uncertainty in Artificial Intelligence (Uai2007), 9:8.
Gruslys, Audrunas, Remi Munos, Ivo Danihelka, Marc Lanctot, and Alex Graves. 2016. β€œMemory-Efficient Backpropagation Through Time.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 4125–33. Curran Associates, Inc.
Gu, Shixiang, Sergey Levine, Ilya Sutskever, and Andriy Mnih. 2016. β€œMuProp: Unbiased Backpropagation for Stochastic Neural Networks.” In Proceedings of ICLR.
Gulrajani, Ishaan, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. 2017. β€œImproved Training of Wasserstein GANs.” arXiv:1704.00028 [Cs, Stat], March.
Ha, David, Andrew Dai, and Quoc V. Le. 2016. β€œHyperNetworks.” arXiv:1609.09106 [Cs], September.
Haber, Eldad, and Lars Ruthotto. 2018. β€œStable Architectures for Deep Neural Networks.” Inverse Problems 34 (1): 014004.
Hamel, Philippe, Matthew E. P. Davies, Kazuyoshi Yoshii, and Masataka Goto. 2013. β€œTransfer Learning In MIR: Sharing Learned Latent Representations For Music Audio Classification And Similarity.” In.
Hardt, Moritz, Tengyu Ma, and Benjamin Recht. 2018. β€œGradient Descent Learns Linear Dynamical Systems.” The Journal of Machine Learning Research 19 (1): 1025–68.
Harris, Fredric J. 1978. β€œOn the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform.” Proceedings of the IEEE 66 (1): 51–83.
Haykin, Simon S., ed. 2001. Kalman Filtering and Neural Networks. Adaptive and Learning Systems for Signal Processing, Communications, and Control. New York: Wiley.
Hazan, Elad, Kfir Levy, and Shai Shalev-Shwartz. 2015. β€œBeyond Convexity: Stochastic Quasi-Convex Optimization.” In Advances in Neural Information Processing Systems 28, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 1594–1602. Curran Associates, Inc.
Hazan, Elad, Karan Singh, and Cyril Zhang. 2017. β€œLearning Linear Dynamical Systems via Spectral Filtering.” In NIPS.
Heaps, Sarah E. 2020. β€œEnforcing Stationarity Through the Prior in Vector Autoregressions.” arXiv:2004.09455 [Stat], April.
HelΓ©n, M., and T. Virtanen. 2005. β€œSeparation of Drums from Polyphonic Music Using Non-Negative Matrix Factorization and Support Vector Machine.” In Signal Processing Conference, 2005 13th European, 1–4.
Helmholtz, Heinrich. 1863. Die Lehre von Den Tonempfindungen Als Physiologische Grundlage FΓΌr Die Theorie Der Musik. Braunschweig: J. Vieweg.
Henaff, Mikael, Kevin Jarrett, Koray Kavukcuoglu, and Yann LeCun. 2011. β€œUnsupervised Learning of Sparse Features for Scalable Audio Classification.” In ISMIR.
Heyde, C. C. 1974. β€œOn Martingale Limit Theory and Strong Convergence Results for Stochastic Approximation Procedures.” Stochastic Processes and Their Applications 2 (4): 359–70.
Hinton, G. E. 1995. β€œThe Wake-Sleep Algorithm for Unsupervised Neural Networks.” Science 268 (5214): 1558–1161.
Hinton, G., Li Deng, Dong Yu, G.E. Dahl, A. Mohamed, N. Jaitly, A. Senior, et al. 2012. β€œDeep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups.” IEEE Signal Processing Magazine 29 (6): 82–97.
Hochreiter, Sepp. 1998. β€œThe Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions.” International Journal of Uncertainty Fuzziness and Knowledge Based Systems 6: 107–15.
Hochreiter, Sepp, Yoshua Bengio, Paolo Frasconi, and JΓΌrgen Schmidhuber. 2001. β€œGradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies.” In A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press.
Hochreiter, Sepp, and Jiirgen Schmidhuber. 1997a. β€œLTSM Can Solve Hard Time Lag Problems.” In Advances in Neural Information Processing Systems: Proceedings of the 1996 Conference, 473–79.
Hochreiter, Sepp, and JΓΌrgen Schmidhuber. 1997b. β€œLong Short-Term Memory.” Neural Computation 9 (8): 1735–80.
Hoffman, M D, and A Gelman. 2011. β€œThe No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo.” Arxiv Preprint arXiv:1111.4246.
Holan, Scott H., Robert Lund, and Ginger Davis. 2010. β€œThe ARMA Alphabet Soup: A Tour of ARMA Model Variants.” Statistics Surveys 4: 232–74.
Hornik, Kurt. 1991. β€œApproximation Capabilities of Multilayer Feedforward Networks.” Neural Networks 4 (2): 251–57.
Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. 1989. β€œMultilayer Feedforward Networks Are Universal Approximators.” Neural Networks 2 (5): 359–66.
Hoshen, Yedid, Ron J. Weiss, and Kevin W. Wilson. 2015. β€œSpeech Acoustic Modeling from Raw Multichannel Waveforms.” In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 4624–28. IEEE.
Hou, Elizabeth, Earl Lawrence, and Alfred O. Hero. 2016. β€œPenalized Ensemble Kalman Filters for High Dimensional Non-Linear Systems.” arXiv:1610.00195 [Physics, Stat], October.
Howard, Andrew G., Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. β€œMobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” arXiv:1704.04861 [Cs], April.
Hua, Yingbo, and Tapan K. Sarkar. 1990. β€œMatrix Pencil Method for Estimating Parameters of Exponentially Damped/Undamped Sinusoids in Noise.” IEEE Transactions on Acoustics, Speech and Signal Processing 38 (5): 814–24.
Huang, Gao, Zhuang Liu, Kilian Q. Weinberger, and Laurens van der Maaten. 2016. β€œDensely Connected Convolutional Networks.” arXiv:1608.06993 [Cs], August.
Huggins, P S, and S W Zucker. 2007. β€œGreedy Basis Pursuit.” IEEE Transactions on Signal Processing 55 (7): 3760–72.
HΓΌrzeler, Markus, and Hans R. KΓΌnsch. 2001. β€œApproximating and Maximising the Likelihood for a General State-Space Model.” In Sequential Monte Carlo Methods in Practice, 159–75. Statistics for Engineering and Information Science. Springer, New York, NY.
HuszΓ‘r, Ferenc. 2015. β€œHow (Not) to Train Your Generative Model: Scheduled Sampling, Likelihood, Adversary?” arXiv:1511.05101 [Cs, Math, Stat], November.
HyvΓ€rinen, Aapo, and Patrik Hoyer. 2000. β€œEmergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces.” Neural Computation 12 (7): 1705–20.
Ignjatovic, Aleksandar, Chamith Wijenayake, and Gabriele Keller. 2018a. β€œChromatic Derivatives and Approximations in Practiceβ€”Part I: A General Framework.” IEEE Transactions on Signal Processing 66 (6): 1498–1512.
β€”β€”β€”. 2018b. β€œChromatic Derivatives and Approximations in Practiceβ€”Part II: Nonuniform Sampling, Zero-Crossings Reconstruction, and Denoising.” IEEE Transactions on Signal Processing 66 (6): 1513–25.
β€”β€”β€”. 2019. β€œChromatic Derivatives and Approximations in Practice (III): Continuous Time MUSIC Algorithm for Adaptive Frequency Estimation in Colored Noise,” 16.
Ionides, E. L., C. BretΓ³, and A. A. King. 2006. β€œInference for Nonlinear Dynamical Systems.” Proceedings of the National Academy of Sciences 103 (49): 18438–43.
Ionides, Edward L., Anindya Bhadra, Yves AtchadΓ©, and Aaron King. 2011. β€œIterated Filtering.” The Annals of Statistics 39 (3): 1776–1802.
Jaeger, Herbert. 2002. Tutorial on Training Recurrent Neural Networks, Covering BPPT, RTRL, EKF and the” Echo State Network” Approach. Vol. 5. GMD-Forschungszentrum Informationstechnik.
Jaganathan, Kishore, Yonina C. Eldar, and Babak Hassibi. 2015. β€œPhase Retrieval: An Overview of Recent Developments.” arXiv:1510.07713 [Cs, Math], October.
Jing, Li, Yichen Shen, Tena Dubcek, John Peurifoy, Scott Skirlo, Yann LeCun, Max Tegmark, and Marin SoljačiΔ‡. 2017. β€œTunable Efficient Unitary Neural Networks (EUNN) and Their Application to RNNs.” In PMLR, 1733–41.
Johnson, Matthew James. 2012. β€œA Simple Explanation of A Spectral Algorithm for Learning Hidden Markov Models.” arXiv:1204.2477 [Cs, Stat], April.
Jost, P., P. Vandergheynst, and P. Frossard. 2006. β€œTree-Based Pursuit: Algorithm and Properties.” IEEE Transactions on Signal Processing 54 (12): 4685–97.
Jost, P., P. Vandergheynst, S. Lesage, and R. Gribonval. 2006. β€œMoTIF: An Efficient Algorithm for Learning Translation Invariant Dictionaries.” In 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings, 5:V–. Toulouse, France.
Jozefowicz, Rafal, Wojciech Zaremba, and Ilya Sutskever. 2015. β€œAn Empirical Exploration of Recurrent Network Architectures.” In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), 2342–50.
Jung, Alexander. 2013. β€œAn RKHS Approach to Estimation with Sparsity Constraints.” In Advances in Neural Information Processing Systems 29.
Kailath, Thomas. 1980. Linear Systems. Prentice-Hall Information and System Science Series. Englewood Cliffs, N.J: Prentice-Hall.
Kailath, Thomas, Ali H. Sayed, and Babak Hassibi. 2000. Linear Estimation. Prentice Hall Information and System Sciences Series. Upper Saddle River, N.J: Prentice Hall.
Kantas, N., A. Doucet, S. S. Singh, and J. M. Maciejowski. 2009. β€œAn Overview of Sequential Monte Carlo Methods for Parameter Estimation in General State-Space Models.” IFAC Proceedings Volumes, 15th IFAC Symposium on System Identification, 42 (10): 774–85.
Karpathy, Andrej, Justin Johnson, and Li Fei-Fei. 2015. β€œVisualizing and Understanding Recurrent Networks.” arXiv:1506.02078 [Cs], June.
Kaul, Shiva. 2020. β€œLinear Dynamical Systems as a Core Computational Primitive.” In Advances in Neural Information Processing Systems. Vol. 33.
KavčiΔ‡, A., and J. M. F. Moura. 2000. β€œMatrices with Banded Inverses: Inversion Algorithms and Factorization of Gauss-Markov Processes.” IEEE Transactions on Information Theory 46 (4): 1495–1509.
Kingma, Diederik P., Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. β€œImproving Variational Inference with Inverse Autoregressive Flow.” In Advances in Neural Information Processing Systems 29. Curran Associates, Inc.
Klapuri, A., T. Virtanen, and T. Heittola. 2010. β€œSound Source Separation in Monaural Music Signals Using Excitation-Filter Model and Em Algorithm.” In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 5510–13.
Knudson, Karin C, Jacob Yates, Alexander Huk, and Jonathan W Pillow. 2014. β€œInferring Sparse Representations of Continuous Signals with Continuous Orthogonal Matching Pursuit.” In Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 27:1215–23. Curran Associates, Inc.
Kong, Q., Y. Xu, W. Wang, and M. D. Plumbley. 2017. β€œA Joint Detection-Classification Model for Audio Tagging of Weakly Labelled Data.” In Proceedings of ICASSP 2017. New Orleans, USA.
Kreutz-Delgado, Kenneth, Joseph F. Murray, Bhaskar D. Rao, Kjersti Engan, Te-Won Lee, and Terrence J. Sejnowski. 2003. β€œDictionary Learning Algorithms for Sparse Representation.” Neural Computation 15 (2): 349–96.
Krishnan, Rahul G., Uri Shalit, and David Sontag. 2015. β€œDeep Kalman Filters.” arXiv Preprint arXiv:1511.05121.
β€”β€”β€”. 2017. β€œStructured Inference Networks for Nonlinear State Space Models.” In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2101–9.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. 2012. β€œImagenet Classification with Deep Convolutional Neural Networks.” In Advances in Neural Information Processing Systems, 1097–1105.
Kronland-Martinet, R., Ph. Guillemain, and S. Ystad. 1997. β€œModelling of Natural Sounds by Time–Frequency and Wavelet Representations.” Organised Sound 2 (03): 179–91.
Kronland-Martinet, R, Ph. Guillemain, and S Ystad. 2001. β€œFrom Sound Modeling to Analysis-Synthesis of Sounds.” In Workshop on Proceedings of MOSART Current Research Directions in Computer Music Workshop, 217–24.
Kuleshov, Volodymyr, S. Zayd Enam, and Stefano Ermon. 2017. β€œAudio Super-Resolution Using Neural Nets.” In Proceedings of International Conference on Learning Representations (ICLR) 2017.
Kumar, Anurag, and Bhiksha Raj. 2017. β€œDeep CNN Framework for Audio Event Recognition Using Weakly Labeled Web Data.” arXiv:1707.02530 [Cs], July.
Kutschireiter, Anna, Simone Carlo Surace, Henning Sprekeler, and Jean-Pascal Pfister. 2015a. β€œA Neural Implementation for Nonlinear Filtering.” arXiv Preprint arXiv:1508.06818.
Kutschireiter, Anna, Simone C Surace, Henning Sprekeler, and Jean-Pascal Pfister. 2015b. β€œApproximate Nonlinear Filtering with a Recurrent Neural Network.” BMC Neuroscience 16 (Suppl 1): P196.
Kuznetsov, Vitaly, and Mehryar Mohri. 2014. β€œGeneralization Bounds for Time Series Prediction with Non-Stationary Processes.” In Algorithmic Learning Theory, edited by Peter Auer, Alexander Clark, Thomas Zeugmann, and Sandra Zilles, 260–74. Lecture Notes in Computer Science. Bled, Slovenia: Springer International Publishing.
Lamb, Alex, Anirudh Goyal, Ying Zhang, Saizheng Zhang, Aaron Courville, and Yoshua Bengio. 2016. β€œProfessor Forcing: A New Algorithm for Training Recurrent Networks.” In Advances In Neural Information Processing Systems.
Laroche, ClΓ©ment, HΓ©lΓ¨ne Papadopoulos, Matthieu Kowalski, and GaΓ«l Richard. 2017. β€œDrum Extraction in Single Channel Audio Signals Using Multi-Layer Non Negative Matrix Factor Deconvolution.” In ICASSP. Nouvelle Orleans, United States.
Laurent, Thomas, and James von Brecht. 2016. β€œA Recurrent Neural Network Without Chaos.” arXiv:1612.06212 [Cs], December.
Law, Edith, Kris West, and Michael I. Mandel. 2009. β€œEvaluation of Algorithms Using Games: The Case of Music Tagging.” In.
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. β€œDeep Learning.” Nature 521 (7553): 436–44.
Lee, Honglak, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng. 2009. β€œConvolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations.” In Proceedings of the 26th Annual International Conference on Machine Learning, 609–16. ICML ’09. New York, NY, USA: ACM.
Lee, Jongpil, Jiyoung Park, Keunhyoung Luke Kim, and Juhan Nam. 2017. β€œSample-Level Deep Convolutional Neural Networks for Music Auto-Tagging Using Raw Waveforms.” In arXiv:1703.01789 [Cs].
Leglaive, Simon, Roland Badeau, and GaΓ«l Richard. 2017. β€œMultichannel Audio Source Separation: Variational Inference of Time-Frequency Sources from Time-Domain Observations.” In 42nd International Conference on Acoustics, Speech and Signal Processing (ICASSP). Proc. 42nd International Conference on Acoustics, Speech and Signal Processing (ICASSP). La Nouvelle OrlΓ©ans, LA, United States: IEEE.
Lei, Tao, and Yu Zhang. 2017. β€œTraining RNNs as Fast as CNNs.” arXiv:1709.02755 [Cs], September.
Lewicki, M S, and T J Sejnowski. 1999. β€œCoding Time-Varying Signals Using Sparse, Shift-Invariant Representations.” In NIPS, 11:730–36. Denver, CO: MIT Press.
Lewicki, Michael S. 2002. β€œEfficient Coding of Natural Sounds.” Nature Neuroscience 5 (4): 356–63.
Lewicki, Michael S., and Terrence J. Sejnowski. 2000. β€œLearning Overcomplete Representations.” Neural Computation 12 (2): 337–65.
Li, Shuai, Wanqing Li, Chris Cook, Ce Zhu, and Yanbo Gao. 2018. β€œIndependently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN.” In arXiv:1803.04831 [Cs].
Li, Yanghao, Naiyan Wang, Jiaying Liu, and Xiaodi Hou. 2017. β€œDemystifying Neural Style Transfer.” In IJCAI.
LindstrΓΆm, Erik, Edward Ionides, Jan Frydendall, and Henrik Madsen. 2012. β€œEfficient Iterated Filtering.” In IFAC-PapersOnLine (System Identification, Volume 16), 45:1785–90. 16th IFAC Symposium on System Identification. IFAC & Elsevier Ltd.
LindstrΓΆm, Erik, Jonas StrΓΆjby, Mats BrodΓ©n, Magnus Wiktorsson, and Jan Holst. 2008. β€œSequential Calibration of Options.” Computational Statistics & Data Analysis 52 (6): 2877–91.
Lipton, Zachary C. 2016. β€œStuck in a What? Adventures in Weight Space.” arXiv:1602.07320 [Cs], February.
Lipton, Zachary C., John Berkowitz, and Charles Elkan. 2015. β€œA Critical Review of Recurrent Neural Networks for Sequence Learning.” arXiv:1506.00019 [Cs], May.
Liu, Jane, and Mike West. 2001. β€œCombined Parameter and State Estimation in Simulation-Based Filtering.” In Sequential Monte Carlo Methods in Practice, 197–223. Statistics for Engineering and Information Science. Springer, New York, NY.
Liu, Jen-Yu, Shyh-Kang Jeng, and Yi-Hsuan Yang. 2016. β€œApplying Topological Persistence in Convolutional Neural Network for Music Audio Signals.” arXiv:1608.07373 [Cs], August.
Ljung, L. 1979. β€œAsymptotic Behavior of the Extended Kalman Filter as a Parameter Estimator for Linear Systems.” IEEE Transactions on Automatic Control 24 (1): 36–50.
Ljung, Lennart. 1999. System Identification: Theory for the User. 2nd ed. Prentice Hall Information and System Sciences Series. Upper Saddle River, NJ: Prentice Hall PTR.
Ljung, Lennart, Georg Ch Pflug, and Harro Walk. 2012. Stochastic Approximation and Optimization of Random Systems. Vol. 17. BirkhΓ€user.
Ljung, Lennart, and Torsten SΓΆderstrΓΆm. 1983. Theory and Practice of Recursive Identification. The MIT Press Series in Signal Processing, Optimization, and Control 4. Cambridge, Mass: MIT Press.
Mallat, StΓ©phane G., and Zhifeng Zhang. 1993. β€œMatching Pursuits with Time-Frequency Dictionaries.” IEEE Transactions on Signal Processing 41 (12): 3397–3415.
Marelli, D., and Minyue Fu. 2010. β€œA Recursive Method for the Approximation of LTI Systems Using Subband Processing.” IEEE Transactions on Signal Processing 58 (3): 1025–34.
Martens, James. 2010. β€œDeep Learning via Hessian-Free Optimization.” In Proceedings of the 27th International Conference on International Conference on Machine Learning, 735–42. ICML’10. USA: Omnipress.
Martens, James, and Ilya Sutskever. 2011. β€œLearning Recurrent Neural Networks with Hessian-Free Optimization.” In Proceedings of the 28th International Conference on International Conference on Machine Learning, 1033–40. ICML’11. USA: Omnipress.
β€”β€”β€”. 2012. β€œTraining Deep and Recurrent Networks with Hessian-Free Optimization.” In Neural Networks: Tricks of the Trade, 479–535. Lecture Notes in Computer Science. Springer.
Masri, Paul, Andrew Bateman, and Nishan Canagarajah. 1997a. β€œA Review of Time–Frequency Representations, with Application to Sound/Music Analysis–Resynthesis.” Organised Sound 2 (03): 193–205.
β€”β€”β€”. 1997b. β€œThe Importance of the Time–Frequency Representation for Sound/Music Analysis–Resynthesis.” Organised Sound 2 (03): 207–14.
Mattingley, J., and S. Boyd. 2010. β€œReal-Time Convex Optimization in Signal Processing.” IEEE Signal Processing Magazine 27 (3): 50–61.
McFee, Brian, Thierry Bertin-Mahieux, Daniel P.W. Ellis, and Gert R.G. Lanckriet. 2012. β€œThe Million Song Dataset Challenge.” In, 909. ACM Press.
McFee, Brian, and Daniel PW Ellis. 2011. β€œAnalyzing Song Structure with Spectral Clustering.” In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Megretski, A. 2003. β€œPositivity of Trigonometric Polynomials.” In 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475), 4:3814–3817 vol.4.
Mehri, Soroush, Kundan Kumar, Ishaan Gulrajani, Rithesh Kumar, Shubham Jain, Jose Sotelo, Aaron Courville, and Yoshua Bengio. 2017. β€œSampleRNN: An Unconditional End-to-End Neural Audio Generation Model.” In Proceedings of International Conference on Learning Representations (ICLR) 2017.
Meinshausen, Nicolai, and Bin Yu. 2009. β€œLasso-Type Recovery of Sparse Representations for High-Dimensional Data.” The Annals of Statistics 37 (1): 246–70.
Mermelstein, Paul, and CH Chen. 1976. β€œDistance Measures for Speech Recognition: Psychological and Instrumental.” In Pattern Recognition and Artificial Intelligence, 101:374–88. Academic Press.
Meyer, Matthias, Jan Beutel, and Lothar Thiele. 2017. β€œUnsupervised Feature Learning for Audio Analysis.” In Proceedings of International Conference on Learning Representations (ICLR) 2017.
Mhammedi, Zakaria, Andrew Hellicar, Ashfaqur Rahman, and James Bailey. 2017. β€œEfficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections.” In PMLR, 2401–9.
Miron, Marius, Julio J. Carabias-Orti, Juan J. Bosch, Gó, Emilia Mez, and Jordi Janer. 2016. β€œScore-Informed Source Separation for Multichannel Orchestral Recordings.” Journal of Electrical and Computer Engineering 2016 (December): e8363507.
MΕ‚ynarski, Wiktor, and Josh H. McDermott. 2017. β€œLearning Mid-Level Auditory Codes from Natural Sound Statistics.” arXiv:1701.07138 [Cs, q-Bio], January.
β€œMobileNets: Open-Source Models for Efficient On-Device Vision.” n.d. Research Blog (blog).
Mohammed, Salah-Eldin A., and Michael K. R. Scheutzow. 1997. β€œLyapunov Exponents of Linear Stochastic Functional-Differential Equations. II. Examples and Case Studies.” The Annals of Probability 25 (3): 1210–40.
Monner, Derek, and James A. Reggia. 2012. β€œA Generalized LSTM-Like Training Algorithm for Second-Order Recurrent Neural Networks.” Neural Networks 25 (January): 70–83.
Moorer, J.A. 1974. β€œThe Optimum Comb Method of Pitch Period Analysis of Continuous Digitized Speech.” IEEE Transactions on Acoustics, Speech and Signal Processing 22 (5): 330–38.
Moradkhani, Hamid, Soroosh Sorooshian, Hoshin V. Gupta, and Paul R. Houser. 2005. β€œDual State–Parameter Estimation of Hydrological Models Using Ensemble Kalman Filter.” Advances in Water Resources 28 (2): 135–47.
Mozer, Michael C., Denis Kazakov, and Robert V. Lindsey. 2018. β€œState-Denoised Recurrent Neural Networks.” arXiv:1805.08394 [Cs], May.
MΓΌller, M, F Kurth, and M Clausen. 2005a. β€œAudio Matching via Chroma-Based Statistical Features.” In Proc. Int. Conf. Music Info. Retrieval, 288–95. London, U.K.
β€”β€”β€”. 2005b. β€œChroma-Based Statistical Audio Features for Audio Matching.” In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 275–78. New Paltz, NY.
Narayan, S. Shyamla, Andrei N. Temchin, Alberto Recio, and Mario A. Ruggero. 1998. β€œFrequency Tuning of Basilar Membrane and Auditory Nerve Fibers in the Same Cochleae.” Science 282 (5395): 1882–84.
Neal, Radford M., and Geoffrey E. Hinton. 1998. β€œA View of the EM Algorithm That Justifies Incremental, Sparse, and Other Variants.” In Learning in Graphical Models, edited by Michael I. Jordan, 355–68. NATO ASI Series 89. Springer Netherlands.
Needell, D., and J. A. Tropp. 2008. β€œCoSaMP: Iterative Signal Recovery from Incomplete and Inaccurate Samples.” arXiv:0803.2392 [Cs, Math], March.
Nerrand, O., P. Roussel-Ragot, L. Personnaz, G. Dreyfus, and S. Marcos. 1993. β€œNeural Networks and Nonlinear Adaptive Filtering: Unifying Concepts and New Algorithms.” Neural Computation 5 (2): 165–99.
Nussbaum-Thom, Markus, Jia Cui, Bhuvana Ramabhadran, and Vaibhava Goel. 2016. β€œAcoustic Modeling Using Bidirectional Gated Recurrent Convolutional Units.” In, 390–94.
Nyquist, H. 1928. β€œCertain Topics in Telegraph Transmission Theory.” Transactions of the American Institute of Electrical Engineers 47 (2): 617–44.
Oliveira, MaurΓ­cio C. de, and Robert E. Skelton. 2001. β€œStability Tests for Constrained Linear Systems.” In Perspectives in Robust Control, 241–57. Lecture Notes in Control and Information Sciences. Springer, London.
Oord, Aaron van den, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. β€œWaveNet: A Generative Model for Raw Audio.” In 9th ISCA Speech Synthesis Workshop.
Pascanu, Razvan, Yann N. Dauphin, Surya Ganguli, and Yoshua Bengio. 2014. β€œOn the Saddle Point Problem for Non-Convex Optimization.” arXiv:1405.4604 [Cs], May.
Pascanu, Razvan, Tomas Mikolov, and Yoshua Bengio. 2013. β€œOn the Difficulty of Training Recurrent Neural Networks.” In arXiv:1211.5063 [Cs], 1310–18.
Patel, Vivak. 2017. β€œOn SGD’s Failure in Practice: Characterizing and Overcoming Stalling.” arXiv:1702.00317 [Cs, Math, Stat], February.
Peeters, G. 2004. β€œA Large Set of Audio Features for Sound Description (Similarity and Classification) in the CUIDADO Project.”
Pillonetto, Gianluigi. 2016. β€œThe Interplay Between System Identification and Machine Learning.” arXiv:1612.09158 [Cs, Stat], December.
Polyak, B. T., and A. B. Juditsky. 1992. β€œAcceleration of Stochastic Approximation by Averaging.” SIAM Journal on Control and Optimization 30 (4): 838–55.
Pons, Jordi, Thomas Lidy, and Xavier Serra. 2016. β€œExperimenting with Musically Motivated Convolutional Neural Networks.” In 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), 1–6. Bucharest, Romania: IEEE.
Pons, Jordi, Oriol Nieto, Matthew Prockup, Erik M. Schmidt, Andreas F. Ehmann, and Xavier Serra. 2017. β€œEnd to End Learning for Music Audio Tagging at Scale.” In Proceedings of ISMIR.
Pons, Jordi, and Xavier Serra. 2018. β€œRandomly Weighted CNNs for (Music) Audio Classification.” arXiv:1805.00237 [Cs, Eess], May.
Prandoni, Paolo, and Martin Vetterli. 2008. Signal processing for communications. Communication and information sciences. Lausanne: EPFL Press.
Preis, Douglas, and Voula Chris Georgopoulos. 1999. β€œWigner Distribution Representation and Analysis of Audio Signals: An Illustrated Tutorial Review.” Journal of the Audio Engineering Society 47 (12): 1043–53.
Qu, Shuhui, Juncheng Li, Wei Dai, and Samarjit Das. 2016a. β€œLearning Filter Banks Using Deep Learning For Acoustic Signals.” arXiv:1611.09526 [Cs], November.
β€”β€”β€”. 2016b. β€œUnderstanding Audio Pattern Using Convolutional Neural Network From Raw Waveforms.” arXiv:1611.09524 [Cs], November.
Rafii, Z. 2018. β€œSliding Discrete Fourier Transform with Kernel Windowing [Lecture Notes].” IEEE Signal Processing Magazine 35 (6): 88–92.
Ragazzini, J. R., and L. A. Zadeh. 1952. β€œThe Analysis of Sampled-Data Systems.” Transactions of the American Institute of Electrical Engineers, Part II: Applications and Industry 71 (5): 225–34.
Rajan, Rajeev, Manaswi Misra, and Hema A. Murthy. 2017. β€œMelody Extraction from Music Using Modified Group Delay Functions.” International Journal of Speech Technology 20 (1): 185–204.
Rall, Louis B. 1981. Automatic Differentiation: Techniques and Applications. Lecture Notes in Computer Science 120. Berlin ; New York: Springer-Verlag.
Ravelli, E, G Richard, and L Daudet. 2008. β€œFast MIR in a Sparse Transform Domain.” In Int. Conf. Music Info. Retrieval. Philadelphia, PA.
Rawat, Waseem, and Zenghui Wang. 2017. β€œDeep Convolutional Neural Networks for Image Classification: A Comprehensive Review.” Neural Computation 29 (9): 2352–2449.
Rebollo-Neira, Laura. 2007. β€œOblique Matching Pursuit.” IEEE Signal Processing Letters 14 (10): 703–6.
Rebollo-Neira, L., and D. Lowe. 2002. β€œOptimized Orthogonal Matching Pursuit Approach.” IEEE Signal Processing Letters 9 (4): 137–40.
Rioul, O., and M. Vetterli. 1991. β€œWavelets and Signal Processing.” IEEE Signal Processing Magazine 8 (4): 14–38.
Robbins, Herbert, and Sutton Monro. 1951. β€œA Stochastic Approximation Method.” The Annals of Mathematical Statistics 22 (3): 400–407.
Robbins, H., and D. Siegmund. 1971. β€œA Convergence Theorem for Non Negative Almost Supermartingales and Some Applications.” In Optimizing Methods in Statistics, edited by Jagdish S. Rustagi, 233–57. Academic Press.
Roberts, Adam, Jesse Engel, and Douglas Eck. 2017. β€œHierarchical Variational Autoencoders for Music.” In NIPS Workshop on Machine Learning for Creativity and Design.
Robertson, Andrew, and Mark Plumbley. 2007. β€œB-Keeper: A Beat-Tracker for Live Performance.” In Proceedings of the 7th International Conference on New Interfaces for Musical Expression, 234–37. NIME ’07. New York, NY, USA: ACM.
Robertson, Andrew, Adam M. Stark, and Mark D. Plumbley. 2011. β€œReal-Time Visual Beat Tracking Using a Comb Filter Matrix.” In Proceedings of the International Computer Music Conference 2011.
Robertson, Andrew, Adam Stark, and Matthew EP Davies. 2013. β€œPercussive Beat Tracking Using Real-Time Median Filtering.” In Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases.
Routtenberg, Tirza, and Joseph Tabrikian. 2010. β€œBlind MIMO-AR System Identification and Source Separation with Finite-Alphabet.” IEEE Transactions on Signal Processing 58 (3): 990–1000.
Rubinstein, Ron, A.M. Bruckstein, and Michael Elad. 2010. β€œDictionaries for Sparse Representation Modeling.” Proceedings of the IEEE 98 (6): 1045–57.
Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. 1986. β€œLearning Representations by Back-Propagating Errors.” Nature 323 (6088): 533–36.
Sagun, Levent, V. Ugur Guney, Gerard Ben Arous, and Yann LeCun. 2014. β€œExplorations on High Dimensional Landscapes.” arXiv:1412.6615 [Cs, Stat], December.
Sainath, T. N., B. Kingsbury, A. r Mohamed, and B. Ramabhadran. 2013. β€œLearning Filter Banks Within a Deep Neural Network Framework.” In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 297–302.
Sainath, Tara N., and Bo Li. 2016. β€œModeling Time-Frequency Patterns with LSTM Vs.Β Convolutional Architectures for LVCSR Tasks.” Submitted to Proc. Interspeech.
Sainath, Tara N., Ron J. Weiss, Andrew W. Senior, Kevin W. Wilson, and Oriol Vinyals. 2015. β€œLearning the Speech Front-End with Raw Waveform CLDNNs.” In INTERSPEECH, 1–5.
SΓ€relΓ€, Jaakko, and Harri Valpola. 2005. β€œDenoising Source Separation.” Journal of Machine Learning Research 6 (Mar): 233–72.
Schniter, P., and S. Rangan. 2012. β€œCompressive Phase Retrieval via Generalized Approximate Message Passing.” In 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 815–22.
Sefati, S., N. J. Cowan, and R. Vidal. 2015. β€œLinear Systems with Sparse Inputs: Observability and Input Recovery.” In 2015 American Control Conference (ACC), 5251–57.
Seuret, Alexandre, and FrΓ©dΓ©ric Gouaisbaut. 2013. β€œWirtinger-Based Integral Inequality: Application to Time-Delay Systems.” Automatica 49 (9): 2860–66.
Shah, Ankit, Anurag Kumar, Alexander G. Hauptmann, and Bhiksha Raj. 2018. β€œA Closer Look at Weak Label Learning for Audio Events.” arXiv:1804.09288 [Cs, Eess], April.
Shannon, C. E. 1949. β€œCommunication in the Presence of Noise.” Proceedings of the IRE 37 (1): 10–21.
Simonyan, Karen, and Andrew Zisserman. 2014. β€œVery Deep Convolutional Networks for Large-Scale Image Recognition.” arXiv:1409.1556 [Cs], September.
SjΓΆberg, Jonas, Qinghua Zhang, Lennart Ljung, Albert Benveniste, Bernard Delyon, Pierre-Yves Glorennec, HΓ₯kan Hjalmarsson, and Anatoli Juditsky. 1995. β€œNonlinear Black-Box Modeling in System Identification: A Unified Overview.” Automatica, Trends in System Identification, 31 (12): 1691–1724.
Smaragdis, Paris. 2004. β€œNon-Negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs.” In Independent Component Analysis and Blind Signal Separation, edited by Carlos G. Puntonet and Alberto Prieto, 494–99. Lecture Notes in Computer Science. Granada, Spain: Springer Berlin Heidelberg.
Smaragdis, P., and J. C. Brown. 2003. β€œNon-Negative Matrix Factorization for Polyphonic Music Transcription.” In Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on., 177–80.
Smith, Evan C., and Michael S. Lewicki. 2004. β€œLearning Efficient Auditory Codes Using Spikes Predicts Cochlear Filters.” In Advances in Neural Information Processing Systems, 1289–96.
β€”β€”β€”. 2006. β€œEfficient Auditory Coding.” Nature 439 (7079): 978–82.
Smith, Julius O. 2007. Introduction to Digital Filters with Audio Applications. http://www.w3k.org/books/: W3K Publishing.
Smith, Leonard A. 2000. β€œDisentangling Uncertainty and Error: On the Predictability of Nonlinear Systems.” In Nonlinear Dynamics and Statistics.
Smith, Leslie N. 2015. β€œCyclical Learning Rates for Training Neural Networks.” In.
Smith, Leslie N., and Nicholay Topin. 2017. β€œExploring Loss Function Topology with Cyclical Learning Rates.” arXiv:1702.04283 [Cs], February.
Smith, Steven W. 1997. The Scientist and Engineer’s Guide to Digital Signal Processing. 1st ed. San Diego, Calif: California Technical Pub.
SΓΆderstrΓΆm, T., and P. Stoica, eds. 1988. System Identification. Upper Saddle River, NJ, USA: Prentice-Hall, Inc.
Soh, Yong Sheng, and Venkat Chandrasekaran. 2017. β€œA Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers.” arXiv:1701.01207 [Cs, Math, Stat], January.
Stepleton, Thomas, Razvan Pascanu, Will Dabney, Siddhant M. Jayakumar, Hubert Soyer, and Remi Munos. 2018. β€œLow-Pass Recurrent Neural Networks - A Memory Architecture for Longer-Term Correlation Discovery.” arXiv:1805.04955 [Cs, Stat], May.
Sutskever, Ilya. 2013. β€œTraining Recurrent Neural Networks.” PhD Thesis, Toronto, Ont., Canada, Canada: University of Toronto.
Sutskever, Ilya, James Martens, George E. Dahl, and Geoffrey E. Hinton. 2013. β€œOn the Importance of Initialization and Momentum in Deep Learning.” In ICML (3), 28:1139–47.
Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. β€œGoing Deeper with Convolutions.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9.
Tallec, Corentin, and Yann Ollivier. 2017. β€œUnbiasing Truncated Backpropagation Through Time.” arXiv:1705.08209 [Cs], May.
Telgarsky, Matus. 2017. β€œNeural Networks and Rational Functions.” In PMLR, 3387–93.
Thickstun, John, Zaid Harchaoui, and Sham Kakade. 2017. β€œLearning Features of Music from Scratch.” In Proceedings of International Conference on Learning Representations (ICLR) 2017.
Tippett, Michael K., Jeffrey L. Anderson, Craig H. Bishop, Thomas M. Hamill, and Jeffrey S. Whitaker. 2003. β€œEnsemble Square Root Filters.” Monthly Weather Review 131 (7): 1485–90.
Tong, Matthew H., Adam D. Bickett, Eric M. Christiansen, and Garrison W. Cottrell. 2007. β€œLearning Grammatical Structure with Echo State Networks.” Neural Networks 20 (3): 424–32.
Tran, Dustin, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. β€œDeep Probabilistic Programming.” In ICLR.
Tran, Dustin, Alp Kucukelbir, Adji B. Dieng, Maja Rudolph, Dawen Liang, and David M. Blei. 2016. β€œEdward: A Library for Probabilistic Modeling, Inference, and Criticism.” arXiv:1610.09787 [Cs, Stat], October.
Triefenbach, F., A. Jalalvand, K. Demuynck, and J. P. Martens. 2013. β€œAcoustic Modeling With Hierarchical Reservoirs.” IEEE Transactions on Audio, Speech, and Language Processing 21 (11): 2439–50.
Tropp, J A, M B Wakin, M F Duarte, D Baron, and R G Baraniuk. 2006. β€œRandom Filters for Compressive Sampling and Reconstruction.” In Proceedings of the IEEE International Conference Acoustics, Speech, and Signal Processing, 3:872–75.
Tsipas, Nikolaos, Lazaros Vrysis, Charalampos Dimoulas, and George Papanikolaou. 2017. β€œEfficient Audio-Driven Multimedia Indexing Through Similarity-Based Speech / Music Discrimination.” Multimedia Tools and Applications, January, 1–19.
Tufts, D. W., and R. Kumaresan. 1982. β€œEstimation of Frequencies of Multiple Sinusoids: Making Linear Prediction Perform Like Maximum Likelihood.” Proceedings of the IEEE 70 (9): 975–89.
Uncini, Aurelio. 2003. β€œAudio Signal Processing by Neural Networks.” Neurocomputing, Evolving Solution with Neural Networks, 55 (3–4): 593–625.
Van Eeghem, Frederik, and Lieven De Lathauwer. 2013. β€œBlind System Identification as a Compressed Sensing Problem.”
Vaz, Colin, Asterios Toutios, and Shrikanth S. Narayanan. 2016. β€œConvex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data.” In, 963–67.
Venkataramani, Shrikant, and Paris Smaragdis. 2017. β€œEnd to End Source Separation with Adaptive Front-Ends.” arXiv:1705.02514 [Cs], May.
Venkataramani, Shrikant, Y. Cem Subakan, and Paris Smaragdis. 2017. β€œNeural Network Alternatives to Convolutive Audio Models for Source Separation.” arXiv:1709.07908 [Cs, Eess], September.
Vincent, E., N. Bertin, and R. Badeau. 2008. β€œHarmonic and Inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch Transcription.” In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 109–12.
Virtanen, T. 2007. β€œMonaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria.” IEEE Transactions on Audio, Speech, and Language Processing 15 (3): 1066–74.
Virtanen, Tuomas. 2006. β€œUnsupervised Learning Methods for Source Separation in Monaural Music Signals.” In Signal Processing Methods for Music Transcription, 267–96. Springer.
Wang, Xinxi, and Ye Wang. 2014. β€œImproving Content-Based and Hybrid Music Recommendation Using Deep Learning.” In Proceedings of the 22Nd ACM International Conference on Multimedia, 627–36. MM ’14. New York, NY, USA: ACM.
Wang, Zhong-Qiu, Jonathan Le Roux, DeLiang Wang, and John R. Hershey. 2018. β€œEnd-to-End Speech Separation with Unfolded Iterative Phase Reconstruction.” arXiv:1804.10204 [Cs, Eess, Stat], April.
Welch, Peter D. 1967. β€œThe Use of Fast Fourier Transform for the Estimation of Power Spectra: A Method Based on Time Averaging over Short, Modified Periodograms.” IEEE Transactions on Audio and Electroacoustics 15 (2): 70–73.
Werbos, P. J. 1990. β€œBackpropagation Through Time: What It Does and How to Do It.” Proceedings of the IEEE 78 (10): 1550–60.
Werbos, Paul J. 1988. β€œGeneralization of Backpropagation with Application to a Recurrent Gas Market Model.” Neural Networks 1 (4): 339–56.
Wiatowski, Thomas, Philipp Grohs, and Helmut BΓΆlcskei. 2018. β€œEnergy Propagation in Deep Convolutional Neural Networks.” IEEE Transactions on Information Theory 64 (7): 1–1.
Williams, Ronald J., and Jing Peng. 1990. β€œAn Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories.” Neural Computation 2 (4): 490–501.
Williams, Ronald J., and David Zipser. 1989. β€œA Learning Algorithm for Continually Running Fully Recurrent Neural Networks.” Neural Computation 1 (2): 270–80.
Wisdom, Scott, Thomas Powers, John Hershey, Jonathan Le Roux, and Les Atlas. 2016. β€œFull-Capacity Unitary Recurrent Neural Networks.” In Advances in Neural Information Processing Systems, 4880–88.
Wisdom, Scott, Thomas Powers, James Pitton, and Les Atlas. 2016. β€œInterpretable Recurrent Neural Networks Using Sequential Sparse Recovery.” In Advances in Neural Information Processing Systems 29.
Wright, Matthew, James Beauchamp, Kelly Fitz, Xavier Rodet, Axel RΓΆbel, Xavier Serra, and Gregory Wakefield. 2001. β€œAnalysis/Synthesis Comparison.” Organised Sound 5 (03): 173–89.
Wu, Xiaoxia, Rachel Ward, and LΓ©on Bottou. 2018. β€œWNGrad: Learn the Learning Rate in Gradient Descent.” arXiv:1803.02865 [Cs, Math, Stat], March.
Wu, Yuhuai, Saizheng Zhang, Ying Zhang, Yoshua Bengio, and Ruslan R Salakhutdinov. 2016. β€œOn Multiplicative Integration with Recurrent Neural Networks.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 2856–64. Curran Associates, Inc.
Wyse, L. 2017. β€œAudio Spectrogram Representations for Processing with Convolutional Neural Networks.” In Proceedings of the First International Conference on Deep Learning and Music, Anchorage, US, May, 2017 (arXiv:1706.08675v1 [Cs.NE]).
Xie, Bo, Yingyu Liang, and Le Song. 2016. β€œDiversity Leads to Generalization in Neural Networks.” arXiv:1611.03131 [Cs, Stat], November.
Yaghoobi, M., Sangnam Nam, R. Gribonval, and M.E. Davies. 2013. β€œConstrained Overcomplete Analysis Operator Learning for Cosparse Signal Modelling.” IEEE Transactions on Signal Processing 61 (9): 2341–55.
Yin, W, S Osher, D Goldfarb, and J Darbon. 2008. β€œBregman Iterative Algorithms for \(\ell_1\)-Minimization with Applications to Compressed Sensing.” SIAM Journal on Imaging Sciences 1 (1): 143–68.
Yoshii, Kazuyoshi, and Masataka Goto. 2012. β€œInfinite Composite Autoregressive Models for Music Signal Analysis.” In.
Yu, D., and L. Deng. 2011. β€œDeep Learning and Its Applications to Signal and Information Processing [Exploratory DSP].” IEEE Signal Processing Magazine 28 (1): 145–54.
Yu, Dong, and Jinyu Li. 2018. β€œRecent Progresses in Deep Learning Based Acoustic Models (Updated).” arXiv:1804.09298 [Cs, Eess], April.
Yu, Guoshen, and Jean-Jacques Slotine. 2009. β€œAudio Classification from Time-Frequency Texture.” In Acoustics, Speech, and Signal Processing, IEEE International Conference on, 0:1677–80. Los Alamitos, CA, USA: IEEE Computer Society.
Yu, Haizi, and Lav R. Varshney. 2017. β€œTowards Deep Interpretability (MUS-ROVER II): Learning Hierarchical Representations of Tonal Music.” In Proceedings of International Conference on Learning Representations (ICLR) 2017.
Zhang, X., and W. R. Zbigniew. 2007. β€œAnalysis of Sound Features for Music Timbre Recognition.” In International Conference on Multimedia and Ubiquitous Engineering, 2007. MUE ’07, 3–8. Washington, DC.
Zhang, Yuchen, Percy Liang, and Martin J. Wainwright. 2016. β€œConvexified Convolutional Neural Networks.” arXiv:1609.01000 [Cs], September.
Zhu, Zhenyao, Jesse H. Engel, and Awni Hannun. 2016. β€œLearning Multiscale Features Directly from Waveforms.” In Interspeech 2016, 1305–9.
Zils, A, and F Pachet. 2001. β€œMusical Mosaicing.” In Proceedings of DAFx-01, 2:135. Limerick, Ireland.
Zinkevich, Martin. 2003. β€œOnline Convex Programming and Generalized Infinitesimal Gradient Ascent.” In Proceedings of the Twentieth International Conference on International Conference on Machine Learning, 928–35. ICML’03. Washington, DC, USA: AAAI Press.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.