Aarabi, Hadrien Foroughmand, and Geoffroy Peeters. 2018. “Music Retiler: Using NMF2D Source Separation for Audio Mosaicing.”
In Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion
, 27:1–7. AM’18. New York, NY, USA: ACM.
Bertin, N., R. Badeau, and E. Vincent. 2010. “Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription.” IEEE Transactions on Audio, Speech, and Language Processing
18 (3): 538–49.
Bitton, Adrien, Philippe Esling, and Axel Chemla-Romeu-Santos. 2018. “Modulated Variational Auto-Encoders for Many-to-Many Musical Timbre Transfer,”
Blaauw, Merlijn, and Jordi Bonada. 2017. “A Neural Parametric Singing Synthesizer.” arXiv:1704.03809 [Cs]
Buch, Michael, Elio Quinton, and Bob L Sturm. 2017. “NichtnegativeMatrixFaktorisierungnutzendesKlangsynthesenSystem (NiMFKS): Extensions of NMF-Based Concatenative Sound Synthesis.” In Proceedings of the 20th International Conference on Digital Audio Effects, 7. Edinburgh.
Caetano, Marcelo, and Xavier Rodet. 2013. “Musical Instrument Sound Morphing Guided by Perceptually Motivated Features.” IEEE Transactions on Audio, Speech, and Language Processing
21 (8): 1666–75.
Carr, C. J., and Zack Zukowski. 2018. “Generating Albums with SampleRNN to Imitate Metal, Rock, and Punk Bands.” arXiv:1811.06633 [Cs, Eess]
Chazan, Dan, and Ron Hoory. 2006. Feature-domain concatenative speech synthesis
. United States US7035791B2, filed July 10, 2001, and issued April 25, 2006.
Coleman, Graham, and Jordi Bonada. 2008. “Sound Transformation by Descriptor Using an Analytic Domain.” In Proceedings of the 11th Int. Conference on Digital Audio Effects (DAFx-08), Espoo, Finland, September 1-4, 2008, 7.
Coleman, Graham, Jordi Bonada, and Esteban Maestre. 2011. “Adding Dynamic Smoothing to Mixture Mosaicing Synthesis.”
Coleman, Graham, Esteban Maestre, and Jordi Bonada. 2010. “Augmenting Sound Mosaicing with Descriptor-Driven Transformation.” In Proceedings of DAFx-10, 4.
Collins, Nick, and Bob L. Sturm. 2011. “Sound Cross-Synthesis and Morphing Using Dictionary-Based Methods.”
In International Computer Music Conference
Cont, Arshia, Shlomo Dubnov, and Gerard Assayag. 2007. “GUIDAGE: A Fast Audio Query Guided Assemblage.”
Dieleman, Sander, Aäron van den Oord, and Karen Simonyan. 2018. “The Challenge of Realistic Music Generation: Modelling Raw Audio at Scale.”
In Advances In Neural Information Processing Systems
Donahue, Chris, Julian McAuley, and Miller Puckette. 2019. “Adversarial Audio Synthesis.”
In ICLR 2019
Driedger, Jonathan, and Thomas Pratzlich. 2015. “Let It Bee – Towards NMF-Inspired Audio Mosaicing.”
In Proceedings of ISMIR
, 7. Malaga.
Dudley, Homer. 1955. “Fundamentals of Speech Synthesis.” Journal of the Audio Engineering Society
3 (4): 170–85.
———. 1964. “Thirty Years of Vocoder Research.” The Journal of the Acoustical Society of America
36 (5): 1021–21.
Elbaz, Dan, and Michael Zibulevsky. 2017. “Perceptual Audio Loss Function for Deep Learning.”
In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR’2017), Suzhou, China
Engel, Jesse, Cinjon Resnick, Adam Roberts, Sander Dieleman, Douglas Eck, Karen Simonyan, and Mohammad Norouzi. 2017. “Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders.”
Godsill, Simon J, and Ali Taylan Cemgil. 2005. “Probabilistic Phase Vocoder and Its Application to Interpolation of Missing Values in Audio Signals.”
In 2005 13th European Signal Processing Conference
Goodwin, M., and M. Vetterli. 1997. “Atomic Decompositions of Audio Signals.”
In 1997 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, 1997
Hazel, Steven. 2001. “Soundmosaic.” Web Page.
———. n.d. “Feature-Based Synthesis: Mapping Acoustic and Perceptual Features onto Synthesis Parameters,” 4.
Hoffman, Matt, and Perry R. Cook. 2007. “Real-Time Feature-Based Synthesis for Live Musical Performance.”
In Proceedings of the 7th International Conference on New Interfaces for Musical Expression
, 309. New York, New York: ACM Press.
Hoffman, Matthew D, David M Blei, and Perry R Cook. 2010. “Bayesian Nonparametric Matrix Factorization for Recorded Music.”
In International Conference on Machine Learning
Hohmann, V. 2002. “Frequency Analysis and Synthesis Using a Gammatone Filterbank.” Acta Acustica United with Acustica 88 (3): 433–42.
Kersten, S., and P. Purwins. 2012. “Fire Texture Sound Re-Synthesis Using Sparse Decomposition and Noise Modelling.”
In International Conference on Digital Audio Effects (DAFx12)
Lazier, Ari, and Perry Cook. 2003. “Mosievius: Feature Driven Interactive Audio Mosaicing,” 6.
Masri, Paul, Andrew Bateman, and Nishan Canagarajah. 1997a. “A Review of Time–Frequency Representations, with Application to Sound/Music Analysis–Resynthesis.” Organised Sound
2 (03): 193–205.
Mehri, Soroush, Kundan Kumar, Ishaan Gulrajani, Rithesh Kumar, Shubham Jain, Jose Sotelo, Aaron Courville, and Yoshua Bengio. 2017. “SampleRNN: An Unconditional End-to-End Neural Audio Generation Model.”
In Proceedings of International Conference on Learning Representations (ICLR) 2017
Mor, Noam, Lior Wolf, Adam Polyak, and Yaniv Taigman. 2018. “A Universal Music Translation Network.” arXiv:1805.07848 [Cs, Stat]
Morise, Masanori, Fumiya Yokomori, and Kenji Ozawa. 2016. “WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications.” IEICE Transactions on Information and Systems
E99.D (7): 1877–84.
Müller, M., D.P.W. Ellis, A. Klapuri, and G. Richard. 2011. “Signal Processing for Music Analysis.” IEEE Journal of Selected Topics in Signal Processing
5 (6): 1088–1110.
O’Leary, Seán, and Axel Röbel. 2016. “A Montage Approach to Sound Texture Synthesis.” IEEE/ACM Trans. Audio, Speech and Lang. Proc.
24 (6): 1094–1105.
Pascual, Santiago, Joan Serrà, and Antonio Bonafonte. 2019. “Towards Generalized Speech Enhancement with Generative Adversarial Networks.” arXiv:1904.03418 [Cs, Eess]
Salamon, Justin, Joan Serrà, and Emilia Gómez. 2013. “Tonal Representations for Music Retrieval: From Version Identification to Query-by-Humming.” International Journal of Multimedia Information Retrieval
2 (1): 45–58.
Sarroff, Andy M., and Michael Casey. 2014. “Musical Audio Synthesis Using Autoencoding Neural Nets.”
In. Ann Arbor, MI: Michigan Publishing, University of Michigan Library.
Schimbinschi, Florin, Christian Walder, Sarah Erfani, and James Bailey. 2018. “Synthnet: Learning Synthesizers End-to-End,”
Scholler, S., and H. Purwins. 2011. “Sparse Approximations for Drum Sound Classification.” IEEE Journal of Selected Topics in Signal Processing
5 (5): 933–40.
Schwarz, Diemo. 2005. “Current Research in Concatenative Sound Synthesis.”
In International Computer Music Conference (ICMC)
, 1–1. Barcelona, Spain.
———. 2011. “State of the Art in Sound Texture Synthesis.”
In Proceedings of DAFx-11
Simon, Ian, Sumit Basu, David Salesin, and Maneesh Agrawala. 2005. “Audio Analogies: Creating New Music from an Existing Performance by Concatenative Synthesis.”
In Proceedings of the 2005 International Computer Music Conference
Smaragdis, P., and J. C. Brown. 2003. “Non-Negative Matrix Factorization for Polyphonic Music Transcription.”
In Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on.
Sturm, Bob L., Laurent Daudet, and Curtis Roads. 2006. “Pitch-Shifting Audio Signals Using Sparse Atomic Approximations.”
In Proceedings of the 1st ACM Workshop on Audio and Music Computing Multimedia
, 45–52. AMCMM ’06. New York, NY, USA: ACM.
Sturm, Bob L., Curtis Roads, Aaron McLeran, and John J. Shynk. 2009. “Analysis, Visualization, and Transformation of Audio Signals Using Dictionary-Based Methods.” Journal of New Music Research
38 (4): 325–41.
Su, Shih-Yang, Cheng-Kai Chiu, Li Su, and Yi-Hsuan Yang. 2017. “Automatic Conversion of Pop Music into Chiptunes for 8-Bit Pixel Art.”
Tenenbaum, J. B., and W. T. Freeman. 2000. “Separating Style and Content with Bilinear Models.” Neural Computation
12 (6): 1247–83.
Turner, Richard E., and Maneesh Sahani. 2014. “Time-Frequency Analysis as Probabilistic Inference.” IEEE Transactions on Signal Processing
62 (23): 6171–83.
Uhrenholt, Anders Kirk, and Bjøern Sand Jensen. 2019. “Efficient Bayesian Optimization for Target Vector Estimation.”
In The 22nd International Conference on Artificial Intelligence and Statistics
, 2661–70. PMLR.
Vasquez, Sean, and Mike Lewis. 2019. “MelNet: A Generative Model for Audio in the Frequency Domain.” arXiv:1906.01083 [Cs, Eess, Stat]
Verhelst, Werner, and Marc Roelands. 1993. “An Overlap-Add Technique Based on Waveform Similarity (WSOLA) for High Quality Time-Scale Modification of Speech.”
In Proceedings of ICASSP
, 554–57. ICASSP’93. Washington, DC, USA: IEEE Computer Society.
Verma, Prateek, and Julius O. Smith. 2018. “Neural Style Transfer for Audio Spectograms.”
In 31st Conference on Neural Information Processing Systems (NIPS 2017)
Verma, T.S., and T.H.Y. Meng. 1998. “An Analysis/Synthesis Tool for Transient Signals That Allows a Flexible Sines+transients+noise Model for Audio.”
In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP ’98 (Cat. No.98CH36181)
, 6:3573–76. Seattle, WA, USA: IEEE.
———. 1999. “Sinusoidal Modeling Using Frame-Based Perceptually Weighted Matching Pursuits.”
In 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)
, 981–984 vol.2. Phoenix, AZ, USA: IEEE.
Vincent, E., N. Bertin, and R. Badeau. 2008. “Harmonic and Inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch Transcription.”
In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing
Wager, S., L. Chen, M. Kim, and C. Raphael. 2017. “Towards Expressive Instrument Synthesis Through Smooth Frame-by-Frame Reconstruction: From String to Woodwind.”
In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Wyse, L. 2017. “Audio Spectrogram Representations for Processing with Convolutional Neural Networks.”
In Proceedings of the First International Conference on Deep Learning and Music, Anchorage, US, May, 2017 (arXiv:1706.08675v1 [Cs.NE])
Zhou, Cong, Michael Horgan, Vivek Kumar, Cristina Vasco, and Dan Darcy. 2018. “Voice Conversion with Conditional SampleRNN.” arXiv:1808.08311 [Cs, Eess]
Zils, A, and F Pachet. 2001. “Musical Mosaicing.”
In Proceedings of DAFx-01
, 2:135. Limerick, Ireland.