Seems like it should be easy, until you think about it.
Artificial intelligence (Ai) is an especially disruptive technology, impacting a growing number of domains in ways both beneficial and detrimental. It is even showing surprising impacts in the Arts, provoking questions fundamental to philosophy, law, and engineering, not to mention practices in the Arts themselves. MUSAiC is an interdisciplinary research venture confronting questions and challenges at the frontier of the AI disruption of music.
A tutorial on generating music using Restricted Boltzmann Machines for the conditional random field density, and an RNN for the time dependence after (Boulanger-Lewandowski, Bengio, and Vincent 2012).
Modeling polyphonic music is a particularly challenging task because of the intricate interplay between melody and harmony. A good model should satisfy three requirements: statistical accuracy (capturing faithfully the statistics of correlations at various ranges, horizontally and vertically), flexibility (coping with arbitrary user constraints), and generalization capacity (inventing new material, while staying in the style of the training corpus). Models proposed so far fail on at least one of these requirements. We propose a statistical model of polyphonic music, based on the maximum entropy principle. This model is able to learn and reproduce pairwise statistics between neighboring note events in a given corpus. The model is also able to invent new chords and to harmonize unknown melodies. We evaluate the invention capacity of the model by assessing the amount of cited, re-discovered, and invented chords on a corpus of Bach chorales. We discuss how the model enables the user to specify and enforce user-defined constraints, which makes it useful for style-based, interactive music generation.
Evan Chow represents for team non-deep-learning with jazzml:
Computer jazz improvisation powered by machine learning, specifically trigram modeling, K-Means clustering, and chord inference with SVMs.
Charles Martin’s Creative Predictions:
Creative Prediction is about applying predictive machine learning models to creative data. The focus is on recurrent neural networks (RNNs), deep learning models that can be used to generate sequential and temporal data. RNNs can be applied to many kinds of creative data including text and music. They can learn the long-range structure from a corpus of data and “create” new sequences by predicting one element at a time. When embedded in a creative interface, they can be used for “predictive interaction” where a human collaborates with, influences, and is influenced by a generative neural network.
Daniel Johnson has a convolutional and recurrent architecture for taking into account multiple types of dependency in music, which he calls biaxial neural network Zhe LI, Composing Music With Recurrent Neural Networks.
Boulanger-Lewandowski, (code and data) for (Boulanger-Lewandowski, Bengio, and Vincent 2012)’s recurrent neural network composition using python/Theano. Christian Walder leads a project which shares some roots with that. (Walder 2016a, 2016b) Bob Sturm’s FolkRNN does a related thing, but ingeniously redefines the problem by focussing on folk tune notation.