Here’s how I would do art with machine learning if I had to

I’ve a weakness for ideas that give me plausible deniability for making generative art while doing my maths homework.

NB I have recently tidied this page up but the content is not fresh; there is too much happening in this field to document.

Quasimondo: so do you.

This page is more chaotic than the already-chaotic median, sorry. Good luck making sense of it. The problem is that this notebook is in the anti-sweet spot of “stuff I know too much about to need notes but not working on enough to promote”.

Some neural networks are generative, in the sense that if you train ’em to classify things, they can also predict new members of the class. e.g. run the model forwards, it recognizes melodies; run it “backwards”, it composes melodies. Or rather, you maybe trained them to generate examples in the course of training them to detect examples. There are many definitional and practical wrinkles, and this ability is not unique to artificial neural networks, but it is a great convenience, and the gods of machine learning have blessed us with much infrastructure to exploit this feature, because it is close to actual profitable algorithms. Upshot: There is now a lot of computation and grad student labour directed at producing neural networks which as a byproduct can produce faces, chairs, film dialogue, symphonies and so on.

Perhaps other people will be more across this?

Oh and also google’s AMI channel, and ml4artists, which publishes sweet machine learning for artists topic guides.

There are NeurIPS streams about this now.

Visual synthesis

There is a lot going on, which I should triage. Important example: Maybe I should also do generative art with neural networks.

AI image editors

See image editing with aI.

Style transfer and deep dreaming

You can do style transfer a number of ways, including NN inversion and GANs.

See those classic images from google’s tripped-out image recognition systems). Here’s a good explanation of what is going on.

  • deep art:

    Our mission is to provide a novel artistic painting tool that allows everyone to create and share artistic pictures with just a few clicks. All you need to do is upload a photo and choose your favorite style. Our servers will then render your artwork for you.

  • For example, OPEN_NSFW was my favourite (NSFW).

  • Differentiable Image Parameterizations looks at style transfer with respect to different decompositions of the image surface. (There is stuff to follow up about checkerboard artefacts in NNs which I suspect is generally important.)

  • Self-Organising Textures stitches these two together, using a VGG discriminator as a loss function for training textures.

  • Deep dream generator does the classic deep-dreaming style perturbations

  • fast style transfer.


Clever uses of GANs that are not style transfer.


Text synthesis

At one point, Ross Gibson’s Adventures in narrated reality was a state-of-the-art text generation using RNNs. He even made a movie of a script generated that way. But now, massive transformer models have left it behind technologically, ff not humorously.


Boulanger-Lewandowski, Nicolas, Yoshua Bengio, and Pascal Vincent. 2012. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription.” In 29th International Conference on Machine Learning.
Bown, Oliver, and Sebastian Lexer. 2006. Continuous-Time Recurrent Neural Networks for Generative and Interactive Musical Performance.” In Applications of Evolutionary Computing, edited by Franz Rothlauf, Jürgen Branke, Stefano Cagnoni, Ernesto Costa, Carlos Cotta, Rolf Drechsler, Evelyne Lutton, et al., 652–63. Lecture Notes in Computer Science 3907. Springer Berlin Heidelberg.
Briot, Jean-Pierre, and François Pachet. 2020. Deep Learning for Music Generation: Challenges and Directions.” Neural Computing and Applications 32 (4): 981–93.
Carlier, Alexandre, Martin Danelljan, Alexandre Alahi, and Radu Timofte. 2020. DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation.” In.
Champandard, Alex J. 2016. Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks.” arXiv:1603.01768 [Cs], March.
Denton, Emily, Soumith Chintala, Arthur Szlam, and Rob Fergus. 2015. Deep Generative Image Models Using a Laplacian Pyramid of Adversarial Networks.” arXiv:1506.05751 [Cs], June.
Dieleman, Sander, and Benjamin Schrauwen. 2014. End to End Learning for Music Audio.” In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6964–68. IEEE.
Dosovitskiy, Alexey, Jost Tobias Springenberg, Maxim Tatarchenko, and Thomas Brox. 2014. Learning to Generate Chairs, Tables and Cars with Convolutional Networks.” arXiv:1411.5928 [Cs], November.
Dumoulin, Vincent, Jonathon Shlens, and Manjunath Kudlur. 2016. A Learned Representation For Artistic Style.” arXiv:1610.07629 [Cs], October.
Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style.” arXiv:1508.06576 [Cs, q-Bio], August.
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. 2014. Explaining and Harnessing Adversarial Examples.” arXiv:1412.6572 [Cs, Stat], December.
Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets.” In Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 2672–80. NIPS’14. Cambridge, MA, USA: Curran Associates, Inc.
Gregor, Karol, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. 2015. DRAW: A Recurrent Neural Network For Image Generation.” arXiv:1502.04623 [Cs], February.
Gregor, Karol, and Yann LeCun. 2010. Learning fast approximations of sparse coding.” In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 399–406.
———. 2011. Efficient Learning of Sparse Invariant Representations.” arXiv:1105.5307 [Cs], May.
Grosse, Roger, Ruslan R. Salakhutdinov, William T. Freeman, and Joshua B. Tenenbaum. 2012. Exploiting Compositionality to Explore a Large Space of Model Structures.” In Proceedings of the Conference on Uncertainty in Artificial Intelligence.
He, Kun, Yan Wang, and John Hopcroft. 2016. A Powerful Generative Model Using Random Weights for the Deep Image Representation.” In Advances in Neural Information Processing Systems.
Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. 2006. Reducing the Dimensionality of Data with Neural Networks.” Science 313 (5786): 504–7.
Jetchev, Nikolay, Urs Bergmann, and Roland Vollgraf. 2016. Texture Synthesis with Spatial Generative Adversarial Networks.” In Advances in Neural Information Processing Systems 29.
Jing, Yongcheng, Yezhou Yang, Zunlei Feng, Jingwen Ye, and Mingli Song. 2017. Neural Style Transfer: A Review.” arXiv:1705.04058 [Cs], May.
Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution.” arXiv:1603.08155 [Cs], March.
Karras, Tero, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive Growing of GANs for Improved Quality, Stability, and Variation.” In Proceedings of ICLR.
Karras, Tero, Samuli Laine, and Timo Aila. 2018. A Style-Based Generator Architecture for Generative Adversarial Networks.” arXiv:1812.04948 [Cs, Stat], December.
Larsen, Anders Boesen Lindbo, Søren Kaae Sønderby, Hugo Larochelle, and Ole Winther. 2015. Autoencoding Beyond Pixels Using a Learned Similarity Metric.” arXiv:1512.09300 [Cs, Stat], December.
Lazaridou, Angeliki, Dat Tien Nguyen, Raffaella Bernardi, and Marco Baroni. 2015. Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation.” arXiv:1506.03500 [Cs], June.
Li, Yanghao, Naiyan Wang, Jiaying Liu, and Xiaodi Hou. 2017. Demystifying Neural Style Transfer.” In IJCAI.
Luo, Yi, Zhuo Chen, John R. Hershey, Jonathan Le Roux, and Nima Mesgarani. 2016. Deep Clustering and Conventional Networks for Music Separation: Stronger Together.” arXiv:1611.06265 [Cs, Stat], November.
Malmi, Eric, Pyry Takala, Hannu Toivonen, Tapani Raiko, and Aristides Gionis. 2016. DopeLearning: A Computational Approach to Rap Lyrics Generation.” arXiv:1505.04771 [Cs], 195–204.
Mital, Parag K. 2017. Time Domain Neural Audio Style Transfer.” arXiv:1711.11160 [Cs], November.
Mnih, Andriy, and Karol Gregor. 2014. Neural Variational Inference and Learning in Belief Networks.” In Proceedings of The 31st International Conference on Machine Learning.
Mordvintsev, Alexander, Nicola Pezzotti, Ludwig Schubert, and Chris Olah. 2018. Differentiable Image Parameterizations.” Distill 3 (7): e12.
Mordvintsev, Alexander, Ettore Randazzo, Eyvind Niklasson, and Michael Levin. 2020. Growing Neural Cellular Automata.” Distill 5 (2): e23.
Niklasson, Eyvind, Alexander Mordvintsev, Ettore Randazzo, and Michael Levin. 2021. Self-Organising Textures.” Distill 6 (2): e00027.003.
Olah, Chris, Alexander Mordvintsev, and Ludwig Schubert. 2017. Feature Visualization.” Distill 2 (11): e7.
Oord, Aaron van den, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A Generative Model for Raw Audio.” In 9th ISCA Speech Synthesis Workshop.
Oord, Aäron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel Recurrent Neural Networks.” arXiv:1601.06759 [Cs], January.
Oord, Aäron van den, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016. Conditional Image Generation with PixelCNN Decoders.” arXiv:1606.05328 [Cs], June.
Sarroff, Andy M., and Michael Casey. 2014. Musical Audio Synthesis Using Autoencoding Neural Nets.” In. Ann Arbor, MI: Michigan Publishing, University of Michigan Library.
Sigtia, Siddharth, Emmanouil Benetos, Nicolas Boulanger-Lewandowski, Tillman Weyde, Artur S. d’Avila Garcez, and Simon Dixon. 2015. A Hybrid Recurrent Neural Network for Music Transcription.” In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2061–65. IEEE.
Smith, Evan C., and Michael S. Lewicki. 2006. Efficient Auditory Coding.” Nature 439 (7079): 978–82.
Stanley, Kenneth O. 2007. Compositional Pattern Producing Networks: A Novel Abstraction of Development.” Genetic Programming and Evolvable Machines 8 (2): 131–62.
Sturm, Bob L., Oded Ben-Tal, Úna Monaghan, Nick Collins, Dorien Herremans, Elaine Chew, Gaëtan Hadjeres, Emmanuel Deruty, and François Pachet. 2018. Machine Learning Research That Matters for Music Creation: A Case Study.” Journal of New Music Research 0 (0): 1–20.
Sun, Zheng, Jiaqi Liu, Zewang Zhang, Jingwen Chen, Zhao Huo, Ching Hua Lee, and Xiao Zhang. 2016. Composing Music with Grammar Argumented Neural Networks and Note-Level Encoding.” arXiv:1611.05416 [Cs], November.
Theis, Lucas, and Matthias Bethge. 2015. Generative Image Modeling Using Spatial LSTMs.” arXiv:1506.03478 [Cs, Stat], June.
Ulyanov, Dmitry, Andrea Vedaldi, and Victor Lempitsky. 2016. Instance Normalization: The Missing Ingredient for Fast Stylization.” arXiv:1607.08022 [Cs], July.
———. 2017. Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis.” arXiv:1701.02096 [Cs], January.
Walder, Christian. 2016a. Modelling Symbolic Music: Beyond the Piano Roll.” arXiv:1606.01368 [Cs], June.
———. 2016b. Symbolic Music Data Version 1.0.” arXiv:1606.02542 [Cs], June.
Wu, Qi, Chunhua Shen, Anton van den Hengel, Lingqiao Liu, and Anthony Dick. 2015. What Value High Level Concepts in Vision to Language Problems? arXiv:1506.01144 [Cs], June.
Wyse, L. 2017. Audio Spectrogram Representations for Processing with Convolutional Neural Networks.” In Proceedings of the First International Conference on Deep Learning and Music, Anchorage, US, May, 2017 (arXiv:1706.08675v1 [Cs.NE]).
Yu, D., and L. Deng. 2011. Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP].” IEEE Signal Processing Magazine 28 (1): 145–54.
Yu, Haizi, and Lav R. Varshney. 2017. “Towards Deep Interpretability (MUS-ROVER II): Learning Hierarchical Representations of Tonal Music.” In Proceedings of International Conference on Learning Representations (ICLR) 2017.
Zhu, Jun-Yan, Philipp Krähenbühl, Eli Shechtman, and Alexei A. Efros. 2016. Generative Visual Manipulation on the Natural Image Manifold.” In Proceedings of European Conference on Computer Vision.
Zukowski, Zack, and Cj Carr. 2017. “Generating Black Metal and Math Rock: Beyond Bach, Beethoven, and Beatles.” In 31st Conference on Neural Information Processing Systems (NIPS 2017).

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.