Learning To See [WIP] (2017)

Work in progress R&D and sketches (in reverse order)


Learning to dream

#LearningToSee #LearningToDream – aka the Art-o-Matic 4000 Turbo XL. A study of human creativity & Art through the eyes of a #DeepNeuralNetwork.
(Deep neural network training on a massive dataset of paintings and sketches learning to dream. one PC training, this PC dreaming).


Deep neural network trying to make sense of the world

A deep neural network (pix2pix generative adversarial network) making predictions on live webcam input, trying to make sense of what it sees. It sees only what it knows. (Source code and model for v1 on github. P.S. not ‘style transfer’!)

v2

… trained on images from the Hubble telescope:

 

… trained on a massive dataset of Art:

 

v1

 

 

 


Deep Neural Network ‘Learning To See’

The process of ‘learning’ visualised. A deep neural network (Deep Convolutional Generative Adversarial Network – DCGAN) ‘Learning To See’. Each frame is the result of the network ‘learning’, and then ‘dreaming’ – re-imagining, re-evaluating, and reconstructing everything that it knows.

… training on images from NASA’s Astronomy Pic Of The Day:

 

… training on a massive dataset of Art:

 

… training on images scraped from the web of Donald Trump, Theresa May, Nigel Farage, Marine Le Pen, Recep Tayyip Erdogan:

(P.S. this is what happens when you have a dirty dataset).

 


 

‘Hallucinations’ from the above trained networks

(after training for a while).

Hubble / NASA’s Astronomy Pic Of The Day:

 

Art:

 

‘Dirty’ (politics) dataset:


 


 

Live ‘Learning To See’

A deep neural network ‘Learning To See’ from live camera input.

(Layer activations as it trains on flickr)

 

 

 


Training data I used for a number of the studies. Scraped from the Google Art project – the new purveyor of Art & Culture.

Background

Originally inspired by the neural networks of our own brain, Deep Learning Artificial Intelligence algorithms have been around for decades, but they are recently seeing a huge rise in popularity. This is often attributed to recent increases in computing power and the availability of large training data. However, progress is undeniably fuelled by the multi-billion dollar investments from the purveyors of mass surveillance: internet companies whose business models rely on targeted, psychographic advertising; and government organisations and their War on Terror. Their aim is the automation of *Understanding* Big Data: understanding text, images and sounds. But what does it mean to ‘understand’? What does it mean to ‘learn’ or to ‘see’? How do *we* make meaning in the world we see, and communicate our thoughts, our internal state?

“Learning To See” is an ongoing series of works that use state-of-the-art Machine Learning algorithms as a means to reflect on ourselves and how we make sense of the world. Everything that we see and learn is filtered and shaped by our prior knowledge and beliefs.

This work is part of a broader line of inquiry about self affirming cognitive biases and our unconscious tendency to selectively only see what we would like to see, and the resulting social polarisation.


Presented as a multi-screen installation with a number of machines receiving signals from surveillance cameras, various stages of the ‘learning’ process is visualised and explored. These include a machine training in real-time on the incoming signals from surveillance cameras and trying to ‘understand’ what it is seeing; a machine with an already pre-trained neural network trying to interpret and ‘re-imagine’ the signals with respect to what it already knows; and another machine with a pre-trained neural network ‘dreaming’.

Starting from ‘random white noise’, the machines shape and bend the input, based on what they see, or have seen before (during training). They produce abstract images, reminiscent of what they have seen during training – not replicating the training images, but creating novel, new images containing certain characteristics and qualities from the training set, statistically similar images.

And then the viewer completes the picture. Based on their own upbringing, education, knowledge and experience, the viewer recognises some of these characteristics and qualities of the new images, and projects meaning back onto them.

A number of different datasets were used for the pre-training, such as:
images scraped from the Google Art Project, containing scans from art collections and museums from every continent (with the exception of Antarctica). These include tens of thousands paintings, sketches and photographs including landscapes, portraits, religious imagery, pastoral scenes, maritime scenes, scientific illustrations, prehistoric cave paintings, realist paintings, abstract, cubist etc.


The work examines the process of learning, and the process of understanding. In many of these explorations, the deep neural network starts off not been trained on anything. It starts off completely blank*. It is literally ‘opening its eyes’ for the first time and trying to ‘understand’ what it sees. In this case ‘understanding’ means trying to find patterns, trying to find regularities in what it’s seeing, and with respect to everything that it has seen so far; so that it can efficiently compress and organise incoming information in context of its past experience. It’s trying to deconstruct the incoming signal, and reconstruct it using features that it has learnt based on what it has already seen – which at the beginning, is nothing. When the network receives new information that is unfamiliar, or perhaps just from a new angle that it has not yet encountered, it’s unable to make sense of that new information. It’s unable to find an internal representation relating it to past experience; its compressor fails to successfully deconstruct and reconstruct. But the network is training in realtime, it’s constantly learning, and updating its ‘filters’ and ‘weights’, to try and *improve its compressor*, to find more efficient internal representations, to build a more ‘universal world-view’ upon which it can hope to reconstruct future experiences. Unfortunately though, the network also ‘forgets’. When too much new information comes in, and it doesn’t re-encounter past experiences, it slowly loses those filters and representations required to reconstruct those past experiences.

These ideas are not behaviours which I have explicitly programmed into the system, they are characteristic properties of deep neural networks which I’m exploiting / exploring.

* One might liken this to a new born baby’s brain. However, this comparison is not entirely accurate. A new born baby’s brain has had hundreds of millions of years of evolution shaping its neural wiring, and arguably the baby is born with already many synaptic connections in place. Here, this network ‘starts life’ with full architecture in-tact, but all connections are initialised randomly. So the comparison may work metaphorically at a high-level, but at a lower level the details are a bit different.