"Data, don't babble!"

3/15/2018

First go visit willrobotstakemyjob.com. Will you lose your job to robots? A lot of articles and think pieces recently have touted the artificial intelligence (AI) revolution as a major job killer. And it probably will be...in a few decades. One of the most commonly studied AI systems are neural networks. In this post I want to demonstrate that, although neural networks are powerful, they are still a long way away from replacing people.

Some brief background: All types of neural networks are, wait for it, composed of neurons. Similar to the neurons in our brains, these mathematical neurons are connected to each other. When we train the network, by showing it data and rating its performance, we teach it how to connect these neurons together to give us the output that we want. It's like training a dog. It does not understand the words that we are saying, but eventually it learns that if it rolls over, it gets a treat. This video goes into more depth if you are curious.[1]

Conventional neural networks take a fixed input, like a 128x128 pixel picture, and produce a fixed output, like a 1 if the picture is a dog and a 0 if it is not. A recurrent neural network (RNN) works sequentially to analyze different sized inputs and produce varied outputs. For instance, RNNs can take a string of text and predict what the next letter should be, given what letters preceded it. What is important to know about them is that they work sequentially and that gives them POWER.

I originally heard about these powerful RNNs from a Computerphile video where they trained a neural network to write YouTube comments (even YouTube trolls will be supplanted by AI). The video directed me to Andrej Karpathy’s “The Unreasonable Effectiveness of Recurrent Neural Networks”. Karpathy is the director of AI at TESLA and STILL describes RNNs as magical. That is how great they are. His article was so inspiring that I wanted to train my very own RNN.

Luckily for me, Karpathy had already published a RNN character-level language model char-rnn [2]. Essentially it takes a sequence of text and trains a computer program to predict what character comes next. With most of the work setting up the RNN system done, the only decision left was to decide what to train the model on. Karpathy's examples included Shakespeare, War and Peace, and Linux code. I wanted to try something unique obviously and because I'm a huge fucking nerd I choose to scrape the Star Trek: Deep Space 9 plot summaries and quotable quotes from the Star Trek wiki [3]. Ideally, the network would train on this corpus of text and generate interesting or funny plot mashups.

However, after training the network on the DS9 plot summaries and quotes, I realized that there was not enough text to train the network well. The output was not very coherent. The only logical thing to do was to gather more Star Trek related content, namely, the text from The Next Generation and Voyager episode wiki pages. After gathering the new text, the training data set had a more respectable 1,310,922 words (still small by machine learning standards).

[Technical paragraph] The network itself was a Long-Short Term Memory (LSTM) network (a type of RNN). The network had 2 layers, each with 128 hidden neurons (these are all the default settings by the way). It took ~24 hours to train the RNN. Normally neural network scientists use specialized high-speed servers. I used my Surface Pro 3. My Surface was not happy about it.

"Show us the results!" Fine. Here is some of the generated text:
"She says that they are on the station, but Seven asks what she put a protection that they do anything has thought they managen to the Ompjoran and Sisko reports to Janeway that he believes that the attack when a female day and agrees to a starship reason. But Sisko does not care about a planet, and Data are all as bad computer and the captain sounds version is in suspicions. But she sees an office in her advancement by several situation but the enemy realizes he had been redued and then the Borg has to kill him and they will be consoled"

Not exactly Infinite Jest, but almost all of those are real (Star Trek) words. Almost like a Star Trek mashup fever dream. Who are the Ompjoran? Why doesn't Sisko care about a planet? What is a starship reason?

It all seems silly but what is amazing about this output is that the RNN had to learn the English language completely from scratch. It learned commas, periods, capitalization and that the Borg are murderous space aliens.

One variable that I can control is the "temperature" of the network output. This tells the RNN how much freedom it has in choosing the next character in a sequence. A high temperature allows for more variability in the results. A temperature close to zero always chooses the most likely next character. This leads to a boring infinite loop:
"the ship is a security officer and the ship is a security officer and the ship is a security officer and the ship is a security officer and the ship is a security officer"

Here is an example of some high-temperature shenanigans. Notice how, like a moody teen, it does whatever it wants:
"It is hoar blagk agable,. Captainck, yeve things he has O'Brien what she could soon be EMH 3 vitall I "Talarias)"

If you want to read more RNN generated output, I have a 15,000 character document here. At one point it says "I want to die" which is pretty ominous. Seriously check it out. For future reference, Star Trek may actually be a bad training set. Many of the words in the show are made up so the network can be justified in also making up words. Hopefully it is clear to all the Star Trek writers reading this that your jobs are safe from artificial intelligence. For the rest of you, your jobs are probably pretty safe too. For now.

William Riker [a human]: "You're a wise man, my friend."
Data [an android]: "Not yet, sir. But with your help, I am learning."

[1] If you are really curious about neural networks this free online book is a good resource.
[2] I actually used a TensorFlow Python implementation of Karpathy’s char-rnn code found here.
[3] You can find my code and input files on GitHub here.

1 Comment

"Data, don't babble!"

Leave a Reply.

Archives

Categories

Home

Projects

Blog