The Triumph of Artificial Intelligence! 16,000 Processors Can Identify a Cat in a YouTube Video Sometimes

In the quest for ever-smarter artificial intelligence, it's easy to let hype get ahead of  performance.

What Google's neural network looks for in a cat or body. #newaesthetic

Perhaps this is not precisely what Turing had in mind.

But in what is being hailed as a triumph in machine learning, Google researchers turned 16,000 processors loose on 10 million thumbnails from YouTube videos to see what they could (machine) learn. This is a vast data set that's an order of magnitude larger than what had been attempted before, according to The New York Times.

What they found was that even without humans training the computers to know certain objects ("This is a cat"), the machines were able to teach themselves the features of a cat face, as you can dimly see above, among many other objects. As one of the researchers told The Times, "[The system] basically invented the concept of a cat" by looking at all those photos and looking for patterns.

It's an impressive feat, but this is a field that moves slower than its hype (even though its achievements are very real and significant). If we look at the Google researchers' paper, we find that if you show their system a random picture from a database of images, its accuracy is about 16 percent. That's a 70 percent improvement over the state of the art, but it's worth considering what that says about the state of the art.

Basically, there are two important curves at play in artificial intelligence today. One is the falling cost of computing, which the Times and most people note. But the other is the falling value of each additional piece of data you feed into the system. Sure, throwing more data at an algorithm makes it better, just like people know more words as they read more books. But as you go, the amount of data you need to make the algorithm better gets larger. To extend the metaphor: you have to read many more words to learn a new one.

Just look at this study: They increased the amount of data fed into the system by 900 percent and got a 70 percent increase in accuracy. (We need a name for this other curve.)

I've learned through the years that it's a terrible idea to bet against Moore's Law, but people expecting massive change due to artificial intelligence need to be aware that there is a major diminishing returns problem inherent in our current techniques.