Importantly, competitors don't just get one crack at the problem; they can revise their submissions until a deadline, driving themselves and the community towards better solutions. "The level of accuracy increases, and they all tend to converge on the same solution," explains Anthony Goldbloom, Kaggle's co-founder and CEO.
Companies as varied as MasterCard, Pfizer, Allstate, and Facebook (not to mention NASA) have all created competitions. GE sponsored a contest to give airline pilots tools to make more efficient flight plans en route. Health technology company Practice Fusion funded another challenge to identify patients with Type 2 diabetes based on de-identified medical records.
Prizes for the winning solution have ranged from $3,000 to $250,000. A $3 million prize, offered by the Heritage Provider Network for the best prediction of which patients will be admitted to a hospital within the next year, based on historical claims data, closed last week, and the winner will be announced in June at the Health Datapalooza.
The key to Kaggle is the community: 85,000 data scientists (who knew there were that many data scientists in the world!) have entered competitions, and each is ranked according to their skill and results in competitions. Xavier Conort, a French actuary living in Singapore, holds the Number One spot (he's won 6 prizes and come in the top 10 percent a dozen times). As I'm writing this, Joshua Moskowitz, an American who joined 9 minutes ago, is at the other end of the pecking order. Just wait till Joshua starts competing, though; he could be a challenge Xavier in a matter of months.
That everyone-has-a-chance ethos means that any competitor, no matter how isolated they may be, can judge their talents against the top rank of their field. What's more, in the company's forums competitors can swap techniques and hone their skills. Goldbloom says that a good programmer can work their way up the ladder fairly quickly, by scoring well in two or three competitions.
The really disruptive thing about Kaggle, though, comes through the company's new service, Kaggle Connect. Here, Kaggle acts as a match-maker, where customers with a specific problem can hire a specific data scientist well-suited to their problem; candidates are drawn the top tier of Kaggle participants: the top 1/2 of 1 percent, or about 500 data scientists.
Which means that now you can hire Xavier, or one of the other best data scientists in the world -- if you can afford them. Or, if you'd rather pay less, you can go down the tail to people less highly ranked, but still with the Kaggle seal of approval.
On one level, of course, Kaggle is just another spin on crowdsourcing, tapping the global brain to solve a big problem. That stuff has been around for a decade or more, at least back to Wikipedia (or farther back, Linux, etc). And companies like TaskRabbit and oDesk have thrown jobs to the crowd for several years. But I think Kaggle, and other online labor markets, represent more than that, and I'll offer two arguments. First, Kaggle doesn't incorporate work from all levels of proficiency, professionals to amateurs. Participants are experts, and they aren't working for benevolent reasons alone: they want to win, and they want to get better to improve their chances of winning next time. Second, Kaggle doesn't just create the incidental work product, it creates a new marketplace for work, a deeper disruption in a professional field. Unlike traditional temp labor, these aren't bottom of the totem pole jobs. Kagglers are on top. And that disruption is what will kill Joy's Law.