How Kaggle Is Changing How We Work

Like it or not, hyper-efficient job markets are on their way.

The job fair of the future (Facebook).

The technology industry loves its laws. There's Moore's Law of processing power, Metcalfe's Law of networks, and Gilder's Law about bandwidth.

And then there's Joy's Law, a more obscure truism named after Sun Microsystems co-founder Bill Joy. "No matter who you are," Joy is said to have said, "most of the smartest people work for somebody else."

For decades this koan about the scarcity of labor and expertise has held true, because of both economics and geography. The problem isn't merely the cost of hiring the smartest people, but the other frictions in the labor market that prevent companies from finding the right people, knowledge- and skills-wise.

But Joy's Law may not be sacred for much longer. A new wave of startups are bringing innovation to several labor markets, making the smartest people in the world available and employable by anybody (for a price). There's no better example of this than Kaggle.

Founded in 2010, Kaggle is an online platform for data-mining and predictive-modeling competitions. A company arranges with Kaggle to post a dump of data with a proposed problem, and the site's community of computer scientists and mathematicians -- known these days as data scientists -- take on the task, posting proposed solutions.

Importantly, competitors don't just get one crack at the problem; they can revise their submissions until a deadline, driving themselves and the community towards better solutions. "The level of accuracy increases, and they all tend to converge on the same solution," explains Anthony Goldbloom, Kaggle's co-founder and CEO.

Companies as varied as MasterCard, Pfizer, Allstate, and Facebook (not to mention NASA) have all created competitions. GE sponsored a contest to give airline pilots tools to make more efficient flight plans en route. Health technology company Practice Fusion funded another challenge to identify patients with Type 2 diabetes based on de-identified medical records.

Prizes for the winning solution have ranged from $3,000 to $250,000. A $3 million prize, offered by the Heritage Provider Network for the best prediction of which patients will be admitted to a hospital within the next year, based on historical claims data, closed last week, and the winner will be announced in June at the Health Datapalooza.

The key to Kaggle is the community: 85,000 data scientists (who knew there were that many data scientists in the world!) have entered competitions, and each is ranked according to their skill and results in competitions. Xavier Conort, a French actuary living in Singapore, holds the Number One spot (he's won 6 prizes and come in the top 10 percent a dozen times). As I'm writing this, Joshua Moskowitz, an American who joined 9 minutes ago, is at the other end of the pecking order. Just wait till Joshua starts competing, though; he could be a challenge Xavier in a matter of months.

That everyone-has-a-chance ethos means that any competitor, no matter how isolated they may be, can judge their talents against the top rank of their field. What's more, in the company's forums competitors can swap techniques and hone their skills. Goldbloom says that a good programmer can work their way up the ladder fairly quickly, by scoring well in two or three competitions.

The really disruptive thing about Kaggle, though, comes through the company's new service, Kaggle Connect. Here, Kaggle acts as a match-maker, where customers with a specific problem can hire a specific data scientist well-suited to their problem; candidates are drawn the top tier of Kaggle participants: the top 1/2 of 1 percent, or about 500 data scientists.

Which means that now you can hire Xavier, or one of the other best data scientists in the world -- if you can afford them. Or, if you'd rather pay less, you can go down the tail to people less highly ranked, but still with the Kaggle seal of approval.

On one level, of course, Kaggle is just another spin on crowdsourcing, tapping the global brain to solve a big problem. That stuff has been around for a decade or more, at least back to Wikipedia (or farther back, Linux, etc). And companies like TaskRabbit and oDesk have thrown jobs to the crowd for several years. But I think Kaggle, and other online labor markets, represent more than that, and I'll offer two arguments. First, Kaggle doesn't incorporate work from all levels of proficiency, professionals to amateurs. Participants are experts, and they aren't working for benevolent reasons alone: they want to win, and they want to get better to improve their chances of winning next time. Second, Kaggle doesn't just create the incidental work product, it creates a new marketplace for work, a deeper disruption in a professional field. Unlike traditional temp labor, these aren't bottom of the totem pole jobs. Kagglers are on top. And that disruption is what will kill Joy's Law.

Because here's the thing: the Kaggle ranking has become an essential metric in the world of data science. Employers like American Express and  the New York Times have begun listing a Kaggle rank as an essential qualification in their help wanted ads for data scientists.  It's not just a merit badge for the coders; it's a more significant, more valuable, indicator of capability than our traditional benchmarks for proficiency or expertise. In other words, your Ivy League diploma and IBM resume don't matter so much as my Kaggle score. It's flipping the resume, where your work is measurable and metricized and your value in the marketplace is more valuable than the place you work.

"We're solving a market failure," says Goldbloom. "People were using really poor proxies" for skills and credentials. That's the big shift here. Kaggle represents a new sort of labor market, one where skills have been bifurcated from credentials. (Tom Friedman, among others, has been beating on this drum of late).

Obviously, data science and computer code is particularly well suited to such a market. It's digital, and the product is easily measured in both quality and efficiency. But this doesn't mean that other fields won't follow. It was the same way with open source software and other easily digitized fields -- at first seems like only work with code. But then the model starts to get adopted and adapted by other industries that figure out how to inject the same magic into their fields.

You don't have to look hard to find other new companies that are building similarly disruptive labor marketplaces: 99Designs has created a contest-based community of designers, and has paid $51 million to designers for their winning contributions at a current rate of $1.8 million per month. And HealthTap has for created a community of 30,000 doctors who are using their spare cycles to answer patients' healthcare questions, and are scored on the quality of their contributions. Founder Ron Gutman calls HealthTap an "arbitrage market for physicians," that's connecting a market need for expert healthcare advice with the suddenly available commodity of doctors with a few minutes on their hands. The company has even started a ClubMD for top-ranked doctors that comes with special posting privileges.

Though he was skeptical at first, Goldbloom conceded that even professions that don't seem particularly quantitative might be similarly arbitraged. Take lawyers, for example - how would you rate them? But after a moments thought, Goldbloom had cracked it: you could pretty easily rank trial lawyers by their courtroom victories, or personal injury attorneys by their settlement amounts. And soon, it became clear that nearly every profession have some sort of metric for success, not just in terms of outcomes (which can measure success) but also in terms of process (which can measure efficiency).

And suddenly, Joy's Law doesn't seem so sacred any more.