In today’s world, everything’s become zeroes and ones. We've developed the technology to collect data about everything in the world around us and now we're left with the question: What do we do with it?
We recently had a conversation with Mike Olson, the co-founder and chief strategy officer of Cloudera, an analytics software startup that recently raised $900 million in its pursuit of answering that question.
Olson began his career as a student at UC Berkeley, where he developed his first software start up, which eventually sold to Oracle in 2006. Two years later, he co-founded Cloudera to focus on finding the meaning within the wealth of data insights at our fingertips.
In our conversation, he spoke of the varied promises of big data, from its potential to address global hunger to its ability to help us become happier shoppers, and what underpinned Cloudera's success.
What follows is a lightly edited and condensed version of our conversation.
You hear the words “big data” thrown around a lot these days. Companies like Cloudera and platforms like Apache Hadoop have created a multi-billion dollar industry around it. Can you explain how that's possible and why it's so important?
The first thing I’ll say is there’s a really important thing happening in our world right now. Data is getting created from more places, in more ways, and in much bigger volume than ever before. We've instrumented much of the world. We've got sensors on lots of things that didn't used to have sensors on them. In order to work with that data, you need to be able to catch it all. You need to be able to organize it all. You need to be able to process and analyze it all.
And I’m a database guy. Through my whole career, I've been building and selling database systems. The volumes of data today are just totally out of whack, compared to the volumes that we were creating when I was new in my career in the 80’s and 90’s. And the systems that we built back then were never designed to get this big. They were never designed to capture this huge variety of data. They were designed for the business data of the 80’s and 90’s. So that’s a challenge, right?
All this data is happening in the world and the systems that I spent my career building aren't going to be able to deal with it. So what do you do?
Well, it turns out Google had this genius idea for organizing the information on the Internet, and built a system that could do just that. The way it works is pretty simple. Rather than try to get all your data organized in a single, great, big computer, which is what we were doing in the 80’s and 90’s, you buy a lot of small computers and gang them together.
All those computers have processors of their own so they can all go look at just their own little fragment of the data and think about it. You've got 2,000, or 16,000 processors that you can take advantage of, and that system is able to ingest, store, organize, process, and analyze all that data. That’s what Hadoop is.
And Hadoop's open source, so anybody can add to it or use it. Why does an open source platform make sense for a company as big as Cloudera that could just as easily build its own proprietary one?
We like open source because we get to collaborate, not just with our own engineering team, but with the whole planet. Hadoop has gotten good, fast, because we’re able to gang together all of the programming skill that’s distributed around the world working on big data.
We are the single largest contributor to the ecosystem of Hadoop projects. Nobody writes and gives away more software than we do. We need that open source platform to get better as fast as possible. And we’re putting considerable money where our mouth is on that. For six years, we've built the very largest and most productive engineering team that develops for that whole collection of open source projects.
We build tools for deploying, operating, managing, all of those computers ganged together in your data center. So we're very deeply involved in the open source ecosystem community to drive the platform forward, and then making the platform safe to consume for enterprises.
And what are some of the ways people and companies can make use of all this new data?
I can give you lots of examples about clean energy production and distribution and encouraging consumer conservation. Seven billion people on the planet today, we’re headed to nine billion. We need more food, we’re going to figure out using data analysis of fields and crops, where and how to plant more efficiently. So there are lots of really important problems that I think we’re going to be able to attack with data.
We’re working right now with Children’s Hospital down in Atlanta. The neo-natal intensive care unit, sick babies go there and get round-the-clock care. For a long time, there have been multiple monitoring devices paying attention to those babies. Heart rate, temperature, all that sort of stuff.
So it turns out that the neo-natal ICU is just a sonically very hostile environment. There’s beeping and clicking and whirring and it’s loud, and they wanted to know whether or not that bore at all on the stress that the babies were feeling. In addition, babies only have one reliable way to self-report distress, right? They cry. There was nothing around that was catching when and how babies were crying. So, they now instrumented the neo-natal ICU with, basically, microphones. And they’re capturing the sound environment as well as all of that other data. And they’re now able to correlate that— including, by the way, correlating when babies start crying and when they stop—in order to analyze how all of those sensor readings influence how the baby is actually feeling. They’ve now analyzed information for a fairly long period and they've begun to change the way that they deliver care to babies.
What I like best about this example is that it’s not that the doctors are collecting this information and changing broad care regimes. It’s that the staff nurses are asking for access to the data so they can better understand day by day, hour by hour, how they can change their behavior—what they can do to make things go better. It was never possible to do that before you could capture and analyze that data.
Looking at the kind of role that data plays in the decision making process for a business executive like yourself, what kind of impact does it make when you’re trying to come up with a strategy for the company? Do you base things primarily based on your own experience and intuition or on the data coming out of computers?
People listen to this story I tell about big data and they say, ‘well, total change, right?’ In fact, no. Businesses, governments, individuals have been data-driven for a long time.
If you are a credit card company, you have always cared about fraud. It’s just that you haven’t been able to look at more than some weeks, maybe some months, worth of transaction data in order to look for patterns of fraud. All of a sudden, you get to look at a decade’s worth if you want to. You can see patterns in lots of data that are invisible in smaller amounts of data. So more data lets us understand the world, and answer old questions even better than we could before. So it’s not that we've never been data-driven before, it’s that we've never been able to consider all the data before.
And what room does that leave for human thought?
Judgment and intuition are great tools for decision-makers. But the more real data that can be brought to bear, the more guidance there is for judgment and intuition, and the more rational basis there is for choosing a course of action. So look, I think we’re going to get more data-driven. I think that executives and politicians can be guided by advice from machines in ways that they never could be before. And I bet that we make better decisions, we’ll run a better world as a result of that. The human will remain in the loop, but the human’s going to be better informed than ever before.
To wrap up, I want to ask more about your experience leading Cloudera because your story is at some level classic Silicon Valley – from college student dorm room entrepreneur to the C-Suite of a major tech company. If you had one piece of advice to offer to the countless people trying to follow that same path, what would it be?
In terms of our strategy, the quality of the people we have hired since we began has been the single largest determinant of our success. We've been able to attract very bright executives, developers, sales people, who have been able to recognize opportunities and challenges and come up with ways to address them. If we've hired well, it’s meant that we've been able to continue hiring well.
Who wants to go be the smartest guy on a mediocre team? But being someone who is constantly challenged by a world-class team, everybody wants that job, right?