During Apple's keynote at its Worldwide Developers Conference, it was all Apple all the time, except for one quick demo near the beginning of the event. The CEO of a relatively obscure company (as obscure as a company with $50 million in funding can be) came on stage with a plastic track and a bunch of little cars. They proceeded to race each other, controlled by AI running on an iPhone. There was a brief hiccup when one of the cars wouldn't run, but it got fixed and the demo finished well.
It was cool, but it was also a bit confounding. What was Apple trying to tell us about its future plans by showing us this particular company? Right after the keynote, one of the company's PR people emailed me to ask if I wanted to meet with Anki's CEO Boris Sofman, a Carnegie Mellon robotics guy (as are his two cofounders Hanns Tappeiner and Mark Palatucci). I accepted, mostly so I could find out what got Apple so excited about this little toy startup.
Of course, they'd hate to be called a toy startup. For Sofman, entertainment, toys, are merely the quickest way to get robotics into consumers' lives. He argues that their product is doing a lot of the same fundamental things that autonomous vehicles and other types of near-future consumer robots do. And that they're merely taking the bottom-up approach to building out these futuristic capabilities.
What follows is a lightly edited and condensed record of our conversation, which took place in a building on 4th and Market in San Francisco, on a floor high above the Ross department store at ground level.
We don't go into a lot of depth about the demo itself, but if you'd like to see it, go here and skipt to about 11 minutes into the presentation.
So, I saw the demo.
It was probably the longest 20 seconds of my life.
You guys were the only outside company at the keynote.
Only outside company this year. This might be the last time they try a wireless demo at WWDC.
I think it was Marco Arment who was tweeting that it was really interesting and kind of strange that they put you guys in there, given that it was such a packed 2 hours of Apple announcements. What do you think it was that got you onto the stage?
I think a big part of it was that there is a lot of overlap in what we're doing and Apple's motivations. We're a robotics and artificial intelligence company. Our focus is to bring these technologies to consumer products. For Anki Drive, mobile devices play a huge part of that. The phone becomes the brain for everything that's happening. We have a video game happening inside a phone that matches the physical world. For us, the phone is a huge advantage.
From Apple's standpoint, it wasn't just a neat product, but we're using their devices in a way that nobody ever has. What you saw was one phone connected to four simultaneous cars. When we do it on our end, you can have 6 cars and more devices. You'll have a phone that's juggling 5,6,7,8 Bluetooth low-energy connections and no one has ever done that before.
It caught their attention because it highlights what you can do with their product ecosystem in a way that no one's ever done before.
When the super car wouldn't get going during the demo, what was happening there?
What happened, when I was holding the car up, the light was green, that means it had disconnected already. That room had so much wireless interference and signal noise strength. We found out afterwards, it was four times anything they had tested or expected. That had never happened before. Once a Bluetooth low-energy connection is made, it's incredibly robust. It doesn't get dropped.
We really quickly restarted the app and reconnected and it held the second time through.
So when you held the car up, did you know that the connection had been been dropped already?
No, because I was holding it up facing away from me. I would have gasped and not been able to finish my monologue. When I pushed it and it didn't snap on, it was because the connection had been dropped. When I saw the green light, I knew. You can hear me say, "Restart" and then I'm like tapdancing for 10 seconds waiting for it to pick up again. A little drama is perfectly fine as long as it works out. It took me like half the rest of the show to stop shaking. I mentally snapped back into it sometime during the iOS stuff.
Talk to me about the funding.
We closed our Series-A back with Andreesen Horowitz last March. We were 4 people back then. We walked in and Marc Andreesen was like a little kid, flicking the cars on the floor. He fell in love with it. he actually joined our board back then, which was humbling. From then, it's been an insanely crazy year, it's like strapping into a rollercoaster. We're now 35 people.
How'd you guys really get on stage?
Marc was the one who introduced us to Apple early on because that was a retail channel that was a good fit for what we wanted to do. We were just super happy to get the amazing response all the way through the [Apple] organization up to the executive team. I think they saw the great synergy between what we're doing and what they're doing.
What do you need all those people for?
It was three of us for a long time. It wasn't glamorous. We were sitting around a kitchen and hacking on nights and weekends. In the first three years, we were able to get really advanced advanced prototypes. It was the furthest you could go without some serious investment. Once you want to take it from an advanced prototype to an advanced product, it takes a lot of people.
For us, this is the first step of bringing this kind of robotics into people's lives, in this case entertainment. When you look at what goes into Anki Drive, what goes into this is: industrial design, mechanical engineering, electrical engineering, embedded systems, low-level firmware development, control algorithms, dealing with sensors, wireless communications, core robotics, artificial intelligence, mobile development, game development. Just getting the product together is a huge chain. Even with 30 other people, we're really thin. There's only one person in each of those categories. Recently, we brought on a manufacturing team, who is working on sourcing all these parts.
As an artificial intelligence guy, what attracted you to this project?
When we were in grad school, we all worked on really cool projects. I'll give you an example. My project was a huge autonomous vehicle, like wheels up to my shoulders. Completely off-road. We'd go to different parts of the country, plop it down in a forest and give it a destination 10 kilometers away. He [the vehicle] would be using aerial data, GPS sensors on-board, pathfinding algorithms. He'd have 8 quadcore CPUs in his little hull. He'd decide where to go, which bushes to trample, and how to get ditches and trees. He'd be moving pretty quickly. We had a chase Hummer and we couldn't keep up with him sometimes.
That was incredible technology, but it's indicative of a lot of robotics. It's focused on space applications, on DARPA, on industrial, on agricultural, on health care. But nothing has penetrated consumer markets.
When we say robotics, it's not just the mechanical part of it. It's the artificial-intelligence side of it where we're using software to program physical things to be intelligence. And it's not just a remote-controlled object. It's something that understands where it is and reacts to its surroundings and has a purpose to it.
For us, there's a huge gap in consumer applications of these technologies. The problem is that everybody focuses on performance, but it doesn't matter to them if you use a $50,000 sensor to do it. So for us, entertainment was a really great place to start. It's familiar, it's friendly, it's fun. And in the case of cars, there's a cross-generational appeal. Two-year-olds and 92-year-olds like cars. It was a chance to showcase these technologies and bring them to life in a way that is familiar in the form of a racing game, but an entirely different entertainment experience, doing things that have never been possible in the physical world.
This becomes a first step in using these technologies. Internally, these are building blocks to robotics in the more general case. The core problems in robotics -- positioning, knowing where you are, reasoning, using that information to make intelligent decisions, planning searching, deciding what you need to do, and the execution where you need to move precisely in the real world -- those carry over into any application in the real world. So when we build modules for wireless communication or planning, we will reuse those in every product we make.
Were you influenced by the "situated robotics" guys like [MIT professor and iRobot co-founder, now CEO of Heartland robotics] Rodney Brooks? Because at some level, this is just a racing game. The computer's ability to race you is the least interesting part because we've been able to do that since ExciteBike. The interesting thing seems to be what changes when you take the racing game out of the virtual world into the physical world.
Rodney Brooks did some incredible things and I interned at iRobot before graduate school. But for us, we have a lot of influences. Here's how we're approaching it. You could take the top-down approach to robotics or the bottom-up approach... The bottom-up approach is taking the building blocks available today -- the most advanced technologies, components, and technological landscapes -- and making an incredible product and then using that to make products two, three, four, five, and six. And every single time, you're using what you built before, making product three only marginally more difficult than product one. Where, if you started with product three, it'd be an insurmountable challenge.
But what does change when you pull the racing game out of the purely digital world? What's really different going on watching Anki Drive versus watching ExciteBike?
It's a completely different experience. You think about the toy industry. It's been pretty stagnant. In the '60s and '70s, the toys back then are like 90 percent of the toys today. But the real appeal that's made it a lasting element is that there is this appeal to the physical world. There's a built-in desire for people to connect with things they can touch. It's more social. It's not as natural to look at something on the screen. You'll never replicate the connection you can make with something you touch. The reason people are so attached to video games and they are so entertaining is that they take advantage of the fact that there are adaptive rules and structures. There are characters and those characters evolve. The world expands. The game changes over time and gets more challenging. But the biggest thing is that there are many characters and the interactions between characters keeps things fun. Nobody has been able to bring that into the physical world in the right way. So almost all physical entertainment is static or remote-controlled with only one-way feedback. When you close the loop, you can bring physical characters to life. You can give them a purpose, evolve over time, get more challenging, and get more capabilities because it's software driving the whole thing.
Early Nintendo games took a big jump in the sort of entertainment that was possible. To us, this feels like a huge leap forward in what you can do in the physical world and it's only the beginning. It goes way beyond racing games. We're giving physical characters the ability to know where they're located in an environment and what's around them and to be able to come to life and execute a person, intention, a personality. That's a platform in every sense of the word. We can bring characters to life in any context and the racing game happens to be a great place to start.
And the reason you're starting with a racing game is that you've got a track that you can control. This is an easier environment to perceive than an arbitrary environment.
In an arbitrary environment, everything changes. For us, the enabler behind this that typical physical products don't have is awareness of your position. The three fundamental challenges of robotics are positioning, reasoning, and execution. It doesn't matter what robotics problem you have, these are the problems you have to solve. You have to understand your position, think about what you want to do, and you have to do it. And that's really difficult because if you want to make something that is a mass producible product like this, you can't throw a $50,000 sensor on it.
And the reasoning part is the only part that racing games have always done.
If you look at a videogame, positioning is trivial because you know where everything is, execution is trivial because you have full control of the environment, and all that's left is the reasoning part. By solving the real world challenges to a really deep degree with artificial intelligence and unique combinations of components and computation, we are able to turn the physical world into a virtual world. We can take all these physical characters and abstract away everything physical about them and treat them as if they were virtual characters in a videogame on the phone. We have a virtual state in the phone that matches the physical world. If we want this one character to be more aggressive or intelligent, physically nothing changes in him. It's the software.
So what hardware goes into these cars?
There is no component in here that costs more than $1.20. We have cheap motors. A battery. A microcontroller, a 50 Mhz computer, and an optical sensor. Ironically [that sensor] is the front facing camera of an iPhone.
The selfie-cam is how this car senses where it is!
What makes this all possible is the commoditization of all these components has driven the cost down to where you can get more capable components at really low cost. And access to mobile devices -- the iPhone wasn't in the picture -- when we started working on this, you couldn't make an app because there was no such thing as an app. The original idea was to have a little box with a computer. But when the phone started gaining traction, it became obvious this was the way to go because for us, software defines the entire interaction. What we're doing with these cars is that they unlock very robust but basic capabilities. It can go 1.5 meters per second. It can sense its position. It can execute a trajectory. But fundamentally all the gameplay is defined in the app in the software on the phone. That means when we ship Anki Drive, that's just the first step.
Can you change the software on the car?
Yes we can. The phone can flash the software on the car. If you look at physical entertainment, it's always been defined by the physical side. Cheap plastics. Maybe sometimes there's some motion or remote control, but we're bringing software into physical entertainment.
I couldn't see the track well enough on the WWDC livestream, but if you've just got one downward facing camera, the track itself must have to have some kind of Kiva-like navigation tracking system embedded.
The track is very specifically designed to work with the cars. There's a really intricate system between the cars, the track, and the phone. What the cars are doing is sensing down on the track, and there is information embedded that gives them knowledge about where they are.
Just X, Y?
Well, and also which environment, because it's possible to have different types of environments. It tells them where they are, but also how well they're executing a [driving] trajectory. You've seen line following robots? Robots that follow a line to go wherever they want. It's doing the same thing except there is no line -- there's a virtualized line where we have sophisticated software that creates any maneuver we want and turns it into a virtual line that the car can follow as if it was physical line.
There's a lot of logic on the car. Five hundred times a second, it oscillates the motors to do sophisticated control algorithms. If we drive too fast, the car will drift and then recover and go back to following the line. We didn't do any drifting on stage, though. Five hundred items a second we're sensing our position and a subset of those times, we're communicating back to the device hosting the game. And we're doing that with components that cost a handful of dollars.
How do the cars come up with their racing strategies?
Inside the phone, we're doing a really deep search, like a chess game, thinking about what the car is going to do, and what the other cars are going to do forward into the future so that we can analyze thousands of these potential actions and come up with a plan that is more sophisticated than anything you'd come up with if you did an instantaneous gut reaction. In fact, [what we do] is a more rigorous way to think about AI than almost any videogame does. I was the one who worked on the early AI on this and I spent a lot of time talking to friends in the videogame industry and asking how people did AI in videogames and racing games. Surprisingly, most of it is relatively simple. If this, then this. It's a basic logic. If you have very basic logic, you'll never come up with an interesting solution to say, you're boxed in, and the best thing is to actually slow down and then come around, or having to do something sneaky.
So, for us, it's a huge advantage to have a physical videogame because all videogames end up piping a lot of their computation into the graphics. And they have to because that's differentiator. And for us, 90% of our computation goes to the planning side. We can do a much more rigorous approach that's driven by a robotics background. We can come up with really sophisticated actions, thinking forward into the future about what these characters are going to do.
How hard is getting the cars to actually do what you want in the physical world?
Execution should not be underestimated. That is really hard because we have to deal with the real world. There's drift, there's physics, there's high-speed driving, there's dust that settles on the tires, and what we're using is two cheap motors that are less than a dollar each and they all vary slightly and change over time. The tires change over time. You can't control something like this precisely without a lot of intelligence and computation. Five hundred times a second, we're oscillating the speeds of the rear to stick like glue to the virtual line.
It was really complicated development, but we've gotten very precise. We're geeks and we actually did the math to see how precise we are now. Extrapolated out to real-world size, it'd be the equivalent of you taking your car and driving down 101 at 250 miles per hour with a concrete wall on either side within a tenth of an inch of your mirrors and being able to stay inside those boundaries. So, even when you are driving a car, the software is still running and doing the same things for you, so it's able to help you drive well beyond your means. And it makes you feel like you are driving with ridiculous precision and ability, which is a core part of the game. That's what levels the playing field. The entire time you're controlling the car, you're getting assistance. All of this robotics and AI and dealing with uncertainty. All of that is such that at some point we started to forget that it is a physical game and we are really programming a videogame that takes place in the real world.