The New Math of Subways

New software can take simple data and infer exactly what's going on across a transportation system.

An urban transportation system is a messy, hard-to-track thing.

Despite planning surveys and increasing data aggregation, it's hard for the people who run city transit to know precisely what is happening within their systems, quantitatively, and in formats that their employees can understand.

It was that realization that led former Googler Shiva Shivakumar and Balaji Prabhakar, a Stanford computer science professor, to found UrbanEngines, a company that uses math to infer the real-time state of a transit system merely from the people entering and exiting the system.

They suck in where and when people board transit, and they spit out what they're calling a digital replica of the city's transportation network.

If you grew up playing SimCity like I did, and you remember all the buses and trains moving to and fro: that's actually what this system outputs. And it does so with existing data that transportation agencies are already collecting.

The company launched this week with the announcement of first three "partner cities:" Sao Paolo, Singapore, and Washington, D.C.

I met up with the co-founders in San Francisco last week to learn how the system worked. (It was actually my second encounter with Prabhakar, who has been interested in using computer science to understand societal not social networks.)

"You tap it at the station at which when you arrive and you tap out when you exit," Prabhakar told me. "Could that data be used—if I get the tap-in-tap-out data for all commuters—could we piece it together, and in fact, infer everything there is to know about the transportation system?"

It turns out that they can.

"An individual's commuting history has limited information," he said. "But if you have the trips of everybody, you can now reconstruct." It just takes some math: algorithms that can compare all those trips and figure out how long it took for the buses to run or the trains to arrive. It is no surprise that Prabhakar's background is in traffic routing for computer networks.

"I'm going to give the algorithm category a name: we call it crowdsensing," he said.

The mathematical basis, interestingly enough, is rooted in tomography and the reconstruction of a complete image from limited observations. In some thicket of software, mathematics, and ideas, this Urban Engines system is connected to MRI scanning. "But it is very domain specific," Prabhakar noted. "You can't take an off-the-shelf algorithm and plug it in."

The elegance of their system is that they don't need complex data—they just need to know where people tapped in and out. And if they pile enough of those trips together, they can piece what's happening in the system into a real-time visualization with the same accuracy, they say, as pre-existing methods for measuring a transportation system's performance.

"Urban Engines' work offers potentially revolutionary solutions for addressing the complex issue of commuter congestion through incentives and data­-driven insights,” said Shomik Mehndiratta, the World Bank’s Lead Transport Specialist for Latin America.

And some investors agree, too: they've got backing from Google Ventures, Andreesen Horowitz, and several other name-brand money sources.

Right now, the company is focused on signing up more partner cities, going down the list of the largest cities in the world. But down the line, they have some intriguing monetization possibilities. Prabhakar's previous work in societal networks focused on creating commuter loyalty programs and using behavioral economics to try to shift people's behavior. And their current project in Singapore incorporates some of these elements.

Perhaps, at some point—and this is my speculation—Urban Engines will make money by offering congestion relief as a service.

At the very least: it is a fascinating way to solve a complex transportation problem with simple—but still big—data.