The Wi-Fi data Google collected in over 30 countries could be more revealing than initially thought.
French regulators are saying that the vans Google sent out to snap photos for its Street View feature may have gathered passwords and other data covered by banking and medical privacy regulations. And on Monday, Connecticut Attorney General Richard Blumenthal announced that roughly 30 states were considering a probe into the data collection, an investigation he would lead. Google has said the information was mostly useless, but some bloggers argue that a company of Google's size could use the data to create detailed and valuable profiles of Internet users, information Google could use to better target its services.
Google's CEO Eric Schmidt has said the information was hardly useful and that the company had done nothing with it. The search giant has also been ordered (or sought) to destroy the data. According to their own blog post, Google logged three things from wireless networks within range of their vans: snippets of unencrypted data; the names of available wireless networks; and a unique identifier associated with devices like wireless routers. Google blamed the collection on a rogue bit of code that was never removed after it had been inserted by an engineer during testing.
Each of the three types of data Google recorded has its uses, but it's that last one, the unique identifier, that could be valuable to a company of Google's scale. That ID is known as the media access control (MAC) address and it is included -- unencrypted, by design -- in any transfer, blogger Joe Mansfield explains.
Google says it only downloaded unencrypted data packets, which could contain information about the sites users visited. Those packets also include the MAC address of both the sending and receiving devices -- the laptop and router, for example.
A company as large as Google could develop profiles of individuals based on their mobile device MAC addresses, argues Mansfield:
Get enough data points over a couple of months or years and the database will certainly contain many repeat detections of mobile MAC addresses at many different locations, with a decent chance of being able to identify a home or work address to go with it.
Now, to be fair, we don't know whether Google actually scrubbed the packets it collected for MAC addresses and the company's statements indicate they did not. The search giant even said it "cannot identify an individual from the location data Google collects via its Street View cars." Add a step, however, and Google could deduce an individual from the location data, argues Avi Bar-Zeev, an employee of Microsoft, a Google competitor.
[Google] could (opposite of cannot) yield your identity if you've used Google's services or otherwise revealed it to them in association with your IP address (which would be the public IP of your router in most cases, visible to web servers during routine queries like HTTP GET). If Google remembered that connection (and why not, if they remember your search history?), they now have your likely home address and identity at the same time. Whether they actually do this or not is unclear to me, since they say they can't do A but surely they could do B if they wanted to.
Theoretically, Google could use the MAC address for a mobile device -- an iPod, a laptop, etc. -- to build profiles of an individual's activity. (It's unclear whether they did and Google has indicated that they have not.) But there's also value in the MAC addresses of wireless routers.
Once a router has been associated with a real-world location, it becomes useful as a reference point. The Boston company Skyhook Wireless, for example, has long maintained a database of MAC addresses, collected in a (slightly) less-intrusive way. Skyhook is the primary wireless positioning system used by Apple's iPhone and iPod Touch. (See a map of their U.S. coverage here.) When your iPod Touch wants to retrieve the current location, it shares the MAC addresses of nearby routers with Skyhook which pings its database to figure out where you are.
Google Latitude, which lets users share their current location, has at least 3 million active users and works in a similar way. When a user decides to share his location with any Google service on a non-GPS device, he sends all visible MAC addresses in the vicinity to the search giant, according to the company's own description of how its location services works.
[Update: Google's own "refresher FAQ" states that a user of its geo-location services, such as Latitude, sends all MAC addresses "currently visible to the device" to Google, but a spokesman said the service only collects the MAC addresses of routers. That FAQ statment is the basis of the following argument.] This is disturbing, argues blogger Kim Cameron (also a Microsoft employee), because it could mean the company is getting not only router addresses, but also the MAC addresses of devices such as laptops and iPods. If you are sitting next to a Google Latitude user who shares his location, Google could know the address and location of your device even though you didn't opt in. That could then be compared with all other logged instances of your MAC address to develop a profile of where the device is and has been.
Google denies using the information it collected and, if the company is telling the truth, then only data from unencrypted networks was intercepted anyway, so you have less to worry about if your home wireless network is password-protected. (It's still not totally clear whether only router MAC addresses were collected. Google said it collected the information for devices "like a WiFi router.") Whether it did or did not collect or use this information isn't clear, but Google, like many of its competitors, has a strong incentive to get this kind of location data.
[Update: According to the third-party
report Google commissioned, the Street View vans collected "all available MAC
addresses" regardless of whether or not the wireless network they were on was encrypted. Google did, however, discard the "user-created
content, such as e-mails or file transfers, or evidence of user activity, such as Internet
browsing" transferred over encrypted networks. In a response to this post, Cameron clarified some issues and pointed out what he says are contradictions in what Google has said.]