Why Amazon's Data Centers Are Hidden in Spy Country
The company powers much of the Internet, but its cloud facilities are difficult to find.
Once in a while—not quite often enough to be a crisis, but just often enough to be a trope—people in the United States will freak out because a huge number of highly popular websites and services have suddenly gone down. For an interminable period of torture (usually about 1-3 hours, tops) there is no Instagram to browse, no Tinder to swipe, no Github to push to, no Netflix to And Chill.
When this happens, it usually means that Amazon Web Services is having a technical problem, most likely in their US-East region. What that actually means is that something is broken in northern Virginia. Of all the places where Amazon operates data centers, northern Virginia is one of the most significant, in part because it’s where AWS first set up shop in 2006. It seemed appropriate that this vision quest to see The Cloud across America which began at the ostensible birthplace of the Internet should end at the place that’s often to blame when large parts of the U.S. Internet dies.
Northern Virginia is a pretty convenient place to start a cloud-services business: for reasons we’ll get into later, it’s a central region for Internet backbone. For the notoriously economical and utilitarian Amazon, this meant that it could quickly set up shop with minimal overhead in the area, leasing or buying older data centers rather than building new ones from scratch.
The ease with which AWS was able to get off the ground by leasing colocation space in northern Virginia in 2006 is the same reason that US-East is the most fragile molecule of the AWS cloud: it’s old, and it’s running on old equipment in old buildings.
Or, that’s what one might conclude from spending a day driving around looking for and at these data centers. When I contacted AWS to ask specific questions about the data-center region, how they ended up there, and the process of deciding between building data centers from scratch versus leasing existing ones, they declined to comment.
Unlike Google and Facebook, AWS doesn’t aggressively brand or call attention to their data centers. They absolutely don’t give tours, and their website offers only rough approximations of the locations of their data centers, which are divided into “regions.” Within a region lies at minimum two “availability zones” and within the availability zones there are a handful of data centers.
I knew I wasn’t going to be able to find the entirety of AWS’ northern Virginia footprint, but I could probably find bits and pieces of it. My itinerary was a slightly haphazard one, based on looking for anything tied to Vadata, Inc., Amazon’s subsidiary company for all things data-center-oriented.
Google’s web crawlers don’t particularly care about AWS’ preference of staying below the radar, and searching for Vadata, Inc. sometimes pulls up addresses that probably first appeared on some deeply buried municipal paperwork and were added to Google Maps by a robot. It’s also not too hard to go straight to those original municipal documents with addresses and other cool information, like fines from utility companies and documentation of tax arrangements made specifically for AWS. (Pro tip for the rookie data-center mapper: if you’re looking for the data centers of other major companies, Foursquare check-ins are also a surprisingly rich resource). My weird hack research methods returned a handful of Vadata addresses scattered throughout the area: Ashburn, Sterling, Haymarket, Manassas, Chantilly.
Before I knew northern Virginia as the heart of the Internet, I knew it as spook country—that is, home to a constellation of intelligence agencies and defense contractors. While I didn’t plan my itinerary around the military-industrial complex, its many outposts remained in the back of my mind and frequently on the horizon—and, at least once during the drive, I just stumbled upon them. After missing an exit in McLean, I made a U-turn in a generically designed but improbably well-guarded office-park entrance that I later found out was the headquarters of the Office of the Directorate for National Intelligence.
The fact that northern Virginia is home to major intelligence operations and to major nodes of network infrastructure isn’t exactly a sign of government conspiracy so much as a confluence of histories (best documented by Paul Ceruzzi in his criminally under-read history Internet Alley: High Technology In Tysons Corner, 1945-2005). To explain why a region surrounded mostly by farmland and a scattering of American Civil War monuments is a central point of Internet infrastructure, we have to go back to where a lot of significant moments in Internet history take place: the Cold War.
Postwar suburbanization and the expansion of transportation networks are occasionally overlooked, but weirdly crucial facets of the military-industrial complex. While suburbs were largely marketed to the public via barely concealed racism and the appeal of manicured “natural” landscapes, suburban sprawl’s dispersal of populations also meant increased likelihood of survival in the case of nuclear attack. Highways both facilitated suburbs and supported the movement of ground troops across the continental United States, should they need to defend it (lest we forget that the legislation that funded much of the U.S. highway system was called the National Interstate and Defense Highways Act of 1956).
Both of these factors were at play in the unincorporated area of northern Virginia known as Tysons Corner, an area just far away enough from Washington to be relatively safe from nuclear attack but close enough to remain accessible. One of the region’s earliest military outposts was actually a piece of communications infrastructure: a microwave tower built in 1952 that was the first among several relays connecting Washington to the “Federal Relocation Arc” of secret underground bunkers created in case of nuclear attack.
The particular alignments of highways that eventually connected Dulles International Airport in Virginia to the Capitol Beltway basically made this pocket of northern Virginia the first and last place for any commercial activities between the airport and D.C. This led to an outcropping of office parks that housed not only defense contractors, but also government IT and time-sharing services and, later, companies like MCI, AOL, and UUNet. Thanks to that concentration of network companies and a whole lot of support from the National Science Foundation, Tysons Corner became home to MAE-East, one of the earliest Internet exchanges and home to the foundation of what would become that Internet backbone. Networks build atop networks, and the presence of this backbone in Tysons Corner led to more backbone, more tech companies, and more data centers. Today, up to 70 percent of Internet traffic worldwide travels through this region, as the Loudon county economic-development board cheerfully notes in its marketing materials.
An unfathomable amount of that traffic is from AWS. Amazon doesn’t release exact numbers at to just how much of the global Internet currently sits atop its infrastructure. In 2012, a now-lost blog post by network-intelligence startup DeepField estimated that on average, one-third of all daily Internet usage accesses a site running on AWS. Over the last three years, that percentage has most likely only increased. Finding more recent numbers is tricky, although it seems agreed upon that Amazon is the largest hosting company operating today, projected to exceed $8 billion this year alone. While this exact calculation of how much of the Internet sits within Amazon’s cloud is uncertain, the calculation of how much revenue Amazon generates from that cloud is crystal clear—and massive.
According to Amazon lore, few within the company really anticipated the scale and impact of the service when it launched—they were mostly trying to solve an internal company problem, namely speeding up deployment of new applications and services. The solution was to make that build-up of the basic development infrastructure something that could be rapidly deployed and scaled as needed. AWS emerged out of the recognition that the services Amazon needed also met a growing industry need for web-scale application infrastructure.
Reporting on AWS history rarely spends much time on the data centers themselves. The actual infrastructure at the heart of AWS’ infrastructure-as-a-service isn’t the thing that makes it important to developers; it’s the services and APIs built on top of that infrastructure. So it made sense that when I stood outside some of the Vadata buildings from my hackish itinerary that I mostly gazed upon warehouses and vacuous-looking colocation buildings.
I had a soft spot for one of the data centers on my itinerary: a building in Sterling, Virginia, that I’d visited two years ago on a similar vision quest. The data center is next door to a pet resort (which always fills me with a weird glee imagining the secretly very stressful life of pets implied in the need for a resort) and down the road from a strip mall. The sheer unremarkability of the building—despite its conspicuous water tanks, its generators, its high fences, and surveillance cameras—serve as a reminder of why it’s easy to overlook how important AWS is to the public experience and perception of The Cloud.
Amazon didn’t invent the principles behind cloud computing, but they made the infrastructure of cloud computing into a dirt-cheap commodity. In Brad Stone’s The Everything Store, Jeff Bezos is quoted comparing AWS to power utilities: “You go back in time a hundred years, if you wanted to have electricity, you had to build your own little electric power plant, and a lot of factories did this. As soon as the electric-power grid came online, they dumped their electric-power generator, and they started buying power off the grid. It just makes more sense. And that’s what is starting to happen with infrastructure computing.”
In practice, this meant that pricing for services was entirely contingent on actual use, an approach that allowed developers to rapidly scale small startups into massive companies by paying for infrastructure support on an as-needed basis and scaffolding as needs grew. Thanks to AWS, the initial overhead for starting a service like Airbnb or Slack (both AWS customers) is so low that those companies can afford to expand quickly. A fenced-off data center next to a pet resort doesn't exactly scream “one of the chief enablers of the current tech bubble,” but at the end of the day, that's what AWS is.
We found one AWS data center surprisingly easily, through a news story rather than an obscure government document. Earlier this year, an under construction AWS data center in Ashburn caught fire. Conveniently, the exact address was included in the story.
This apparently not-yet-operational AWS data center is across the street from Ashby Ponds, a retirement community, and adjoins an area dominated by office parks, other data centers, and construction sites. It shared some familiar tropes of the region’s defense-contractor office parks: topography and landscaping that obscured just enough of the building to be inconvenient, high black gates, carefully curated suggestions of “nature.” An outcropping of conduit ducts poured out from the sidewalk alongside Ashby Ponds, like seaweed abandoned on a beach or a lesser descendant of Cthulu reaching up from the suburban depths.
The site may have been a project of real-estate investment trust, Corporate Office Properties Trust, a company that’s been a personal pet interest of mine for a while. Due to SEC regulations COPT hasn’t publicly disclosed the actual tenant or their particular role in building new AWS data centers in northern Virginia, but reports of COPT building an AWS data center in 2013 surfaced a few months before the CIA announced they’d awarded a $600 million contract to Amazon to build cloud services for the U.S. intelligence community. COPT only got into the data-center business a few years ago after focusing most of their efforts on building and managing office parks for defense contractors next to military outposts.
While the CIA contract and the COPT deal are probably unrelated, given COPT's history in facilitating the construction of facilities secured to DoD specs, the data center might be part of an expansion of AWS’ GovCloud, a cloud-services platform specifically designed for use by government agencies.
In contrast to the manicured landscapes of Ashburn and the pet resorts of Sterling, I was quite pleased to see that one of the data centers I found was adjacent to what looked like an abandoned freight-rail line. Its neighbors included a wholesale brick supplier, a gutter-supply company, and a Virginia Department of Transportation vehicle-maintenance outpost. The data center’s presence among sites of far more tangible, industrial exchange were a resonant reminder of the fact that Amazon has always been more logistics company than retail company. This is why in its early years Bezos aggressively poached new hires from America's original logistics-disguised-as-retail business, Walmart (I’ve heard rumors that AWS frequently recruits hires from the FBI, but beyond anecdata and a scattering of LinkedIn profiles nothing to confirm it). In part, the success of Amazon Web Services—arguably the success of The Cloud itself at this point—lies in its ability to abstract infrastructural problems into logistics problems. And for a long time, Amazon has been able to abstract away a lot of the more discomforting or difficult facets of its infrastructure.
In the case of Amazon’s retail divisions, the realities obscured might be the labor conditions for employees in their distribution centers. In the case of AWS, it might be the energy use of its data centers. When Bezos compares cloud infrastructure to the power grid, he obscures the fact that data centers aren’t exactly analogous to electricity so much as they’re dependent on it, to the point where Amazon has to build new power substations for its cloud infrastructure. Although the company pledged earlier this year to move to entirely renewable energy, Greenpeace has previously referred to AWS as the least transparent of all major tech companies on carbon footprint and energy use.
In some ways this is why, despite the fact I know that I’ll probably only see a building or some cable markings if I’m lucky, I keep going on these pilgrimages to the physical remnants of the network, and I’ll probably be doing them for a long time. At this point, it is easier to use the data points that slip through the cracks to find an Amazon data center in the heart of spook country than it is to actually understand in any sort of granular detail how much of the internet currently lives on Amazon Web Services or how serious of an environmental impact Amazon Web Services has.
The incoherent banality of northern Virginia also felt like a fitting aesthetic conclusion to this journey to see the cloud. If driving across America in search of the Internet has taught me anything, it’s that the suburban sprawl of northern Virginia (and Silicon Valley, north Utah, eastern Kansas, and central Iowa) looks exactly like the Internet as we live with it today: it looks like a landscape in equal measure blandly sinister and weirdly poetic, a place whose significance is not really born of grand ambitions but of conniving and coincidence, of political machinations hitting against material reality, of easily discarded histories that only achieve coherence after sifting through sediment.
And maybe my desire to submerge myself in that sediment, to weave The Cloud into the timelines of railroad robber-barons and military R&D, emerges from the same anxiety that makes me go try to find these buildings in the first place: that maybe we have mistaken The Cloud's fiction of infinite storage capacity for history itself. It is a misunderstanding that hinges on a weird, sad, very human hope that history might actually end, or at least reach some kind of perfect equipoise in which nothing terrible could ever happen again. As though if we could only collate and collect and process and store enough data points, the world’s infinite vaporware of real-time data dashboards would align into some kind of ultimate sand mandala of total world knowledge, a proprietary data nirvana without terror or heartbreak or bankruptcy or death, heretofore only gestured towards in terrifying wall-to-wall Accenture and IBM advertisements at airports.
But databases alone are not archives any more than data centers are libraries, and the rhetorical promise of The Cloud is as fragile as the strands of fiber-optic cable upon which its physical infrastructure rests. The Internet is a beautiful, terrible, fraught project of human civilization. While I make light of language like “pilgrim” to describe this cross-country journey, at the end of the day it has been an affirmation of a kind of faith: faith in the humanity of that beautiful, terrible, fraught project, and in the possibility of being able to see ourselves in all that beautiful, terrible, fraught truth.