Seventeen years ago, just as the periodic cicadas were getting ready to arrive in droves in the eastern United States, Google announced Gmail, an exciting new email service. It had three key features: search, making it easy to find emails; storage, with what was then a mind-blowing 1 gigabyte; and speed, with emails threaded into conversations that ostensibly eliminated the need for cumbersome folders. Today, as the cicadas have seemingly taken over parts of the eastern U.S. once more, Gmail and Google’s G Suite, now used by more than 2 billion monthly active users around the world, still largely operate on these same basic principles.
Google figures only briefly in the The Filing Cabinet: A Vertical History of Information, a new book by Craig Robertson, an associate media-studies professor at Northeastern University, but it’s impossible not to think about the little search bars we live with every day while reading it. Like the once-ubiquitous 8-track tape, the filing cabinet was an essential marker of modernization that’s now considered clunky and outdated, with none of the mystique that some objects, such as vinyl records and windup clocks, have acquired over time. But Robertson’s captivating history makes the case that, when the filing cabinet was invented in the 1890s, it represented a new mode of efficient work. And today, its legacy informs some of our most innovative technologies, including search, Siri, and the way we organize the files on our computers.
Consider the rationale that one of Google’s co-founders, Larry Page, offered when Gmail was released. He cited the experience of one user who had asked whether there was a way to fix email: “She kvetched about spending all her time filing messages or trying to find them,” he is quoted as saying in the company’s original press release. “And when she’s not doing that, she has to delete email like crazy to stay under the obligatory four megabyte limit.” Page could have been describing the problems of the 20th-century office, which had found itself inundated with paper, as managers yearned for a simple way to file documents and find them again quickly.
A vertical filing cabinet, Robertson writes, “allowed a user to find papers ‘at a moment’s notice’ or ‘almost instantaneously.’” It has a couple of origin stories, and at least two inventors: the Library Bureau, a library-supply company that built a prototype based on an idea from a secretary in Buffalo, New York; and Edwin Seibels, an insurance sales agent who tried, and failed, to get a patent for his version. (The patent office considered it an idea, not a device.) With the right tabs and folders, a filing cabinet made the process of sorting and collating documents intuitive. Strategically, it was sometimes advertised as a machine of its own, alongside the then-nascent telephone and calculator: The turn of the century saw a flourishing of office innovations, and marketers made sure to use words such as equipment and appliances, rather than furniture. This naming convention plugged the humble filing cabinet into the world of modern technology, granting it the same sort of pizzazz that software developers today aim for in describing, say, a computational process as “artificial intelligence” rather than as a “computer program.”
The 20th century also saw an emergent information paradigm shaped by corporate capitalism, which emphasized maximizing profit and minimizing the time workers spent on tasks. Offices once kept their information in books—think Ebenezer Scrooge with his quill pen, updating his thick ledger on Christmas. The filing cabinet changed all that, encouraging what Robertson calls “granular certainty,” or “the drive to break more and more of life and its everyday routines into discrete, observable, and manageable parts.” This represented an important conceptualization: Information became a practical unit of knowledge that could be standardized, classified, and effortlessly stored and retrieved.
Take medical records, which require multiple layers of organization to support routine hospital business. “At the Bryn Mawr Hospital,” Robertson writes, “six different card files provided access to patient information: an alphabetical file of admission cards for discharged patients, an alphabetical file for the accident ward, a file to record all operations, a disease file, a diagnostic file, and a doctors’ file that recorded the number of patients each physician referred to the hospital.” The underlying logic of this system was that the storage of medical records didn’t just keep them safe; it made sure that those records could be accessed easily.
Robertson’s deep focus on the filing cabinet grounds the book in history and not historical analogy. He touches very little on Big Data and indexing and instead dives into the materiality of the filing cabinet and the principles of information management that guided its evolution. But students of technology and information studies will immediately see this history shaping our world today. Curious about the chemical composition of quartz? Google will show you the results in a series of stacked cards and provide tags and bits of structured data to help you sift through the results. Need to know the capital of Bhutan? Ask Siri, and a friendly female voice—most filing-cabinet clerks were women, the book reminds us—will do a quick scan of digital records and provide a helpful answer: Thimphu, situated in the western part of the country. (Full disclosure: The technology non-profit Meedan, where I oversee operations, receives funds from the Google News Initiative to support its work on information trust and quality, including a recent COVID-19 collaboration with the Australian Science Media Centre.)
But if the filing cabinet, as a tool of business and capital, guides how we access digital information today, its legacy of certainty overshadows the messiness intrinsic to acquiring knowledge—the sort that requires reflection, contextualization, and good-faith debate. Ask the internet difficult questions with complex answers—questions of philosophy, political science, aesthetics, perception—and you’ll get responses using the same neat little index cards with summaries of findings. What makes for an ethical way of life? What is the best English-language translation of the poetry of Borges? What are the long-term effects of social inequalities, and how do we resolve them? Is it Yanny or Laurel?
Among the many charming aspects of The Filing Cabinet are the vintage advertisements peppered across its pages. One Shaw-Walker campaign promised a filing cabinet “built like a skyscraper” (another modern marvel of the time) but simple enough for a child to operate. A Yawman and Erbe ad called files the “treasure chests of business” and showed a woman peering lovingly at a tidy set of tabulated folders. Then there are the peripherals. Remington Rand’s Multisort device, for example, claimed to help clerks sort hundreds of papers quickly before placing them in filing cabinets. The physical design of the filing cabinet itself, in other words, helped tell a story about modernization and ease of use.
These advertisements show most clearly how the architecture of the filing cabinet informs modern information organizing. We riffle through the rectangular boxes of search results and email with a swipe of our fingertips, the way we would with file folders. Browser tabs, the blessing and curse of web surfing today, help us sort content; the tabs display a brief summary of what’s inside. Modern search relies on a process called indexing, which aims to store and parse data to maximize information retrieval. Contemporary machine-learning systems tend to rely on data classification and cleaning done by a host of invisible workers, moving quickly through data like the office clerks of yesteryear. But the sheer scale of the effort makes these classification systems and algorithms difficult to comprehend.
The opacity of our information systems comes with a cost. It would have been absurd, in the early 20th century, to ask a file-cabinet worker for answers about matters that lie outside the scope of daily business affairs. If you wanted to pull up client correspondence, the file clerk had a file for you. If you wanted to understand John Locke’s theory of mind, they’d probably point you to the library. There, you would also encounter filing cabinets, with index cards organized by author, subject, or title, following the Dewey Decimal system; such a system didn’t replace books but rather made them easier and more efficient to locate. And books are just one part of a system of learning that might include other forms of media, discussion with peers and mentors, and life experiences. Unlike the granular certainty of information, the acquisition of complex knowledge is a multimodal, evolving experience, something modern information systems can only begin to support.
The coronavirus pandemic is just one potent example of this tension between information and knowledge. In the early days, it became clear that very little was clear about what was going on. Questions about COVID-19’s transmission, its prevention, and its range of symptoms were subject to further inquiry by scientists and researchers. Our information systems were rife not simply with misinformation—false or misleading information—but with what I call “midinformation” (note the d), or informational ambiguity based on scant or conflicting evidence, in many cases about emerging scientific knowledge. An information environment designed for certainty butted up against a hazier reality. Last spring, scientists were learning how, exactly, the novel coronavirus spreads. As a consequence of a reasonable level of concern, many of us regularly wiped down food packaging, mail, and countertops. By last May, the CDC had updated its guidelines, confirming that surface transmission is possible, but likely not the main way the virus travels. As the global-health scholar Emily LaRose has written, sometimes online information is partially true but misunderstood because it is incomplete or inadequately contextualized. Sometimes, science just takes time.
Can emergent knowledge coexist with an internet that privileges certainty? The mystery of periodical cicadas is an example of midinformation at work; knowledge about them evolves in fits and spurts as different broods appear every 13 or 17 years. We still don’t know why their life cycle follows such an odd pattern, or how they calibrate their internal clock. But we at least have new tools that could help collect and classify more data about them, and bring more people along in the process of learning more about these creatures. In the spring of 2019, Gene Kritsky, the dean of behavioral and natural sciences at Mount St. Joseph University, in Cincinnati, released the Cicada Safari app, to encourage citizen scientists to document cicadas around them. Key metadata—time, date, and coordinates—are captured, and the images are filed and tagged after being verified. Already, the app has yielded surprises for Kritsky and his team, such as the first-ever documentation of off-cycle emergence in certain broods. As he reminded me, even the large data set enabled by the app is simply establishing a baseline for 2038. Changes in geographic spread, declines in the population, the effects of deforestation—these are all potential observations we’ll have to wait another 17 years to study further.
As data points about cicadas improve over time, I imagine new mysteries will emerge. Such is the nature of knowledge—today, we live with more information at our fingertips than in all of human history, and yet our lives are filled with questions. And besides, entomological data may never quite explain the pathos of these creatures, which have symbolized death and rebirth for the ancient Chinese and ancient Greeks alike; the reason we find their red eyes so haunting; or the fact that some people find their song captivating and others find it annoying. Information collection and distribution today tends to follow the rigidity of cabinet logic to its natural extreme, but that bias leaves unattended more complex puzzles. The human condition inherently demands a degree of comfort with uncertainty and ambiguity, as we carefully balance incomplete and conflicting data points, competing value systems, and intricate frameworks to arrive at some form of knowing. In that sense, the filing cabinet, despite its deep roots in our contemporary information architecture, is just one step in our epistemological journey, not its end.
When you buy a book using a link on this page, we receive a commission. Thank you for supporting The Atlantic.