There is no doubt that the coronavirus was spreading in the United States in January. We can at least start with that. Recently, California’s Santa Clara County reported that bodily tissues from a woman who died on February 6 tested positive for the coronavirus. She had not traveled outside the country, and based on what is known about the virus, she must have picked it up by January 31; in all likelihood, she was infected a week or two before that. The virus, it turned out, had been spreading in the United States well before we suspected, and weeks earlier than previous official estimates of community transmission had accounted for.
Of all the things we still don’t know about SARS-CoV-2—How far can it travel through the air? What treatments can tame it? How many people will it kill?—the number of people who might have been infected with the virus in January has held a special allure. A reliable estimate could help determine just how bad the United States’ botched early response to the pandemic was. We already know that the government failed to detect as many as 28,000 infections by March 1, so just how late to the game were we? Knowing more about January infections could also offer clues to the true number of Americans who have now been infected—thanks to a shortage of tests, the official count of 1.2 million is almost certainly too low. A firm number could inform our strategies for preventing subsequent waves of COVID-19 from becoming even more disastrous than the first.
Curiosity about January isn’t driven only by the collective good. Many Americans are eager to know whether that lingering cough or punishing fever they had back in the winter was in fact COVID-19. It’s tempting to believe that a sizable chunk of the population secretly carried the virus in January, because then there would be hope that anyone who came down with a suspicious illness in the past few months has some degree of immunity now. At the very least, if the U.S. dramatically increased its estimate of January infections, Americans could have some scientific backing for mourning a death in the family as coronavirus-related.
Unfortunately, experts generally agree that the actual number of Americans carrying the virus by the end of January was nowhere near high enough to support speculation about herd immunity and star-studded superspreader events. Beyond that, the estimates vary widely. Lauren Gardner, an associate professor of engineering at Johns Hopkins University, who created the school’s popular dashboard for tracking coronavirus cases, told me that “there could have been hundreds of cases in the U.S. in January and thousands by the end of February.” Trevor Bedford, a biologist at the Fred Hutchinson Cancer Research Center who has been at the forefront of the genetic study of SARS-CoV-2, says that “more than 10, less than 100 would be my guess.” Caitlin Rivers, a senior scholar at the Johns Hopkins Center for Health Security, told me that she’s “confident it is not zero” and that “it seems like it’s not millions.”
It is scientifically possible for a country months into an infectious-disease outbreak to determine with some certainty how many of its residents were infected in the first few weeks. The challenge is that doing so would require data about the United States and the rest of the world that are currently a mystery. Many Americans’ most pressing questions—Did I have the disease without knowing it? When?—will remain unanswerable forever. But with time, we’re likely to gain some limited clarity about what exactly happened at the beginning of this year. And we’re probably not going to like what we find.
One crucial factor for estimating how many Americans were infected in January is understanding how many sick people traveled here in the first place. The more people who carried the virus into the country—whether they were visitors or Americans returning home—the more chances it would have had to cause large, undetected outbreaks.
By mid-February, 12 COVID-19 cases related to travel from China had been detected in the U.S. There’s good reason to believe the actual number was at least marginally higher. In February, epidemiologists at both Harvard and Imperial College London estimated that the world’s disease-surveillance systems caught only about one of every three infections exported from Wuhan, China, where the virus was first identified. According to Bedford, this ratio suggests that about 20 to 50 infected people arrived in the U.S. from China or other countries in January.
SARS-CoV-2 is highly contagious, but a few dozen imported cases would probably not be enough to spark many major undetected outbreaks. Based on related diseases such as SARS and MERS, epidemiologists suspect that the coronavirus’s spreading potential is irregular. In all likelihood, some sick people infect many others, but most infect just a handful. Alessandro Vespignani, a network scientist and public-health professor at Northeastern University, estimates that in each American city that later became a hot spot for COVID-19, perhaps 10 to 20 “local transmission events” occurred in January. Aside from the one or two infections that did seed major outbreaks in places such as Seattle and New York, most infections that arrived from outside the country in January would have been transmitted to at most a few people, then quickly “fizzled out,” Bedford told me.
Establishing a more precise number of how many sick people carried SARS-CoV-2 to the U.S. early this year would require data that can be difficult or impossible to collect, especially during a major global-health crisis. For one: how many people were actually sick with COVID-19 around the world in January (or earlier). The official data out of Wuhan have been unreliable from the start. And countries that have since ramped up their coronavirus-detection efforts were not looking as carefully for cases at the beginning of the year. The World Health Organization did not declare a global-health emergency until January 30.
Researchers would also need to know where people traveled around the world in the early weeks of the pandemic. “One of the big challenges of looking at actual global spread of this disease is that, from January on, travel patterns have been massively disrupted,” Gardner said. When researchers don’t have perfect travel data for a time and place they’re studying, she explained, they often substitute or extrapolate from data in the recent past. “Sometimes you can say, ‘Well, I don’t have 2016 data, but I’m using 2015 data. That’s representative.’ That does not apply anymore.” The Chinese government shut down Wuhan on January 23; even before then, individual people’s movement patterns might have begun to shift in ways that are difficult to track.
When the living cannot be fully accounted for, one way to move forward is to tally the dead. Testing shortages mean that some COVID-19 deaths have gone undetected, but researchers can get a better handle on just how many people the virus killed during a given time period by looking at the excess mortality: how many more people died than would have been expected to under normal circumstances.
Last week, the National Center for Health Statistics published preliminary data on weekly excess deaths since January 2017, which will be updated as the pandemic wears on. Bob Anderson, the chief of the NCHS’s mortality-statistics branch, told me that it’s “the first time we’ve done something like this before the data were final.” The hope is that researchers can use the gross numbers to estimate how many Americans died of COVID-19 over a particular period, and from there estimate how many Americans were infected. But picking out excess deaths in the first few weeks of this year will be difficult. Compared with the hundreds of thousands of deaths the country experiences in a typical month, a handful of COVID-19 deaths would hardly be a blip. Indeed, by the NCHS’s count, the United States did not exceed the expected number of deaths by a significant margin until the week of March 22.
Even if the U.S. is able to more precisely nail down excess winter deaths a few months from now, using those numbers to calculate COVID-19 infections would depend on very accurate estimates of the virus’s fatality rate across age groups, according to Kaiyuan Sun, a postdoc at the National Institutes of Health’s Multinational Influenza Seasonal Mortality Study. Those estimates are just not there yet. Gardner pointed out that the case fatality rate of COVID-19 has been inconsistently reported, and that making reliable calculations with it will be hard until numbers from around the world start to balance out, which “could easily take through the year.”
Cities and states around the country could consider following Santa Clara’s lead and combing through autopsy data of people who died earlier this year. “A routine part of the autopsy is to collect various tissue samples,” James Gill, Connecticut’s chief medical examiner, told me. Those tissues—pieces of the deceased’s lungs, heart, and so on—might be kept for six months to a year, or preserved in paraffin wax and stored for several years. They could be tested for the presence of SARS-CoV-2 whenever anyone decided to look at them.
California Governor Gavin Newsom recently ordered the state’s coroners to review autopsies back to December. But other states have been hesitant to make similar commitments. Gill estimates that less than 10 percent of patients who die in hospitals get autopsies. Combined with the small number of coronavirus deaths that likely occurred in January and February, this limited pool of autopsy data means that few COVID-19 infections are likely to be captured. Reviewing tissues from January and February autopsies could produce a few needles in the haystack, and perhaps push back the timeline of community spread in the U.S. even further. But medical examiners are unlikely to find enough infected samples to draw any meaningful conclusions about how many deaths in the first few weeks of the year were actually COVID-19–related.
A third possible approach involves evidence that the virus leaves behind in the blood. After the human body fights off a viral infection, the immune system continues to produce antibodies—proteins that can recognize the invader if it returns. Right now, blood tests can tell you if coronavirus antibodies exist in your body at the time of testing, with limited accuracy. But they can’t tell you when you were exposed to the virus. If scientists can pin down how long coronavirus antibodies stick around in the body, accurate serology tests would tell a patient the earliest possible date they were exposed.
During the course of any viral infection, the human body produces different proportions of different kinds of antibodies. Gigi Gronvall, an immunologist at the Johns Hopkins Center for Health Security, told me that with enough data, researchers could theoretically determine the antibody ratios likely to be found in the blood at different numbers of days or weeks postinfection. But, Gronvall said, individual variation in antibody ratios would be common. So even if a study like this were attempted, the best it could do is predict what proportion of many, many blood samples indicate an infection around January; such a method probably couldn’t determine when exactly a specific person’s infection occurred.
Given that scientists have yet to settle on how long the body produces any SARS-CoV-2 antibodies, no one should be optimistic that a system like the one Gronvall imagined will be functional soon, if ever. Just determining the immunity window of the virus could take “months and years,” according to Rivers, the senior scholar at the Johns Hopkins Center for Health Security. “The virus has only really existed in the world, as far as we know, for a few months. And so we just haven’t been able to study it over time.”
Bedford, the biologist who has studied genetic changes in SARS-CoV-2, suggested a simpler way to get around the time-blindness of serological tests: Use blood from January. He explained that often, when patients have their blood drawn for testing in a hospital, the hospital keeps the sample and makes it available to researchers, who will know only the date the blood was drawn and the patient’s age. Testing a critical mass of blood drawn in January to see what percentage has antibodies to SARS-CoV-2 could help modelers infer what percentage of the entire U.S. population was infected.
Such a plan, though, would work only with serological tests that are much more accurate than the ones currently available. As Bedford explained, about 1 percent of results from the current tests are likely to be false positives. There’s some math involved, but the upshot is that until more than 1 percent of the population is infected—about 3 million people nationwide—you can’t actually draw any conclusions from the results. The odds that 3 million Americans had COVID-19 in January are slim to none.
THE BOTTOM LINE
After Bedford explained to me how retrospective blood tests would work, he clarified that they “probably shouldn’t have strong research priority.” Other experts I spoke with agreed that chasing after precise numbers from January is unlikely to bring the closure Americans might be looking for, or make a tremendous difference in what we as a country choose to do next. Even if we could wave a magic wand and drop the exact number of infections for the month into every epidemiological model we have, “I don’t know that there’s any immediate changes that we would make to our current response based on that information,” Rivers said.
Still, there are some benefits to determining more precisely just how many infections escaped America’s detection in the first weeks of the year. Peter Hotez, a dean at the National School of Tropical Medicine at Baylor College of Medicine, mentioned that many countries in the Global South are still in the earliest stages of their own outbreaks. The more information that public-health experts in those places can get about how COVID-19 began to spread through the U.S., along with Europe and East Asia, the better they can address local risks.
It’s not just other countries: In all likelihood, the United States will face more waves of COVID-19 before the disease is brought to heel by a vaccine or herd immunity, both of which are months if not years off. We will probably, in a matter of weeks or months, find ourselves at a point along the epidemiological curve that is more similar to late January than mid-March. Repeating the same mistakes would be a tragedy. “We don’t want to reopen and find ourselves [sitting] on an invisible epidemic, so that then, at the end of [another] four weeks or six weeks, we have to shut down everything again,” Vespignani, the Northeastern professor, told me.
After weeks of our political leaders offering revisionist histories and lies about the severity of the outbreak in the United States, providing a close-to-exact portrait of the beginning of the outbreak might also help quell rumors that COVID-19 has been secretly spreading through pockets of the country since the fall. These theories can lead people to think that immunity is far more widespread than it actually is, and therefore encourage them to flout social-distancing guidelines. This is especially concerning as dozens of states, against the advice of public-health experts, have eased or made plans to ease legal restrictions around nonessential businesses and travel.
To those of us desperate for any sort of certainty right now, January can seem like a good contender for a solvable mystery. But even if we did settle on a more precise tally of infections, it would not change much of what we already know. The data are already clear enough to show that, unless you traveled to Wuhan in December or January or were in close contact with someone who did, any flu-like symptoms you experienced in the first month of the year were most likely not caused by SARS-CoV-2. More precise numbers could marginally change our understanding of who might be immune, but they will not change the reality of the nightmare we’re living through. This is a hard truth to swallow. We still have a long way to go before we can return to work and pass others on the sidewalk and hug without fear.
Even so, the more knowledge we have about how this crisis began, the more power we might have to shape how it ends. Despite all the barriers to deducing the pandemic’s history, Vespignani said that modeling the future is still more difficult. Modelers have to consider human behavior, policy, and testing resources, which are shaped not only by the speed of science but by political and economic will. “We need to prepare a different approach to the future,” Vespignani said. “We don’t want to repeat history.”