Microbiology has always been about recognizing the scale of what is unknown. In the beginning, the unknown was that microbes existed at all.

The invention of the microscope proved that these tiny, single-celled organisms live all around us. Then the invention of DNA sequencing helped reveal the full diversity of microbes out in nature, 99 percent of which cannot be grown and studied in a lab.

Well, not the full diversity, it turns out. The three domains that make up all life on Earth are bacteria, archaea, and complex life (which includes us humans). And a new study of gut microbiomes finds that a common DNA sequencing technique overlooks 90 percent of the diversity in archaea—the single-celled organisms that comprise the oft-forgotten third domain. Archaea and bacteria are both microbes—microscopic and single-celled —but do not make the mistake, as scientists once did, of thinking they are basically alike.

Archaea are at once alien and intimately familiar. Many of the most well-known ones are extremophiles, which live in harsh environments like hot springs. Yet, archaea may be evolutionarily more closely related to us—multicellular humans—than bacteria. “When I tell people I work on archaea, most people don’t even know what it is,” said Kasie Raymann, a microbiologist at the University of Texas at Austin and an author of the study.

Raymann and her colleagues wanted to study archaea living in the guts of humans as well as other great ape species like bonobos, gorillas, and orangutans. So they used the typical method: First, find some ape poo. And second, extract DNA from it. Scientists looking for microbial diversity usually sequence regions from one particular gene, called 16S rRNA. To find it, they use primers, two short pieces of DNA that match the beginning and end of the region they want to sequence. The primers stick to the two ends, and enzymes swoop in to copy the flanked region thousands or millions times over. It’s these copies that get sequenced.

As is still typical, the team used “universal primers,” which are supposed to pick up both bacteria and archaea. But they also sequenced their samples a second time using archaea-specific primers. The difference was profound. In humans, the universal primers found just one type of archaea; the specific primers found 37. Orangutans: 161 to 7. Gorillas: 135 to 7. Bonobos: 71 to 6. Chimpanzee: 69 to 7. The most common group of archaea the team found were methanogens, which produce methane.

Why the discrepancy? First of all, archaea tend to be less abundant than bacteria in the gut, so they are trickier to find in a needle-in-the-haystack kind of way. Second, the primers themselves may be flawed. If the DNA sequence on the primers does not perfectly match the sequence of the gene, the primers sometimes don’t stick. “Universal primers” tend to be designed with well-known bacteria in mind. Obscure groups of bacteria are susceptible to primer bias, but it’s especially true for archaea.

“It’s a nice advance and confirmation,” said Jack Gilbert, a microbial ecologist at the University of Chicago who was not involved in the study. Gilbert said he would like to see scientists use these archaea-specific primers in different environments—like the ocean or soil—to see if there really is more archaeal diversity lurking out there than previously thought.

Archaea do have history of being overlooked. In the early days of microbiology, scientists relied on culturing microbes. Archaea that live in extreme environments like hot springs don’t really thrive on petri dishes in labs. “They’re very hard to culture. It’s super difficult,” says Raymann.

You get a cycle: Archaea are difficult to study, so scientists don’t study them. Because they don’t study them, they don’t know very much about them. Because they don’t know very much about them, they don’t know how best to study them through culturing or sequencing. And so on. “All of that contributes to bias in the knowledge base about archaea,” says Tanja Woyke, a microbiologist at the Department of Energy’s Joint Genome Institute.

In recent years, microbiology has been moving to a more advanced sequencing technique called metagenomics, which sequences all of the genetic material in a sample, rather than just part of the 16S rRNA gene. Metagenomics is still too expensive if you want to compare thousands of samples, but smaller microbiome studies often rely on metagenomics now. Even this fancy new technique can harbor biases though.

Metagenomics requires chopping up all of the genetic material in a sample and piecing those short snippets back together using genomes of known organisms for reference. So if your sample contains a novel microbe that has never been sequenced before, you're not going to find it by comparing it to known genomes. You can’t look for what you don’t know exists. (There are ways to do metagenomics without reference genomes, but they take a lot of computing power.)

The microbes most likely to lack a reference genome are, of course, archaea. For example, the Joint Genome Institute’s genome database has nearly 50,000 bacteria and just 1,000 archaea. “If you only have a 1,000 references for archaea, then the likelihood you’ll find a match will be much lower,” says Woyke.

That’s why Woyke is working on the Microbial Dark Matter project, which generates reference genomes for obscure microbes. The “dark matter” here comprises both bacteria and archaea (but especially archaea) that microbiologists have been unable to culture and have never fully sequenced. Woyke’s lab takes samples from remote environments—the vents of ocean floors or gold mines—and isolates microbial cells one by one. The team sequences DNA from these single microbes, one cell at a time, slowly filling in the unknowns in the tree of life.