When you do this, you ought to get more or less the same number of reads for every gene in an animal’s genome. But when the Edinburgh team looked at their own data, they saw that many reads were incredibly rare while others were up to 10 times as common. “There is no way, biologically, these can be part of the same genome,” says Mark Blaxter, who led the team. Instead, the rare reads probably came from stowaway bacteria. The team carefully cleaned up their data to remove these contaminating sequences.
They ended up with around 500 genes that potentially came from microbes, and they still think that the most of these are from contaminants that they haven’t been able to sort through yet. They only have strong evidence for 36 genes being horizontally transferred from bacteria, which is within the common range for animal genomes.
The Edinburgh team were still polishing their data when the rival paper came out last Monday, claiming 6,600 horizontally transferred genes. They were shocked, but rushed to analyze the data that Bob Goldstein from UNC quickly uploaded onto a server.
Worryingly, they found that the UNC data included many reads that they hadn’t seen at all—even though both groups sequenced the same animals! And most of these phantom reads were rare. Based on this, the Edinburgh team concluded that around 30 percent of the UNC genome probably came from contaminating microbes.
“If this is true, it is damning,” says John McCutcheon from the University of Montana. “But it’s also surprising, because much of what [the UNC team] did was pretty careful, so I would not have expected them to miss this.”
“We thought seriously about the possibility of contamination—it was of course the most likely initial explanation for the large amount of foreign DNA found in our assembly—and much of the analysis in our paper was designed specifically to address this issue,” says Thomas Boothby from the UNC team, in a comment.
For example, the UNC team focused on 107 parts of their assembled genome where a gene of bacterial origin seemed to sit next to one of animal origin. They used a technique that started with DNA sequences at either end of these pairs, and tried to duplicate all the DNA in between. If the technique worked—and it largely did—it would mean that both genes really are connected on the same stretch of DNA, and that the bacterial one couldn’t have come from contaminants. What’s more, 54 of these 107 genes were among the 500 that the Edinburgh team had singled out as being potential horizontal transfers.
Much depends on whether these 107 pairs are representative of the 6600 genes that were supposedly transferred from bacteria. The UNC are treating them as such. The Edinburgh team thinks they can’t possibly be. David Baltrus from the University of Arizona, who wasn’t involved with either group, says, “My gut is telling me that the UNC authors somehow, probably unintentionally, biased picking this random subset.”