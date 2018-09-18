“I find that very depressing,” says Jay Shendure from the University of Washington. “It is stunning that we sit here 15 years after the Human Genome Project, and still know little to nothing about so many genes. In a world of finite resources, it does not make sense to invest equal effort in every gene. But it’s clear that something is amiss in the status quo of research allocation.”

In what Amaral describes as a “heroic effort,” Stoeger spent years collating information from dozens of databases about every known gene. Using machine-learning tools, he then showed that he could accurately predict how many papers have been published about a given gene using just 15 traits.

Some of these telltale traits—how often the gene is mutated, or the negative consequences of losing it entirely—certainly reflect the gene’s importance and its relevance to human disease. They’re the kind of characteristics scientists should be paying attention to.

But other traits—how big the gene is, how active it is, how many tissues it is active in, whether it produces proteins that are secreted from cells, whether those proteins are soluble in water, and more—reflect how amenable the genes are to experiments. Highly active genes, for example, are easier to detect using older methods. “That definitely had a substantial impact on whether you were even able to study a gene in [the 1980s and 1990s],” says Sharon Plon from Baylor College of Medicine. And those historical quirks are better at predicting how the National Institutes of Health currently allocates its money than thousands of other features that more directly reflect what we now know about the role of genes in disease.

It’s possible, of course, that scientists have already identified all the really important genes, and are allocating their attention appropriately. There are good reasons, for example, why p53 is the most popular human gene: It protects our cells from cancer, and is itself mutated in half of all tumors. More broadly, Stoeger found that compared to the least popular genes, the most popular ones are three to five times more likely to have been linked to diseases in large studies, or to wreak havoc when they accrue incapacitating mutations. The problem is that those celebrity genes get 13 times more attention than their neglected counterparts. Scientists do tend to study important genes, Stoeger says, but even then, they do so disproportionately.

That’s partly because there are substantial barriers to studying something that no one else has studied before. A researcher might spend years trying to, for example, engineer a line of laboratory rodents that lack the gene in question. They might create bespoke antibodies or other chemical reagents that can help track or visualize the gene. This all takes time, money, and effort. “Many investigators identify an important gene and then spend their whole career studying it,” says Plon.

To do otherwise is risky. Stoeger showed that over the past two decades, junior researchers who focused their attention on the least studied genes were 50 percent less likely to eventually run their own lab. “Those people get pushed out of the biomedical workforce, and then don’t get a chance to set up a lab that explores some of the previously unknown biology,” he says.

Stoeger and Amaral “have done a remarkable job of comprehensively analyzing the reasons why many important genes are ignored,” adds Purvesh Khatri from Stanford University. “Their results underscore the need to change how we study human biology.”

Amaral blames the research imbalance on the erosion of funding from the National Institutes of Health, which forces scientists to compete for a dwindling number of grants and pushes them toward safer research. “When resources stop growing, the entire system is telling people not to take chances,” he says. The NIH does have grants that are meant to promote innovative, exploratory, high-risk research, but even these end up augmenting the same imbalances: Half of the papers that emerge from them still focus on the same 5 percent of well-studied genes. Even supposedly game-changing techniques like CRISPR have altered the landscape of gene popularity very little. “You get all these new tools but you end up using them on the same set of genes that you were using them on before,” says Amaral.