Katie Carey / Spectrum

Fresh out of her graduate program, where she studied sex differences in the rodent brain, Jill Silverman joined a top mouse-behavior lab at the National Institute of Mental Health in Bethesda, Maryland. She was especially excited to study autism. It was 2007, and researchers had just begun identifying strong candidate genes for the condition. They were churning out mice with mutations in these genes and turning to mouse behaviorists to show that the mice could serve as furry proxies for people with autism. Silverman tested dozens of the mice in the lab.

The ideal “autism mouse,” researchers thought at the time, should show all the same traits that characterize autism in people: language and social problems, and restricted and repetitive behaviors. Some mutant mice make fewer ultrasonic vocalizations than controls do, which many behaviorists took to be an analog of language problems. Other models groom, jump, or bury marbles to an excessive degree—actions the researchers interpreted as repetitive behaviors reminiscent of autism. But researchers were most intent on sniffing out social deficits, a hallmark feature of autism. If they could pin down a murine model of this trait, the thinking went, perhaps they could design drugs to address it—or could at least better understand the brain pathways involved.

In 2011, one particular animal made headlines in Nature for ticking that last box. The mice lacked part of SHANK3, a gene mutated in roughly 1 percent of people with autism. The study in which the mice debuted announced triumphantly that they “display autistic-like behaviors,” and the evidence seemed straightforward: The mice showed little inclination to seek out another mouse in the three-chambered assay, a landmark test developed by Silverman’s mentor, Jacqueline Crawley.

In this test, researchers place a mouse in a Plexiglas box containing three rooms laid out in a row, and the rodent faces a simple choice: It can spend time in the first room, where another mouse sits imprisoned in a small wire cage, or it can loiter in a back room, which holds an identical but empty cage. Most mice gravitate toward the playmate. But the SHANK3 visitors moved slowly, pausing frequently to groom. (They groom so compulsively, in fact, that they have bald spots where they’ve licked the fur away.) They sometimes sniffed the empty cage and sometimes the other mouse, showing no real preference for the mouse.

Or at least that’s what the 2011 study found. Last year, Silverman and her colleagues published a study looking at two male and two female groups of the same mice in the three-chambered assay, and found something different. This time, the mice showed behavior that was “mostly normal,” the team wrote, with only one group of male mice displaying social deficits in the three-chambered assay. Around the same time, another team also announced that it was unable to replicate the original results. The upshot: Any social deficits in the SHANK3 mice are, at best, significantly milder than previously reported.

The news confirmed what many scientists had started to suspect. “I felt like I was very naïve,” says Silverman, who now runs her own lab at UC Davis. “The models were coming out and we were like, ‘Oh, we have models now, and we’ve developed these assays, and now we’re really going to start to clear up and have answers,’” Silverman says, “And then we realized of course it’s not so simple.”

The excitement over mouse models has since fizzled. Many of the original reports of social deficits in mice have not held up when tested by independent labs—including Silverman’s—or in different strains of mice. Inconsistent findings have plagued studies not just of SHANK3 mice, but also those with mutations in the risk genes CHD8, NLGN3, NLGN4 and CNTNAP2, among others. That has left many scientists wondering whether mice can ever recapitulate something as complex and human as autism.

“I think that defining an autism mouse is folly,” says Valerie Bolivar, the director of the Mouse Behavioral Phenotype Analysis Core at the Wadsworth Center in Albany, New York. “To get all those things in a nice, neat package with a bow—we’re just not going to get that.”

A more productive approach may be to focus only on behaviors that are reproducible in mice, such as quantitative measures of how the animals learn. If researchers do want to get at social deficits, they may need to go back to basics and methodically catalog mouse behavior.


Much of the inconsistency in reports of social behavior in autism mice may stem from fundamental misunderstandings about the three-chambered assay. Crawley developed the test in 2003, a few years after a National Institute of Mental Health meeting that brought together experts on mouse and human behavior to design screens for autism features in mice.

Crawley says the three-chambered assay was intended to deliver a binary readout: Does the mouse prefer another animal over an object (much as a child might choose a playmate over a toy)? The test is not sensitive enough to quantify the amount of time a mutant spends with the other mouse and compare it with a control. Still, many teams mistakenly use that metric anyway and report that their mice have autism-like social behavior—only to be contradicted by later observations from other labs.

Silverman says her biggest concern with the assay is that unrelated features common in many autism models can make the test practically impossible to interpret correctly. For example, SHANK3 mutants are slower than controls and, as a result, may move between rooms so rarely that any comparison of the time spent on each side is meaningless. Likewise, a mutant mouse with impaired olfaction just won’t recognize a peer, and mice that are hyperactive might move too rapidly between chambers to interact with the other mouse.

Some of the confounds also interfere with other behavioral tests, Silverman says. For example, researchers may tune into ultrasonic vocalizations for clues to mice’s levels of social interest: the inaudible squeaks of a male mouse near a female in estrus, or those of pups separated from their mother. In these scenarios, she says, motor problems—a common feature in people with autism and in these mice—may impair the mouse’s ability to vocalize.

A mouse’s background strain can also affect its behavior. The genetic background of the SHANK3 strain published in 2011 changed substantially over time, which may help explain the differences between the original report and later publications. Conflicting results also came from mice missing other parts of SHANK3. And mice are notoriously finicky: They behave differently depending on whether they live with other mutants or controls, whether they are anxious from a prior test, whether they are male or female, whether their handlers are male or female, and whether the scientists observing them stay in the room or set up a camera and sneak out.

Perhaps the biggest problem with using mice to study social skills is that mice aren’t especially social to begin with. The region of the mammalian brain that dictates most social behaviors, the prefrontal cortex, is significantly smaller in mice than in people. In the wild, male mice are highly solitary and may come into contact only with a mate and its pups, says Mu Yang, who directs a mouse behavioral facility at Columbia University. “‘Let me sit next to you; what’s going on?’ That’s not necessarily what mice do.”

Caroline Blanchard, a rodent behaviorist at the University of Sao Paulo in Brazil, has spent nearly 40 years scrutinizing mice as they go about their business in large enclosures that closely mimic their natural habitat. “They are not solitary in the sense that they kill each other if they come too close or something,” says Blanchard. But, she says, “they’re not nearly as social as most rats.”

Rats and monkeys are often hailed as better models for autism research, but support for mice remains strong despite the drawbacks. “Whether the brains of mice and the behavioral repertoires of mice are sophisticated enough to sufficiently monitor autism and its primary symptoms—that, I think, is a very logical question to ask,” Crawley says. “I think about this all the time, and I don’t have an answer.”


The fact that many behaviorists are rethinking whether mice can mimic social deficits at all is far from obvious from the literature. Top journals continue to publish studies detailing social deficits in autism mice, often based on inexpert interpretations of the three-chambered assay. “It is annoying when you see the words ‘autism model’ in the titles of five Nature papers,” Silverman says.

Researchers often feel pressure from journal reviewers to demonstrate that an autism mouse has social deficits that seem to recapitulate autism, says Guoping Feng, a professor of brain and cognitive sciences at MIT. Feng led the 2011 study that first showed social deficits in SHANK3 mice. He was mainly interested in looking at the effects of a mutation in SHANK3 on the brain. But for these experiments to be considered meaningful, he understood that the mice needed to show autism-like behaviors.

In 2014, that same pressure drove at least five other teams to look at social behavior in mice with mutations in CHD8, which was emerging as a top candidate gene for autism. By 2016, one team had announced in Nature that they had a CHD8 model showing “autistic-like phenotypes,” even though those mice prefer a playmate to an object in the three-chambered assay.

At the time, Silverman and her collaborators were characterizing a mouse with another mutation in CHD8. Although the findings from both teams were similar, Silverman’s team interpreted the mice as not having any social problems. However, the journals the team approached were hesitant to publish results that seemingly contradicted the other group’s paper. To convince the reviewers, Silverman’s team needed to replicate their behavioral experiments—six months’ worth of work—before they were able to publish.

“This is an example of what’s been happening in the literature with mouse models of autism,” Crawley says. “The first paper that comes out has incorrectly interpreted their findings or done the wrong experiment sometimes, and their findings get popularized in the community and sometimes in the press—and then the next papers that come out that have done things right get lost in the shuffle.” To break this cycle, researchers should work with mouse behaviorists to help interpret their studies, Crawley says.

An even more effective solution might be to move away from the pressure to show autism features altogether, some researchers say. “We need to loosen the standard of framing mouse models using the mouse version of [a diagnostic manual]. When that happens, people can stop trying too hard to shape their data to look like the cardinal symptoms of autism,” Yang says.


Researchers are also thinking about assays that may better capture the nuances of autism—such as tests of social behavior that borrow from studies of reward. For example, an autism mouse might not be inclined to revisit a corner of a cage where it once had a rewarding interaction with a peer, says Ted Abel, the director of the Iowa Neuroscience Institute at the University of Iowa. But he cautions against inferring too much about what the mouse might be thinking. Instead of looking for a mouse example of a particular trait, he says, researchers should document all behavior in the animal with no bias. “We should study the mice and see what they tell us—try to discover something we don’t know, as opposed to assuming we do know,” he says.

A few teams are following this best-practice guideline, including the two that reassessed the SHANK3 model last year. One group of researchers, the Preclinical Autism Consortium for Therapeutics, is focused on finding the best possible mouse model for screening treatments. The SHANK3 model, the group says, is still the best option for this purpose. The other team is carefully assessing features in a range of mice, including those with mutations in CNTNAP2.

Animal studies should mimic clinical trials in people, says Patricia Kabitzke, a researcher on the second project and the senior scientific program manager at Cohen Veterans Bioscience, a nonprofit research organization. In clinical trials, researchers are required to state the parameters of their experiments in advance and then publish their findings, whether positive or negative, rather than interpreting the results they get. If mouse studies don’t follow this practice, she says, some teams may repeat their experiments multiple ways, with multiple assays, until they get the result that best fits their paradigm.

None of the autism mice Kabitzke’s team has looked at so far, including two SHANK3 mutants, show reliable social deficits. But that doesn’t mean the mice aren’t useful, she says, because they have other traits that may prove consistent. “At the end of the day, I am not sure how necessary it is going to be to try to recapitulate exactly the same symptoms in the animal as you see in the human,” she says. “It’s kind of an absurd proposition to think that we have a mouse model, or rodent model, or really any kind of model of anything. We really don’t even have a human model of a disease, because there’s so much variation in how these differences present themselves.”

The key to truly understanding behavior may be to slow down and chart it manually. Bolivar trained with a scientist who learned animal behavior from the zoologist Robert Hinde, who mentored the legendary primatologists Jane Goodall and Dian Fossey. This makes Bolivar a “dinosaur,” she says, but also well equipped to study the behavior of a species that is different from people.

Even then, scientists are inferring intent when they decide, for example, that sniffing a nose is a social act. “I have probably scored more social interactions than anybody on earth,” says Yang, who got her start in science 12 years ago testing the first autism mouse models for social deficits. “I scored this nose-to-butt sniffing for 10 years, but do I really understand [it]?”

Watching animals behave for dozens of hours is the only way for researchers to make sure they don’t misinterpret anything, Bolivar and others say. Scuffling mice might be playing or fighting, and grooming can be a show of either affection or dominance, depending on the degree of pressure involved, for example. In Bolivar’s lab, students sit and watch videos in which pairs of mice sniff, scurry, and tumble together in small Plexiglas cages. The students take note each time the mice do one of 11 things, such as touching noses, sniffing rears, or grooming themselves. It is slow, painstaking work.

In one video, a mouse watches as the gloved hand of a researcher descends, holding another mouse. The first mouse immediately runs over, and the two mice touch noses several times, then sniff each other’s heads and backsides. By one minute in, the first mouse is following the second rapidly across the cage. By three minutes, they are rolling across the cage in a blur of fur and whiskers. What this really means, only the mice know for now.

But Silverman hopes to find out. She, too, is carefully observing one-on-one interactions as a way to look for social deficits in mice, although she has yet to see an animal that fits the bill. “I haven’t totally given up,” she says.


This post appears courtesy of Spectrum.

We want to hear what you think. Submit a letter to the editor or write to letters@theatlantic.com.