In 2007, only after historian Alexandra Minna Stern had spent years researching eugenics in the American west, culminating in a published book, did she find the motherlode.

During the height of the eugenics movement, California sterilized 20,000 patients deemed feeble-minded or insane. Stern, who is a professor at the University of Michigan, wrote about the sterilization program in her book, but she had only a patchwork of records to work with.

One day in 2007, a secretary pointed her toward a neglected filing cabinet at the state department of mental health’s office in Sacramento. Inside were 19 reels of coiled microfilm, containing sterilization recommendation forms with the names, ages, family histories, and diagnoses of nearly 20,000 patients. These forgotten records covered patients recommended for sterilization at California state hospitals from 1919 to 1952. “The microfilm was in very good shape,” says Stern. “I don’t think anyone had looked at it since the 70s.”

Recognizing the value of such a complete record, Stern had the microfilm duplicated. (Good thing because the original microfilm was later lost when the department reorganized following California’s budget cuts.) From there, the project morphed into something resembling contemporary data science more than traditional historical research. Stern hired a team of students to turn microfilm into a searchable database of health records—powered by HIPAA-compliant clinical trial software. And along the way, the team encountered the promises and pitfalls of big data as applied to history.

Recently, Stern co-authored a paper estimating that as many as 831 of the patients sterilized are still alive. The paper, she hopes, will spark conversations about compensating those forcibly sterilized, like North Carolina and Virginia have done. Until now, such a conversation had not happened in California. Because to compensate people, you have to first know who they are.

California became the third state to pass a sterilization law in 1909. The law allowed superintendents at state psychiatric institutions to sterilize patients to improve their “physical, mental, or moral condition.” The wording was vague, and it could be applied to patients considered mentally ill, handicapped, sexually deviant, criminal—anyone considered a misfit, really. Sterilization would prevent these undesirable people from having children.

As a kid growing up in California, Stern recalls these state institutions being evoked in schoolyard taunts—like “Oh you’re weird, you’ll end up in Napa,” referring to Napa State Hospital. The institutions performed vasectomies on men and salpingectomies, or the removal of Fallopian tubes in women. The 20,000 people sterilized in California account for one-third of all such sterilizations in the U.S.

Here are some of their stories, as previously written by Stern:

In 1943, a 15-year-old Mexican-American boy we will call Roberto was committed to the Sonoma State Home, an institution for the “feebleminded” in Northern California. Roberto’s journey to Sonoma began the previous year when he was picked up by the Santa Barbara Police for a string of infractions that included intoxication, a knife fight, and involvement with a “local gang of marauding Mexicans.” Citing his record of delinquency and “borderline” IQ score of 75, the officials at Sonoma recommended that Roberto be sterilized. Roberto’s father adamantly, and unsuccessfully, opposed his son’s sterilization, and went so far as to secure a priest to protest the operation.

Or the story of a young Mexican-American woman:

Silvia, a Mexican-American mother of a toddler, was 20 years old when she was placed in Pacific Colony in 1950. She was assessed with an “imbecile” IQ of 35 and reportedly had been raised in a violent home. Silvia’s mother ostensibly could not control her daughter and approved her sterilization.

Or that of a young, possibly gay, man:

Fifteen years earlier, Timothy, a white 25-year old placed in Stockton because of same-sex encounters since boyhood and a psychiatric diagnosis of “dementia praecox, hebephrenic type,” consented to his own reproductive surgery, perhaps because he knew that it was a potential ticket out of the facility or because he felt it would help him control his pathologized sexual desires.

These are individual stories, but Stern wanted to construct the database to find larger patterns. And of the foremost questions she wanted to ask was whether Hispanic patients were treated any differently in these institutions.

The first step, just building a database, was no trivial task. The 20,000 records each contained 212 individual variables. “Imagine you put all that in Excel spreadsheet. It’s going to be a mind boggling mess,” says Stern. Since they were working with health records, they also needed to be in compliance with HIPAA.

Stern’s then graduate student Natalie Lira, now a professor of Latino/Latina studies at the University of Illinois Urbana-Champaign, headed up that effort. She ended up collaborating with non-historians from University of Michigan’s School of Public Health to use Redcap, a data capture tool typical in clinical trials. Over three years, a team of undergraduate and graduate students entered the data off of the microfilm into a searchable database.

Sterilization recommendation forms for 1935 and 1940.
Image used in accordance with the California Committee for the Protection of Human Subjects Protocol ID 13-08-1310 and the University of Michigan Biomedical IRB HUM00084931.

Asking the question of who is Hispanic turned out to be complicated too. The forms did not have a line for ethnicity; instead, they asked for “nativity,” where someone was originally born. A man of Mexican descent could be born in California, so nativity would not correlate with what we currently think of as Hispanic. When Nicole Novak, an epidemiologist who also worked on the project, went to look for census data on Hispanics living in California, she encountered more confusion. Mexicans, for example, where considered their own race in 1930, white in 1940 and after, though states in the Southwest had their own category for “white person of Spanish surname.” Looking at historic records, says Novak, “has shed a lot of light on how constructed a lot of categories we use in public health are.” The team ended up using Spanish surnames as a proxy for Hispanic ethnicity, despite the imperfect correlation. (Filipinos, for example, also have Spanish surnames.) Eventually, they found that patients with Spanish surnames were indeed two-and-a-half times as likely to be sterilized than those without.

The “nativity” classification posed another question. Should the team use the original outdated terminology on the forms or attempt to update them to our modern language? “Do you use ‘dementia praecox,’ which is roughly equivalent to schizophrenia today? Do you use the word ‘Negro?’” says Stern. In the end, they ended up using the original terms. As a historian, Stern is very much aware of the pitfalls of using interpreting the past through a modern lens. “One can be seduced by big data,” she says. “ You have to precede with caution.” To use this database is to interpret variables set a century ago.

These issues are on Stern’s mind now because the team is working to make the database available to other researchers. It also requires a delicate balance of privacy. Many of the records are now old enough that they are publicly available in California’s state archives in Sacramento. (The microfilm is the state archives is actually the duplicate set Stern made because, remember, the originals were lost.) But is that the same thing as putting them online for anyone to search? Especially if hundreds of these patients, as the recent paper suggests, are still alive?

Stern’s team, too, has had to balance their desires to learn more with the wishes of the people whose lives they are studying. The records are wealth of intimate information. And despite analyzing the data on a macro level—831 survivors is a statistical estimate based on age—they haven’t gone looking for them individually. As a researcher, says Lira, “Ethically, it feels wrong to look at someone’s medical records and try to reach out and contact them.”

Stern says she has heard from family members of sterilized patients, who reached out after learning about her research. In one case, she also managed to track down the sister of a woman who had been sterilized and later filed a lawsuit. The sister called Stern back but said she didn’t want to talk anymore because the past was too painful. Ideally, says Stern, the state of California would take on the task of finding these patients and compensating them.

“It’s more present in my mind now, that there’s a story for every single data point,” says Novak, who as an epidemiologists usually works with statistics. That filing cabinet in Sacramento held 20,000 forgotten stories, now with 20,000 names to go with them.