When the telephone rang in the kitchen of his townhouse in Lambertville, New Jersey, one morning last April, Roy O. Freedle had just taken his blood-pressure medicine. That was good, he thought later, because it turned out to be a stressful conversation. The caller was Drew Gitomer, the senior vice-president for research and development at the Educational Testing Service, where Freedle had worked for more than thirty years as a research psychologist, before retiring in 1998. The two men were of different generations—Gitomer was forty-six and Freedle was sixty-nine. They had not known each other well when Freedle was still at ETS, but the older man had a good idea why the younger man was calling.
Just a few days before, a long article by Freedle had appeared in the Harvard Educational Review arguing that the most important test in America, the SAT, was racially biased. Previous work on bias in the SAT, he wrote, had failed to point out that African-Americans were doing better on harder questions of the test than non-Hispanic whites with the same SAT scores. Minority students, along with culturally deprived whites with similarly hidden abilities, he argued, should have an assessment of their previously undiscovered talents shown to colleges so that they could get fairer decisions from admissions committees. Because ETS and its New York-based client the College Board had given birth to the SAT, still depended on it for much of their income, and spent considerable time and energy trying to keep biased questions out of the exam, Gitomer was none too pleased by Freedle's argument.
The SAT I, called the Reasoning Test by the College Board, is a three-hour, mostly multiple-choice test of verbal and mathematical knowledge and skills. In its three quarters of a century as a college-entrance examination it has become a giant; more than 2.2 million students took the test in the 2001-2002 school year, some more than once. Nearly as many people last year took the ACT, the SAT's Iowa-based rival, but the SAT gets more attention because it is the prevalent college-admissions test in the major government, financial, and media centers of the East and West Coasts. A revised version of the SAT, to be introduced in March of 2005, will add grammar questions and a written essay, replace quantitative comparisons with second-year algebra questions, and replace analogies with more reading questions.
Plenty of Americans, particularly those familiar with the subtlest forms of ethnic prejudice, think there is something wrong with the SAT, and with other standardized tests. For the high school class of 2002 the average score for a non-Hispanic white student on the 1600-point test was 1060. The average score for a black student was 857, or 203 points lower. (For Asians the average was 1070, and for Hispanics it was slightly over 900.) The gap between blacks and whites on the test is sixteen points greater today than it was in 1992.
If minority students are at a disadvantage in taking the SAT, their choice of colleges will be significantly limited, with important implications for their financial, professional, and social futures. In other words, the SAT is interfering with the pursuit of happiness—a problem that has long absorbed the efforts of education researchers and civil-rights lawyers, with not nearly as much progress as anyone would like.
Freedle's accusation of racial bias in the SAT is striking because it is one of the few ever to come from an experienced ETS professional. Perhaps more important, it has caught the attention of the University of California (a powerful malcontent in the College Board family), which has ordered its own detailed analysis of the issue, due to be completed in 2004. Even if Freedle is ultimately proved wrong, his success at raising doubts about the SAT shows how loose a grip the test has on the political and scientific handholds that keep it upright.
In his book The Big Test: The Secret History of the American Meritocracy (1999), Nicholas Lemann described how the existence of the racial test-score gap and the difficulty of closing it began to dawn on American policymakers in the mid twentieth century, just as the SAT was becoming the arbiter of which young people would get ahead, at least academically, and which would not. One of the first warnings had to do with socioeconomic bias rather than specifically racial bias. Lemann found an item in the diary of Henry Chauncey, the first president of ETS, showing that he had read an article in the April 1948 issue of The Scientific Monthly arguing that tests like the SAT could be biased against low-income students. But Chauncey dismissed it as a "radical point of view." The first prominent case involving the issue of racial bias in testing arose later, and dealt not with college-entrance tests but with employment exams. In 1963 the Illinois Fair Employment Practices Commission ordered a Motorola television factory in Chicago to hire a young African-American who had been denied a job because of his score on an IQ test. ETS and others had been promoting tests in hiring decisions, Lemann wrote, but it never occurred to them that the results would be used to systematically exclude members of minorities. The Motorola case had a profound effect, especially on the debate over the Civil Rights Act of 1964. Senator John Tower, of Texas, inserted an amendment into the act specifically permitting the use of ability tests in employment. "Thus did standardized testing become a part of a landmark law in American history," Lemann wrote.
At about the same time, civil-rights activists, legislators, judges, and educators began using the term "affirmative action" to justify offering opportunities to minority members who otherwise would not have seemed qualified. In the 1960s and 1970s colleges and universities that had more applicants than spaces began to give preference to some minority students who had lower test scores than whites but whose high school grades and personal qualities suggested that they would benefit from a demanding academic environment.
This form of affirmative action was buttressed by the 1978 Supreme Court case Regents of the University of California v. Bakke. The Court ruled 5-4 against the quota system used by the university's medical school, whereby sixteen places out of a hundred in each entering class were reserved for minority students. However, Justice Lewis F. Powell wrote in his opinion that race could be considered in admissions decisions.
Shortly thereafter the College Board set up a fairness-review process that subjected every potential SAT question to close examination for racial stereotypes, loaded words, inappropriate assumptions, or anything else that might put minority students at a disadvantage. Questions dealing with subjects beyond the experience of a typical inner-city student, such as yachting or debutante balls, were thrown out. The College Board, together with ETS, also produced studies showing that the SAT was doing its job and did not make minority academic skills look any worse than they were. The data demonstrated that the SAT predicted about as well as high school grades how a student would do in his or her first year of college. And since selective schools did not want to admit applicants who could not meet their standards, and since they had to have some defensible way to sort applicants whose relative academic merits were sometimes hard to quantify, the SAT—and the ACT—not only survived the occasional assaults on their methodology but continued to grow.