Technology: Lie Detectors

The polygraph industry is thriving, but questions remain about the validity of test results

EVEN ALLOWING FOR the fact that Americans love gadgets, it is hard to explain the popularity of the lie detector. The roots of this invention, more formally known as the polygraph, go back to the turn of the century, when infatuation with the newly discovered powers of electricity more than once overcame common sense. But whereas electric hair-restorers and high-voltage cancer cures have all but vanished, the polygraph persists, and even flourishes. According to the best estimates, one million polygraph examinations are administered each year in the United States; they are used in criminal investigations, during government security checks, and, increasingly, by nervous employers—particularly banks and stores.

Those of us who are not among the one million examined each year have a number of opportunities to experience the polygraph vicariously. Celebrities accused of wrongdoing (John DeLorean comes to mind) seem invariably to offer to submit to polygraph examinations. Then there was the television program Lie Detector, a 1983 show in which F. Lee Bailey, the polygraphist turned lawyer, added the dimension of home entertainment to polygraph tests.

Organized opposition to the increasing use of the polygraph has scored a modest success with the argument that the machine represents an invasion of privacy, especially when the coercive power of the government or an employer is behind it. It is very hard for a job applicant to say no when a prospective employer asks him or her to take a polygraph test; once hooked up to the machine, the applicant typically faces questions not only about past criminal activity but also about matters that an employer may have no business intruding upon, such as sexual practices or gambling—questions asked ostensibly to assess the applicant’s “character.” Retailers sometimes use the polygraph on current employees to ask questions like “Are you relatively satisfied with this job now?” and “Do you intend to stay with this employer?” As a result of such abuses, nineteen states and the District of Columbia have made it illegal for an organization to ask its employees to take polygraph examinations.

Much the same issue is at the heart of the protracted wrangle between the Reagan Administration and Congress over plans for expanded government use of the polygraph. An executive order issued on March 11, 1983, known as National Security Decision Directive 84, would have sanctioned for the first time “adverse consequences” for a federal employee who refuses to take a test when asked. The directive authorized tests to investigate candidates for certain security clearances and to ask any federal employee about leaks of classified information. (It was issued shortly after Reagan’s comment about being “up to my keister” in press leaks.) Almost simultaneously the Department of Defense (DOD) released a draft regulation that authorized use of the polygraph to screen employees who take on certain sensitive intelligence assignments; it, too, prescribed adverse consequences for refusal.

Both proposals are now in abeyance. Although a congressional ban on changes in DOD polygraph policies expired on April 15 of this year, the DOD and Robert McFarlane, the national security adviser, sent assurances to Capitol Hill that they would not proceed with any proposed changes “during the remainder of this session of Congress.” Both left little doubt, however, that the proposals are very much alive. Democratic Representative Jack Brooks, of Texas, has introduced a bill that would kill the proposals once and for all, restrict use of the polygraph to criminal investigations, and require consent from the examinee. In a bow to political reality the bill exempts the Central Intelligence Agency and the National Security Agency from these conditions.

A QUESTION MORE BASIC than whether the polygraph test is an unacceptable invasion of privacy is, of course, whether it works. Seeking an answer in the scientific literature can be a bewildering experience. A report by the Office of Technology Assessment (OTA), commissioned last year by Brooks’s Committee on Government Operations, summed up the problem by citing twenty-four studies that found correct-detection-of-guilt rates ranging from 35 to 100 percent.

The flourishing polygraph industry (it is estimated that there are at least 5,000 professional polygraphists in the country) naturally looks on the bright side of these studies. Sergeant Kenneth Shiflet, of the Colorado State Patrol, a past president of the American Association of Police Polygraphists, says that polygraph results are “at least ninety-five percent accurate.” Joseph Buckley, the president of John E. Reid & Associates, one of the oldest polygraph firms in America, puts accuracy at 85 to 95 percent.

Skepticism comes mostly from psychologists, whose studies report substantially lower accuracy rates, generally around 70 percent. Picking one’s way through the scientific debate is not made any easier by the obvious social and intellectual friction between the two camps. “You have to bear in mind that you have academics coming into this field,” Buckley says. “Many of them are non-trained, non-examiner people.” David Lykken, a professor of psychiatry at the University of Minnesota and a major critic of the validity of polygraph testing, counters that the average training program for a professional polygraph examiner lasts six to eight weeks—“about one-sixth the study time required by the average barber college.”

To evaluate the rival claims, a person needs little more than common sense and a basic knowledge of statistics. The first step is to find out where the numbers that support the accuracy of the polygraph come from. Significantly, the most optimistic of the optimists is Norman Ansley, the chief of the polygraph division of the National Security Agency’s Office of Security. The NSA leads the roster of federal polygraph users; both it and the CIA rely heavily on polygraph testing for pre-employment and routine security screening. The NSA reported giving nearly 10,000 tests in 1982. (CIA numbers are classified.)

A review article written by Ansley last year reported an average of 97.6 percent accuracy in studies in which a polygraphist’s diagnosis of “truthful” or “deceptive” was compared with some independent measure of the subject’s guilt or innocence, such as a subsequent confession or an evaluation of guilt by an expert judicial panel that had reviewed other material evidence in the case. The polygraphists’ diagnoses were correct at least 90 percent of the time in all of these instances.

Yet the research Ansley chose to include in his review is problematic. Even the OTA report, which at times seemed to bend over backward to accommodate all points of view, disqualified from its consideration all but one of the eight field studies that Ansley cited, finding statistical problems with some and dismissing others as “anecdotal.” The OTA report did not elaborate. But one of these “anecdotal” studies was in fact nothing more than a survey of polygraphists, who were asked how many tests they had administered, how many had been verified, and how many of those they got right. Lykken observed in congressional testimony last year that a similar questionnaire sent to astrologers would produce similar results.

Ansley took these already sanguine findings and added an inflation factor. Whenever the polygraphist disagreed with the judicial panel (or with another of the Colorado State Patrol, a past president of the American Association of Police Polygraphists, says that polygraph results are “at least ninety-five percent accurate.” Joseph Buckley, the president of John E. Reid & Associates, one of the oldest polygraph firms in America, puts accuracy at 85 to 95 percent.

Anecdotal studies are almost never acceptable from a statistical point of view, because of the problem of data selection—that is, the tendency of the person collecting the information to seek evidence that confirms his or her preconceived notions. The same statistical sin is more subtly apparent in another series of studies often cited to support polygraphic accuracy. These studies used cases drawn from the files of John E. Reid & Associates which were subsequently verified by confessions. To eliminate one obvious source of bias— that the original examiner may have used information other than the polygraph results in reaching a diagnosis— the charts were rescored by a second Reid examiner on a “blind” basis. The re-examiners’ accuracy rate was 85 to 95 percent.

The cases selected for rescoring did not include any that had stumped the original examiner, however, but only those in which the original examiner’s diagnosis had been confirmed by a subsequent confession. Thus all that the studies really show is that Reid’s examiners score charts the same way; the original examiners might have been right in only one percent of all Reid cases, yet the studies could still show 100 percent “accuracy” if the re-examiners agreed with them on that one percent. This statistic reveals what is known in the business as “reliability”; it tells nothing about the predictive power of the polygraph.

THE PROLIFERATION OF statistical measures of accuracy is itself a source of overstatement of polygraphic accuracy. Proponents of the polygraph will, for example, sometimes cite “correct guilty detections”: the percentage of guilty subjects who are caught by the polygraph. This figure can be very impressive; in one study that does not suffer from the failings already mentioned, it was 98 percent. But the same study found that 55 percent of innocent subjects were also diagnosed as “deceptive.” The handful of studies that used a truly random selection of cases and rescored them blind produced similar results: overall, 83 percent of guilty subjects were diagnosed as “deceptive,” as were 43 percent of innocent subjects. It’s no trick to push the correct-guilty - detection rate to 100 percent—just call everyone “deceptive.” You don’t even need a machine to do that.

These numbers suggest that the polygraph test is biased against innocent people. The problem is accentuated when the test is used in the so-called screening situations envisioned in the Reagan Administration proposals (and already established at the NSA and the CIA). Everyone is tested, but presumably only a very small proportion has done anything wrong. If we assume that one in a hundred employees is a spy (probably a gross overestimate), and if we use the 83 percent correct-guilty-detection rate and the 43 percent falsepositive rate, we find that fifty-one innocent persons will flunk the polygraph test for every real spy who flunks. Any test, whether it is for truth or for cancer, has to be extremely accurate to detect a rare phenomenon without setting off a lot of false alarms in the process. Even if the test were 99 percent accurate for both guilty and innocent detections, one innocent person would be falsely branded for each spy caught. Because of this “base rate” problem, the Federal Bureau of Investigation forbids the use of polygraph dragnets; the tests can be used only after an initial investigation has narrowed the field of suspects.

There are technical problems in addition to the statistical problems of using the polygraph for mass screenings. Even the most ardent advocates of polygraphy do not claim that the machine detects some unique “lying” response; rather, the machine records physiological reactions—blood pressure, respiration rate, skin moisture—that are generally indicative of anxiety or arousal. The standard technique for criminal investigations is to pair relevant questions (“Did you take $50 from the cash register last Tuesday?”) with control questions (“Have you ever stolen anything in your life?”). The idea is that the guilty person will react more strongly to the first question, whereas the innocent person will react more strongly to the second, since (investigators assume) everyone has stolen something in his or her life and is afraid to admit it.

In security or pre-employment screening the relevant questions are reasonable enough (“Have you ever given or sold classified materials to unauthorized persons?”); the problem is what to use for control questions. A list provided by the NSA to congressional investigators includes “Have you ever repeated gossip about a friend?”, “Have you ever cheated on a test?”, and “Do you remember my last name?” As David Raskin, a professor of psychology at the University of Utah and a licensed polygraph examiner, told the Senate Armed Services Committee earlier this year, “One does not have to be a psychologist to recognize the major problem” with such tests. “Since it is readily apparent to the subject that the only important questions in the test are the relevant questions, relatively large reactions would be expected on that basis alone.”

The OTA report found no evidence to support the validity of such screening tests. In fact, some researchers even argue that an examinee can use simple countermeasures, such as biting his tongue or stepping on a nail concealed in his shoe, to fake a strong reaction to the control questions, thus “beating” the test. According to Lykken, one prison inmate, who became the jailhouse polygraph expert after studying the literature, trained twenty-seven fellow inmates in these techniques; twenty-three beat the polygraph tests used to investigate violations of prison rules.

Given all the doubts about validity, why does the government persist in using polygraph tests? Some clues are found in the Defense Department’s 1983 report on polygraph testing—and even in its title, “The Accuracy and Utility of Polygraph Testing,” which suggests that accuracy and utility are two different things. The most that the report concludes about accuracy is that it is “significantly above chance.” Utility, however, is quite another matter: “Where examinees are found deceptive during testing, the confession rate is consistently high.” The report gives example after example of examinees who confessed to criminal or espionage activity. The star is an NSA job applicant who “admitted that his engineering degree was phony (he bought it through mail order from London for $100),” that he “shot and wounded his second wife,” and that his present wife “is missing under unusual circumstances that he would not explain.”

One DOD official who lodged an internal protest against the department’s new polygraph policy received a similar explanation of the polygraph’s utility from General Richard Stilwell, the undersecretary of defense and the architect of the policy. In response to a memo setting out scientific objections to the polygraph in some detail, Dr. John Beary, the acting assistant secretary of Defense for health affairs, received an icy note from Stilwell that skipped over the scientific issues (except to mention Ansley’s review article and how “well received” Ansley had been at congressional hearings) and pointedly asked Beary how he viewed “the utility of the polygraph process in light of its demonstrated value at CIA and NSA.” (Beary’s memo, curiously, was classified “confidential” for three months.)

Among all defense officials who were asked to comment on the new policy, only Beary refused to give it his approval. Soon afterward he left the Defense Department, and late last year Stilwell reported to Secretary of Defense Caspar Weinberger that “following Dr. Beary’s departure, a concurrence has been obtained from OASD (Health Affairs).”

Not long before Beary left, obviously frustrated at getting nowhere with his scientific arguments, he sent another memo to Stilwell. “One could reasonably ask at this juncture,” he wrote, “why has Washington been bamboozled by the polygraph community for over 20 years? I can only speculate that it may have something to do with the poor state of science education in the United States.”

—Stephen Budiansky