The National Data Center and Personal Privacy

The computer age is not to be stayed, as anyone knows who has been billed for another citizen’s charge account or has wondered what has happened to his paid-up magazine subscription. The computer science is already so advanced that experts envisage a huge National Data Center to speed and simplify the collection of pertinent information about Americans. Properly run, it could be a boon. But any person who has seen an FBI file or been party to a U.S. government “security check" has reason to know how the abuse or misuse of dossiers of unevaluated information can threaten an individual’s rights. A professor of law at the University of Michigan here discusses the precaulions necessary to protect citizens from “governmental snooping and bureaucratic spinelessness or perfidy.”Professor Miller has testified on the subject before the Senate Subcommittee on Administrative Practice and Procedure. On page 58, Bob and Ray show what can happen if the safeguards fail.



THE modern computer is more than a sophisticated indexing or adding machine, or a miniaturized library; it is the keystone for a new communications medium whose capacities and implications we are only beginning to realize. In the foreseeable future, computer systems will be tied together by television, satellites, and lasers, and we will move large quantities of information over vast distances in imperceptible units of time.

The benefits to be derived from the new technology are many. In one medical center, doctors are already using computers to monitor heart patients in an attempt to isolate the changes in body chemistry that precede a heart attack. The search is for an “early warning system” so that treatment is not delayed until after the heart attack has struck. Elsewhere, plans are being made to establish a data bank in which vast amounts of medical information will be accessible through remote terminals to doctors thousands of miles away. A doctor will then be able to determine the antidote for various poisons or get the latest literature on a disease by dialing a telephone or typing an inquiry on a computer console.

A committee of the Bureau of the Budget has proposed that the federal government set up a National Data Center to compile statistical information on various facets of our society. Certainly the computer can help us simplify record-keeping by assigning everyone a “birth” number that will identify him for tax returns, banking, education, social security, the draft, and other purposes. This number could also serve as a telephone number, which, when used on modern communication mechanisms, would make it possible to reach its holder directly no matter where he might be.

But such a Data Center poses a grave threat to individual freedom and privacy. With its insatiable appetite for information, its inability to forget anything that has been put into it, a central computer might become the heart of a government surveillance system that would lay bare our finances, outassociations, or our mental and physical health to government inquisitors or even to casual observers. Computer technology is moving so rapidly that a sharp line between statistical and intelligence systems is bound to be obliterated. Even the most innocuous of centers could provide the “foot in the door” for the development of an individualized computer-based federal snooping system.

Since a National Data Center would be augmented by numerous subsystems or satellites operated by state and local governments or by private organizations, comprehensive national regulation of computer communications, whether of federal or nonfederal origin, ultimately will become imperative.

Moreover, deliberations should not be conducted in terms of computer capability as it exists today. New computer hardware is constantly being spawned, machine storage capacity and speed are increasing geometrically, and costs are declining. Thus at present we cannot imagine what the dimensions, the sophistication, or the snooping ability of the National Data Center will turn out to be ten or twenty years from now. Nor can we predict what new techniques will be developed to pierce any safeguards that Congress may set up in order to protect people against those who manipulate or falsify information they extract from or put into the center.

Of course, it would be foolish to prohibit the use of data-processing technology to carry out important governmental operations simply because it might be abused. However, it is necessary to fashion an adequate legal structure to protect the public against misuse of information handling.

IN THE past, privacy has been relatively easy to protect for a number of reasons. Large quantities of information about individuals have not been available. Generally decentralized, uncollected, and uncollated, the available information has been relatively superficial, access to it has been difficult to secure, and most people are unable to interpret it. During the hearings held recently by two of the congressional subcommittees investigating invasions of privacy, however, revelations concerning the widespread use of modern electronic and optical snooping devices shocked us.

In testimony before the House Subcommittee on Invasion of Privacy, Edgar S. Dunn, Jr., a research analyst for Resources for the Future, Incorporated, pointed out that information in the center would not be intelligible to the snooper as are the contents of a manila folder. Computerized data require a machine, a code book, a set of instructions, and a technician in order to be comprehended. Presumably Mr. Dunn’s thesis is that if it is difficult or expensive to gain access to and interpret the data in the center, there is little likelihood of anyone’s trying to pry; if the snooper’s cost for unearthing a unit of dirt increases sufficiently, it will become too expensive for him to try to violate the center’s integrity.

Mr. Dunn’s logic fails to take into account other factors. First, if all the information gathered about an individual is in one place, the payoff for snooping is sharply enhanced. Thus, although the cost or difficulty of gaining access may be great, the amount of dirt available once access is gained is also great. Second, there is every reason to believe that the art of electronic surveillance will continue to become more efficient and economical. Third, governmental snooping is rarely deterred by cost.

Mr. Dunn also ignores a number of special dangers posed by a computerized National Data Center. Ever since the federal government’s entry into the taxation and social welfare spheres, increasing quantities of information have been recorded. Moreover, as recording processes have become mechanized and less cumbersome, there also has been centralization and collation of information. In something akin to Parkinson’s Law, the increase in information-handling capacity has created a tendency toward more extensive manipulation and analysis of recorded data, which, in turn, has required the collection of more and more data. The creation of the Data Center with electronic storage and retrieval capacity will accelerate this pattern.

Any increase in the amount of recorded information is certain to increase the risk of errors in reporting and recording and indexing. Information distortion also will be caused by machine malfunctioning. Moreover, people working with the data in Washington or at a distance through remote terminals can misuse the information. As information accumulates, the contents of an individual’s computerized dossier will appear more and more impressive and will impart a heightened sense of reliability to the user, which, coupled with the myth of computer infallibility, will make it less likely that the user will try to verify the recorded data. This will be true despite the “softness” or “imprecision” of much of the data. Our success or failure in life ultimately may turn on what other people decide to put into our files and on the programmer’s ability, or inability, to evaluate, process, and interrelate information. The great bulk of the information likely to find its way into the center will be gathered and processed by relatively unskilled and unimaginative people who lack discrimination and sensitivity. Furthermore, a computerized file has a certain indelible quality— adversities cannot be overcome simply by the passage of time.

There are further dangers. The very existence of a National Data Center may encourage certain federal officials to engage in questionable surveillance tactics. For example, optical scanners — devices with the capacity to read a variety of type fonts or handwriting at fantastic rates of speed — could be used to monitor our mail. By linking scanners with a computer system, the information drawn in by the scanner would be converted into machinereadable form and transferred into the subject’s file in the National Data Center.

Then, with sophisticated programming, the dossiers of all of the surveillance subject’s correspondents could be produced at the touch of a button, and an appropriate entry — perhaps “associates with known criminals” — could be added to all of them. As a result, someone who simply exchanges Christmas cards with a person whose mail is being monitored might find himself under surveillance or might be turned down when he applies for a job with the government or requests a government grant or applies for some other governmental benefit. An untested, impersonal, and erroneous computer entry such as “associates with known criminals” has marked him, and he is helpless to rectify the situation. Indeed, it is likely that he would not even be aware that the entry existed.

These tactics, as well as the possibility of coupling wiretapping and computer processing, undoubtedly will be extremely attractive to overzealous lawenforcement officers. Similarly, the ability to transfer into the National Data Center quantities of information maintained in nonfederal files — credit ratings, educational information from schools and universities, local and state tax information, and medical records — will enable governmental snoopers to obtain data that they have no authority to secure on their own.

The compilation of information by unskilled personnel also creates serious problems of accuracy. It is not simply a matter of the truth or falsity of what is recorded. Information can be entirely accurate and sufficient in one context and wholly incomplete and misleading in another. For example, the bare statement of an individual’s marital status has entirely different connotations to the selective service, a credit bureau, the Internal Revenue Service, and the social security administration. Consider a computer entry of “divorced” and the different embellishment that would be necessary in each of those contexts to portray an accurate picture of an individual’s situation.

The question of context is most graphically illustrated by the unexplained and incomplete arrest record. It is unlikely that a citizen whose file contains an entry “arrested, 6/1/42; convicted felony, 1/6/43; three years, federal penitentiary” would be given federal employment or be accorded the governmental courtesies accorded other citizens. Yet the subject may simply have been a conscientious objector. And what about the entry “arrested, disorderly conduct; sentenced six months Gotham City jail.” Without further explanation, who would know that the person involved was a civil rights demonstrator whose conviction was reversed on appeal?

Finally, the risks to privacy created by a National Data Center lie not only in the misuse of the system by those who desire to injure others or who can obtain some personal advantage by doing so. There also is a legitimate concern that government employees in routine clerical positions will have the capacity to inflict damage through negligence, sloppiness, thoughtlessness, or sheer stupidity, by unintentionally rendering a record inaccurate, or losing it, or disseminating its contents to people not authorized to see it.

TO ENSURE freedom from governmental intrusion, Congress must legislate reasonably precise standards regarding the information that can be recorded in the National Data Center. Certain types of information should not be recorded even if it is technically feasible to do so and a legitimate administrative objective exists. For example, it has long been “feasible,” and from some vantage points “desirable,” to require citizens to carry and display passports when traveling in this country, or to require universal fingerprinting. But we have not done so because these encroachments on our liberties are deemed inconsistent with the philosophical fiber of our society. Likewise, highly personal information, especially medical and psychiatric information, should not be permitted in the center unless human life depends upon recording it.

Legislation sharply limiting the information which federal agencies and officials can extract from private citizens is absolutely essential. To reinforce these limitations, the statute creating the Data Center should prohibit recording any information collected without specific congressional authorization. Until the quality of the center’s operations and the nature of its impact on individual privacy can be better perceived, the center’s activities should be restricted to the preservation of factual data.

The necessary procedural and technical safeguards seem to fall into two categories: those needed to guarantee the accuracy and integrity of the stored information, and those needed to control its dissemination.

To ensure the accuracy of the center’s files, an individual should have an opportunity to correct errors in information concerning him. Perhaps a print-out of his computer file should be sent to him once a year. Admittedly, this process would be expensive; some agencies will argue that the value of certain information will be lost if it is known that the government has it; and there might be squabbles between citizens and the Data Center concerning the accuracy of the file that would entail costly administrative proceedings. Nonetheless, the right of a citizen to be protected against governmental dissemination of misinformation is so important that we must be willing to pay some price to preserve it. Instead of an annual mailing, citizens could be given access to their flies on request, perhaps through a network of remote computer terminals situated in government buildings throughout the country. What is necessary is a procedure for periodically determining when data are outmoded or should be removed from the file.

Turning to the question of access, the center’s computer hardware and software must be designed to limit access to the information. A medical history given to a government doctor in connection with an application for veteran’s benefits should not be available to federal employees not legitimately involved in processing the application. One solution may be to store information according to its sensitivity or its accessibility, or both. Then, gov - ernmental officials can be assigned access keys that will let them reach only those portions of the center’s liles that are relevant to their particular governmental function.

Everyone directing an inquiry to the center or seeking to deposit information in it should be required to identify himself. Fingeror voice-prints ultimately may be the best form of identification. As snooping techniques become more sophisticated, systems may even be needed to counter the possibility of forgery or duplication; perhaps an answerback system or a combination of fingerand voiceprints will be necessary. In addition, the center should be equipped with protector files to record the identity of inquirers, and these files should be audited to unearth misuse of the system. It probably will also be necessary to audit the programs controlling the manipulation of the files and access to the system to make sure that no one has inserted a secret “door" or a password permitting entry to the data by unauthorized personnel. It is frightening to realize that at present there apparently is no foolproof way to prevent occasional “monitor intrusion" in large data-processing systems. Additional protection against these risks can be achieved by exercising great care in selecting programming personnel.

In the future, sophisticated connections between the center and federal offices throughout the country and between the federal center and numerous state, local, and private centers probably will exist. As a result, information will move into and out of the center over substantial distances by telephone lines or microwave relays. The center’s “network" character will require information to be protected against wiretapping and other forms of electronic eavesdropping. Transmission in the clear undoubtedly will have to be proscribed, and data in machine-readable form will have to be scrambled or further encoded so that they can be rendered intelligible only by a decoding process built into the system’s authorized terminals. Although it may not be worth the effort or expense to develop completely breakproof codes, sufficient scrambling or coding to make it expensive for an eavesdropper to intercept the center’s transmission will be necessary. If information in the center is arranged according to sensitivity or accessibility, the most efficient procedure may be to use codes of different degrees of complexity.

AT A minimum, congressional action is necessary to establish the appropriate balance between the needs of the national government in accumulating, processing, and disseminating information and the right of individual privacy. This legislation must be reinforced by statutory civil remedies and penal sanctions.

Testimony before Congress concerning the intrusive activities of the Post Office, the Internal Revenue Service, and the Immigration and Naturalization Service gives us cause to balk at delegating authority over the Data Center to any of the agencies that have a stake in the content of data collected by the government. Some federal personnel are already involved in mail-cover operations, electronic bugging, wiretapping, and other invasions of privacy, and undoubtedly they would try to crack the security of any Data Center that maintains information on an individual basis. Thus it would be folly to leave the center in the hands of any agency whose employees are known to engage in antiprivacy activities. Similarly, the center must be kept away from government officials who are likely to become so entranced with operating sophisticated machinery and manipulating large masses of data that they will not respect an individual’s right to privacy.

The conclusion seems inescapable: control over the center must be lodged outside existing channels. A new, completely independent agency, bureau, or office should be established — perhaps as an adjunct to the Census Bureau or the National Archives — to formulate policy under whatever legislative guidelines are enacted to ensure the privacy of all citizens. The organization would operate the center, regulate the nature of the information that can be recorded and stored, ensure its accuracy, and protect the center against breaches of security.

The new agency’s ability to avoid becoming a captive of the governmental units using the center would be crucial. Perhaps with proper staffing and well-delineated lines of authority to Congress or the President, the center could achieve the degree of independence needed to protect individuals against governmental or private misuse of information in the center. At the other end of the spectrum, the center cannot become an island unto itself, populated by technocrats whose conduct is shielded by the alleged omniscience of the machines they manage and who are neither responsive nor responsible to anyone.

The proposed agency should be established before the center is planned. To date, there has been virtually no meaningful exchange among scientists, technicians, legal experts, and government people on the implications of the center. The center also might consider supporting some of the planned nonfederal computer networks, such as the Interuniversity Communications Council’s (EDUCOM) plan to link the major universities together, using them as models or operating laboratories to test procedures and hardware for the federal center.

To satisfy those who argue for the early establishment of a purely statistical Data Center, it might be possible for the proposed agency to set up a modest center in which information which does not invade privacy could be made available to government officials, educators, and private researchers. Other federal agencies might establish satellite centers that would contain information too sensitive to be recorded in the statistical center during that institution’s formative period, although the data in satellites ultimately might be transferred to the national center.

THE threat to individual privacy posed by the computer comes from the private sector as well as the proposed federal Data Center. Each year state and local governments, educational institutions, trade associations, and industrial firms establish data centers that collect and store quantities of information about individuals. Because the high cost of computer installation forces many organizations to operate on a time-share basis, the nonfederal centers pose a special danger to privacy. Without effective screening and built-in security devices, one participant, accidentally or deliberately, may invade and extract or alter the computer files of another participant. Moreover, because many time-share systems operate over large geographic areas, their transmissions will be vulnerable to tapping or malicious destruction unless they are scrambled or encoded. Right now, a mailing list containing 150 to 170 million names, accompanied by addresses and financial data, is being compiled. The list is so structured that it yields sublists of people in various vocational and avocational categories. Where the necessary information to produce this monster came from and how one gets off the list are mysteries.

Currently there are more than two thousand independent credit bureaus in the United States, many of whose files are being computerized. Eventually, these bureaus will make a network of their computers, creating a ready source of detailed information about an individual’s finances. The accuracy of these records will become increasingly crucial; an honest dispute between a consumer and a retailer over a bill may produce an unexplained and unexpungeable “no pay” evaluation in the computer and result in considerable damage to the buyer’s credit rating.

In testimony before the House subcommittee, the director of the New York State Identification and Intelligence System described a data bank containing files on “known” criminals that ultimately will contain millions of entries. He expressed a willingness to exchange information with police officials in other states as soon as the state systems could be meshed. If this system is tied into the National Data Center or New York’s Bureau of Motor Vehicles or welfare agencies, it would permit someone to direct an inquiry to the computer file of “known” criminals, find an entry under the name of his subject, and rely on that entry to the subject’s detriment without attempting to verify its accuracy.

Congress should consider the need for legislation setting standards to be met by nonfederal computer organizations in providing information about private persons and restraining federal officers from access to certain types of information from nonfederal data centers. Nonfederal systems should be required to install some protective devices and procedures. This is not to suggest that Congress should necessarily impose the same controls on nonfederal systems that it may choose to impose on the federal center. But a protector file to record the source of inquiries and modest encoding would probably prevent wide-scale abuse, although security needs vary from system to system. Since security may be facilitated by installing protective devices in the computer hardware itself, the possible need for regulation of certain aspects of computer manufacturing also should be taken into account.

The possibility of regulating transmission between federal and nonfederal centers and the interaction among nonfederal centers also should be considered. The specter of a federal agency, such as the Veterans’ Administration, reaching into a citizen’s medical file in a data center operated by a network of hospitals to augment the federal center’s file is a disturbing one. Regulating the security of the transmissions and imposing sanctions for noncompliance and eavesdropping would preserve individual privacy against governmental snooping and bureaucratic spinelessness or perfidy.