NOT LONG AFTER DAWN ONE MORNING IN THE MIDDLE of May, Ken Holberger turned his brown Saab down the road toward the red-brick, fortress-like Data General building. At the beginning of the debugging, it had been dark when he drove in and dark when he left. The slow daily ascent of the morning sun across his windshield was one of the principal means by which he kept track of outside, planetary time. For the past few months, he had found himself incapable of reading newspapers. "We're deep in the debug. Yeah, underground," said Holberger. "Burn-out city."
Holberger was short and very handsome. He wore a nearly trimmed black beard. Rasala called him "a sharp cat." He was considered to be a first-rate circuit designer, and at this point the only member of the group whose understanding of the details of the hardware approached completeness. Though only in his mid-twenties, he had chief responsibility under Rasala for the creation of Eagle's hardware, and he shared Rasala's general attitude toward an engineering project. "It doesn't matter how hard you work on something," Holberger once said. "What counts is finishing and having it work."
Holberger had devised significant parts of Eagle, including some of the IP. He said he was not worried about that piece of hardware. The machine was like a crossword puzzle; he and the other designers had invented it, surely they could solve it. "I'm getting quite good at it," Holberger said. "I can track a problem back into the twilight zone quite well."
Holberger went right to the lab, to see how the machines had performed the night before.
Some weeks previously, the Hardy Boys had run a diagnostic program called "Eclipse 21," and the prototypes had failed occasionally. The debuggers had hypothesized that the problem was likely of the flaky sort, the result of a loose connection or noise, and they had moved on to other diagnostic programs. Now, however, the prototypes had successfully completed all of the basic diagnostics except for Eclipse 21. So Holberger had decreed that it was time to return and clean up that bug. Accordingly, the previous night he had asked that when they departed the lab, the second shift of debuggers leave both prototypes running Eclipse 21.
One characteristic of the diagnostic programs was extreme repetitiousness. They directed the computer to perform many sequences of operations and to do so over and over again. When the machine completed the entire program, it was said to have performed one "pass," and if left running all night, it would perform many passes. If the computer failed to execute an instruction properly the program would tell the machine to send an error message to the system console, a large device with a type writer-like keyboard. Then the program would tell the computer to continue its exercise. In the morning, the debuggers could find out right away, by placing an order through the console, how many passes a machine hac completed during the previous night and how many times it had failed.
When Holberger came into the lab, a tall and fairly husky young man named Jim Veres was sitting at the console of the prototype called Gollum. (The Hardy Boys had christened one of their prototypes "Coke," and Holberger had named the other "Gollum," after the sinister spidery creature in Tolkien's Lord of the Rings.) Veres had already examined the results of the nighttime exercise. Turning to Holberger, he said that Gollum had run 921 passes of Eclipse 21 and had committed only thirty failures. Coke, Veres added, had performed in identical fashion.
Holberger made a dour face. In 921 passes, thirty was a very small number of failures. Noise or a loose connection could produce such a result, but flaky problems usually created failures in no discernible pattern. If the problem was noise, then why had Coke and Gollum performed identically? Veres was thinking along the same lines. To himself, Veres said, "Either that noise is remarkably consistent, or we've got a real problem in the logic somewhere."
Holberger and Veres swapped a few theories about noise, but they were really just taking deep breaths. "Okay," said Holberger at length. "Time to fix it."
Between Holberger and Veres there existed a technical understanding of these machines that speech could not encompass. Most of the debuggers shared this specialist's ESP, but not all debugged well together. Holberger and Veres did. Like many of the Hardy Boys, Veres was just a year out of engineering school; Holberger called him "one of the stars." Veres had designed a large part of the IP, with Holberger's assistance, and he had quickly developed his own technique of debugging.
In talking to Veres, one found him listening intently, sometimes with a stern look on his face. He had his own small computing system. Sometimes, after a long day in the lab, he would go home and play with it. "I like to tinker," he said once. "I like to build things." In his senior year at Georgia Tech, he became interested in digital clocks. "I built four or five. Then it was computer terminals. I built one. Then I decided I ought to have a computer to hook it up to. So I got a microprocessor and then I figured it was not worth much without an operating system, so I wrote a small operating system. I did a number of all-nighters building this computer junk." Veres was enjoying the Eagle project; his only complaint was that lately his managers had been scheduling work in the lab in such a way that he could not always get his hands on Gollum for as much time as he desired.
Holberger and Veres hooked the probes of two logic analyzers to various parts of Gollum. They "put on a trace." The analyzers were the Hardy Boys' windows into the computers. Each analyzer had a little screen; in essence, the analyzer served as a camera. Used artfully, it could take pictures of what was going on inside and between the circuit boards.
Holberger and Veres backed up the diagnostic program just a short distance from the point of failure. They ran the program. Gollum did not fail.
This was, as Veres put it, "another clue." In a machine with accelerators, history is important. Often it is some complex combination of previous operations that leads to a later failure. So they set Gollum running the diagnostic program from the beginning, and went to the cafeteria for coffee. About fifteen minutes later, when they were sitting in front of Gollum again, they saw flashes on the screens of their analyzers. Gollum had failed. They pulled up their chairs and began to study pictures.
What instruction did Gollum fail to execute? That was the first question.
"Okay. It's doing a JSR and Return."
The diagnostic program was telling Gollum to take a short detour off the main road of the program. Gollum was supposed to "Jump" while remembering how to get back ("Save Return") the stream of instructions that it was executing and get a new instruction. This new instruction should have directed Gollum to return to the mainstream of the program. This small series of operations was a spot quiz, as it were. Further study of the pictures on the analyzers' screens and of the various documents scattered across the top of the table told them that Gollum had in fact jumped to the right instruction and had returned to the right place. But when it returned, it executed the wrong instruction.
"Is it hitting the I-cache?" asked Holberger. Was the instruction that Gollum was supposed to execute, after making its jump, residing in the IP's storage? They examined more pictures and decided that Gollum was indeed hitting the I-cache. When they examined the contents of the I-cache, they discovered that it contained a wrong instruction where the right one should have been.
HOLBERGER AND VERES SAT BEFORE GOLLUM.THEY had completed their preliminary investigation of the bug in Eclipse 21, and now the main question lay before them: Why was there a wrong instruction at the right place? The morning and half the afternoon had slipped away. It was two o'clock, and Jim Guyer had just come into the lab. He put down his motorcycle helmet, pulled up a chair, and started asking questions.
Guyer, at twenty-six, was a relatively old hand among Hardy Boys. His brown beard made an oval frame around his face. As a rule, he was very cheerful, much given to laughter, and he had become one of Rasala's favorites, largely because he took an interest in every part of the machine. Although he had had nothing to do with designing it, Guyer had been studying the IP assiduously of late. Indeed, he had been blaming the IP, sometimes incorrectly, for practically everything that went wrong inside the prototypes.