The Java Theory
The Internet might someday replace the personal computer -- but for now only conventional software can get you where you need to go
THE personal-computer era is about twenty years old, dating from some point between the founding of Microsoft, in 1975, and the introduction of the IBM Personal Computer, in 1981. People have continually marveled at how fast everything is changing. But never before have the fundamentals of the business changed as quickly or as dramatically as they have in the past year and a half, because of the sudden popularity of the Internet.
Less than two years ago an auditorium's worth of high-powered computing executives watched in fascination as one of their colleagues demonstrated something called the World Wide Web. (A report on the meeting appeared in the July, 1994, issue of this magazine. For those joining us late: The Internet is the supernetwork that links computer networks around the world. "The Web" is a graphically oriented system that makes it easy for someone using one computer connected to the Internet to inspect and collect information from another computer connected to the Internet, no matter where those computers might be. Such transactions have been possible for years, but the Web allows users to perform them merely by clicking on little symbols, which are connected to files or other computers in a series of hypertext links.) Now children in elementary school have their own Web pages, and beer and car companies put their Web-site addresses in their advertisements.
The popularity of the Web has enriched certain companies, particularly Netscape, which makes the most popular "Web browsing" software, and Sun Microsystems, the leading supplier of the "server" computers on which the Internet runs. It has also raised questions about the future of the industry's dominant firm, Microsoft -- a debate that has important implications for all computer users.
The debate concerns whether the growth of the Internet will strengthen or dilute Microsoft's ability to set standards for the software industry. For more than a decade Microsoft's great achievement has been to have its operating systems -- first MS-DOS and then Windows -- accepted as the nearly universal standard for IBM-style computers. This guaranteed an ever-increasing cash flow, since the sale of nearly every computer meant the sale of a Microsoft operating system. It also made Microsoft's word-processing programs, spreadsheets, and other software more attractive than their competitors from, say, Borland International or Lotus Development Corporation, since customers knew they would always be compatible with Microsoft operating systems. Last summer, as the company prepared to release its ballyhooed Windows 95 software, which comes with a built-in Internet connection, those most fearful of Microsoft's influence worried that the company was about to project its standard-making power onto the Internet as a whole. Within a few years, they said, the Microsoft Network might crush CompuServe, America Online, and other commercial networks, and some Microsoft-created browser might supplant Netscape as the industry standard.
Shortly after the release of Windows 95 the members of another camp made their surprising, and opposite, case. They said that far from strengthening Microsoft, the rise of the Internet might impose a limit on the company's expansion and influence. According to this line of reasoning, the question was not whether Microsoft's software was better or worse than anyone else's but whether most kinds of software people now buy were about to become less important.
THIS argument involves the programming language Java, which was developed by Sun Microsystems and which suddenly became famous at the end of last year. The idea behind Java, to oversimplify, is that it could make a computer work like a telephone. The nation's telephone network is an enormously complex software-and-hardware combination, but the average user does not need to think about its complexities for a minute (except for the horror of selecting among long-distance plans). You buy whatever phone you want; you plug it in; it works. When the companies offer some new service -- call forwarding, voice mail -- you don't have to buy extra memory chips for your phone or get an upgrade for your telephone software, as you would in the computer world. You keep using your old phone.
And so it could be in the world of computers, according to the Java theory. Java is a way to let your computer borrow and use programs that exist on the central network -- like the switching and call-forwarding programs on the telephone network -- even though you have never bought or installed them. If you were working on a financial problem, you might locate an Internet site that had data you wanted. Using the Java protocol, that site would instantaneously ship you the most up-to-date version of the particular spreadsheet or statistical routines you needed to analyze the data. When these "applets," or small program components, had solved your problem, they would disappear from your machine. Presumably, in some way not yet specified, you would pay a small fee somewhere, as you do for use of the different services available by telephone. You wouldn't need to worry whether you had found the right files for the latest release of your favorite program. Your computer wouldn't need to be a huge battleship, with more raw power than ran the Apollo project and with hard drives capable of storing gigabytes of complex programs. It would simply need to be able to connect to the Internet and receive and run programs sent by Java (which is compatible with nearly any kind of computer and nearly any operating system). Conceivably Java could lead to the production of stripped-down "Internet terminals," costing $500 or so, which could turn up where pay phones do today. Rather than take portable computers with them, people could send messages from an Internet terminal in one airport as they departed and collect them from another after they arrived.
This vision of telephone-like computers, busily touted by Sun representatives since late last year, might never come true. In the short term it faces the ugly reality that existing Internet phone connections are just too slow and overtaxed to handle the high-capacity traffic a Javaed world would require. When I visited Sun's headquarters last fall and watched a Java demonstration by Eric Schmidt, the chief technology officer, an embarrassing slowdown kept the demo from loading at all. Even Sun's internal lines, it seemed, were congested, because end-of-quarter financial data were being passed around.
Last December, Microsoft seemed to acknowledge the importance of Java when it said that it would license the technology for use with its software. Immediately another of the religious wars that always surround Microsoft's actions broke out in the computer world. Some people said this was a sign that Microsoft, ever realistic, had abandoned hope of bending the Internet to its own standards. Others said that Microsoft, ever resourceful, would figure out a way to put its own proprietary stamp on Java -- much as the company had done a decade before when it made the Macintosh-like graphic interface part of its own Windows.
This argument will play out over the next few years, while we wait for phone lines to become fast enough to give Java a serious trial. In the meantime, the idea behind Java -- that the Internet, which we cannot see or really imagine, will take over functions now performed on the computer -- applies to the part of computing that has always been the most interesting and will become increasingly valuable. This is the ability to find information that we actually want.
SOME computer functions are so deeply ingrained in modern life that they are no longer interesting: word processing, E-mail and data transfer, spreadsheets and financial analysis. The programs that handle these functions are mature and complete enough to be merely tools. At the other extreme are tasks that will never be successfully computerized. I am convinced that I will never see a computer that can take a recording of normal conversation and convert it into text, or one that can translate a document from Japanese to English without numerous howlers. In between are the interesting challenges -- the tasks that are hard but not impossible for computers and potentially very valuable for users.
Information retrieval is at the top of the list. If our own computers give us information by the ton, the Internet can provide it in Krakatoan quantities. It is sometimes hard to find anything whatsoever of value on the Internet, with so many sites listing such things as the personal TV-show preferences of people you have never met or showing "live cam" shots of office workers in Germany. Finding the particular facts you are looking for can seem impossible. Yet in the long run the Internet, which links together many of the world's data resources, should be the ideal research vehicle.
Tools are available for finding information in the limited universe of one's own desktop computer, and for now they are indispensable, if space-hogging and sometimes cumbersome. Back in the primitive old DOS days, I relied on two now-orphaned programs from Lotus -- Magellan and Agenda -- to search for a fact or name in some long-forgotten disk file, and to organize the facts thereby retrieved. (Sorry: once again you'll get no Macintosh references from me. If I had had perfect foresight, I would have started down the Mac road a dozen years ago, because Macintosh systems, while more expensive, are much less harassing to use than IBM PC-style models. But I took the road more traveled.) I still find Agenda indispensable when I want to organize data for a writing project. Like the 1964 Mustang, it has been replaced but not improved upon. For bulk storage of research material -- interview notes, material scanned in from newspaper or magazine clippings, citations from online sources, and so on -- I have come to love and rely upon a program named askSam.
When it made its debut, about ten years ago, askSam was a nerd's delight, from its peculiar name to its arcane command structure to its ugly on-screen appearance. Its compensating virtue was its ability to search large quantities of data quickly. Its newest release, version 3.0, is attractive and easy to use and yet is faster and more powerful than ever. I dump all my research data into a big askSam file (once, my file totaled nearly fifteen megabytes) and within a second or two I can call back any particular document or clipping, using askSam's index of keywords. (The askSam company's phone number is 800-800-1997. A stripped-down but perfectly adequate version of the program costs $149.95. The full-scale "professional" version, which can handle larger files and has other features, costs $395.) As one who still believes that the IBM operating system OS/2 Warp is faster, stronger, and more reliable than Windows 95, I also use a phenomenally speedy Warp retrieval program called Search Manager/2, which can find names or phrases not just in the askSam research files but anywhere on my hard disks. (It costs $79, from Indelible-Blue, 800-776-8284.) While I'm at it, I should mention that the new release of Xerox's optical character-recognition software, TextBridge Professional version 3.0, does the best job I've yet seen of converting printed material into text for a computer. The basic version of TextBridge sells for about $75 from mail-order houses. The "professional" version, with more-advanced editing tools, costs about $250.
But with Java in mind I ask myself, Why should I have these huge research files on my computer at all? Only a tiny fraction of their bulk represents material that can be found nowhere else -- my own interview notes, E-mail, and so on. Most of it is information I wouldn't need to keep on the premises if I were sure I could find and use it when I wanted to. Java proponents say they can spare us from building up stockpiles of programs that we rarely use. Why shouldn't they spare us needless data buildup too?
For one limited set of people this prospect is already a reality. A decade ago journalists following a theme or story kept big yellowing stacks of clippings on their desks, usually until the desks themselves disappeared. Now few of them bother with clippings, because they know they can get a citation when they want it from Lexis-Nexis, the nearly all-inclusive compilation of newspaper and magazine stories, plus TV and radio broadcasts, which is owned by Reed Elsevier. The most obvious drawback of Lexis-Nexis is its very steep cost. Fees are customized for each client, but they range from several hundred to several thousand dollars a month and essentially rule out the service for people not associated with some large institution that can afford it. Even if more people could afford it, Lexis-Nexis would still be significantly limited as a guide to the information network of the future.
The problem concerns its "search engine." Nexis, like most of today's searching programs and databases, works on the principles of Boolean logic that most of us learned in junior high school. The most important words in the Boolean dialect are "and," "or," and "not." If you wanted to use Nexis to find an article that compares President Bill Clinton or Jimmy Carter with Harry Truman but doesn't mention Lyndon Johnson or Dwight Eisenhower, your query would start out "Clinton OR Carter AND Truman AND NOT Johnson OR Eisenhower." Endless refinements go on from there, such as specifying when the article appeared ("Date is 1995") or that the names occur close enough to each other that they're likely to be logically related in the article. For instance, "(Clinton OR Carter) W/20 Truman" would find appearances of the names within twenty words of each other.
Boolean searches on Nexis are extremely fast and accurate -- as long as you know exactly what you are looking for. But if you have a more general query, or if you do not pose your question in precisely the right form, Boolean searches can lead to random, surprising results. If the article you were thinking of appeared one day before or after your target period, or if the names you were looking for were twenty-one rather than twenty words apart, or if owing to a newspaper typo "Clinton" became "Clnton," you would not find those articles, and you would never know how close you came.
This is much better than having no search system -- modern journalism could hardly exist without Nexis -- but it is still an imperfect tool. Nexis also offers a non-Boolean search system, called "freestyle," which allows you to launch more-general queries -- "Show me information about IBM's profits." In my experience freestyle searches are even less effective than the Boolean process in helping locate the data I am looking for.
Some of the most intriguing research now under way in computerdom, which I will describe in another article, involves efforts to impose a "find what I mean" index-and-searching structure on the now-uncharted tracts of the Internet. Mitchell Kapor, the founder of Lotus, has argued that this effort won't go anywhere until some equivalent of the Dewey Decimal System is applied to databases around the world. Only with an idea of where an article or research paper fits into the general structure of knowledge, he says, can the limits of hit-or-miss Boolean searching be overcome.
IN one little corner of the Internet something like Kapor's plan has already been applied. Along with the Dewey Decimal System, one of the most ambitious attempts to categorize information is the "Propaedia," or "Outline of Knowledge," long a part of the Encyclopaedia Britannica. The existence of the Propaedia may in turn explain why the best glimpse of the possibilities of a usable information network may be the one offered by the stodgy-seeming old Britannica.
After nearly a decade of preparation the company introduced two electronic versions of the Encyclopaedia Britannica in the past year and a half. The Britannica is available on a CD-ROM now selling for $495, but anyone who has a modem and an Internet connection should instead consider the Web-based version. It is less expensive ($150 a year for family use); its database is updated regularly; and, as in the Java vision, it doesn't care what kind of machine you use. The Encyclopaedia Britannica's telephone number is 800-323-1229; its Web address, where you can sign up for a seven-day free trial of the program, is http://www.eb.com.
The Britannica's designers used the Propaedia, ingeniously, as an answer sheet against which to judge the success of their retrieval system. Their existing Propaedia structure told them which articles, from various parts of the encyclopedia, should be pulled up by a general query. ("Why did the League of Nations fail?" "Which animals have the longest life spans?") Thus they could adjust the weight of their numerous "search algorithms" to come as close as possible to pulling up all the relevant articles. These algorithms, or formulas, measure the frequency of a word's appearance in an article, whether it shows up in a headline, its proximity to related terms, and, among other factors, whether it takes on a different meaning when used in a compound -- for example, "social security," which has a meaning distinct from that of its component words.
The power and nuance of the resulting system is remarkable. I vaguely remembered that Sir Christopher Wren had been commemorated with a significant inscription in London. I typed in "What is Wren's epitaph?" Within two seconds, and after two clicks of the mouse, the screen displayed several paragraphs about his tomb in St. Paul's Cathedral, which is marked with the phrase Lector, circumspice -- "Reader, if you seek a monument, look around." I could have found this in the paper encyclopedia, if I had bothered. But next I asked "What was the history of slavery in Latin America?" -- and I would never have bothered to track down the dozen or so related articles, from several different volumes, that this inquiry produced.
The Britannica 's searching system is not perfect, but it has always given me something useful. It even reminded me of something I used to know: that Boolean logic is named for the English mathematician George Boole, 1815-1864. I have not yet seen it make an embarrassing mistake. That move it leaves to the user.
The Atlantic Monthly; March, 1996; The Java Theory; Volume 277, No. 3; pages 113-117.