Total (Onscreen) Immersion
A guide to some electronic gadgets that make navigating and learning a foreign language easier. By James Fallows
Thoughts on Writing This Column
James Fallows on what most surprised him about this topic and the biggest development that happened after press time.
Here is a truth I have finally accepted about foreign languages: Some are just impossibly hard. Before you reply “Duh,” in any language, let me explain an underappreciated aspect of the varying difficulties in learning languages, which in turn highlights the importance of some new computerized language tools.
Why are major Asian languages—Chinese, Japanese, Korean—so much harder for English speakers to cope with than, say, Spanish? By the U.S. State Department’s classification system, the “Category Four” languages, “super-hard” for English speakers, also include Arabic, but I know firsthand only about the East Asian ones and will use them as examples.
Chinese and Japanese are hard because of their writing systems, which take their own native speakers many years to master. (Not Korean, whose alphabet can be learned in a day.) Japanese is particularly hard because of its grammar, which has many inflections and distinctions that don’t exist in English. (For their part, Chinese speakers have trouble with some distinctions made in English. In spoken Chinese, the same word is used for both he and she. Even very accomplished Chinese speakers of English often mix the words up: “Your son is getting big; how old is she now?”) And in general, hard languages are hard because they offer so few shortcuts or entry points. I have never studied Portuguese, but I can guess what aeroporto means. The counterpart in Chinese is written and pronounced jichang. In Japanese, and kuko. No one who hasn’t studied them would have a clue.
The problem with hard languages, then, is that so much about them must be learned rather than recognized. That is where the latest versions of Internet-based translation tools come in. Two well-known products are Babel Fish, which is now owned by Yahoo, and the translation system offered as part of Google Language Tools. A Web search for the phrase online translation will turn up many other sites, like imTranslator.net and Worldlingo.com. Most of the sites are free; they generally are fast; and they all have similar limitations. But the best ones do something important, indeed revolutionary, to help make hard languages easier.
What the programs can’t do is obvious. They are strictly for written rather than spoken language. In practice they’re useful to English speakers only for bringing material from another language into English, not for going the other way. (I’m satisfied with a rough-and-ready English version of a Chinese page I’ve been looking at, but unless I’m desperate, I don’t want my words translated into Chinese in a similarly crude form.) As soon as you get tricky with them—having them translate a poem or do a round-trip conversion of a passage from English to Japanese and then back again—they make ridiculous errors. They are no substitute for real human mastery of language, and unless your expectations are realistic, you will find them frustrating rather than helpful.
Realistically, then, what they can do is create the shortcuts and entry points that hard languages normally lack. To return to the aeroporto example: Anyone who knows French, Italian, or Spanish could scan newspapers, product labels, ads, or Web pages in Portuguese and know what they’re about—not every nuance and detail, and with some big gaps and misunderstandings, but the main idea. Someone who can read Chinese can do something similar in Japan. The ability to tell, at a glance, what information is about—this is important, that can be ignored—is a crucial part of operating in any language. If you can do that, you can make progress; if can’t, you’re lost. At their best, the Babel Fish and Google tools (the two I’ve tried extensively) give you this scanning ability with material that would otherwise be impenetrable.
Most of the companies offer a standalone site where you can enter a passage from another language, like a selection copied from an e-mail or other document, and have it converted into English. For Babel Fish, it is tinyurl.com/qgtov; for Google, tinyurl.com/az86s. From those same addresses, you can also follow simple instructions to install buttons in your browser toolbar, which you can then click for an on-the-fly translation of whatever you are looking at online. With Google’s system, if you have a selection marked on a Web page, only that portion will be translated; if you don’t, the entire page is processed. After a delay of one to 10 (or more) seconds, depending on how much text is being handled and how fast your connection is, the material reappears in English. Babel Fish automatically registers what language the page is in and converts it to English; each of Google’s browser buttons is dedicated to translations from a specific language, so you need to install a button for each one. With Google, if you run your mouse over any sentence in the English version, the passage in the original language appears in a bubble to the side. This lets you track what the program has done and become a better translator yourself, if you care to. Babel Fish and Google each handle a range of hard and easy languages, though Babel Fish does not work with Arabic.
The programs are best with predictable, structured information, like official biographies or corporate news releases. Google and Babel Fish are nearly perfect with online-commerce sites: I now confidently buy airline tickets, order supplies, and make hotel reservations on sites that are all in Chinese. They often work well with newspaper stories but have trouble with the slangy and allusive language of blogs. These two programs have subtle differences of emphasis and effectiveness. Just now, on the Chinese government’s main Web site (www.gov.cn), Google rendered the heading above a list of links as “Emergency Management,” and Babel Fish gave it as “anxious management.” In some other cases, Babel Fish has worked better. You can install buttons for both in your browser and use them interchangeably.
Over the past five years the U.S. government has financed an enormous effort to improve “machine translation” systems, so as to process more information from non-English sources. The government’s National Institute of Standards and Technology has conducted annual evaluations of such systems, many from universities or other research groups. NIST avoids using terms like contest and winner. But in 2005, Google’s system was the highest-ranked in all evaluations involving the two hard languages NIST used in its assessment: Arabic and Chinese. Last year, Google ranked highest in six out of eight tests. (The other free online sites, including Babel Fish, did not participate either time.)
Today’s most promising machine-translation systems are fascinating because they are not designed by people who are necessarily expert in Chinese or Arabic—or, for that matter, in English. Instead they reflect the triumph of brute-force statistical analysis. The system’s designers search constantly for documents already translated by skillful human linguists: United Nations documents; scientific papers; commercial books; treaties; European Union material that must be published simultaneously in several languages; multilingual corporate sites. They feed the original and the translation into their system, which amasses an ever-growing record of which sequence of letters and words in one language is most likely to match words in another language. The more data that is fed in, the more precise the correlations become.
“This approach has revolutionized machine translation,” I was told by Franz Josef Och, a computer scientist from Germany who heads Google’s machine-translation team. “The amount of language-specific knowledge you need in order to create the system is astonishingly small.” He said that of the three hard languages Google is concentrating on—Chinese, Russian, and Arabic—the Arabic translating system now produces the most natural-sounding results in English. To see what he means, you can try Google’s translated version of the Arabic page aljazeera.net.
I don’t think it’s too dramatic to say that tools like these can eventually be as important as Web searching. Ever wonder how newspapers in Seoul are playing the latest North Korean threat? Now you can get an idea for yourself. I wondered what the People’s Liberation Army journals were saying about a coming showdown with the United States. I quickly identified the articles that looked most interesting and then worked with a real, human translator to understand the subtle parts.
If that doesn’t sound like much, you’ve never operated through the murk of a hard language. Technology doesn’t always bring us together, but in this case it helps.