The first and bluntest is the “DNS block.” The DNS, or Domain Name System, is in effect the telephone directory of Internet sites. Each time you enter a Web address, or URL—www.yahoo.com, let’s say—the DNS looks up the IP address where the site can be found. IP addresses are numbers separated by dots—for example, TheAtlantic.com’s is 18.104.22.168. If the DNS is instructed to give back no address, or a bad address, the user can’t reach the site in question—as a phone user could not make a call if given a bad number. Typing in the URL for the BBC’s main news site often gets the no-address treatment: if you try news.bbc.co.uk, you may get a “Site not found” message on the screen. For two months in 2002, Google’s Chinese site, Google.cn, got a different kind of bad-address treatment, which shunted users to its main competitor, the dominant Chinese search engine, Baidu. Chinese academics complained that this was hampering their work. The government, which does not have to stand for reelection but still tries not to antagonize important groups needlessly, let Google.cn back online. During politically sensitive times, like last fall’s 17th Communist Party Congress, many foreign sites have been temporarily shut down this way.
Next is the perilous “connect” phase. If the DNS has looked up and provided the right IP address, your computer sends a signal requesting a connection with that remote site. While your signal is going out, and as the other system is sending a reply, the surveillance computers within China are looking over your request, which has been mirrored to them. They quickly check a list of forbidden IP sites. If you’re trying to reach one on that blacklist, the Chinese international-gateway servers will interrupt the transmission by sending an Internet “Reset” command both to your computer and to the one you’re trying to reach. Reset is a perfectly routine Internet function, which is used to repair connections that have become unsynchronized. But in this case it’s equivalent to forcing the phones on each end of a conversation to hang up. Instead of the site you want, you usually see an onscreen message beginning “The connection has been reset”; sometimes instead you get “Site not found.” Annoyingly, blogs hosted by the popular system Blogspot are on this IP blacklist. For a typical Google-type search, many of the links shown on the results page are from Wikipedia or one of these main blog sites. You will see these links when you search from inside China, but if you click on them, you won’t get what you want.
The third barrier comes with what Lih calls “URL keyword block.” The numerical Internet address you are trying to reach might not be on the blacklist. But if the words in its URL include forbidden terms, the connection will also be reset. (The Uniform Resource Locator is a site’s address in plain English—say, www.microsoft.com—rather than its all-numeric IP address.) The site FalunGong .com appears to have no active content, but even if it did, Internet users in China would not be able to see it. The forbidden list contains words in English, Chinese, and other languages, and is frequently revised—“like, with the name of the latest town with a coal mine disaster,” as Lih put it. Here the GFW’s programming technique is not a reset command but a “black-hole loop,” in which a request for a page is trapped in a sequence of delaying commands. These are the programming equivalent of the old saw about how to keep an idiot busy: you take a piece of paper and write “Please turn over” on each side. When the Firefox browser detects that it is in this kind of loop, it gives an error message saying: “The server is redirecting the request for this address in a way that will never complete.”
The final step involves the newest and most sophisticated part of the GFW: scanning the actual contents of each page—which stories The New York Times is featuring, what a China-related blog carries in its latest update—to judge its page-by-page acceptability. This again is done with mirrors. When you reach a favorite blog or news site and ask to see particular items, the requested pages come to you—and to the surveillance system at the same time. The GFW scanner checks the content of each item against its list of forbidden terms. If it finds something it doesn’t like, it breaks the connection to the offending site and won’t let you download anything further from it. The GFW then imposes a temporary blackout on further “IP1 to IP2” attempts—that is, efforts to establish communications between the user and the offending site. Usually the first time-out is for two minutes. If the user tries to reach the site during that time, a five-minute time-out might begin. On a third try, the time-out might be 30 minutes or an hour—and so on through an escalating sequence of punishments.
Users who try hard enough or often enough to reach the wrong sites might attract the attention of the authorities. At least in principle, Chinese Internet users must sign in with their real names whenever they go online, even in Internet cafés. When the surveillance system flags an IP address from which a lot of “bad” searches originate, the authorities have a good chance of knowing who is sitting at that machine.
All of this adds a note of unpredictability to each attempt to get news from outside China. One day you go to the NPR site and cruise around with no problem. The next time, NPR happens to have done a feature on Tibet. The GFW immobilizes the site. If you try to refresh the page or click through to a new story, you’ll get nothing—and the time-out clock will start.
This approach is considered a subtler and more refined form of censorship, since big foreign sites no longer need be blocked wholesale. In principle they’re in trouble only when they cover the wrong things. Xiao Qiang, an expert on Chinese media at the University of California at Berkeley journalism school, told me that the authorities have recently begun applying this kind of filtering in reverse. As Chinese-speaking people outside the country, perhaps academics or exiled dissidents, look for data on Chinese sites—say, public-health figures or news about a local protest—the GFW computers can monitor what they’re asking for and censor what they find.
Taken together, the components of the control system share several traits. They’re constantly evolving and changing in their emphasis, as new surveillance techniques become practical and as words go on and off the sensitive list. They leave the Chinese Internet public unsure about where the off-limits line will be drawn on any given day. Andrew Lih points out that other countries that also censor Internet content—Singapore, for instance, or the United Arab Emirates—provide explanations whenever they do so. Someone who clicks on a pornographic or “anti-Islamic” site in the U.A.E. gets the following message, in Arabic and English: “We apologize the site you are attempting to visit has been blocked due to its content being inconsistent with the religious, cultural, political, and moral values of the United Arab Emirates.” In China, the connection just times out. Is it your computer’s problem? The firewall? Or maybe your local Internet provider, which has decided to do some filtering on its own? You don’t know. “The unpredictability of the firewall actually makes it more effective,” another Chinese software engineer told me. “It becomes much harder to know what the system is looking for, and you always have to be on guard.”