The mirroring routers were first designed and supplied to the Chinese authorities by the U.S. tech firm Cisco, which is why Cisco took such heat from human-rights organizations. Cisco has always denied that it tailored its equipment to the authorities’ surveillance needs, and said it merely sold them what it would sell anyone else. The issue is now moot, since similar routers are made by companies around the world, notably including China’s own electronics giant, Huawei. The ongoing refinements are mainly in surveillance software, which the Chinese are developing themselves. Many of the surveillance engineers are thought to come from the military’s own technology institutions. Their work is good and getting better, I was told by Chinese and foreign engineers who do “oppo research” on the evolving GFW so as to design better ways to get around it.
Andrew Lih, a former journalism professor and software engineer now based in Beijing (and author of the forthcoming book The Wikipedia Story), laid out for me the ways in which the GFW can keep a Chinese Internet user from finding desired material on a foreign site. In the few seconds after a user enters a request at the browser, and before something new shows up on the screen, at least four things can go wrong—or be made to go wrong.
The first and bluntest is the “DNS block.” The DNS, or Domain Name System, is in effect the telephone directory of Internet sites. Each time you enter a Web address, or URL—www.yahoo.com, let’s say—the DNS looks up the IP address where the site can be found. IP addresses are numbers separated by dots—for example, TheAtlantic.com’s is 220.127.116.11. If the DNS is instructed to give back no address, or a bad address, the user can’t reach the site in question—as a phone user could not make a call if given a bad number. Typing in the URL for the BBC’s main news site often gets the no-address treatment: if you try news.bbc.co.uk, you may get a “Site not found” message on the screen. For two months in 2002, Google’s Chinese site, Google.cn, got a different kind of bad-address treatment, which shunted users to its main competitor, the dominant Chinese search engine, Baidu. Chinese academics complained that this was hampering their work. The government, which does not have to stand for reelection but still tries not to antagonize important groups needlessly, let Google.cn back online. During politically sensitive times, like last fall’s 17th Communist Party Congress, many foreign sites have been temporarily shut down this way.
Next is the perilous “connect” phase. If the DNS has looked up and provided the right IP address, your computer sends a signal requesting a connection with that remote site. While your signal is going out, and as the other system is sending a reply, the surveillance computers within China are looking over your request, which has been mirrored to them. They quickly check a list of forbidden IP sites. If you’re trying to reach one on that blacklist, the Chinese international-gateway servers will interrupt the transmission by sending an Internet “Reset” command both to your computer and to the one you’re trying to reach. Reset is a perfectly routine Internet function, which is used to repair connections that have become unsynchronized. But in this case it’s equivalent to forcing the phones on each end of a conversation to hang up. Instead of the site you want, you usually see an onscreen message beginning “The connection has been reset”; sometimes instead you get “Site not found.” Annoyingly, blogs hosted by the popular system Blogspot are on this IP blacklist. For a typical Google-type search, many of the links shown on the results page are from Wikipedia or one of these main blog sites. You will see these links when you search from inside China, but if you click on them, you won’t get what you want.
The third barrier comes with what Lih calls “URL keyword block.” The numerical Internet address you are trying to reach might not be on the blacklist. But if the words in its URL include forbidden terms, the connection will also be reset. (The Uniform Resource Locator is a site’s address in plain English—say, www.microsoft.com—rather than its all-numeric IP address.) The site FalunGong .com appears to have no active content, but even if it did, Internet users in China would not be able to see it. The forbidden list contains words in English, Chinese, and other languages, and is frequently revised—“like, with the name of the latest town with a coal mine disaster,” as Lih put it. Here the GFW’s programming technique is not a reset command but a “black-hole loop,” in which a request for a page is trapped in a sequence of delaying commands. These are the programming equivalent of the old saw about how to keep an idiot busy: you take a piece of paper and write “Please turn over” on each side. When the Firefox browser detects that it is in this kind of loop, it gives an error message saying: “The server is redirecting the request for this address in a way that will never complete.”
The final step involves the newest and most sophisticated part of the GFW: scanning the actual contents of each page—which stories The New York Times is featuring, what a China-related blog carries in its latest update—to judge its page-by-page acceptability. This again is done with mirrors. When you reach a favorite blog or news site and ask to see particular items, the requested pages come to you—and to the surveillance system at the same time. The GFW scanner checks the content of each item against its list of forbidden terms. If it finds something it doesn’t like, it breaks the connection to the offending site and won’t let you download anything further from it. The GFW then imposes a temporary blackout on further “IP1 to IP2” attempts—that is, efforts to establish communications between the user and the offending site. Usually the first time-out is for two minutes. If the user tries to reach the site during that time, a five-minute time-out might begin. On a third try, the time-out might be 30 minutes or an hour—and so on through an escalating sequence of punishments.