Home » China Slice, External Source, Internet Marketing, Opinion

China Internet Censorship Explained

15 January 2009 No Comment

Since I started posting about censorship I’ve noticed with some surprise that the basics of the system are not clearly understood by many readers outside China. This post is to classify and explain the system in the most simple way possible. It is largely drawn from my own experience as a user in China and from the studies by Rebecca Mackinnon.

The internet censorship in China is a complex system in constant evolution, both technologically and in terms of the content censored. It is managed by the State Council Information Office – Internet Management Division. Until recently it was mostly referred to by foreigners as the Great FireWall of China (GFW), but today the name of Net Nanny is more in use, especially since studies like this one exposed the limitations of the GFW metaphor.

In fact,  both names can be used, as they refer to different mechanisms of the censorship system and they help visualize the basics for non China-dwellers. Man gave names to all the animals, and let’s give clear names to these ones too so that we can avoid further confusion. China’s Censorship system is composed of: the Net Nanny, the Great Firewall (GFW), and the Search Engines Manipulation (SEM). Note the important differences between the three, which can be summarized as follows:

  • the Nanny eliminates content, by forcing self-censorship.
  • The GFW blocks content from access in mainland China.
  • The SEM hides content, making sites unsearchable/invisible.

These three elements or any combination of them are currently used to censor content on the Chinese internet.

1- The Net Nanny

Like a nanny does with naughty kids, the government scolds rebellious citizens who publish content of “vulgar” or political nature. The Net Nanny is the mechanism that controls content by putting pressure on the publishers to self-censor. Of course, Net Nanny methods are only applied when publishers are in some way subject to the power of the Chinese government. Normally because either they are Chinese, have business in China, or have their websites hosted in China.

The Nanny’s power comes from the menace of closing down a page, taking away the business license or directly imposing “stern punishment” on offenders. The Nanny monitors compliance using a large human workforce aided by sophisticated devices that sweep or sniff the data moving about the Chinese internet.  She regularly warns the publishers, either privately or in public inquisitorial lists that make the headlines in Western media.

Final users suffer the Nanny in one of the two following ways:

  1. The site where they read/publish content is found non-compliant and closed down, like recently happened to bullog.
  2. The site where they read/publish content is self-censoring, erasing individual user’s content or refusing to publish it.

In all cases, content censored or “harmonized” by the Nanny is not accessible from anywhere, regardless of the use of coded connections. This content is not blocked, but simply eliminated from the internet.

2- The Great Firewall of China (GFW)

The Great Firewall is a different creature altogether, although closely related. It is another tool that the Information Office uses to control access to content. As opposed to the Nanny, the GFW is not directly  based on human interaction, but rather on a series of technological devices that are able to detect the sensitive content entering the Chinese internet and block it, whether the original site is in China or not. Depending on the devices used, the GFW can come in different flavours, such as “Reset Connection” or “Time Out”, but the result is always the same: the page cannot load in mainland China.

The blocks applied by the Great Firewall of China are often very quick, automated, and without previous notice to the publishers. In fact, it can happen that the owners of the site go for a long time without noticing, especially if China is not an important part of their business.

Other characteristics of the GFW are:

  • It is only visible to users in mainland China.
  • It is erratic and unpredictable, block can last hours or years.
  • It is easy to bypass using coded connections, like VPN or web proxies.
  • It can affect a single post, a website or a whole host/subnet.
  • GFW often tries to disguise itself as technical problems of the Chinese network.

GFW is the most annoying part of the Chinese censorship. One might think that it is not very effective, since it can be bypassed by widely available free proxies. In fact it is extremely effective, due to a mixture of laziness and lack of information of the public. Using myself as an example, there are some excellent blogs I had not visited for months just to avoid the (minor) hassle of connecting through proxy. How many Chinese would go out of their way to access political documents like Chrter 08 that they’ve never heard of and they cannot locate in their Search Engines anyway? (see below SEM)

But the worst aspect of GFW is that it embodies the complete lack of respect of the censors for the individual rights of users. Indeed, to avoid access to a few pages, the GFW regularly blocks whole domains without previous notice, affecting thousands of users that had nothing to do with the non-compliance in the first place. There are many examples of this, one of them is the major blog hosting service “Blogger”, which has been blocked in China for years.

3- The Search Engine Manipulation (SEM)

This is the part of the censorship system specifically dedicated to Search Engines. Technically it is not a new mechanism, but a combination of the previous two. The main difference lies in the essential role of the Search Engines in directing internet traffic, and the enormous potential for manipulaton that Search Result lists provide. Note that SEM refers only to the List of Search Results itself, and not to the possible blocks happening when clicking on one of the individual resuts, which would belong to point (2) above.

When an internet user looks for a term in a Search Engine, he is trusting this Engine to bring him the most relevant results for that Search. A List of Search Results that is manipulated to show only what the government wants to show is one of the most powerful tools of deception, and one that is less obvious to the final user than the plain blocking of websites. The websites that don’t appear on the list are not perceived as “censored”, they are simply “nonexistant”.

Like any other websites, the Search Engines can suffer the 2 kinds of censorship described above.

1- They “harmonize” their Result Lists, following the Nanny.
2- They get some Search Strings blocked by the GFW.

Note that, while (1) is a flagrant case of Search Engine collaboration with the system, in (2) the role of the Search Engine in facilitating the work of the GFW is unclear (*). This makes it difficult to ascertain to which extent companies like Google are collaborating with the Chinese censorship.

I have already done a little study of SEM in a recent post, including the two most used Search Engines in China, and showing that Baidu is by far the champion censor of  the lot.

(*)UPDATE: Following suggestion by international expert Nart Villeneuve: I have introduced a few changes of my own in my SEM post. It is very important to understand the role of Search Engines in GFW censorship: to get the details of this complex question you should read proper research papers like this one, or this one.

Also by same author a suggestion of what could be the 4th and newest animal in the Censor’s farm: application-specific censorship such as the censoring of IM’s by qq and Skype.

A little Study of the Internet Censorship in China

Last Sunday I did a post on internet censorship in China where I mixed in various different ideas and I’m afraid the final result regarding Search Engine Censorship didn’t come out as clear as I would have liked. I think it is an important subject, so here are the complete results:

We will be looking at Google.cn, Google.com and Baidu.com, and we will try in each of them 3 different kind of search terms.

A- Chrter 08: In all its combinations, which are 08宪章 and 零八宪章
B- Political Terms: Tiananmen incidents (天安门六四事件), FLG.
C- Vulgar words: Sex. I will employ the “blog job” and the “chicken bar”.

It is understood that in all cases the search terms are in Simplified Chinese. The browser is Firefox 3.0.5. and the connection is a normal home DSL by China Telecom. The possible results are:

  1. Free Search – Results look consistent and realistic, like the ones obtained in the West.
  2. Reset Connection (RC) – This can only be seen in Mainland China. The result is an image like the one below and the search engine cannot open anymore for a while (I estimate 30 seconds). RC is not directly done by the Search Engine. Wikipedia internal search also gives RCs for B Terms.
  3. Forbidden Message (FM)  – This is the forbidden Message that, with slight variations, is the same as shown below. It says something in the lines of: “Some results are not displayed according to the local laws, regulations and policies”.
  4. Manipulated Results (MR)- This is the case where the results are obviously manipulated, for example in the search of 天 安门六四事件 (Tiananmen incident) on Baidu, where all the results are official newspapers such as People’s Daily, etc. Sometimes it can also carry on top of the page a FM.

Google.com
A -Free Search.   (But click some individual results gives RC).
B- Reset Connection
C- Manipulated Results.

Google.cn
A- Forbidden Message and (sometimes *) Manipulated Results
B- Reset Connection.
C- Forbidden Message. When used “” gives Manipulated Results.

Baidu.com
A- Manipulated Results. When used “” gives Forbidden Message.
B- FM and Manipulated results.
C-FM and Manipulated Results.

Conclusions

1- The results are somewhat erratic and it is difficult to see a pattern: it all looks like a series of patches on top of each other rather than a systematic implementation. Also, things change in time, as in *, where the Manipulated Result I saw Sunday cannot be seen anymore.

2- Baidu has a different system from Google: it has no Reset Connections. This is very advantageous for Baidu and I understand it is unfair competition, as a RC is one of the worst experiences while surfing.

3- This might be due to Google’s own preference server location: the involvement of the Search Engines in the RC is unclear no direct involvement (even Wikipedia has RCs!!) whereas Manipulated Results obviously requires their action, and can more easily attract attention from Advocacy Groups. Of course, in the case of sexual terms (C), this is not a problem as the Manipulated Results can just be called “Safe Search”.

4- The Chrter 08 has different treatment than other political terms, but it might just be because it was banned urgently and suddenly, so it is only a quick fix added to existing structure. It does not provoke RC in any case. It looks like they have decided to leave it alone on Google.com to avoid attention from Western advocacy groups, but in exchange Google has had to give up Google.cn and apply the infamous “porn block” to it which is active censorship by SE. Why the FM and not RC? Who knows, I am guessing perhaps RC is more complicated to implement.

5- In any case, and however negative, I understand it is always better to show FM than Manipulated Results, because the former is openly admitting censorship, whereas the latter is a lie and a distortion of reality. Forbidden Message does increase transparency, yet does not justify involvement in political censorship. From this perspective, Google is closer to the truth than Baidu. Baidu seems indeed a more active participant in the government’s information control schemes, and Chinese users of Baidu are clearly the most exposed to Search Engine brainwash.

UPDATE: Following corrections by international expert Nart Villeneuve below: I have introduced a few changes of my own (in blue). In any case, this post is just a very basic review of the SE Censorship system from the perspective of a normal user. If you really want to understand how the GFW works, you should read proper research papers like this one, or this one.

.

IMAGES:

1- FORBIDDEN MESSAGE (FM)

2- RESET CONNECTION (RC)

Source: ChinaYouRen (January 14, 2009)

  • Share/Bookmark
1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.