What Is a Chinese "Dialect/Topolect"? Reflections on Some Key Sino-English Linguistic Terms

Victor H. Mair
Department of Oriental Studies
University of Pennsylvania


Words like fangyan, putonghua, Hanyu, Guoyu, and Zhongwen have been the source of considerable perplexity and dissension among students of Chinese language(s) in recent years. The controversies they engender are compounded enormously when attempts are made to render these terms into English and other Western languages. Unfortunate arguments have erupted, for example, over whether Taiwanese is a Chinese language or a Chinese dialect. In an attempt to bring some degree of clarity and harmony to the demonstrably international fields of Sino-Tibetan and Chinese linguistics, this article examines these and related terms from both historical and semantic perspectives. By being careful to understand precisely what these words have meant to whom and during which period of time, needlessly explosive situations may be defused and, an added benefit, perhaps the beginnings of a new classification scheme for Chinese language(s) may be achieved. As an initial step in the right direction, the author proposes the adoption of "topolect" as an exact, neutral translation of fangyan

This article is a much expanded and revised version of a paper entitled "Problems in Sino-English Nomenclature and Typology of Chinese Languages" that was originally presented before the Twentieth International Conference on Sino-Tibetan Linguistics and Languages / 21-23 August 1987 / Vancouver, B.C., Canada. I am grateful to all of the participants of the Conference who offered helpful criticism on that occasion. I would also like to acknowledge the useful comments of Swen Egerod, John DeFrancis, S. Robert Ramsey, and Nicholas C. Bodman who read subsequent drafts. Any errors of fact or opinion that remain are entirely my own.


The number of different living languages (Modern Standard Mandarin [MSM] yuyan) is variously estimated to be between about 2,000 and 6,000.*1 If we take as a conservative approximation the arithmetic mean of these two figures, we may say that there are roughly 4,000 languages still being spoken in the world today. Of these, over a thousand are North, Central, and South American Indian languages whose speakers number but a few thousand or even just a few hundred. Another five hundred or so languages are spoken by African tribes and nearly five hundred more by the natives of Australia, New Guinea, and the islands of the Pacific. Several hundred others are the by and large poorly studied tongues of scattered groups in Asia (e.g. Siberia, the Himalayan region, etc.) .

This plethora of tongues can be broken down, first, into major "families" (MSM yuxi) that are presumed to have derived from the same "parent" language. Thus we have the Indo-European, Semito-Hamitic, Ural-Altaic, Sino-Tibetan, Dravidian, Malayo-Polynesian, African, American Indian, and other families.

The next level below the language family is the "group" (yuzu). When classifying members of the Indo-European family of modern languages, for example, one usually thinks in terms of its main groups (Indic, Iranian, Hellenic, Romance, Celtic, Germanic, Slavic, Baltic) and the individual languages belonging to them. Thus, in the Indic group there are Hindi-Urdu, Bengali, Punjabi, Marathi, Gujarati, Singhalese, Assamese, and others. In the Iranian group, there are Pashto, Farsi (Persian), Kurdish, and Baluchi. In the Celtic group, there are Irish, Scots Gaelic, Breton, and Welsh. In the Romance group, there are Rumanian, Rhaeto-Romanic (Romansch), Italian, French, Provençal, Catalan, Spanish, and Portuguese. In the Germanic group, there are (High) German, Low German, Dutch, Frisian, English, Danish, Swedish, Norwegian, and Icelandic. In the Slavic group, there are Russian, Ukrainian, Bulgarian, Macedonian, Serbo-Croatian, Slovene, Czech, Slovak, and Polish.

A similar classification scheme may be applied to the still somewhat hypothetical Sino- Tibetan language family. Among its groups are Sinitic (also called Han), Tibeto-Burmese, Tai (or Dai), Miao-Yao, and so on. If we consider Sinitic languages as a group of the great Sino-Tibetan family, we may further divide them into at least the following mutually unintelligible tongues: Mandarin, Wu, Cantonese (Yue), Hunan (Xiang), Hakka, Gan, Southern Min, and Northern Min.*2 These are roughly parallel to English, Dutch, Swedish, and so on among the Germanic group of the Indo-European language family. If we pursue the analogy further, we may refer to various supposedly more or less mutually intelligible*3 dialects of Mandarin such as Peking, Nanking, Shantung, Szechwan*4, Shensi*5, Dungan*6 and so on just as English may be subdivided into its Cockney, Boston, Toronto, Texas, Cambridge, Melbourne, and other varieties. The same holds true for the other languages in the Sinitic and Germanic groups. Where Dutch has its Flemish and Afrikaans dialects, Wu has its Shanghai and Soochow forms. Likewise, Yue has its Canton, Taishan, and other dialects; Xiang has its Changsha, Shuangfeng, and other dialects; Hakka has its Meishan, Wuhua, and other dialects; Gan has its Nanchang, Jiayu, and other dialects; Southern Min has its Amoy, Taiwan, and other dialects; and Northern Min has its Foochow, Shouning, and other dialects. For the purposes of this article, we do not need to enter into the matter of sub-dialects.

Another level of classification is the "branch" (yuzhi) which embraces several closely related languages of a group. Germanic, for instance, has two surviving branches --West (German, Dutch, Frisian, English) and North or Scandinavian (Icelandic, Faeroese, Norwegian, Swedish). The Altaic group has a Turkish branch (Uighu., Kazakh, Uzbek, Tatar, Kirghiz, etc.), a Mongol branch (Kalmuk, Buryat, etc.), and a Tungusic branch (Manchu, Sibo, etc.). Determination of the branches of the Sinitic group of languages has not yet been achieved.

Thus far in our investigation, we have determined that all the many natural tongues of the world are commonly classified (in descending order of size) into the following categories: family, group, branch, language, dialect, sub-dialect. Is "Chinese" (it remains to be seen exactly what this means) so utterly unique that it cannot fit within this scheme, but requires a separate system of classification?


There are three main problems that I wish to address in this article. First, what is the proper translation of fangyan in English? Second, how shall we refer to the national language of China? Third, what do we mean when we speak of "the Chinese language"? All three of these questions. as we shall see, are intimately related, but the latter two are dependent on the first. If we can arrive at a precise understanding of the word fangyan, the solution to the other two questions becomes relatively easy. Although the translation of the word fangyan superficially seems to be a simple, innocent matter, it is actually quite the opposite. How we translate (hence explicate and comprehend) the word fangyan has a direct bearing on the typology of Chinese language(s). If we do not establish clearly the meaning of this key term fangyan, it is quite possible that our entire analysis of Sino-Tibetan languages will be flawed.

Pre-modern definitions of fangyan usually stress the crudity or non-standardness of its exemplars. Li Tiaoyan, for example, states that the syllable fang (literally "place" or "locale") of the word fangyan refers to its vulgarity (fang zhe bisu zhi wei).*7 This sort of explanation is obviously incompatible with such linguistic expressions as "prestige dialect". Therefore, we must search further in an attempt to understand the real meaning of fangyan and its proper relationship to "dialect".

The first thing we have to recognize is that fangyan (unlike yuxi ["language family"], yuzu ["language group"], and yuzhi ["language branch"] which are neologisms based on the corresponding Western terms) is a word of hoary antiquity.*8 It goes back at least to Yang Xiong's (53 BCE -18 CE) famous work of the same name. It has long been fashionable to translate fangyan invariably, unreflectively ,and automatically as "dialect". In my estimation, this is wrong, because the Chinese word simply does not mean what we normally imply by "dialect", viz. one of two or more mutually intelligible varieties of a given language distinguished by vocabulary, idiom, and pronunciation. This is not, of course, to assert that fangyan and "dialect" never coincide, only that their semantic range is markedly different.

As proof of the great disparity between fangyan and "dialect", we need only take note of the fact that, during the last dynasty, the former was applied by Chinese officials and scholars who drew up bilingual glossaries to such patently non-Sinitic languages as Korean, Mongolian, Manchu, Vietnamese, and Japanese.*9 Here it is obvious that fangyan should not simplistically be equated with "dialect". There are even late Qing period (1644-1911) texts that consider Western languages to be fangyan We find, for instance, the following passage in Sun Yirang's Zhouli zhengyao [Essentials of Government from the Zhou Ritual], "Tongyi [Translation]":

Now it is appropriate to establish fangyan bureaus on abroad scale in each province in order to ensure that Oriental and Occidental languages will be known to all. In spite of the vast distance across the oceans which separates us, we will be as though of one household. When we have achieved this, there will be no conflict with the exercise of Western governance and the acquisition of Western arts. This, then, is the prosperous path to a commonality of language between Chinese and the rest of the world.10

It is clear that what we are dealing with here are not dialects at all and not even different Sinitic languages, but wholly non-Chinese and non-Sino-Tibetan languages. Consequently, it is manifestly inappropriate in such cases to render fangyan as "dialect".

There are dozens of modern definitions for fangyan, most of them colored by notions of "dialect" to one degree or another, but never completely equivalent to the Western concept. As a typical recent example, we may cite Xing Gongwan's three conditions for declaring the speech of the members of two communities to be fangyan rather than separate languages:

  1. they share a common standard language.
  2. they share the same script.
  3. they can converse (jiaotan) directly or can converse with a bit of effort.*11

Item three is merely the standard mutual intelligibility test. This would seem to be a sensible measure for determining whether the two speech patterns in question are separate languages or are fangyan of a single language. Xing, however, quickly negates it by declaring that we "cannot merely take the ability to converse (tanhua) as the standard for differentiating between fangyan and language." This means that Xing relies more heavily on the first two conditions as yardsticks for deciding what is or is not a fangyan. They have, however, nothing whatsoever to do with determining what is or is not a dialect. According to the first condition, all the African languages that are found in areas where Swahili serves as a lingua franca would have to be considered fangyan, presumably of this highly Arabicized Bantu language. Numerous other examples could be imagined where this condition, while suitable perhaps for determining what a fangyan is, would be utterly ludicrous in attempting to establish a given dialect.

Xing's second condition is even less viable as a measure for determining whether or not a given speech pattern should be considered a dialect or a language. In truth, since most varieties of Chinese speech have never existed in a written form and certainly cannot be accurately recorded in conventional tetragraphs (fangkuaizi ["square graphs] " or hanzi ["sinographs"] the native terms for our quaint "characters"]), I fail to see how this condition can serve as a norm for determining what is or is not a fangyan. Furthermore, if we are to accept it, English, Dutch, Turkish, Vietnamese, Hawaiian, Navaho, Zhuang, Croatian, Hausa, Hokkien, and hundreds of other tongues from virtually every language family on earth would have to be considered as fangyan of each other. If such be the case, then fangyan is reduced to meaning no more than yuyan and the possibility of rational linguistic classification is terminated.

Xing goes on to add that a fangyan may also be considered as a descendant (houdaiyu) of a parent language. Thus, he stipulates the fourth condition for determining a fangyan. "If the speech [patterns] of two different areas are both the descendants of a single relatively primitive (yuanshi) language, then the difference between them is that of fangyan [not of language]." If we apply this rule in the determination of the affiliation of certain Iranian languages, then it transpires that Khotanese, Sogdian, Persian, and Pashto are all fangyan of Avestan. This may be so, but by no stretch of the imagination can they be said to be dialects of Avestan.

Let us examine one more modern Chinese textbook definition of what a fangyan is. This is from the widely quoted Essentials of Chinese Fangyan by Yuan Jiahua, et al.: "Fangyan are the heirs (jicheng) or offspring (zhiyi) of a common language (gongtongyu). A fangyan possesses certain linguistic features that are different from those of the fangyan to which it is related. In the historical period, they are invariably subordinate to the unified standard of a nation (minzu de tongyi biaozhun)."*12 This is a prescriptive ad hoc definition that applies to the Chinese case alone, and then only in a forced way. The putative common speech of China today is Modern Standard Mandarin, but it is not historically accurate to state that the fangyan derived from it or are its offspring. In fact, it would be easier to make the opposite case. Indeed, the authors themselves express dissatisfaction with and reservations about the ability of their definition to cope with actual linguistic situations (i.e. it is only a hypothetical construct). Regardless of its applicability to fangyan, however, this definition most assuredly cannot be used to determine what is or is not a dialect.

We have seen repeatedly that fangyan and dialect signify quite different phenomena. It is no wonder that massive confusion results when one is used as a translational equivalent of the other. The abuse of the word fangyan in its incorrect English translation as "dialect" has led to extensive misinformation concerning Chinese language(s) in the West For example, a recent dictionary of linguistic terms has the following entry:

The distinction between 'dialect' and 'language' seems obvious: dialects are subdivisions of languages. What linguistics (and especially SOCIO-LINGUISTICS) has done is to point to the complexity of the relationship between these notions. It is usually said that people speak different languages when they do not understand each other. But many of the so-called dialects of Chinese (Mandarin, Cantonese, Pekingese) are mutually unintelligible in their spoken form. (They do, however, share the same written language, which is the main reason why one talks of them as 'dialects of Chinese'.)*l3

In spite of the healthy skepticism evidenced by the qualifier "so-called", the author succumbs to the pernicious myth that Chinese is "different" and therefore not susceptible to the same rules as other languages.*l4 First of all, the vast majority of Chinese languages have never received a written form. Mandarin, Fuchow, Cantonese, Shanghai, Suchow, and the other major fangyan do not share the same written language. I have seen scattered materials written in these different Chinese fangyan, both in tetragraphs and in romanized transcription, and it is safe to say that they barely resemble each other at all. Certainly they are no closer to each other than Dutch is to English or Italian to Spanish. The discrepancies between the major Chinese fangyan in phonology, lexicon, orthography, and grammar are so great that it is impossible for a reader of one of them to make much sense of materials written in another of them. This is an entirely different matter from that of classical (wenyan), vernacular (baihua), and mixed classical-vernacular (banwen-banbai) prestige, non-local or national written styles that are read by literate individuals throughout China according to the pronunciation of their local fangyan. Written Sinitic, with exceedingly few exceptions, has been restricted to some type of Classical Chinese or Mandarin, since the other languages of the group have never developed orthographical conventions that were recognized by a substantial segment of their speakers. In sum, regardless of the fact that such statements are almost universally accepted among Western treatments of Chinese language(s), it is false (or at least dangerously misleading) to claim that all the Chinese "dialects" share the same written language. It is also frequently asserted that, while there may be enormous differences in vocabulary, pronunciation, and idiom among the major spoken Chinese fangyan, they basically share the same grammar. This assumption, too, remains to be proven, both in absolute and in comparative terms.

One solution to the dilemma might be to create a new English word that intentionally has all the ambiguities of fangyan. This John DeFrancis has attempted to do in his important recent book, The Chinese Language: Fact and Fantasy, where he ingeniously proposes "regionalect ".*15 As an alternative, I would suggest "topolect", which -- aside from being fully Greek in its derivation -- has the added advantage of being neutral with regard to the size of the place that is designated whereas "region" refers only to a rather large area. Both of these words patently represent efforts to render the literal semantic content of fangyan. Their drawback is that they do not fit into established Western schemes for the categorization of languages.

Another solution would be for linguists writing in Chinese to devise a more etymologically precise translation for "dialect" (e.g., xiangyan ["mutual-speech"]), but it is highly unlikely that such an unnatural coinage would make any sense to all but an extremely small circle of cognoscenti. Or fangyan could be rigorously redefined so as to match "dialect" more exactly, but it might be difficult to detach the word from its long heritage of a much broader signification. In general, it would seem prudent to use a word like "regionalect" or "topolect" to translate fangyan in Chinese passages that are not written according to the standards and practices of modern linguistic science. On the other hand, in analyses that are originally composed in English, we should eschew such usages and the concepts of language taxonomy that they imply. We must do so in order to avoid mixing two differenf and pmtmlly incompatible, systems of classification.

The authoritative, new Language volume 16 of the Chinese Encyclopedia [Zhongguo da baike qmhu] divides the Sino-Tibetan language family as follows:

  1. Sinitic (Hanyu)
  2. Tibeto-Burman group (Zang-Mian yuzu)
    1. Tibetan branch (Zang yuzhi)
      1. Tibetan
      2. Jiarong
      3. Monba
    2. Jinghpaw branch (Jingpo yuzhi)
    3. Yi branch (Yi yuzhi)
      1. Yi
      2. Lisu
      3. Hani
      4. Lahu
      5. Naxi
      6. Jino
    4. Burmese branch (Mian yuzhi)
      1. Atsi/Zaiwa
      2. Achang
    5. Branch undetermined
      1. Lhoba
      2. Deng
      3. Drung
      4. Nu
      5. Tujia
      6. Bai
      7. Qlang
      8. Primmi
  3. Miao-Yao group (Mi-Yao yuzu)
    1. Miao branch (Miao yuzhi)
      1. Miao
      2. Bunnu
    2. Mienic/Yao branch (Yao yuzhi)
      1. Yao
      2. Mian
    3. Branch undetermined
      1. She
  4. Zhuang-Dong group (Zhuang-Dong yuzu)
    1. Zhuang-Dai branch (Zhuang-Dai yuzhi)
      1. Zhuang
      2. Bouyei
      3. Dai
    2. Kam-Sui I Dong-Shui branch (Dong-Shui yuzhi)
      1. Kam
      2. Sui
      3. Mulam
      4. Maonam
      5. Lakkja
    3. Li branch (Li yuzhi)
      1. Li
    4. Gelao branch
      1. Gelao

It is most curious that the Tibeto-Burmese, Miao-Yao, and Zhuang-Dong groups are subdivided so extensively into various branches and individual languages within branches, whereas Sinitic --which has by far the largest number of speakers of the four groups --is presented as a monolithic whole. It is not even explicitly identified as a group, although its position in the taxonomical chart makes clear that it is. Still more intriguing is the fact that in the Nationalities volume of the Chinese Encyclopedia, Sinitic (Hanyu) is characterized as "analogous / comparable / equivalent to a language group" (xiangdang yu yige yuzu).*17 This is now a common formulation among linguistic circles in China for designating Hanyu. Well, is Hanyu a yuzu or is it not? Apparently it is almost a yuzu yet not quite. Yuzu seems to be perfectly synonymous with "language group" when it is applied to Tibeto-Burman, Miao-Yao, and Zhuang-Dong but not when it is applied to Sinitic. What makes Chinese or Sinitic (i.e. Hanyu in the broadest sense) so unusual? Is it something in the structure or nature of the language(s)? Or are there other, non- linguistic constraints operative? These are very important questions because they indicatethat there is a powerful hidden agenda behind this superficially innocuous dictum. When queried about the reasons for this intentionally vague --yet highly significant --wording ("analogous / comparable / equivalent to a language group"), Chinese scholars have repeatedly and confidentially told me on many occasions that Hanyu -- on purely linguistic grounds alone -- really ought to be considered as a group (yuzu), but that there are "traditional", "political", "nationalistic" and other factors that prevent them from declaring this publicly. These concerns may be temporarily unavoidable inside China, but it is regrettable that the y are also still being purveyed in purportedly authoritative treatments of language intended for external consumption. Li and Thompson unabashedly confess that they subscribe to the aberrant classification of Sinitic languages as dialects, "even though it is based on political and social considerations rather than linguistic ones."*18 I would suggest that linguistic science outside of China need not be governed by these factors and, as such, we should feel free to refer directly to Sinitic as a language group comparable to Romance, Germanic, Tibeto-Burmese, Turkic, and so forth. The fact that Sinitic (Hanyu) is ranked parallel to Tibeto-Burman, Miao-Yao, and Zhuang-Dong in the above classification scheme of the Sino-Tibetan language family implies that -- except for non-linguistic criteria -- Chinese scholars themselves virtually accept it as a language group. As with much else in contemporary China, one may think that something is true of language, but one may not necessarily be willing to say that it is so.

We must now turn to the proper designation of the current national language of China. I say "current", because the official elevation of Northeastern Mandarin (with Peking pronunciation as the basis for the standard) to that status is a fairly recent phenomenon.*19 If we accept, as I have tried to demonstrate above, that Sinitic or Chinese is a language group rather than being merely a single language, then we must choose another name for the current national language which is one member of that group. The usual designation in English is "(Modern Standard) Mandarin" but it is just as often loosely called "Chinese". Let us examine what justifications, if any, there are for these usages.

Zhang Gonggui and Wang Weizhou *20 list the following synonyms and near-synonyms for "(Standard) Mandarin": putonghua ("ordinary speech"), hanyu ("Han language"), hanwen ("Han writing"), zhongwen ("Central Kingdom] writing"), zhongguohua ("Central Kingdom speech"), zhongguoyu ("Central Kingdom language"), zhongguo yuwen ("Central Kingdom spoken and written language"), huayu ("[culturally] florescent speech"), huawen ("[culturally] florescent writing"), guanhua ("officials' speech"), baihua ("plain speech"), baihuawen ("plain speech writing "), xiandaiwen ("modern writing"), dazhongyu ("language of the masses" *21), fangjiyu ("intertopical language"), qujiyu ("interregional language"), gongtonghua ("common speech"), hanzu gongrongyu ("common language of the Han people"), hanzu biaozhunyu ("standard language of the Han people"), hanyu biaozhunyu ("standard language [! --i.e. form] of the Han language"), and the older terms tongyu ("general language"), fanyu ("ordinary language"), yayan ("elegant parlance"), xiayan ("estival parlance"), and so forth. There is no point in my reviewing here the intricacies of each of these designations individually. Suffice it to say that there must be some good reason(s) for this wild prolifemtion of names for what is ostensibly the same phenomenon. At the present moment, I am not willing to speculate on what the reason(s) might be. While there is still such a vigorous debate going on among the Chinese themselves about what to call their national language, however, I believe it would be premature, if not impolitic, to assign a new one ourselves. Indeed, we may eventually have to end up calling the national language of China "ordinary speech" (putonghua), if that is what the Chinese people finally decide is the proper designation for the official language of their country. The Indians call their national language Hindi and we follow suit. Its position among the other languages of the subcontinent is comparable to that of Mandarin among the languages of China Just as it would be strange for us to insist upon calling Hindi "Indian", so is it presumptuous for us to call Mandarin (i.e., putonghua, guoyu, huayu, etc.) "Chinese". (In terms of typology, as I have shown above, it is also imprecise to do so.) For the present, it is best to wait until the Chinese themselves achieve a greater degree of unanimity on this subject before we abandon a word that has served us well since at least 1604 (OED, q.v.).

Historically, the name "Mandarin" is an accurate designation for guanhua, which it was intended to represent. This word entered the English language through Portuguese mandarim. Though influenced in form by Portuguese mandar ("command, order"), it actually goes back to Malay mad which, in turn, was borrowed from Hindi-Urdu. The word ultimately derives from Sanskrit mantrin ("counsellor"), in other words, "official". Hence "Mandarin" means "the language of the officials," ergo "offical speech", which is precisely what guanhua was. Incidentally, the latter word is still in active use, as in the expressions xinan guanhua ("officials' speech of the southwest," i.e. Mandarin with a Szechwanese or Yunnanese accent), Shaoxing guanhua (Mandarin à la Lu Xun), Lanqing guanhua (Mandarin à la Chiang Kai-shek), nunfang guanhua ("officials' speech of the south") which latter, ironically, refers to Cantonese, and so forth.

Although (Modern Standard) Mandarin (Guanhua, Putonghua, Guoyu, Huayu, or what have you) is the official national language of China, it is not the only Chinese language.*22 Therefore, we must be extremely cautious when using such an expression as "the Chinese language".*23

What do we actually mean when we do so? Do we mean Mandarin? Fukienese? The language of Changsha during the seventh and eighth centuries? The language of Anyang during the Shang period? All the Chinese topolects from all periods of history? What would we mean if we were to say the Germanic language, the Slavic language, or the Indic language? In my estimation, unless we establish a unique system of classification for the Sinitic group, it is inaccurate to speak of there being but a single Chinese language.

The claim is frequently made that there are "a billion speakers of Chinese". Are there really? Even supposing that all the people within the presently constituted borders of China speak a Sinitic language (which is far from true), it makes no more sense to refer to "a billion speakers of Chinese" than it would to claim that there are "a billion speakers of Indic", "six hundred million speakers of Germanic", "four hundred million speakers of Romance", "three hundred and fifty million speakers of Slavic", and so forth. Only when we recognize that Chinese (more properly Sinitic), Indic, Germanic, Romance, and Slavic are groups of different languages are these large numbers justified.


Let us assume that the following propositions are true:

  1. fangyan is not equal to "dialect".
  2. Chinese or Sinitic is a group and not a language.
  3. Mandarin is not a dialect but a full-fledged language or, ultimately perhaps more accurately, branch of languages within the Sinitic group.
  4. Modern Standard Mandarin (MSM) is the national language of China
  5. Cantonese, Amoy, Hakka, Hunanese, Hainanese, Taiwanese, Dungan, etc. are distinct languages within the Chinese or Sinitic group.
  6. the branches of the Chinese or Sinitic group remain to be established
  7. "Mandarin" is not synonymous with "the Chinese language".

I am fully cognizant of the fact that the proposals set forth in this article have potential political implications. It is for this reason that I wish to state most emphatically that my suggestions apply only to English usage. I am making no claim about how the Chinese government or Chinese scholars should classify the many languages and dialects of their country. My only plea is for consistency in English linguistic usage. If we call Swedish and German or Marathi and Bengali separate languages, then I believe that we have no choice but to refer to Mandarin and Cantonese as two different languages. At the very least, if diplomatic or other considerations prevent us from making such an overt statement, we should refer to the major fangyan as "forms" or "varieties" of Chinese instead of as "dialects". If Chinese scholars wish to classify them as fangyan ("topolects"), that is their prerogative, and Western linguists should not interfere. So long as fangyan and "dialect" are decoupled, there is no reason that the proposed English usage should cause any disturbance among speakers of Chinese language(s).

Unless the notion of dialect is somehow separated from politics, ethnicity, culture, and other non-linguistic factors, the classification of the languages and peoples of China can never be made fully compatible with work that is done for other parts of the world. Take the language of the Hui Muslims, for example. They are considered to be one of China's major nationalities, but it is very difficult to determine what language(s) they speak. Is it a dialect of northwest Mandarin with an overlay of Arabic, Persian, Turkish, and perchance a smattering of Russian and other borrowings? That may be he for the Hui who live in Sinkiang or Ninghsia, but what about those who are located in Yunnan, Canton, Fukien, Kiangsu, Shantung, Honan, Hopei, and so forth?

China's linguistic richness if justly celebrated. Aside from the many Sino-Tibetan languages we examined earlier in this article, there are Turkic languages (Kazakh, Kirghiz, Salar, Tatar, Uighur, Uzbek, Yugur),Mongol languages (Bonan, Daur, Dongxiang, Mongol, Tu) , Tungus-Manchu (Ewenki, Hezhen, Manchu, Orogen, Sibo), and Korean --all from the Altaic family. There are also Malayo-Polynesian languages such as Kaoshan, Austroasiatic languages such as Benglong, Blang, and Va of the Mon-Khmer group and Gin of the Vietnamese group as well as Indo-European languages including Tajik of the Iranian group and Russian of the Slavic group. As reflections of a historically shifting political entity called China, these languages too are "Chinese", but no one would claim that they are Sinitic.

While there may still be differences of opinion about the classification of these dozens of non-Sinitic Chinese languages, their existence mitigates strongly against the use of expressions like "the Chinese language". It is hard for me to think of any situations in which it would be proper to translate Zhongguo (de) yuyan in the singular as "Chinese language" except in an abstract, diachronic sense. It is even harder for me to imagine conditions under which Zhongguo (de) yuyan should be rendered as "the Chinese language". Once we obviate the fangyan / "dialect" problem, however, the issue of how to handle Zhongguo (de) yuyan essentially solves itself. The plural English form then becomes virtually obligatory.

A century ago, Uighur would have been thought of by Chinese scholars as a fangyan (of what we are unsure). Now it has been elevated to the status of an independent yu[yan]. Perhaps, in the future, the speech of Wenchow, Foochow, and Kaohsiung will similarly cease to be thought of as fangyan. Perhaps not. The real question for us now is whether they are dialects or languages. If they are dialects, then we must ask what language(s) they are dialects of and, if they are languages, then we are obliged to find out to which branch and group they belong. Simply to throw up our hands and say that "Chinese is different" is, to my mind, the height of irresponsibility. If we are going to rely on the "Chinese is different" ploy, then we should at least say precisely how it differs from the other language groups of the world. It is also irresponsible to seek refuge in the old canard that "written Chinese is the same for speakers of all Chinese 'dialects'", ergo Wenchow, Foochow, and Kaohsiung speech are "dialects" of "Chinese" because the elites of all three places could write mutually intelligible literary styles. Here we come smack up against the question of the relationship between language and script, between speech and writing. That, however, is the subject for another article.

In conclusion, when writing original linguistic works in English and when translating into English, we must decide whether to adopt terminology that is commensurate with generally accepted linguistic usage or to create an entirely new set of rules that are applicable only to Chinese languages. Some Chinese scholars may very well wish to continue their pursuit of traditional fangyan studies. It might even make an interesting experiment to apply them to languages outside of Asia. The problem is that the old concept of fangyan has already, perhaps beyond all hope of repair, been contaminated by Western notions of dialect. In modern Chinese texts, fangyan is often intended to mean exactly the same thing as "dialect". Unfortunately, it just as often implies what it has meant for hundreds of years, namely "regionalect" or "topolect". Or it may be a confused jumble of the old and the new. Whether we are writing in Chinese or in English or in some other language, it is our duty to be scrupulously precise when using such fundamental and sensitive terms as fangyan and "dialect".

The subject discussed in this article is admittedly an extraordinarily sensitive one, but it is an issue that sooner or later must be squarely faced if Sino-Tibetan linguistics is ever to take its place on an equal footing with Indo-European and other areas of linguistic research. So long as special rules and exceptions are set up solely for the Sinitic language group, general linguists will unavoidably look upon the object of our studies as somehow bizarre or exotic *24. This is most unfortunate and should be avoided at all costs. The early publication of a complete and reliable linguistic atlas for all of China is a desideratum and might help to overcome some of the "strangeness" factor in Chinese language studies, but for that we shall probably have to wait a good many years.*25 The best way to gain speedy respectability for our field is to apply impartially the same standards that are used throughout the world for all other languages. The first step in that direction is to recognize that fangyan and "dialect" represent radically different concepts. 26


  1. Pei, pp. 15- 16, and Berlitz, p. 1, both cite the figure 2,796. Although one would have expected some attrition since it was arrived at more than half a century ago, Ruhlen (pp. 1 and 3) has recently referred to roughly 5,000 languages in the world today. The source of this discrepancy probably lies in Ruhlen's greater coverage and more meticulous standards of classification.

  2. Chinese linguists usually speak of ba da fangyan qu ("eight major fangyan areas"), but there are constant pressures to revise that figure. Government bureaucrats wish to reduce the number to as few as five major fangyan so that it appears Sinitic languages are converging. Fieldworkers, on the other hand, know from their firsthand contact with individual speakers of various localities that the number is in reality much larger (see notes 4, 5, and 6 below). One of China's most open- minded linguists, Lyu Shuxiang (pp. 85-86), speaks of the existence of as many as one to two thousand Chinese fangyan. Most refreshingly, he also suggests that the term fangyan be reserved for specific forms of local speech, such as those of Tiantsin, Hankow, Wusi, and Canton. In a private communication of August 9, 1987, Jerry Norman, an eminent specialist of Chinese fangyan, expressed the opinion that the number of mutual1y unintelligiblevarieties of Chinese (i.e. Hanyu or modern Sinitic) is probably somewhere between 300 and 400.

    Since 1985, a series of exciting revisions of the traditional classification of the major Sinitic fangyan has appeared in the pages of the journal Fangyan [Toplea]. According to this new breakdown, there were 662,240,000 speakers of Guanhua (Mandarin), 45,700,000 speakers of Jinyu (eastern Shansi), 69,920,000 of Wuyu (Shanghai, Chekiang), 3,120,000 of Huiyu (southern Anhwei), 31,270,000 of Ganyu (Kiangsi), 30,850,000 of Xiangyu (Hunan), 55,070,000 of Minyu (Fukien), 40,2 10,000 of Yueyu (Cantonese), 2,000,000 of Pinghua (in Kwangsi), and 35,000,000 of Kejiayu (Hakka) for a total of 977,440,000 speakers of Han (i.e., Sinitic) languages. I am grateful to my colleague Yongquan Liu who reported this information in a lecture given at the University of Pennsylvania.

    There are several interesting features to note about this new division. First is that seven of these major topolects are designated as yu ("languages") while three--including the largest and the smallest --are referred to as hua ("[patterns of] speech"). Of the three newly recognized topolects, Jinyu represents a large splitting off from Mandarin which suggests the possibility that many other comparable units (e.g. Szechwan) may one day do likewise, Huiyu is a breakaway from Wuyu, and Pinghua is a hitherto unknown Sinitic topolect that has been canted out of the Zhuang Autonomous Region. The latter two topolects, being small and poorly defined (in linguistic terms), evince special pleading of the sort that led to a proliferation of ethnic "minoritiesn during recent decades.

  3. 3. On the matter of mutual intelligibility, I follow the most reliable authorities on language taxonomy (e.g. Ruhlen, p. 6). There are, of course, a few well-known exceptions to the mutual intelligibility rule (e.g. the Scandinavian languages, Spanish and Portuguese, Russian and Ukranian, etc.) where, for political reasons, patterns of speech that are partially mutually understandable are referred to as languages rather than dialects. There are also instances where what is essentially the same patterns of speech, when recorded in two different scripts, may sometimes be considered as two languages (e.g. Hindi-Urdu and Serbo-Croatian, but note that the hyphenated expressions recognize the basic identity of the constituent members). In both of these types of exception to the mutual intelligibility rule, it is a matter of overspecification by language rather than gross underspecification by dialect as in the Chinese case. There is no comparable situation elsewhere in the world where so many hundreds of millions of speakers of mutually unintelligible languages are exceptionally said to be speakers of dialects of a single language.

    Laymen often use the word "dialect" in imprecise ways (argot, style of expression, and so on), but that is true of many technical terms that have broad currency outside of a particular scholarly discipline. In this article, to avoid confusion, I shall employ the word "dialect" only in its technical sense as defined in linguistics handbooks and monographs on that topic. Although there are other factors to consider, mutual intelligibility is the most common criteron for distinguishing a dialect from a language. Furthermore, mutual intelligibility is an easy test to administer. Monolingual members of two different speech communities are requested to communicate to each other certain specific information. Each subject is then asked by the administrator of the test or his assistant in the subject's own language about the content of the other subject's communication. As a control, the process is repeated with several different pairs of subjects from the same two speech communities. If less than 50% of the content has been transmitted, the two speech communities must be considered to be two languages. If more than 50% has been communicated, they must be considered to be two dialects of the same language. The 50% figure is actually overly generous. The smooth and uninterrupted flow of ideas and information would require a substantially higher percentage. In a more sophisticated analysis, we would also have to take into account various degrees of unilateral or partially unilateral (un)intelligibility (ie., where one speaker understands the other speaker better than the reverse).

    Regardless of the imprecision of lay usage, we should strive for a consistent means of distinguishing between language and dialect. Otherwise we might as well use the two terms interchangeably. That way lies chaos and the collapse of rational discourse. Mutual intelligibility is normally accepted by most linguists as the only plausible criterion for making the distinction between language and dialect in the vast majority of cases. Put differently, no more suitable, workable device for distinguishing these two levels of speech has yet been proposed. If there are to be exceptions to the useful principle of mutual intelligibility, there should be compelling reasons for them. Above all, exceptions should not be made the rule.

  4. Liang Deman of Sichuan University, an expert on Szechwanese dialects, pointed out to me (private communication of July, 1987), that fifty per cent or more of the vocabulary of the major Szechwan fangyan is different from Modern Standard Mandarin. This includes many of the most basic verbs. Professor Liang emphasized the differences between Szechwan Putonghua and genuine Szechwan fangyan. The former is basically MSM spoken with a Szechwanese accent or pronunciation and a small admixture of Szechwanese lexical items, whereas the latter represent a wide variety of unadulterated tuhua ("patois"), many of them unintelligible to speakers of MSM. My wife, Li-ching Chang, grew up in Chengtu, the capital of Szechwan province, speaking Mandarin with a Szechwanese accent. Although she also speaks MSM, she still is most comfortable when speaking Mandarin with a Szechwanese accent. In the summer of 1987 when we climbed Mt. Emei, however, she was perplexed to find that she could not understand one word of the speech of the hundreds of pilgrims (mostly women in their fifties and sixties) who had come to the mountain from various parts of the province. Making inquiries of temple officials, shopkeepers, and others along the pilgrimage routes who did speak some version of MSM, we learned to our dismay that the women were ethnically Han, that most of them came from within one hundred miles of the mountain, and that they were indeed speaking Sinitic languages. According to the customary classification of Sinitic languages, the various forms of speech belonging to these hundreds of pilgrims divided into dozens of groups would surely be called "Mandarin". Hence we see that even Mandarin includes within it an unspecified number of languages, very few of which have ever been reduced to writing, that are mutually unintelligible.

    These conclusions are borne out by the observations of Paul Serruys, a linguist who was a former missionary among peasants in China:

    In determining what is standard common language and what is not, one must compare the idea of a standard language with the dialects on one hand and the written literary language on the other....
    The masses of the people do not know any characters, nor any kind of common Standard Language, since such a language requires a certain amount of reading and some contact with wider circles of culture than the immediate local unit of the village or the country area where the ordinary illiterate spends his life. From this viewpoint, it is clear that in the vast regions where so-called Mandarin dialects are spoken the differences of the speech which exist among the masses are considerably more marked, not only in sound, but in vocabulary and structure, than is usually admitted. In the dialects that do not belong in the wide group of Mandarin dialects, the case is even more severe. To learn the Standard Language is for a great number of illiterates not merely to acquire a new set of phonetic habits, but almost to learn a new language, and this in the degree as the vocabulary and grammar of their dialect are different from the modern standard norms. It is true that every Chinese might be acquainted with a certain amount of bureaucratic terminology, in as far as these terms touch his practical life, for example, taxes, police. We may expect he will adopt docilely and quickly the slogan language of Communist organizations to the extent such is necessary for his own good. But these elements represent only a thin layer of his linguistic equipment. When his language is seen in the deeper levels, his family relations, his tools, his work in the fields, daily life at home and in the village, differences in vocabulary become very striking, to the point of mutual unintelligibility from region to region.

    The Standard Language must be acquired through the learning of the characters; since alphabetization for the time being gives only the pronunciation of the characters. But in many cases these characters do not stand for a word in the dialects, but only for one in the standard written language. There is often no appropriate character to be found to represent the dialect word. If historical and philological studies can discover the proper character, it may be one that is already obsolete, or a character that no longer has the requisite meaning, or usage in the Standard Language, or a reading comparable to the Pekinese pronunciation.

    While Serruys' conventional Sinological use of the word "dialect" is confusing, the import of his remarks is of great importance, both for spoken Sinitic languages and for their relationship to the Chinese script.

  5. Yang Chunlin of Northwestern (Xibei) University, an expert on Shensi dialects, claims (in discussions with the author held in July, 1987) that there should be at least nine major fangyan areas (cf. note 2 above). His grounds for making this claim include the fact that local varieties of speech in northern Shensi retain the entering tone (rusheng) and are partially incomprehensible to speakers of MSM. By these standards, scores of additional languages would have to be established within the current Mandarin-speaking areas of China alone. There are numerous local speech forms in the north that preserve the entering tone in part or in whole. Other places, like Yentai (on the northern Shantung coast), have not experienced the palatalization of the velars and apical sibilants before high vowels that is supposedly common to all Mandarin "dialects". And so forth.

  6. Now considered by Soviet authorities and its own speakers to be a separate language, Dungan is written in the Cyrillic alphabet and includes a large number of direct borrowings from Russian. Although still formally classified as a dialect of northwest Mandarin, the independent status of Dungan is attested by the lack of comprehension which a group of Chinese and American linguists who are also fluent speakers of MSM experienced upon hearing a tape recording of this language. This event took place at the Ninth Workshop on Chinese Linguistics held at the Project on Linguistic Analysis (Berkeley) from February 15-17, 1990. Even though the auditors had available a text (presumably written in Cyrillic letters) of the story being told, they could "get only a rough understanding." See pp. 343-344 (English) and p. 84 (MSM) in the reports of Lien Chin-fa. I myself experience the same difficulties when travelling in Soviet Central Asia, when entertaining Dungan friends in this country, and when listening to Dungan tapes and records. For a description of Soviet Dungan and references to scholarly articles on the subject, see Mair.

  7. Fangyan zao, preface.

  8. MSM yuyan ("language") is also an older word but, perhaps because it is more nearly synonymous to its Western translations, it does not cause nearly so much confusion in linguistic discussions as does fangyan

  9. Ramsey, The Languages of China, p. 32.

  10. Op. cit., 2.20a.

  11. Hanyu fangyan diaocha jichu zhishi, p. 4.

  12. Op. cit., pp. 6-7.

  13. Crystal, Dictionary, p. 92.

  14. Still more recently, however, the same author has shown that he is no longer swayed by such non-linguistic factors. In a remarkably straightforward and long overdue reappraisal, Crystal (Encyclopedia, p. 312a) cuts through centuries of obfuscation by declaring that the eight major varieties of Han speech "are as different from each other (mainly in pronunciation and vocabulary) as French or Spanish is from Italian, the dialects [sic] of the south-east being linguistically the furthest apart. The mutual unintelligibility of the varieties is the main ground for referring to them as separate languages. However, it must also be recognized that each variety consists of a large number of dialects, many of which may themselves be referred to as languages." Likewise, the most recent, complete, and authoritative study on language taxonomy properly refers to the eight major Sinitic speech forms of China as being "really separate languages." See Ruhlen, pp. 142-143. I have also seen and heard similar remarks by Noam Chomsky. Perhaps Crystal's, Ruhlen's, and Chomsky's no-nonsense approach presages a new rigor that will bring the study of Chinese languages in line with linguistic usage for other areas of the world.

  15. Op. cit., pp. 53-67.

  16. Op. cit., p. 192b. Nearly identical charts appear on pp. 164b and 55% of the Nationalities volume of the encyclopedia.

  17. The exact statement, penned by the noted specialist on so-called minority nationality languages in China, Fu Maoji, appears on p. 554b of the Nationalities volume of the Chinese Encyclopedia and reads as follows: "Hanyu occupies a position in linguistic classification that is equivalent to a language group. " (Hanyu mi yuyan xishu fenlei zhong xiangdang yu yige yuzu de diwei.) This is a formulation of the utmost significance, one that seems to foretell a still more candid approach to the problem in the not-too-distant future. Once Hanyu is recognized to be a language group (which it is) instead of a single language, it will not be long thereafter that the issue of Chinese dialects receives more forthright treatment.

  18. In Comrie, ed., The World's Major Languages, p. 813. It is ironic that the complexity of the Sinitic group (and its even more perplexing relationship to the Chinese script) tends to be confronted more directly in less well publicized studies. For example, Siew-Yue Killingley (p. 3 1) is willing to conclude a study of Cantonese with the following series of questions: "Finally, can the character-based analysis of tones in Chinese, based on a former monosyllabic state of the Chinese languages, have affected the phonological analyses of other Chinese languages besides Cantonese which have become increasingly polysyllabic? Has our attitude to such analyses as received knowledge prevented us from questioning them too deeply? ... Could this conclusion [viz., that Mandarin tones 2 and 3 are one phonological tone in an environment where tone 3 is immediately followed by another tone 3] be taken any further, beyond the restrictions of this environment? And could parallel discoveries be made for other Chinese languages?" Compare Rosaline Kwan-wai Chiu's suggestion (p. 3) that "We should perhaps better describe the situation of the internal composition of Chinese and of the mutual relationship between Chinese dialects if we compared Chinese to a group of related European languages."

    It would appear that, in their initial encounters with spoken Sinitic languages, Western scholars relied more on direct observation than on cultural myths. In a survey of Chinese languages written nearly two centuries ago, J. Leyden ("On the Languages and Literature of the Indo-Chinese Nations," pp. 266-267) remarked:

    It must be observed, however, that when the term Chinese is applied to the spoken languages of China, it is used in a very wide signification, unless some particular province be specified. The Chinese colloquial languages appear to be more numerous than the Indo-Chinese tongues, and equally unconnected with each other. BARROW himself declares, that scarcely two provinces in China have the same oral language. (Travels in China, p. 244.) While the nature of the Chinese character is still so imperfectly understood, it is not surprizing that the investigation of the spoken languages of China has been totally neglected. In the course of some enquiries that I made among the Chinese of Penang, I found that four or five languages were current among them, which were totaUy distinct from each other, and the names of several others were mentioned. I was informed that the principal Chinese languages were ten in number; but I have found that considerable variety occmed in the enumeration of their names, and suspect that they are considerably more numerous, in reality.

    Perhaps because they have fallen under the sway of views on language and script that were traditionally espoused by literati and bureaucrats in China, only rarely have modern Sinologists publicly admitted that there exist more than a single Chinese tongue, as did N. G. D. Malmqvist during a 1962 lecture:

    It should be noted that if the criterion of mutual intelligibility were applied, we would have to classify many of the Chinese dialects as languages, and not as dialects.

    We know from literary sources that mutually unintelligible dialects existed in China in pre-Christian times. We also know that a given dialect may spread at the expense of other dialects as the result of the political dominance or economic or cultural supremacy of the speakers of that dialect. This is what happened to the Attic dialect which grew in influence, and eventually, in the Hellenistic period, became the standard speech of all Greece. The same process is under way in China today, where the Common Language --the Northern Mandarin --is being propagated all over the country. The spread of the knowledge of this dialect is indeed a prerequisite to the introduction of a romanized script, and this process is therefore being accelerated by the Peking government.

    In a lecture delivered about a decade later (May 1 1, 197I), M. A. French (pp. 10 1-102) addressed the matter even more straightforwardly:

    First, one should realise that the term Chinese language may refer to more than one linguistic system. Within present-day China there are spoken a number of genetically related but mutually unintelligible linguistic systems, including Cantonese and Mandarin .... Another Chinese linguistic system is Wenyan. Wenyan takes as its model the language of the Chinese classics. It has long been exclusively a written medium and until the beginning of the present century it was the medium in which almost all Chinese literature was written. Since in popular English usage the word Chinese may refer to any or all of the above varieties it is evident that, without elaboration, statements such as 'Chinese has no grammar' or 'Chinese is a monosyllabic language' or 'Chinese written with an ideographic script' are unsatisfactory, irrespective of whether they are true or not, in that they may suggest that there exists only one Chinese language.

    Most recently, in his long and authoritative article on "Sino-Tibetan Languages" in the New Encyclopdia Britannica, Smen Egerod has accurately described the linguistic situation in China as follows:

    Chinese as the name of a language is a misnomer. It has been applied to numerous dialects, styles, and languages from the middle of the 2nd millennium BC. Sinitic is a more satisfactory designation for covering all these entities and setting them off from the Tibeto-Karen group of Sino-Tibetan languages .... The present-day spoken languages are not mutually intelligible (some are further apart than Portuguese and Italian), and neither are the major subdivisions within each group.

  19. Paul F. M. Yang has shown that the Mandarin (guanhua) of the late Ming period (1368-1643), for example, may well have been based on the Nanking dialect (here I use the term advisedly). For most of the Tang period, the standard was the "dialect" of Chang'an. And so forth, depending on political circumstances and scholarly preferences. There was still intense disagreement on this subject as recently as the period of the founding of the Republic of China during the first quarter of this century when there were proponents of Cantonese, Shanghainese, Pekingese, and other forms of Chinese as the national language.

  20. "'Putonghua', haishi 'Guoyu'?" p. 10.

  21. This expression ultimately derives from pre-Marxian Sanskrit mahasangha ("the great assembly, everybody").

  22. In fact, as Robert Sanders has recently shown in a brilliantly argued paper, there are actually at least four different categories of Mandarin languages:

    1. Idealized Mandarin which, by definition, has no native speakers.
    2. Imperial Mandarin, an artificial language spoken by the scholar-official class (drawn from throughout China).
    3. Geographical Mandarin, an abstraction that embraces numerous speech patterns of low mutual intelligibility.
    4. Local Mandarin, represented by hundreds of independent speech communities.
  23. Only when we are careful to signify that one of the Sinitic tongues which has received political sanction as the official national language (at the present moment it happens to be MSM as based upon the Peking topolect) is it proper to refer to "Chinese" as a single language in contrast to the other nonsanctioned languages. Here the usage is comparable to calling the northern Langue d'Oil "French" so as to set it apart from Occitan or Langue d'Oc in the south and Franco-Provençal in the east-center of France. Likewise, we may refer to "Spanish" as the national language of Spain which was originally prevalent in the western part of country as distinguished from Catalan which still flourishes in the east of the Iberian peninsula.

  24. The burden of proof rests with those who insist that Sinitic languages are not subject to the same universal laws of phonology, morphology, grammar, and syntax that govern all other human languages.

  25. Long after the initial drafts of this paper were written, the first two parts of the Language Atlas of China became available. So far, it is not entirely clear from the Atlas what attitude it will take toward the overall problem of the nomenclature.for dialects and languages in China It does, however, intruduce the very interesting concept that Mandarin, Jin, Wu, Gan, etc. are groups and that most of them may be readily divided into subgroups. The Atlas also speaks of a Min Supergroup (the Sinitic languages of Fukien, Taiwan, Eastern Kwangtung, and Hainan island). If these readjustments come to be accepted, it will require a new understanding of their position vis-à-vis Hanyu (i.e., Sinitic) and, indeed, of Sinitic vis-à-vis Tibeto-Burman (or Tibeto-Karen), not to mention Sino-Tibetan and still less Austro-Asiatic, Austronesian (Malayo-Polynesian) and Dene-Caucasian. It is obvious that the classification of Sinitic languages is presently in a tremendous state of flux.

  26. Just as I was completing the final revisions of this article, I received a copy of Li Jingzhong's epochal paper on the independent status of Cantonese within the Sinitic group. Li begins with an historical discussion of the relationship between Cantonese (Yuèyǔ/Yuet6yue5) and the "Hundred Yue" (Bǎiyuè/Baak3yuet6) of the Spring and Autumn and the Warring States periods. In the process, Li also points out the probable origins of the designation Mányì/Maan4yi4 which has long been applied to the indigenous peoples of Kwangtung and Kwangsi. In the next section of his/her article, Li draws illuminating comparisons between Cantonese and MSM, and between Cantonese and Zhuang in terms of phonology, lexicon, and grammar. (S)He also shows that there are telling similarities with Yáo/Yiu4 (i.e., Myen). The conclusion of this section reads as follows:

    To sum up the above, we can see clearly that the origins of Cantonese lie in Old Chinese (i.e., Old Sinitic). Therefore it has quite close genetic connections with MSM. However, during the process of its formation and development, Cantonese experienced intense contact with and mutual influence upon the languges of the "Hundred Yue I ~uet6" and others, greatly influencing its phonology, grammar, and lexicon. Consequently, Cantonese gradually lost many special features of Old Chinese. At the same time, through absorption of influences from the languages of the "Hundred Yue I Yuet," Cantonese gradually and continuously acquired new features and new structural patterns until; at last, it became an independent language that, while sharing an organic relationship with MSM, is totally different hm it.

    The opening of the next section of Li's article is equally important:

    During the past several decades, many linguists, both in China and abroad, have considered Cantonese to be a "fangyan"of Modern Sinitic (Xiandai Hanyu). This is especially true of linguists within China who are in virtually unanimous agreement on this point.

    In actuality, no matter with regard to phonology, grammar, or lexicon, the differences between Cantonese and Mandarin are enormous. Speakers of Mandarin are quite incapable of understanding Cantonese and vice versa. This is a fact of which everyone is fully aware. Nonetheless, although it is obvious that speakers of Mandarin and Cantonese cannot converse with each other, why is there this insistence that Cantonese is a ifungyan" of Modern Sinitic? To my mind, there are but two reasons: 1. the influence of Stalin's discussions on "language" and "dialect"; 2. the imperceptible psychological pressure of "politicolinguistics".

    Paying heed neither to Stalin nor the heavy hand of politics, Li forges ahead to provide clear statistical proof of the tremendous gap between Cantonese and Mandarin. (S)He even puts forth his/her own classification scheme for the Sinitic group of languages, which I reproduce here:


    1. Sinitic
      1. Northern fangyan
      2. Xiang (Hunan) fangyan
      3. Gan (Kiangsi)fangyan
      4. Hakka
    2. Wu
      1. Soochow fangyan
      2. Southern Chekiang fangyan
    3. Min
      1. Northern Min fangyan
      2. Southern Min fangyan
    4. Cantonese
      1. Canton fangyan
      2. Pinghua fangyan
      3. Northwest Hainanese fangyan

    There are, of course, many difficulties and anomalies in this scheme (e.g., Sinitic is both the group name and the name of one of what Li presumably views as the functional equivalent of branches, the Cantonese branch appears to be more finely analyzed than the other branches, fangyan is used both to signify languages and dialects, and so forth), but it represents the beginning of a classification scheme for Sinitic that is potentially compatible with linguistic usage universally employed in the study of other language groups.

    Li closes with some predictions for the future of Cantonese based on current trends which indicate that, over a course of centuries, it will continue to absorb elements from a variety of sources (including English in a rather substantial way) while maintaining its basic structural integrity and identity.

    Almost as important as the content of Li Jingzhong's article is the fact that (s)he is Associate Professor at the Kwangtung Nationalities Institute (Guangdong Minzu Xueyuan). It is evident that it has now become possible even for a scholar from China to discuss the problem of the classification of the Sinitic group of languages candidly and scientifically. Li's article fully deserves a speedy and complete translation into English for it is one of the most vital statements on Chinese linguistics to have been published within memory.


Berlitz, Charles. Native Tongues. New York: Grosset and Dunlap, 1982.

Chao, Yuen Ren. "Languages and Dialects in China" In Aspects of Chinese Sociolinguistics, ed. Anwar S. Dil. Stanford: Stanford University Press, 1976. Pp. 21-25.

Chen Zhongyu. "'Huayu' --Huaren de gongtongyu ['Florescent Language' --The Common Language of the Florescent (i.e. Chinese) People]." Yuwen jiamhe tongxun (Chinese Language Advancement Bulletin), Hong Kong, 2 1 (November, 1986), 7-9.

Chiu, Rome Kwan-wai. "TheThree-fold Objective of the Language Reform in Mainland China in the Last Two Decades." Paper prepared for the Symposium organised under the auspices of the Comite qubkois des etudes de la Chine et des autres pays GAsie, Universite Laval, Quebec, April 14, 1969.

Comrie, Bernard, ed The World's Major Languages. New York: Oxford University Press, 1987. Article on "Chinese" by Charles N. Li and Sandra A. Thompson.

Crystal, David. A Dictionary of linguistics and Phonetics. Oxford: Basil Blackwell in association with Andre Deutsch, 1985.

__________________. The Cambridge Encyclopedia of Language. Cambridge: Cambridge University Press, 1987; rpt. 1988.

DeFrancis, John. The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press, 1984.

Donner, Frederick W., Jr., comp. A Preliminary Glossary of Chinese Linguistic Terminology. Chinese Materials and Research Aids Service Center, Occasional Series 38. San Francisco; Chinese Materials Center, 1977.

Egerod, Smen. "Sino-Tibetan Languages. " The New Encyclopcedia Britannica. Chicago: Encyc1op;edia Britannia., 1988. Vol. 22, pp. 721-73 1.

French, M. A. "Observations on the Chinese script and the classification of writing-systems." In

W. Haas, ed., Writing without letters. Mont Follick series, 4. Manchester: Manchester University Press, Roman and Littlefield, 1976. 4.101-129.

Killingley,Siew-Yue. A New Look at Cantonese Tones. Five or Six? Newcastle upon Tyne: S.

Y. Killingley, 1985.

Leyden, J. "On the Languages and Literature of the Indo-Chinese Nations. " Asiatic Researches, comprising History and Antiquities, the Arts, Sciences, and Literature of Asia, 22 vols. New Delhi: Cosmo, 1979 rpt. Vol. 10 (1812), pp. 158-289.

Li Jingzhong. "Yueyu zhi Hanyu zuqun zhong de duli yuyan [Cantonese Is An Indpendent Language within the Sinitic Group]." Yuwen jianshe tongxun (Chinese Language Advancement Bulletin), Hong Kong, 27 (March, 1990), 28-48.

Li Tiaoyuan (1 734- 1803). Fungyan zao [Elegant TopolecticismsJ. In Congshu jicheng jianbian- [Abridged Compilation of Collectaneaj. Taipei: Commercial Press, 1965.

Li Yehong. "Ba 'Putonghua' zhengming wei 'Guoyu', shi shihou le! [It's High Time to Rectify the Name of 'Ordinary Speech' As 'National Language'!]" Yuwen jianshe tongxun (Chinese Language Advancement Bulletin), Hong Kong ,19 (February, 1986), 25-26.

Lian Jinfa. "Dijiu jie guoji Zhongguo yuyanxue yanjiuhui jiyao [Summary of the Ninth International Research Meeting on Chinese Linguistics]." Hame yanjiu tongxun (NewsZener for Research in Chinese Snrdies), 9.2, cumulative 34 (June, 1990), 84-89.

Lien, Chinfa. "The Ninth Workshop on Chinese Linguistics. " Journal of Chinese Linguistics, 18.2 (June, 1990), 343-356.

Liu, Yongquan. "The Linguistic Situation in China." Lecture presented before the Department of Oriental Studies, University of Pennsylvania, April 17, 1990.

Liu Yongquan and Zhao Shikai, comp. Ying-Hun yuyanxue cihui [English-Chinese Glossary of Linguistics]. Peking: Zhongguo shehui kexue chubanshe, 1979.

Lyu Shuxiang. Yuwen changtan [Common Talk about Language]. Peking: Sanlian shudian, 1980.

Mair, Victor H. "Implications of the Soviet Dungan Script for Chinese Language Reform," Sino-Platonic Papers, 18 (May, 1990), 19 pages. Paper originally prepared for the Conference on "The Legacy of Islam in China: An International Symposium in Memory of Joseph F. Fletcher" held at Harvard University, April 14- 16, 1989.

Malmqvist, N. G. D. "Problems and Methods in Chinese Linguistics." The Twenty-fourth George Ernest Morrison lecture in ethnology, 1962. Canberra: The Australian National University, 1964.

Pei, Mario A. The World's Chief Languages. New York: S. F. Vanni, 1946, third ed.

Ramsey, S. Robert. The Languages of China. Princeton: Princeton University Press, 1987.

Ruhlen, Memtt. A Guide to the World's Languages. Vol. I: Classification. Stanford: Stanford University Press, 1987.

Sanders, Robert M. " The Four Languages of 'Mandarin'." Sino-Platonic Papers, 4 (November, 1987).

Sermys, Paul L.-M. "Survey of the Chinese Language Reform and the Anti-Illiteracy Movement in Communist China " Studies in Chinese Communist Terminology, 8 Berkeley: University of California, Institute of International Studies, Center for Chinese Studies, February, 1962.

Sun Yirang (1848-1908). Zhouli zhengyao [Essentials of Government from the Zhou Ritual]. In Guanzhong congshu [Shemi Collectanea], first series. Taipei (1970?), facsimile reproduction of 1934typeset edition.

Wang, William S-Y. Wang. "Theoretical Issues in Studying Chinese Dialects." Journal of the Chinese Language Teachers Association, 25.1 (February, 1990), 1-34.

Wurm, S. A., B. Ts'ou, and D. Bradley, et al., eds. Language Atlas of China. Parts I and II. Pacific Linguistics, Series, No. 102. Hong Kong: Longman, 1987. English and Chinese editions published simultaneously.

Xing Gongwan. Hanyu fangyan diaocha jichu zhishi [Fundamental Knowledge for the Investigation of Sinitic Topolects]. Wuchang, Hupeh: Huazhong gongxue yuan shubanshe, 1982.

Yang, Paul Fu-mien. "The Portuguese-Chinese Dictionary of Matteo Ricci: A Historical and Linguistic Introduction." Paper presented at the Second International Conference on Sinology. Academia Sinica, Taipei, December 29-31, 1986.

Yuan Jiahua, et al. Hanyu fangyan gaiyao [Essentials of Chinese Fangyan]. Second ed. Peking: Wenzi gaige chubanshe, 1983.

Zhang Gonggui and Wang Weizhou. "'Putonghua', haishi 'Guoyu'? ['Ordinary Speech' or 'National Language'?]." Yuwen jianshe tongxun (Chinese Language Advancement Bulletin), Hong Kong, 2 1 (November, 1986), 14.

Zhongguo da baike quanshu [Chinese Encyclopedia 1. Minzu [Nationalities] (1986) and Yuyan [Languages] (1988) volumes.

This essay was originally published in September 1991 as issue no. 29 of Sino-Platonic Papers and is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.

Webmaster's note:
This HTML document was prepared from a scan of the printed original and therefore may contain some errors. If spellings or references are in doubt, please consult the PDF file of the original issue.