One-fifth of 白水村 Baishuicun (BSC) 'White Water Village' dialect forms ending in -o cannot be explained using the sound laws I proposed in parts  2 and 3. I will try to explain these forms which fall into eight categories. I could have covered the first two categories in part 3, but they are more complicated than the majority of the *-at class.

1. 佛 fo < *fat (< *but) 'Buddha'

This appears to be a loan from a language in which *-ut became *-at as in Cantonese 佛 fat (though Cantonese is not spoken in Hunan and therefore is not the source of fo). An earlier loan is pu which is close to prestigious Early Middle Chinese *but, an abbreviation of 佛陀 *but da 'Buddha'.

Could the word simply be a very recent borrowing from Mandarin fo (whose rhyme is irregular)?

I am reluctant to guess glosses for BSC forms since the 小學堂 Xiaoxuetang database does not provide any, but in this case I'm pretty sure 佛 is 'Buddha' (though 佛 may have other meanings in BSC), and it would be odd to mention 佛陀 *but da without also mentioning its Indic source.

2. 蝨 so < *sat (< *ʂɨt < *ʂit < *srit < *srik)

This might be a loan from a language in which *-it became *-at as in Cantonese 蝨 sat (though I must note again that Cantonese cannot be the source of this particular word).

3. -o < *-ai

In part 2, I dealt with BSC -o-forms corresponding to -ai-type rhymes in other Chinese languages. The following words may have had pre-BSC *-ai even though they may not have -ai-like rhymes in other Chinese languages:

Sinograph Old Chinese Early Middle Chinese Late Middle Chinese Pre-BSC BSC
(*srəj > *ʂɨj?) *ʂi *ʂi *sai so
*Cɯ-baj or *Nɯ-paj *bɨe *bɨi *pai po
*ʔəj *ʔɨj *ʔi *ai o
*pəts > *pɨjh *pujʰ *fi *pai po
*məjʔ > *mɨjʔ *mujˀ *vi *mai mo
*pajʔ *paˀ *pa *pai po
*naj *na *na *nai no
*lats *da(j)ʰ *da(j) *tai to

Pre-BSC *-ai may have been a reflex of Old Chinese *-aj/-əj/-ɨj-type rhymes; it cannot have been borrowed from a prestigious EMC or LMC dialect. (BSC 衣 i is a loan from an prestige LMC-like form.)

篩 is not attested in Old Chinese. Pre-BSC 篩 *sai is reminiscent of Cantonese 篩 sai, though it cannot be from Cantonese. The word could be borrowed from a form like Mandarin 篩 shai (whose reading is from 籭; the hypothetical regular Mandarin reading would be *shi).

4. 麥 mo < *mai

This has an unexpected yang departing tone. Is this a very late loan from a form like Mandarin 麥 mai which lost its final stop and entering tone? If it were a native word or an early loan, it should have an entering tone as a trace of its original *-k.

5. -o < *-raw

ko is ultimately from Old Chinese *kraw but it could be a loan from *kaw or *ko in some later language. I can't look further into this (or anything else BSC-related) because the Xiaoxuetang site is down.

6. -o < (*-wa? <) *-ra

There are three forms in this category: 灑下拏. 咬 from category 4 may also belong to this category, as pre-BSC or a source language may have been like Taiwanese which has -a from both *-ra and *-raw:

咬 T ka (kau is a borrowing)

灑 T sa (borrowed?; se may be native; cf. 下 below)

下 T literary (i.e., borrowed) ha; the colloquial (i.e., native) form is e

拏 T na (borrowed?; displaced a native *ne?)

"Like" does not entail a close relationship. Taiwanese is both genealogically and geographically distant from BSC.

7. -o < *-ə(ŋ)

Is 扔 no from Old Chinese *nəŋ or an open-syllable variant *nə (cf. Japanese no < *nə for 乃, the phonetic of 扔)?

8. Unrelated synonyms

I initially thought -o of 久 no might be a reflex of Old Chinese *-ə in 久 *kʷəʔ, but I doubt n- is from *kʷ-. Moreover, the yang departing tone of no is not what I would expect for a descendant of 久 *kʷəʔ. I conclude that no is an unrelated synonym.

mo < *mat? has an initial and an entering tone (from an earlier final stop) that rule out any relationship with Old Chinese *Cɯ-ʔoj. Even if *C- were *m-, the tone would still be irregular.

no < *nat? has the same problems as 萎 mo; it cannot be related to EMC 拋 *phræw (the word is not attested in Old Chinese).

ŋo < *ŋat? is not related to Old Chinese 鷹 ʔəŋ.

ko superficially resembles Sino-Japanese but may be from *kat which cannot be related to EMC 硬 *ŋɤeŋʰ (the word is not attested in Old Chinese, and the SJ initial is irregular). A DIP INTO WHITE WATERS (PART 3): FINAL ST-*AP-S

The chain shift I proposed in part 2 explains three-fifths of -o syllables in the 白水村 Baishuicun (BSC) 'White Water Village' dialect:

*-ai > *-oi > -o

Another fifth requires further sound laws:

*-ap merged with *-at (cf. the *-p > -t shift in Nanchang which is 500 km to the east and not closely related)

*-at > *-ait > *-oi(t) > *-o

I propose that *-t had a fronting effect on *a similar to the fronting effect of -d in Tibetan:

'eight': Proto-Sino-Tibetan ?*prjat >

Old Chinese *pret > pre-BSC *pat > pait > *poi(t) > BSC po

Earlier Tibetan brgyad > Lhasa cɛʔ

Here are more examples to show the merger of multiple *-t and *-p rhymes. The Early and Late Middle Chinese forms here are composites based on prestige dialects and are not directly ancestral to BSC.

Sinograph Early Middle Chinese Late Middle Chinese Pre-BSC BSC
*ləp *lap *lap > *lat lo
*ɣɤap *ɣæp *xap > *xat xo
*kɤep *kæp *kap > *kat ko
*ɣwet *xwiet *xwat fo
*xɤat *xæt *xat xo
*pɤet *pæt *pat po
*puot *fat *fat fo

Although BSC no longer has any final stops, they have conditioned tones absent from words without original final stops: high level if the initial consonant was *voiceless and mid level if the initial consonant was voiced. I don't know when the final consonants were lost after *a shifted to *ai before *-t.

發 must be a loanword because it has *f- instead of the expected *p- (see my discussion of 煩 xoi and 佛 pu ~ fo in part 1). Foreign *f- might have been borrowed as *f- after BSC developed its own *f- from *xw- in words such as 穴. Conversely, it is also possible that *xw- became *f- after loanwords introduced *f- into the BSC phonemic inventory. A DIP INTO WHITE WATERS (PART 2): A CH-OI-N SHIFT?

In part 1, I proposed the following sound change for the 白水村 Baishuicun (BSC) 'White Water Village' dialect:

*-an > *-oi

I now propose a larger chain shift:

*-an > *-ai > *-oi > -o

BSC -o often corresponds to *-e/*-aj-type rhymes in prestigious Early and Late Middle Chinese dialects which were not its ancestors (but might have been sources of loans into BSC). I do not include tones in pre-BSC and BSC forms. I have not heard BSC, but I assume that its -i is [j] after vowels, so there is no real difference between MC *-j and (pre-)BSC (*)-i.

Sinograph Early Middle Chinese Late Middle Chinese Pre-BSC BSC
*najʰ *nàj *nai > *noi no
*nəjʰ *nəj > *naj
*buojʰ *fàj *pai > *poi po
*bɤajʰ *bàj
*kɤe *kæj *kai > *koi ko

My proposal accounts for 59% (58/99) of the -oi forms in the 小學堂 Xiaoxuetang database. I will deal with the others in part 3.

BSC p- corresponding to LMC *f- (e.g., in 吠) indicates a native word. A hypothetical early loan of 吠 would be *xo and a hypothetical late loan would be *fo (see my discussion of 煩 xoi and 佛 pu ~ fo in part 1). A DIP INTO WHITE WATERS (PART 1)

Over the past couple of days I have been intrigued by the dialect of 湘南土話 Xiangnan Tuhua 'local speech of southern Hunan' spoken in 白水村 Baishuicun 'White Water Village' in 江永縣 Jiangyong County. In " 'More' Evidence", I found that 更 Old Chinese (OC) *kraŋ(s) / Middle Chinese (MC) *kɤaŋ(ʰ) 'watch of the night'/'more' corresponded to Baishuicun (BSC) koi. I hypothesized that -oi was from an *-aɲ like that of Sino-Vietnamese canh/cánh [kaɲ]. Looking at other BSC -oi forms, I can make a more general statement: *A/O-type vowels followed by nonback nasals (*ɲ, *n, *m) became -oi. The nasals probably merged into *-n before becoming -i.

Sinograph Early Middle Chinese Late Middle Chinese Pre-BSC BSC
*buan *van *xan xoi
*kən *kən *kon or *kan? koi
*tan *tan *CV-tan loi
*kwɤan *kwæn *kwan koi
*ʂɤen *ʂæn *san soi
*kɤaŋ *kæŋ *kaɲ koi
*ŋɤem *ŋæm *ŋam ŋoi
*ʂɤam *ʂæm *sam soi
*bon *bon *pon > *pan? poi

Pre-BSC is a very rough guess bridging Late Middle Chinese (LMC)* and BSC. It looks more like a typical Chinese language than BSC does.

xoi is probably a loanword. I think the native BSC reflex of EMC *b- is p-: e.g., 佛 pu 'Buddha' (a later borrowed form is fo). BSC is not descended from generic LMC dialects which lenited labial stops to fricatives before *u. The borrowing of 煩 must predate the shift of *-an to *-oi and the borrowing of *f-. *x- was the closest pre-BSC equivalent of foreign *f-. I conclude there are at least two layers of borrowings that can be distinguished by their treatment of foreign labiodental fricatives: an older x-layer and a newer f-layer.

I think the l- of 單 is due to BSC-internal lenition.

The above scheme accounts for most but not all instances of -oi in BSC. Two requiring further investigation are

吾 OC *ŋa > EMC, LMC *ŋo : BSC ŋoi

崖 OC *ŋre > EMC *ŋɤe > LMC *ŋæj : BSC ŋoi

Neither had nasals in earlier Chinese. The normal BSC reflexes of OC *-a and *-re are -u and -o.ŋoi could be a borrowing from a dialect that had broken *-ɤe to *-æj or the like. But the -i in 吾 ŋoi remains a mystery.

The above scheme cannot account for cases in which -oi did not develop from *-an: e.g., 肝 BSC kaŋ (not *koi!) <  OC/MC *kan. It seems that velars somehow blocked the *-an to -oi shift.

Next: A ch-oi-n shift?

*9.11.23:03: The LMC reconstruction here is a composite of the prestige dialects underlying borrowed forms in Chinese and non-Chinese languages. It is not a direct ancestor of pre-BSC, though such an ancestor may have been similar and may have borrowed from an LMC prestige dialect. 'SECONDAR-Y' ROUNDING IN CANTONESE

The unexpected labiovelar /kʷ/ in  Cantonese 梗 /kʷaːŋ˧˥/ 'stem' (among many other meanings) from my last post brought to mind a Cantonese form that has puzzled me for many years: 乙 /jyːt˧/ 'second Heavenly Stem'. I used to think its rounded vowel /yː/ was unique, as it wasn't in any reconstruction or actual form that I had ever seen: e.g.,

Old Chinese *ʔi̯ɛt (Karlgren), *qrig (Zhengzhang), my *ʔrət or *ʔrit (I cannot find any rhyming evidence favoring one vowel over the other*)

Middle Chinese: *ʔi̯ĕt (Karlgren), *ʔɣiɪt (Zhengzhang), my *ʔɨit

Mandarin yi

Taiwanese it

Sino-Vietnamese ất [ʔət]

Sino-Korean ŭl [ɯl] < idealized ʔɯ́rʔ

Sino-Japanese otsu < *ət

However, I now see that rounded vowels are not only in Yue varieties like Cantonese but also a few southern non-Yue varieties:

Yue: too many /y/-varieties to list; other rounded vowels (or glides?) are in

Xintian Fantian, Dapu Taiheng jɵk

Kaiping (Chikan) zuat

Taishan (Taicheng) zᵘɔt (? - there is no syllable like this in Stephen Li's Taishan syllabary)

Mengshan iut

Huaiji wut

Dongguan (Guancheng) zøt

Bao'an (Shajing) (j)iɔʔ

Ping: Nanning yt (loan from Cantonese?)

Hakka: Huizhou yat


Zhongshan (Gong'an) iuə

Shaozhou Tuhua:

Xingzi ɵy

西岸 Xi'an oi

Bao'an u

All the non-Yue varieties are within the Yue area, so their rounding may be due to Yue influence.

If rounding is a Yue innovation, why did it happen? Both 梗 and 乙 had medial *-r- in Old Chinese. Did that *-r- sporadically become *-w-? (Cf. Elmer Fudd's "wascally wabbit".) Are there other Old Chinese *-r- words with modern labial reflexes?

I would expect kw-reflexes of Old Chinese 甲 *qrap 'first Heavenly Stem', but the only remotely similar forms are

Yixian kɔɐ̆ʔ

Guilin (Chaoyang) kuo

Jiangyong Chengguan (Baishuicun) kuə

whose diphthongs might be breakings of an *o from *a (cf. o-forms like Lingchuan (Tanxia) ko). None of those three varieties have rounded vowels in 乙, though Baishuicun does have a rounded vowel in 梗.

*My Old Chinese *ʔrət and *ʔrit could both become Middle Chinese *ʔɨit:

OC *r-vocalization -breaking monophthongization
*-ət *-ɨət *-ɨt (> *-ut after labials)
*-rət *-ɨət *-ɨit
*-rit *-ɨit

I have included two other rhymes for comparison.

There was a chain shift:

*-ət > *-ɨət > *-ɨit

That could be interpreted as a push or pull chain:

Push: When *-ət broke to *-ɨət, it 'pushed' original *-ɨət into the 'space' of *-ɨit.

Pull: When original *-ɨət merged with *-ɨit, it left a gap to be filled by *-ət after it broke to *-ɨət.

I generally prefer pull chains, but mixed reflexes of *-ət and *-rət might point to a push chain. 'MORE' EVIDENCE FOR THE LIMITS OF THE MIDDLE CHINESE LEXICOGRAPHICAL TRADITION

Last night I saw this passage in the Wikipedia article on Cantonese phonology:

There are about 630 sounds [i.e., syllables disregarding tones?] in the Cantonese syllabary. Some of these, such as /ɛː˨/ and /ei˨/ (欸), /pʊŋ˨/ (埲), /kʷɪŋ˥/ (扃) are not common any more; some such as /kʷɪk˥/ and /kʷʰɪk˥/ (隙), or /kʷaːŋ˧˥/ and /kɐŋ˧˥/ (梗) which has traditionally had two equally correct pronunciations are beginning to be pronounced with only one particular way uniformly by its speakers (and this usually happens because the unused pronunciation is almost unique to that word alone) thus making the unused sounds effectively disappear from the language [...]

At first I was puzzled by 梗 /kʷaːŋ˧˥/ 'stem' (among many other meanings) which has a labiovelar initial even though it is written with a velar phonetic 更 'watch of the night'/'more'. I have never seen an Old Chinese reconstruction of 梗 with a labiovelar or labial. 梗 had no *-w- according to the Middle Chinese lexicographical tradition based on prestige varieties. But not all modern forms arise from those varieties. Forms in multiple branches of Chinese (written here without tones) may point to *-w-:

Yunhe kuɛ (see here for more Wu forms with -u-)

Nanchang ku (see here for more Gan forms with -u-; is Leping mu a typo for kuaŋ?)

Fuzhou literary (!) ku ~ colloquial keiŋ (see here for more Min forms with -u-)

Lechang kuɐn (see here for more Yue forms with -u-)

Lingui yɛn (the only Ping form with a labial; the aspiration is irregular and can be sporadically found in other Ping varieties and in Yue, Hakka and even Mandarin)

Meixian literary (!) ku ~ colloquial kɛn (see here for more Hakka forms with -u-)

Fengyang kua (see here for more Shaozhou Tuhua forms with -u-)

The -u- and -y- of those forms cannot be derived from Middle Chinese reconstructions for 梗 such as my *kɤaŋˀ or Old Chinese reconstructions derived in turn from those reconstructions: e.g., my *kraŋʔ. (I reconstruct its phonetic 更 as Middle Chinese *kɤaŋ(ʰ) from Old Chinese *kraŋ(s).)

梗 has no labials in Mandarin, Jin, or Xiang. Was labiality lost in the north, or is it a common retention of southern languages that do not form a subgroup? Fuzhou, Meixian, and perhaps other varieties may have borrowed from one or more southern literary Middle Chinese dialects with a labial absent from other prestige dialects.

For comparison, 更 does not have a labial with a few exceptions:

Shaxian and Sanming kɔ̃ (< *kaŋ?; Sanming also has kɛ̃; other Min forms here)

Yangshuo kyɛ̃ (but Lingui kəŋ; other Ping forms here)

Hezhou kɔ (< *kaŋ?)

Jiangyong Chengguan (Baishuicun) koi (< *kaɲ?; cf. Sino-Vietnamese canh 'watch of the night' ~ cánh 'more' [kaɲ])

The labials of most of these forms do not necessary point to *-w-. The shifts I propose for Shaxian, Sanming, and Hezhou have parallels in northwestern Middle Chinese (in which *-aŋ became *-o; a similar shift occurred in neighboring Tangut and its relative Japhug rGyalrong). *PI̵K A CODA

Here's something I don't see every day: a Chinese character (逼) whose readings have three different types of codas:

Velar/glottal: Cantonese bik [pɪk], Suzhou ʔ (source)

Alveolar/dental: Sino-Japanese hitsu < *pit

Labial: Sino-Korean phip

Its Middle Chinese reading was *pɨk. I can't explain this diversity.

9.8.0:48: There are Chinese readings with -t as well, but they are regular reflexes of *-k after front vowels: e.g.,

Toisanese pet < *pek (source)

Meixuan Hakka pit < *pik (source)

That is not the case with Sino-Japanese hitsu with a -tsu instead of the expected -ki. There was no such fronting rule in Japanese or in the Chinese source dialects of Sino-Japanese.

Conversely, 匹 Middle Chinese *phit has two Sino-Japanese readings: hiki as well as the expected hitsu < *pit.

Sino-Korean is full of irregularly aspirated labial initials. In fact there is no *pa in Sino-Korean; all syllables that should be *pa are pha: e.g., 波 pha < Middle Chinese *pa. I have long wondered if this was the product of hypercorrection. (Korean never had f, so Chinese *f- was Koreanized as the stops p- and ph-, and words without *f- in Chinese such as 波 might have been read as if they had *f-.)

The idealized Sino-Korean readings of Tongguk chŏngun (1448) lack this excess aspiration of labials. (I would expect the Tongguk chŏngun reading of 逼 to be *pík, but I can't find 逼 in that dictionary. Although Martin 1992: 126 listed pík in a table of Tongguk chŏngun readings, I don't see it in the book itself.)

On the other hand, Sino-Korean is almost completely lacking in kh-readings, though Tongguk chŏngun has them where they are expected: e.g.,

可 Sino-Korean ka, Tongguk chŏngun khǎ < Middle Chinese *khaˀ

This may tell us something about the chronology of the development of Korean aspirates which are either borrowed or secondary. THE M-ISSING COMMISSAR

Today I read about Genrikh Lyushkov (1900-1945?), whose title brought to mind a question that I've had for a long time: why was German Kommissar borrowed into Russian as комиссар komissar? Why is an m 'missing' from that word and команда komanda 'team' (< French commande)? Was there a rule to simplify sequences of identical consonants at prefix-root boundaries in spellings of loans?

Latin com-missarius > R komissar

Latin com-mendare > R komanda

(The Latin forms are for root identification only and do not necessarily match the later forms' parts of speech, etc.)

But what about

Latin com-mercium > F commerçant > R коммерсант kommersant (not *комерсант komersant) 'merchant'

and Latin com-mutator > R коммутатор kommutator (not *комутатор komutator) 'switchboard'?

That rule obviously does not apply to native words with secondary sequences resulting from syncope: e.g., введение vvedenie 'introduction' < въведеніе vŭ-vedenie 'in-leading' ≠ ведение vedenie 'leadership'.

Sequences of identical consonants within a root word remained intact: e.g., the -ss- of missarius and the -mm- of communis (hence R коммунизм kommunizm 'Communism').

Elsewhere in East Slavic, although both Belarusian and Ukrainian have phonemic gemination, all of the above loanwords (presumably borrowed from Russian) lack geminates:

Gloss Russian Belarusian Ukrainian
commissar комиссар
team команда
merchant коммерсант
switchboard коммутатор
Communism коммунизм

R комитет komitet / B камітэт kamitet / U комітет komitet does not fall into this category since its French source comité (< English committee) already lacked the double consonants of Latin committere.

Finnish komissaari 'commissioner' looks like a borrowing from Russian komissar.

Why does Serbo-Croatian комесар komesar have an -e- instead of an -i-?

9.6.20:48: And why did Old Latin comoine(m) become Latin communis 'common' with -mm-? LIANMA, LINGMO, LINYIN

Most Chinese character spellings of foreign place names in Japanese can be explained in terms of Chinese and/or Japanese readings.

One baffling exception is 布哇 Hawai 'Hawaii' which makes no obvious sense in either Chinese or Japanese. I have written about it thrice (2008, 2010, 2012). I can't think of a better explanation than what I proposed in 2012.

I discovered what initially appeared to be another exception tonight in Yamamoto (2009: 81): 嗹馬 Denmāku 'Denmark' which would be read as Lianma in Mandarin and as *Renba in normal Sino-Japanese. I didn't think it could have been created by a Japanese speaker because Japanese not only has [d] but also has characters pronounced [den]. Mandarin, on the other hand, has no [d] (what is romanized d is actually an unaspirated [t]). So was voiced l intended to be a substitute for voiced d? Apparently it was, as it and similar Chinese names for Denmark turn up in the 1852 edition of the 海國圖志 Illustrated Treatise on the Maritime Kingdoms by 魏源 Wei Yuan:

嗹國 Lianguo (guo is 'country')

領墨 Lingmo (-ng would be an acceptable substitute for -n to a speaker of a Chinese variety like Shanghai without an -n : -ng distinction)

吝因 Linyin (I have no idea what -yin is doing)

9.6.0:36: The Japanese may have taken the spelling 嗹馬 from the Treatise given its influence in Japan:

Wei's work was also to have a later impact on Japanese foreign policy. In 1862, samurai Takasugi Shinsaku, from the ruling Japanese Tokugawa shogunate, visited Shanghai on board the trade ship Senzaimaru. Japan had been forced open by US Commodore Matthew C. Perry less than a decade earlier and the purpose of the mission was to establish how China had fared following the country's defeat in the Second Opium War (1856–1860). Takasugi was aware of the forward thinking exhibited by those such as Wei on the new threats posed by Western "barbarians" [...] Sinologist Joshua Fogel concludes that when Takasugi found out "that the writings of Wei Yuan were out of print in China and that the Chinese were not forcefully preparing to drive the foreigners out of their country, rather than derive from this a long analysis of the failures of the Chinese people, he extracted lessons for the future of Japan". Similarly, after reading the Treatise, scholar and political reformer Yokoi Shōnan became convinced that Japan should embark on a "cautious, gradual and realistic opening of its borders to the Western world" and thereby avoid the mistake China had made in engaging in the First Opium War. Takasugi would later emerge as a leader of the 1868 Meiji Restoration which presaged the emergence of Japan as a modernised nation at the beginning of the 20th century. Yoshida Shōin, influential Japanese intellectual and Meiji reformer, said Wei's Treatise had "made a big impact in our country". *BAKUSHIKO AND *BAKUSHIK(W)A

I almost added this to my last post, but I ran out of time, and the topic is somewhat different, n  ...

I have long been puzzled by Chinese character spellings of foreign place names in Japanese. Some seem to be hybrids of Chinese and Japanese readings.

For instance, when I Googled for モスコウ Mosukou and 1939 last night, I found 北京より 莫斯古へ Pekin yori Mosukou e, 高山洋吉 Takayama Yōkichi's 1939 translation of Sven Hedin's Von Peking nach Moskau. Although the Kobe University City Library has it catalogued as Pekin yori Mosukuwa e (with the currently dominant Japanese name of the city - the most likely term to be searched), I assume 莫斯古 was meant to be read as Mosukou since the name appears as モスコウ Mosukou in the title of chapter 11. 莫斯古 Mosukou would be read as *Mosigu [mwɔ sz̩ ku] in Mandarin and *Bakushiko in normal Sino-Japanese. Whoever created that spelling seemed to be thinking of Mandarin 莫斯 [mwɔ sz̩] followed by Sino-Japanese 古 ko. (There is no [mɔ] or [mo] in standard Mandarin.)

I used to think another Japanese spelling 莫斯科 was a direct loan from Mandarin Mosike [mwɔ sz̩ kʰɤ] (ke was once [kʰɔ] and is still [kʰɔ] or the like in many other varieties of Chinese today). But could it be a blend of Mandarin 莫斯 [mwɔ sz̩] followed by Sino-Japanese 科 kwa (pronounced [ka])? Was 莫斯科 first attested in Chinese or Japanese? In any case, it cannot be based on its hypothetical normal Sino-Japanese reading *Bakushikwa.

Tonight I discovered a third Japanese spelling 莫斯哥 in Yamamoto (2009: 78). This looks like a direct loan from the less common Mandarin name Mosige [mwɔ sz̩ kɤ] (ge was once [kɔ] and is still [kɔ] or the like in many other varieties of Chinese today). In normal Sino-Japanese, it would be read *Bakushika which sounds nothing like Moskva or Moscow.

In the Kobe University Library Newspaper Clippings Collection, 莫斯科 appears 815 times between 1912 and 1941, 莫斯哥 appears only once in 1916, and 莫斯古 does not appear at all. The three katakana spellings combined outnumber the kanji spellings by nearly two to one (1564 : 816). MOSUKUWA VS. MOSUKŌ (AND MOSUKOU)

While looking up ワルシャワ Warushawa and ワルソー Warusō in various editions of Kenkyusha's New Japanese-English Dictionary, I noticed that their distribution paralleled that of モスクワ Mosukuwa (< Russian Moskva) and モスコー Mosukō (< English Moscow). Was there a shift toward more Slavic-flavored Japanizations by 1974?

Here is the distribution of both terms and a third term in the Kobe University Library Newspaper Clippings Collection:

Mosukuwa: 866 results, 1912-1942

Mosukō: 568 results, 1912-1942

モスコウ Mosukou: 130 results, 1915-1936

I was expecting Mosukuwa to be less common than Mosukō by analogy with Warushawa and Warusō, but the reverse is true. I wish I had postwar statistics. Here are current Google statistics showing the gaps between the three have widened considerably:

Mosukuwa: 1.36 million

Mosukō: 119,000 results including non-Russian Moscows (cf. the use of Warusō for non-Polish Warsaws)

Mosukou: 10,600 results including モスコウイッツ Mosukowittsu 'Moskowitz'

I've never heard Moscow rhyme with Mexico. Is that pronunciation still current, and if so, where? WARUSHAWA VS. WARUSŌ

The influence of English on Japanese has only grown over time, while the influence of other European languages has waned: e.g., German-based dēdētē 'DDT' (in this dictionary of extinct Japanese words) has been replaced by English-based dīdītī. (What was the last major German or French loanword in Japanese?) So when I see a continental European loanword, I assume it is pre-1945: e.g., ワルシャワ Warushawa 'Warsaw' which sounds like Polish Warszawa (though Polish w is [v]).

That was why I was surprised to see an English-like ワルソー Warusō for 'Warsaw' in the September 2, 1939, Asahi shinbun. (Yes, the 75th anniversary of the beginning of WWII is still on my mind.) How far back do Warushawa and Warusō go? I wish Google Ngram Viewer worked with Japanese.

I quickly found various attestations of Warusō from the period:

- the September 18, 1939, entry of the diary of 馬淵良三 Mabuchi Ryōzō

- the October 6, 1939, 大陸日報 Continental Daily News published in Vancouver, BC

- 宮本百合子 Miyamoto Yuriko, "The Flames of the Life of Mrs. Curie" (December 1939)

- the Privy Council's "Abolishing an Imperial Embassy in Poland" (October 1, 1941)

Was Warusō the standard Japanese name for Warsaw at the time? Judging from Wikipedia, today it seems to linger only in a few contexts such as ワルソー条約 Warusō jōyaku 'Warsaw Convention' (1929; cf. ワルシャワ条約 Warushawa jōyaku 'Warsaw Pact' with the same jōyaku) and ワルソー・コンチェルト Warusō koncheruto 'Warsaw Concerto' (1941). The Japanese Wikipedia entry for Warsaw doesn't mention Warusō as an alternative of Warushawa. Looking in various editions of Kenkyusha's New Japanese-English Dictionary, I discovered that editions prior to 1974 only listed Warusō. The 1974 edition listed both Warusō and Warushawa for the first time.

The 1975 edition of Sanseido's New Concise Japanese-English Dictionary that I have been using for over thirty years lists only Warushawa in its appendix of place names.

It would be interesting to see when, say, Asahi shinbun shifted from Warusō to Warushawa. A couple more data points: I just found Warusō in the December 20, 1919 Ōsaka asahi shinbun (image / HTML) and Richard Austin Freeman's The Case of Oscar Brodski, translated into Japanese by 妹 尾韶夫 Seno Akio in 1957.

Okay, a few more: Warushawa first appears in the Kobe University Library Newspaper Clippings Collection in the May 30, 1913, Jiji shinpō (image / HTML), and appears in papers up through 1939 (image / HTML). Warusō first appears in that collection in 1915 (image / HTML), and last appears in 1941 (image / HTML). Warusō outnumbers Warushawa by a ratio of roughly seven to one (138 : 21). Today in Google, Warusō is vastly outnumbered by Warushawa (20,300 : 515,000). Warusō is the only Japanization of Warsaws outside Poland, but I presume those other Warsaws aren't mentioned enough to give Warushawa serious competition. GYDDANYZC

I have long been interested in Slavic partly because it underwent massive vowel loss paralleling the massive vowel losses I reconstruct for Chinese and Tangut.

My favorite example is monosyllabic Gdańsk from *Gŭdanĭskŭ* with four syllables (Comrie 1987: 326). That city has been on my mind lately because today is the 75th anniversary of the German invasion of Poland.

On Saturday I learned that Gdańsk is first attested as Gyddanyzc sometime after 997 AD. Is that spelling evidence for

- the retention of medial and

- the loss of final

circa 1000 AD?

Why were the vowels that were later lost both spelled y? Had they merged into [ɨ] in the version of the name that was transcribed? They could not have merged in the ancestor of the modern name Gdańsk, as ń is from *nĭ and still reflects the palatal quality of the lost vowel *ĭ. Or is ny a transcription of [ɲ]?

Is the doubling of d significant?

Why was the sibilant before c [k] written as z instead of s? The z also appears in the later spellings Kdanzk (1148), Gdanzc (1188), and Danzc (1263). I assume the z of the spelllings Danczk (1311), Danczik (1399), and Danczig (1414) is half of a digraph cz [tʂ] and is not evidence for [z].

*This matches Old Church Slavonic Гъданьскъ <Gŭdanĭskŭ>. Is that form of the name attested in ancient texts, or is it a retroactive creation?

I assume the ISO 639-1 code cu for OCS is from c(h)u(rch). cu makes me think of Cuman which has no ISO 639-1 code; its three-letter ISO 639-3 code is qwm. qum and cum were already taken for Sipakapense in Guatemala and Cumeral in Colombia. EAT-YMOLOGY 3: *NZ- > *NDZ-?

I just realized that 'eat' from my last post wasn't the best example of a word that had undergone brightening without lenition. What if lenition were followed by fortition (in bold) after a nasal?

stem 1: *NI-dza > *NI-dzja > *NI-z- > *Nz- > *ndz- > dzi 1.11

stem 2: *NI-dza-w > *NI-dzjaw > *NI-z- > *Nz- > *ndz- > dzio 1.51

Perhaps these are better examples of brightening without lenition:

Tangut 0749 phi 1.11 'to order' (stem 1), 4568 phio 2.44  'to order' (stem 2)  : Japhug kɤ-ɣɤ-xpra 'to order'

also cf. Somang ka-wa-kprá 'to order' preserving Proto-rGyalrong *kpr-

Pre-Tangut *CI-Kpra(-w-H) > *CI-Kprja(w-H) > *Kpr- > phi(o) 1.11/2.44

or Pre-Tangut *KI-pra(-w-H) > *KI-prja(w-H) > *Kpr- > phi(o) 1.11/2.44

Tangut 5449 1tị 'to put' 1.67 (stem 1), 5633 1tiọ 'to put' 1.72 (stem 2)  : Japhug kɤ-ta 'to put'

Pre-Tangut *CI-S-ta(-w) > *CI-Stja(w) > *tt- > ti ~ tiọ̣ 1.67/1.72

or Pre-Tangut *SI-ta(-w) > *SI-tja(w) > *tt- > ti ~ tiọ̣ 1.67/1.72

If lenition had preceded brightening, ph- and t- would have lenited to *v- and *l- in those words.

I do not know if the *I of the brightening presyllable followed *K- that conditioned the aspiration of ph- and/or the *S- that conditioned the tension of rhymes 1.67 and 1.72 (indicated by a subscript dot). *I could have been in a presyllable preceding one or both of those consonants.

Pre-Tangut *K(I-)pr- nicely matches Proto-rGyalrong *kpr-, but pre-Tangut *S(I)-t- does not match Proto-rGyalrong *t-. Perhaps the aspirated th- of Written Burmese thāḥ 'to put' is from *St-.

Old Chinese 置 *trək-s 'to place' may be an unrelated lookalike even if it is from *r-tək-s, as it has a *-k absent in  the other languages.

I cannot explain why stem 2 of 'to order' has a second ('rising') tone from an *-H absent in stem 1. EAT-YMOLOGY 2: BRIGHTENING BEFORE LENITION?

The second item in Guillaume Jacques' 2006 list of Tangut-Japhug rGyalrong comparisons is

Tangut 5113 1wji 'to do' (stem 1), 36211wjo 'to do' (stem 2)  : Japhug kɤ-pa 'to close'

also cf. other rGyalrong forms: e.g., Somang ka-pa 'to do'

(more in #1133-1135 at Nagano and Prins' database)

See my first "Eat-ymology" post for an explanation of stems 1 and 2.

That post was about a parallel pair of stems:

Tangut 4517 1dzji 'to eat' (stem 1), 4547 1dzjo 'to eat' (stem 2)  : Japhug kɤ-ndza 'to eat'

The parallelism is not as apparent if one looks at rhyme numbers and/or my reconstructions:

Stem 1 2
'to do' vɨi 1.10 vɨo 1.51
'to eat' dzi 1.11 dzio 1.51

Guillaume uses Gong Hwang-cherng's reconstruction in which rhymes 1.10 and 1.11 are both -ji. I reconstruct them differently. I also reconstruct different allophones of 1.51 after different initials. The class II initial v- is followed by Grade III -ɨ- but not Grade IV -i-. It was somehow antipalatal in a way that medial -w- is not. Gong reconstructed w in both initial and medial position and did not reconstruct a Grade IV distinct from Grade III.

I can see why Gong reconstructed rhymes in those two grades identically. There are a few minimal pairs involving them: e.g.,


0932 ʔɨi 1.10 'many' (only with that meaning in dictionaries?) : 3119 ʔi 1.11 'many'

Those pairs were probably the reason why the Tangut split rhymes (e.g., into 1.10 -ɨi and 1.11 -i). When no such pairs were present, the Grade III/IV distinction was subphonemic, and there was no split. Hence there was only one rhyme 1.51 -ɨo/-io. The frontness of the first half of the diphthong was predictable.

I think the Grade III/IV distinction was absent from pre-Tangut. I reconstruct the pre-Tangut sources of 'to do' as

stem 1: *CI-pa > *CI-pja > *CI-β- > *vi > vɨi 1.10

stem 2: *CI-pa-w > *CI-pjaw > *CI-βj- > *vjo > vɨo 1.51

I do not know if *-ja and *-jaw became *-i and *-jo before or after lenition. My guess is that such shifts predated the loss of stop codas that might have blocked the raising of *-ja to *-i:

Brightening stage 1 *-ja *-jaw *-jaC
Brightening stage 2 *-i *-jo *-jaC
Final coda loss *-i *-jo *-ja
GV > VV diphthong reanalysis -i -io -ia

Some of the changes above could be viewed in terms of a drag chain:

*-jaC > *-ja > *-i

Tangut syllables with lenited initials must have once had presyllables conditioning intervocalic lenition:

*presyllable + labial > *β- > v-
*presyllable + dental > l-

*presyllable + alveolar > *z- > ɮ-

*presyllable + palatal > - > ʐ-

*presyllable + velar > ɣ-

Lenition must have preceded brightening because there are words such as 'to eat' with brightening but without lenition. 'To eat' must have lost its brightening presyllable before 'to do':

Gloss 'to eat' 'to do'
Pre-Tangut *NI-dza *CI-pa
Brightening *NI-dzja *CI-pja
Presyllable to prenasalization *Ndzja *CI-pja
Lenition *Ndzja *CI-β-

If brightening had followed lenition, 'to eat' should have been *ɮi 1.11 (stem 1) / *ɮio 1.51 (stem 2) with *ɮ- from a lenited *-dz-. EAT-YMOLOGY

Guillaume Jacques' 2006 list of Tangut-Japhug rGyalrong comparisons begins with

Tangut 4517 1dzji 'to eat' (stem 1), 4547 1dzjo 'to eat' (stem 2)  : Japhug kɤ-ndza 'to eat'

Guillaume used Gong's 1997 reconstruction of Tangut.

Those Tangut words in my reconstruction are 1dzi (without -j-) and 1dzio (which could be rewritten as dzjo).

Stem 2 (in bold below) was used before the first and second personal singular suffixes when the object is in the third person. Otherwise stem 1 was used:

Subject \ object of 'eat' ... me ... us ... thee ... you ... him/her/it/them
I ... (no 'I eat me', etc.)


We ...


Thou ...

 (no 'You eat you', etc.)
You ...


He/she/it/they ...





The seventeen slots in that table have only six forms. I list my reconstructions when they differ from Gong's.

1. bare stem 1 (3rd person subject and object)

2. stem 1 + 2ŋa (first person singular object)

3. stem 1 + 2nja (= my 2nia; second person singular object)

4. stem 1 + 2nji (= my 2ni; nonthird person plural subject and/or object)

5. stem 2 + 2ŋa (first person singular subject + 3rd person object)

6. stem 2 + 2nja (= my 2nia; second person singular subject + 3rd person object)

Reconstructing the history of the Tangut and Japhug words for 'to eat' involves dealing with issues 2 and 3 from my last post.

Cognates of the Tangut word such as Japhug kɤ-ndza, Written Tibetan za-ba, and Written Burmese cā generally have a. The high front vowel of Tangut -ji is assumed to be the product of 'brightening' (Matisoff 2004).

In 2009, Guillaume derived -ji in 'eat' from *-ja. I don't know whether he still does in his new book. This is phonetically plausible. However, it raises the question of where the *-j- in *-ja came from. How far back can it be projected? Did languages such as Japhug, Tibetan, and Burmese lose it? Or is it a Tangut-internal innovation?

Gong (1994: 42) thought Old Chinese and Tangut retained Proto-Sino-Tibetan *-j- whereas Tibetan and Burmese generally lost it. Perhaps he would have said Japhug had lost it in this word. (There are no *affricate-j clusters in Guillaume's (2004: 331-332) Proto-rGyalrong reconstruction. Did *ndzj- simplifiy to *ndz-?)

On the other hand, I did not reconstruct -j- in Tangut. I proposed that the brightening of *a to -i was due to a high-vowel presyllable:

*CI-dza > *CI-dzja > 1dzi

*CI-dza-w > *CI-dzjaw > 1dzio

The problem with this hypothesis is the absence of external evidence for *CI-. Could the Japhug reflex of *CI- be n-; i.e, was *CI- something like *[ni]? If so, perhaps the presyllable was absorbed into the initial in both Japhug and Tangut:

Pre-Proto-rGyalrong *ni-dza > Proto-rGyalrong *ndza >

Japhug -ndza

Somang -zá

Zbu -ndzeʔ, -ndziʔ (with brightening conditioned by the front vowel of *ni-?), -ndzʌʔ

Tangut: *ni-dza > *ni-dzja > *ndzi > 1dzi

I have followed Gong and Arakawa who reconstructed Tangut voiced obstruent initials without prenasalization, but others such as Nishida (1964) and Sofronov (1968) would disagree. The most recent scholar in favor of complex voiced obstruent initials is Tai (2008:

[...] there are regular use of prescripts in front of voiced obstruents [in the Tibetan transcription of Tangut], suggesting that there should be a pre-initial consonant [in Tangut], which is probably a weak nasal or glottal sound.

I followed Guillaume who reconstructed *ndz- at the Proto-Tangut (= my pre-Tangut) level in 2009. GUILLAUME JACQUES' ESQUISSE DE PHONOLOGIE ET DE MORPHOLOGIE HISTORIQUE DU TANGOUTE NOW IN PRINT

That was the best news I'm likely to hear all week. This month too. Maybe even this year.

Unfortunately I haven't seen the book yet. Google Books has no preview for it. Nonetheless I am confident that I will be impressed. I have seen Guillaume's previous work on Tangut and rGyalrong and look forward to see how he has build upon it. I am particularty interested to see his treatment of the following topics:

1. What shared innovations distinguish his proposed Macro-rGyalrongic group from the rest of Qiangic or - if  Macro-rGyalrongic is his term for Qiangic - the rest of Sino-Tibetan?

Tonight I found the 2011 dissertation of Marielle Prins (whose rGyalrongic database I constantly use) which states  on p. 21 that there is "an absence of common innovations" in Qiangic. Prins proposed that

the similarities between the Qiangic languages may be caused by diffusion rather than be genetic in nature. [...] It is more likely that the shared features of these languages are the result of contact induced structural convergence, and that the Qiangic group should be considered an areal language group rather than a group of genetically related languages. (p. 22)

I wonder what Guillaume would say about that.

I am not sure whether Prins is denying that the Qiangic languages are related at all, or if she is just rejecting Qiangic as a subgroup. The latter position need not entail a complete absence of a genetic relationship: e.g., Qiangic could consist of languages from multiple Sino-Tibetan branches which have converged. Is Prins' Qiangic like my Altaic (completely unrelated languages tha have converged) or like the Balkan languages (which are from different branches of Indo-European)?

2. I assume Guillaume is still using Gong's 1997 reconstruction of Tangut which has three grades of rhymes (his III corresponds to my III and IV):

Grade Gong Gong's source Arakawa This site
I -Ø- *-Ø- -Ø- -Ø- + lowering of high vowel
II -i- *-r- -j- -ɤ-
III -j- *-j- long vowel -ɯ-
IV -i-

Gong's Tangut -j- and Old Chinese *-j- were retentions from his Proto-Sino-Tibetan *-j-. On the other hand, Guillaume does not reconstruct Old Chinese *-j-. How does he account for Gong's Tangut *-j-?

3. Pre-Tangut *a was raised and fronted ('brightened') to various degrees. I have tried to explain the multiple reflexes of *a by reconstructing presyllables with front vowels:

*CI-Ca > Ci

*CE-Ca > Cie

More recently I have hypothesized that some 'brightening' might have been conditioned by a suffix *-j: e.g.,

1749 *kwa-j > 1kwe 'hoof'

cf. Ersu nkhuɑ⁵⁵ 'id.'; more cognates at STEDT

How does Guillaume explain 'brightening' in Tangut?

4. Gong reconstructed long vowels that do not correspond to long vowels in Tangut transcriptions of Sanskrit. I am now agnostic about those vowels and reconstruct them with an abstract symbol ' to differentiate them from their much more common '-less counterparts (short vowels in Gong's reconstruction). The zero-' distinction does not seem to correspond to anything in rGyalrong; both types of Tangut vowels correspond to the same Japhug rGyalrong vowels (Jacques 2006): e.g.,

Tangut rhyme Gong This site Japhug
37 -jij -ie -i, -e, -o
40 -jiij -ie'

What does Guillaume think is the source of vowel length in Gong's reconstruction? Does that length reflect a disticnction lost in Japhug?

5. I reconstructed *-H as the source of the Tangut second ('rising') tone; syllables without *-H developed the Tangut first ('level') tone. This type of tonogenesis has parallels in Chinese, Tibetan, and Burmese, but not. Southern Qiang (Evans 2007). What is Guillaume's account of the origin of Tangut tones? A *E(YE)-GRADE ROOT?

If Tangut 1new 'breast', 2niu 'to drink milk', and 2niụ 'to give milk' are from the root *√n-w that I proposed in my last post, there ideally should be other sets of -ew ~ -iu words. Unfortunately, I still haven't gotten around to looking for them, but today it did occur to me that Tangut

4684 1me 'eye'

and Old Chinese 目 *muk 'eye' might share a root *m-kʷ:

*m-e-kʷ (e-grade) > *mew > 1me

(See this series of posts on Tangut *labial-w syllables: 12.7.23 / 12.7.28 / 12.7.29).

*m-kʷ (zero-grade) > 目 *muk

This word is widespread in Sino-Tibetan (STEDT roots #33, 681, 682). It often has an -i-: e.g., Tibetan mig 'eye'. Was there an i-grade? (I have borrowed the terms e-grade and zero-grade from Indo-European studies. There is no such thing as an i-grade in Indo-European, but perhaps it existed in Sino-Tibetan.) Or was the root *mʲ-kʷ with an initial palatalized consonant that was vocalized as -i- in the zero grade in many languages?

One might want to resurrect an old-fashioned reconstruction *mjuk for Old Chinese 目 *muk and view its *mj- as a reflex of *mʲ-, but I have never seen any evidence for *-j- in modern Chinese languages, and there is no trace of *-j- in Sinoxenic:

Taiwanese bak (colloq.), bok (lit.)

Cantonese muk

Mandarin mu (in earlier reconstructions, *mj- > w-, not m-)

Sino-Vietnamese mục (not *dục < *mjuk or *mʲuk)

Sino-Korean mok

Sino-Japanese moku, boku

One might also be tempted to regard 覓 Middle Chinese *mek 'to seek' as being from an e-grade *m(ʲ)ekʷ like Tangut 1me 'eye', but the earliest attestation of the word that I can find is in Yupian (c. 543 AD), so I do not know if it should be reconstructed at the Old Chinese level. It could be a later unrelated innovation that has nothing to do with m-words for 'eye'. A *N-W WORD FAMILY?

The Tangut word

4834 2niụ < *S-nuH 'to give milk'

from my last two posts is a causative derivative of

4614 2niu < *S-nu 'to drink milk'

and I think those two words are related to


2123 1new 'breast' =

left of 3588 1new 'radish' (phonetic) +

left and center of 5275 2nɪʳ 'breast' (semantic).

Tangut *-w can be from *-k or *-w. If 2123 1new 'breast' is from *new and not *nek*, then it and the niu-words may share a root *√n-w:

pre-Tangut prefix root consonant 1 vowel root consonant 2 suffix gloss
e-grade Ø n- -e- -w Ø breast
zero-grade Ø -H to drink milk
S- to give milk

The *-w of the zero-grade root would have been pronounced as a vowel [u].

Could the grade hypothesis account for the vocalic diversity of these cognates?

Old Chinese 乳 *Cɯ-noʔ 'nipple, milk' could be from an o-grade *n-o-w. (The prefix could be *pɯ- if 孚 *phu is phonetic.)

If the above scenario is correct, are there other cases of *Cew ~ *Cu alternations in Tangut, and what is the significance of the different grades?

Li (2008: 832) regarded 5275 2nɪʳ 'breast' as a loan from Chinese. The only similar Chinese word I know of is 奶 'breast, milk'. But I cannot find any attestations of 奶 before the Qing Dynasty. If 奶 had existed in northwestern Middle Chinese, it would have been pronounced *nəjˀ, and if Tangut speakers added a *T-prefix, the resulting *T-nəjˀ would have developed into 2nɪʳ. The Old Chinese source of *nəjˀ would be *Cʌ-nəʔ which might have come from an even earlier *Cʌ-nəw-ʔ: i.e., a schwa-grade form of *√n-w. Perhaps the pre-Tangut prefix directly reflects the Old Chinese prefix if the latter had survived in the colloquial speech of the northwest during the Middle Chinese period:

OC *Tʌ-nəw-ʔ > *Tʌ-nəʔ > MC *T(ʌ)-nəjˀ > pre-Tangut *T(ʌ)-nəjˀ > Tangut 2nɪʳ

However, all that is highly speculative.

2nɪʳ could be an unrelated lookalike from a pre-Tangut source such as *Cʌ-nirH.

In any case, I cannot think of a way to derive 2nɪʳ from *√n-w within Tangut. If it is ultimately from that root, it would have to be a Chinese loanword.

*One might be tempted to reconstruct *-k since Maru has nuk⁵⁵ 'breast, milk', but Maru -k is an innovation. A BOVINE DYNASTY? (PART 2)

Guillaume Jacques (2010) equated the second half of ngo.snuHi, the Tibetan transcription of the name of the mythical first Tangut emperor, with Tangut

2niụ < *S-nuH 'to give milk'

In 2008, Guillaume rejected the temptation to go further and equate Tibetan s- with pre-Tangut *S- (his *s-):

Une hypothèse plus audacieuse pourrait être de voir dans ce s- une notation du préfixe causatif *s- qui doit se reconstruire pour ce verbe. En tangoute, nju.² [= 2niụ]  est dérivé de nju² [= 2niu]  (#4614) 'boire du lait'; le préfixe causatif *s- a disparu, laissant comme seule trace la 'voix tendue' notée par un point en dessous de la voyelle (Gong 1999). Cette hypothèse, toutefois, est très improbable dans la mesure où elle supposerait que soit conservée dans la graphie tibétaine une prononciation du tangoute plus ancienne que le système reconstruit à partir des dictionnaires du XIIème siècle, et donc antérieure d’au moins quatre cent ans aux textes tibétains eux-mêmes.

He regarded the Tibetan s- as "un simple artifice orthographique" since

dans le tibétain central du XIVème siècle, les consonnes préinitiales étaient déjà probablement confondues, voire amuies

but I wonder if ngo.snuHi reflects a nonstandard 14th century Tangut dialect which preserved pre-Tangut *S-. Tangut may have been internally diverse, and this dialect may have been to 12th century standard Tangut what modern Cantonese (which preserves final stops) is to Tangut period northwestern Chinese (which lost final stops) or what Ladakhi (which preserves some s-clusters: examples here and here) is to 14th century central Tibetan.

The -Hi may reflect a -j which was another trait of this 14th century Tangut dialect. Summing up the differences between the two types of Tangut and their common parent pre-Tangut:

Word cow to give milk
Pre-Tangut *ŋwə(-j)-H *S-nu(-j)-H
Standard Tangut 2ŋwɪ 2niụ
Later nonstandard Tangut ŋwə or ŋ(w)o snuj
Tibetan transcription ngo snuHi

The standard dialect had a *-j suffix in 'cow' absent from the nonstandard dialect. Conversely, the nonstandard dialect had a *-j suffix in 'to give milk' absent from the nonstandard dialect.

It is remotely possible that the -iụ of standard 2niụ could be a metathesis of *-u-j rather than a breaking of *u. But even if that were true - and I don't think it is - many or even most -iu could not be from *-u-j, as -iu regularly corresponds to Japhug rGyalrong < Proto-rGyalrong *-u (Jacques 2004: 143, 2006: 16-17). Moreover, if such a metathesis had occurred in standard Tangut, I would expect Chinese *-uj or perhaps even *-wi to correspond to Tangut -iu in very early loans. No such loans have been identified.

In any case, *-j cannot go back very far because probable cognates lack it, and nothing else leads me to believe that Tangut preserved a *-j lost elsewhere.

Next: A *n-w word family? A BOVINE DYNASTY? (PART 1)

Guillaume Jacques (2010) equated 2339, the first syllable of the Tangut imperial surname 2ŋwɪ 1mi, with its homophone (and near-homograph) 0395 2ŋwɪ 'cow':


The shared center and right components are phonetic. The surname tangraph has 'sage' on the left, whereas 'cow' has the center of 'bear' according to Precious Rhymes of the Tangraphic Sea:


0395 2ŋwɪ 'cow' = 5605 2riẽ 'bear' + 2139 2ŋwɪ 'a kind of bird'

Without looking outside Tangut, I could reconstruct the pre-Tangut source of 2ŋwɪ 'cow' as

*Cʌ-ŋwiH (if the -w- is original) or

*Pʌ-ŋiH (if the -w- is from a presyllable)

with a low presyllabic vowel to condition the lowering of *i. However, it is unlikely that the root vowel was once *i.* Probable external cognates such as Old Chinese 牛*ŋʷə 'cow' and Written Burmese nvāḥ (< *ŋwaH?*; many more here) point to a nonfront vowel. This word was borrowed into southwestern Tai as *ŋuaA 'ox'**.

I used to reconstruct the rhyme of 2ŋwɪ 'cow' as -əi. Could 2ŋwɪ or 2ŋwəi be from *ŋʷə-i-H?

The name of the first Tangut emperor was transcribed as ngo.snuHi in Tibetan. Guillaume Jacques identified that as Tangut

0395 4834 2ŋwɪ 2niụ 'the cow gives milk /  [someone] fed milk by the cow'

whose meaning was rendered in Tibetan as

ba-la Hthung-ba

cow-DAT milk drink-NMLZ

'he who drinks milk from the cow'

ngo might be a transcription of a nonstandard Tangut *ŋwə without my proposed suffix *-i. There is no character for schwa in the Tibetan script, so Tibetan o might represent a schwa. It is also possible that a pre-Tangut *ŋwə could have become *ŋo in that dialect (whereas *-wə did not fuse into -o in standard Tangut).

*Was *ŋw- > *nw- a regular change in Proto-Lolo-Burmese? I can't remember if my unpublished reconstruction from twenty years ago had either cluster. Matisoff's (1972) reconstruction has only one word with *ŋ(w)- which has a variant initial *mw- (not *nw-!).

**Do variants with w/v- and h- reflect different sources of borrowing? See Gedney's list of forms in Hudak 2008: 95. Unfortunately I could not find the word in Pittayaporn's  2009 dissertation on Proto-Tai. It may have been excluded because it could not be reconstructed at the Proto-Tai level. WAS THE TANGUT IMPERIAL FAMILY THE MI OF WEI?

Two years ago I saw Guillaume Jacques' derivation of the Tangut imperial surname

2339 1903 2ŋwɪ 1mi

from a hypothetical homophonous phrase

0395 4542 2ŋwɪ 1mi 'the cow feeds [someone]' / 'fed by the cow'

At the end of last month, I saw another derivation but couldn't remember what it was. I found it last night in Nishida (2010: 233):

It is very probable that the second syllable, miɦ (level 11), of ŋʷwɪ-miɦ (level 11) meaning "imperial family" was one of the corresponding cases of the [Tangut autonym] Mi. Its meaning might have been the Mi of Wei 魏.

The Tangut imperial family claimed descent from the Tuoba clan of the Northern Wei. Although this etymology is initially appealing, it has phonological problems.

First, the Tangut called the Wei

4962 2vɪ or 5574 2vɨi

rather than 2ŋwɪ. v- in those transcriptions reflects the loss of *ŋ- in the Tangut period northwestern Chinese pronunciation of 魏. Perhaps the imperial surname contains an earlier borrowing of 魏 preserving its nasal initial.

Second, the Tangut autonym

2344 2mi < *miH

has a 'rising' tone, whereas the second syllable of the surname has a 'level' tone. This tonal difference does not necessarily rule out a connection between the two names. The 'rising' tone of the autonym may be a reflex of a final glottal suffix *-H absent from the 1mi < *mi of the surname. Both 2mi and 1mi may be cognate to Tibetan mi 'person'. A SILKEN SOURCE FOR THE RED RADICAL?

I'm surprised I was able to account for all uses of the 'red' radical

(Boxenhorn code: qie; Nishida radical 226)

in a straightforward manner in my last post. It means 'red' and/or is phonetic in all but one case (E):

A. n-phonetic B. 'red'
E. 1tʂhɨĩ 'Chen' (a family on the land of the 2nie family?) C. xŨ-phonetic in < B. 1xʊ̃ 'red' D. -iã-phonetic in < B. 2ʔiã (1st syl of 'rouge')

I am normally at a loss to explain the function of a component in one or more tangraphs containing it. For instance, I have no idea what

the right side of

1671 1nie 'red'

is doing. It is in 65 other tangraphs. I think it is phonetic in

1674 2nie (second syllable of 2mi 2nie 'younger sister')

1809 2nie (second syllable of 1ɣɤə 2nie 'few')

which are near-homophones of 1nie 'red'. But what is it doing in, say,

3528 2tho' 'to harm, endanger'

whose analysis is unknown? Did red signify danger?

Going back to the other half of 1617 'red', I think

might be derived from the seal form of the top half (幺) of the Chinese 'silk' radical 糸 on the left side of Chinese 紅 'red'. The vertical line at the top of the Chinese 'silk' radical corresponds to the horizontal line of the Tangut 'red' radical, and the two circles correspond to the two

of the Tangut 'red' radical. If the admittedly vague similarity between the two radicals is just pareidolia on my part, did the Tangut simply draw a random line pattern and declare it to be 'red' and/or nie? A RED RADICAL

Half of the nie-tangraphs (Tangut characters) from my previous entry contained the element

(Boxenhorn code: qie; Nishida radical 226)

that appears in twenty other tangraphs. Here are all 31 qie-tangraphs. Asterisks indicate words which are only in dictionaries to the best of my knowledge.

Class Tangraphs Li Fanwen 2008 # Reading Gloss
A1 0529
1nie 2nd syl of 'be stifled to death'*
2nd syl of 'servant'
to try
2nie surname syl
2nd syl of 'kind of insect'*
2nd syl of 'kind of grass'
2nd syl of 'chin'
2nd syl of 'colored silk'*
2nd syl of 'to hide'*, 'to turn around'
bird name syl
A3 =+ 0363 1nʊ transcription
A4 =+ 1235 1nĩ red

red jade necklace*
red sand
red (Chn)
1st syl of 'rouge' (Chn)
red soil*
red wood*
2nd syl of 'rouge' (Chn)
C =+ 1741 1xõ transcription
D =+ 2049 2siã
E =+ 0298 1tʂhɨĩ surname 陳 Chen

The 31 fall into five categories:

A. qie as n(ie)-phonetic: 13 tangraphs

A1. qie as 1nie-phonetic: 4 tangraphs

A2. qie as 2nie-phonetic: 7 tangraphs

A3. qie as n-phonetic: 1 fanqie tangraph (2nie + 1tʊ = 1nʊ)

A4. qie as n-phonetic / semantic for 'red' (see category B below): 1 fanqie tangraph (1nie 'red' + 1ʔĩ = 1nĩ 'red')

B: qie as semantic abbreviation of 'red': 15 tangraphs

C. qie as xŨ-phonetic: 1 tangraph

Cf. 1402 1xʊ̃ 'red' (Chinese loanword) in category B.

The Tangraphic Sea derived 1741 1xõ from 1671 1nie 'red' (whose Chinese translation was 紅 *xʊ̃) and 3682, first syllable of 2mə 1ʔɤõ 'merit' (which could be translated into Chinese as 勳 *xiũ)

D. qie as -iã-phonetic: 1 fanqie tangraph (1si + 2ʔiã = 2siã)

E. qie as abbreviation of a surname Ne or a surname containing the syllable ne: 1 tangraph

Was a Chen family in the Tangut Empire on the land of a Tangut family whose name contained Ne? The Tangraphic Sea derived the right half of 0298 from 2107 1tsɪʳ 'earth'. PROXIMATE PRONUNCIATION

Yesterday I wrote about the transcription evidence and potential cognates of Tangut

1nie 'relative'

which originally may have meant 'near' (relatives being the people nearest to oneself).

Fanqie spellings expressing the pronunciation of tangraphs ( Tangut characters) in terms of the initials and rhymes of other tangraphs are only available for a little over half of the 6,000+ known tangraphs. Unfortunately, no fanqie are known for either 1nie or its second ('rising') tone counterpart 2nie.

Usually first ('level') tone tangraphs have fanqie in the surviving first volume of the Tangraphic Sea, but that volume is missing some pages including those which probably contained tangraphs for 1nie and other syllables with the 36th rhyme of the first tone.

The fanqie for most second tone tangraphs is probably in the lost second volume of the Tangraphic Sea. (Some second tone fanqie are in the surviving third volume Mixed Categories.)

Homophones lists 22 characters in a homophone group mixing 1nie and 2nie. All but one (0548) can also be found in Precious Rhymes of the Tangraphic Sea which has no fanqie for any of them.

Homophones Tangraph Li Fanwen number Reading Gloss Tangraphic Sea Precious Rhymes
13B31 1723 2nie second syllable of 2ŋwəʳ 2nie 'colored silk' (only in dictionaries?) in missing second volume?
13B32 1671 1nie red in missing pages of first volume?
13B33 0547 2nie the surname Ne (occurs as a first or second syllable in disyllabic surnames but unclear if it can occur by itself); transcription character in missing second volume?
13B34 1858 2nie second syllable of 1lɨa 2nie 'to hide' (only in dictionaries?; the first half can mean 'to hide' by itself) and 1gie 2nie 'to turn oneself around, look around; the other way around'
13B35 0593 2nie second syllable of 2khwa 2nie 'a kind of grass'
13B36 0548 2nie second syllable of 1lhə 2nie 'a kind of insect' (only in dictionaries?) not in either book
13B37 1678 2nie second syllable of 2miə 2nie 'chin' in missing second volume?
13B38 1774 1nie second syllable of 2nieʳ 1nie 'servant' in missing pages of first volume?
13B41 0529 1nie second syllable of 1nie' 1nie 'to be stifled to death' (only in dictionaries?)
13B42 1732 1nie first syllable of the surname 1nie 1xɤu
13B43 0806 2nie second syllable of 2mɪ 2nie 'wind' (only in dictionaries?; the first half by itself is the name of the 'wind' trigram ☴) in missing second volume?
13B44 1674 2nie second syllable of 2mi 2nie 'younger sister' ('ritual language'? only in dictionaries?)
13B45 1809 2nie second syllable of 1ɣɤə 2nie 'few'
13B46 1926 2nie in the past
13B47 0213 1nie relative in missing pages of first volume?
13B48 2231 1nie to try, second half of 1lɨe 1nie 'emissary' (the first syllable is 'to serve' by itself), first half of 1nie 2ʔwiəʳ 'writing on silk [cf. 1723 above with a different tone], written correspondence' (the second syllable is 'writing' by itself)
13B51 2239 2nie second half of 2biu 2nie 'nightingale', first half of 2nie 2no 'cuckoo, oriole' in missing second volume?
13B52 3671 1nie first syllable of 1nie 2riaʳ 'father' (only in dictionaries?; the second half needs to be combined with either 1nie- for 'father' or -2si for 'mother') in missing pages of first volume?
13B53 5147 2nie first syllable of 2nie 1ɣa 'dog' (only in dictionaries?) in missing second volume?
13B54 3846 2nie optative prefix (< 'downward'), you (is this pronoun only in dictionaries?)
13B55 3817 2nie to present a gift (only in dictionaries?)
13B56 0638 2nie to compel, drive

Since Tangut dictionaries - both ancient and modern - are character-based, one might think 22 characters stood for 22 monosyllabic words pronounced nie, but in fact there are only three monosyllabic 1nie words and only three or four monosyllabic 2nie words:

1nie: 1. 'red', 2. 'relative', 3. 'to try'

2nie: 1. 'the surname Ne' (? - unsure if it can occur by itself), 2. 'in the past', 3. 'to present a gift', 4. 'to compel, drive'

8.21.2:42: It is tempting to try to derive the polysyllabic words from the monosyllabic nie-words, particularly since some of them are combinations of nie with monosyllabic words: e.g.,

1995 0806 2mɪ 2nie 'wind'

whose first syllable is also the Tangut name of the 'wind' trigram ☴. Could that word literally be 'red wind'? The trouble with that case and others is that 'red' is 1nie, not 2nie. One could try to salvage the etymology by proposing that 2nie in compounds is from 'red' plus an *-H suffix conditioning the second tone. But it is dangerous to build speculations atop speculations. Moreover, in this particular case, perhaps 2mɪ is an abbreviation of a monomorphemic, disyllabic 2mɪ 2nie 'wind'. PROXIMATE PEOPLE

Today I saw Tibetan nye 'near' which brought to mind a possible Tangut cognate:

0213 1nie 'relative' (i.e., one's near relations; I covered related characters here)

This word was transcribed in Tangut period northwestern Chinese as 你 *ni. No Tibetan transcription is known, but its near-homophone

3830 2nie 'king'

with a different tone was transcribed in Tibetan as nye(H) and ne(H). (Tibetan ཉ ny- [ɲ] and ན n- are different letters.)

I normally derive Tangut rhyme 37 -ie from pre-Tangut *Cɯ-e:

*Cɯ-ne > *Cɯ-nie > nie

The -i- is a trace of the lost presyllabic high vowel *ɯ.

However, Tibetan nye makes me wonder if Tangut -i- in 'relative' is primary rather than secondary:

*ɲe > nie = [ɲe]? [nje]?

Similar Qiangic and rGyalrongic words for 'near' (see sections 3.2 and 3.3 of this list) have palatal ȵ- (= ɲ) or dental n-. (See items #1757-1758 here for very different rGyalrong words.)

Possible Old Chinese cognates have n-:

*Cɯ-ne(j)ʔ (< *n-e-j + -ʔ?) 'near'

*Tnik (< *T- + √n-j + -k?) 'near, be familiar with'

Could those words contain e-grade and zero-grade forms of a root *n-j? Could the root-initial consonant have been *ɲ-? Were *Cɯ- and *T- the same prefix with and without a presyllabic vowel? Were *-ʔ and *-k variants of the same suffix? SREDNJI KITAJSKI JĘZYK

I felt uncomfortable about mentioning Middle Chinese reconstructions in my last post because they may give the false impression that Chinese in the past was more homogeneous than it actually was.

It occurred to me last night that Middle Chinese is about as real as Interslavic, my favorite constructed language. If future linguists knew nothing about Russian, Polish, Serbo-Croatian, etc. - i.e., specific actual languages - Interslavic would have to do for comparisons with other European languages. The title of this post is Interslavic for 'Middle Chinese language'.

I suspect that diversity within Middle Chinese was like that between Slavic languages today. So my *kon for 昆in this table is to real Middle Chinese forms what Interslavic koń 'horse' is to these modern Slavic words: similar but not necessarily identical. Interslavic koń happens to match the actual Polish word for horse, but its vowel is very different from that of Ukrainian кінь [kinʲ] 'horse', and it is completely different from Russian лошадь [loʂətʲ] 'horse', a loan from Turkic. If a language borrowed a word kin 'horse' from Ukrainian, it would be strange to say that kin is from a 'Slavic' koń. Yet how many would blink if I wrote that Sino-Vietnamese mã is a loan from 'Middle Chinese' 馬 *mɤaˀ? (The actual source of mã was more like *ma with a 'rising' tone in a southern late Tang variety of Middle Chinese.)

'Middle Chinese' may sound specific, but it's actually a generic term like 'Middle Indic' which could refer to Pali, Gandhari, Ardhamagadhi, etc.

Unfortunately there is no analogous established terminology for specific varieties of Middle Chinese. It is easier to type a simple name like Pali than a phrase like 'Tangut period northwestern Chinese' (TPNWC), the dialect in the Timely Pearl in the Palm that is also the source of Chinese loans in Tangut.

Tonight I momentarily considered renaming TPNWC 'Zaric' after Tangut

1ɮar 'Chinese'

but that term would make no sense to those who didn't know the Tangut word. Although my older term is more tedious, it is also more transparent.

One could think of Middle Chinese reconstructions as being as open to intrepretation as Interslavic pronunciation: e.g., the ę of język 'language' in the post title could be [ʲa] ~ [ʲɛ] ~ [ʲɛ̃] ~ [ʲɔ̃] ~ [ɛ]; [ʲæ] is a suggested average.

That description of Interslavic states that "[a]ccentuation is free." Hence there is no way that one could figure out Serbo-Croatian tones from Interslavic: e.g., the falling tone of konj 'horse'. (Most of Slavic lacks tones, so Interslavic also lacks them.)

The situation is a bit different with Middle Chinese tones. The Old Chinese sources of Middle Chinese tones are known (e.g., *-ʔ in 'horse'), but their phonetic realizations are not. It is likely that *-ʔ left a trace as glottalization which disappeared at different times in different places (and is still present in today's Xiaoyi), and pitches once associated with glottalization became phonemic.

Although 'rising', the traditional name of the tone category for 'horse', suggests the tone was rising, that may not have been true in all Middle Chinese varieties, and it is certainly not true today: e.g., in Taiwanese, 馬 'horse' has a high falling tone (indicated with an acute accent in romanization!). See Sagart (1998) for more on Chinese tonal history.

I have similarly used an acute accent to indicate the 'rising' tone in Middle Chinese varieties after glottalization was lost, but that accent may imply a rising or even high tone though I am actually agnostic about its contour, so I am reluctant to use it now. Maybe it's time to dust off my tone codes.

All of the above also applies to Old Chinese except for the part about tones since Old Chinese didn't have any. Old Chinese was not uniform before the mid-first millennium AD. In fact, 揚雄 Yang Xiong (53-18 AD) wrote the first Chinese dialect dictionary, 方言 Fangyan 'Areal Speech', toward the end of the old chines period. I think that oddities in Chinese loans in Vietnamese and Tai may in part reflect Old Chinese diversity that has been lost. Proto-Indo-European must also have been diverse.

Speaking of Proto-Indo-European, I don't understand how Proto-Indo-European *ḱem- 'hornless' became Proto-Slavic *konjь; why didn't PIE *ḱ- become PS *s- (cf. Sanskrit śama- 'domestic'), and why didn't PIE *-m- become PS *-m-? WHEN B IS SPELLED G

If Vietnamese mắm [mam] < *ɓamʔ 'salting' could be written with the velar-initial phonetic 禁 cấm [kəm] (see my last two entries), could labial-initial syllables be written with velar-initial phonetics in sawndip, the traditional Zhuang script, as well? I looked through Sawndip sawdenj [Traditional Zhuang script dictionary] which I admit is a problematic source* and found the following characters with velar-initial phonetics for Zhuang [p]-initial syllables:

Standard Zhuang reading IPA Semantic component Phonetic (?) component and Middle Chinese reading Zhuang reading of phonetic (?) component Meaning
boenq pon³⁵ 土 'earth' *kon (> some northern Pinghua readings with khw-; the aspiration is irregular) goen [kon³⁵] dust
bomx poːm⁴² 足 'foot' *kuŋ 'bow' (archery) no reading for 弓 in isolation; 弓 is a phonetic in goem [kom³⁵] to crouch
byaij pjaːj⁵⁵ *ŋwajʰ 'outside' (> three northern Pinghua dialects have m-!) vaih [waːj³³] to walk
byangj pjaːŋ⁵⁵ 強 'strong' *kɔŋ (> early Mandarin *khjaŋ) gangj [kaːŋ⁵⁵] hot pain?
byoq pjo³⁵ 火 'fire' *khɨak (> early Mandarin *khjaw, northern Pinghua readings khio, khyo) cog [ɕoːk³³] to bake
byouz pjow³¹ *gu caeuz [ɕaw³¹] to boil
byuk pjuk³⁵ 虫 'bug' *kok goek [kok³⁵] white ant
byuz pju³¹ 瓜 'melon' *ɣo (> some northern Pinghua readings with f-: e.g., Guilin [Yanshan zhuyuan dialect] fu) no reading for 乎 in isolation; 乎 is a phonetic for fouj [fow⁵⁵], fuj [fu⁵⁵], hued [hut³³], huz [hu³¹], ruz [ɣu³¹], and youq [jow³⁵] gourd
bywngj pjɯŋ⁵⁵ 足 'foot' *khəŋˀ haengj [haŋ⁵⁵] verb suffix
扌 'hand'

(8.18.0:54: Added Zhuang reading of phonetic component column. The title of this post should makes more sense now. I was referring to how Zhuang b-syllables were written with g-phonetic components.)

At least two phonetic components may actually be semantic: e.g.,

*kuŋ 'bow' (archery) could refer to bending down in 足+弓 bomx 'to crouch'

*ŋwajʰ 'outside' could refer to going outside in 足+外 byaij 'to walk'

or it could have been chosen for a labial or labiodental initial ([w]? [v]? [m]?) close to by- [pj]

*ɣo may be a reference to its homophone, the first syllable of 葫蘆 'gourd'.

The other phonetic components are baffling. If they are really phonetics, were they chosen only for their rhymes? Or did they have labial-initial readings in local varieties of Chinese?

*Holm (2011: 2) pointed out that

The Sawndip sawdenj is a useful compendium, but it provides no information about where the dialect forms come from, so it is impossible to see any patterns in geographic variation from this source.

Moreover, all the readings in Sawndip sawdenj are in standard Zhuang, even though the characters could be from all over the Zhuang-speaking world. Hence I presume many actual readings have been converted into hypothetical standard Zhuang equivalents. Such readings are strictly speaking not readings at all, since no literate native speaker would have ever used those hypothetical readings. Nonetheless I hope those hypothetical readings are close enough to the originals for my purposes here: e.g., b-[p] readings are most likely from nonstandard [p]-readings. WHEN B IS SPELLED C

In my last entry, I wrote about three types of nom characters for Vietnamese mắm 'salting':

1. m-phonetic characters: e.g., 𩻐 = 魚 'fish' + right of 鎫 m- 'head ornament for a horse' (Sino-Vietnamese reading unknown but presumably similar to its nom reading mâm)

2. c-phonetic characters: e.g., 鹵 'salt' + 禁 cấm 'to forbid'

3. b-phonetic characters: e.g., 酉 'liquor' + 稟 bẩm 'to receive from above'

The third type of mắm-character must have been devised at a stage when 'salting' had an initial closer to the initial of 稟 (i.e., stage 1 or 2 below):

Stage 'salting'
1 *p *m
2 *ʔm
3 b [ɓ] m [m]

The first type of mắm-character must date from stage 2 or 3.

The second type of mắm-character continues to baffle me. If I didn't know anything about Vietnamese or Chinese, I might propose a solution involving a labiovelar, but labiovelars did not exist in earlier Vietnamese, and禁 never had a labiovelar or a velar-labial cluster *kw- in Chinese. Did 'salting' once have a cluster *kɓ- in Vietnamese? There is no support for *k- in other Vietic languages.

I looked for other cases of c-phonetics for syllables with *ɓ- and other labial initials in the Nom Foundation's Kiều index and only found a single example: biếng khuây 'unforgettable' was written as 更亏 in line 246 of the 1872 version of Kiều. 更 is normally read as canh 'watch of the night' and cánh 'more'. Khuây is 'forget', and I doubt 更 has semantic relevance in 更亏: why write 'unforgettable' as 'watch forget' or 'more forget'? 更更 canh cánh 'obsessed' appears earlier in the line, so I wonder if 更 for biếng later in the line is an accidental substitute for the b-phonetic character that appeared in earlier editions. FORBIDDEN SALT

Last night I mentioned two examples of phonetics representing Vietnamese syllables with different onsets in the nom script. Here's a third.

As Vietnamese cuisine becomes more popular, more Americans are becoming familiar with nước mắm 'fish sauce'. Nước is literally 'water' and mắm is 'salting'. I do not know of any Sino-Vietnamese reading like mắm. The only similar Middle Chinese syllable was 鋄/鎫 *muamˀ 'head ornament for a horse'. I cannot find a Sino-Vietnamese (SV) reading for that rare character; in theory it should have been *vãm or, if it was borrowed earlier, *muộm. 鎫 was used as a phonetic symbol for the native Vietnamese word mâm 'tray', so its SV reading must have contained the consonant sequence m-m. Variations of its right side were used as a phonetic in nom characters for mâm 'tray' and mắm 'salting': e.g., 𩻐 mắm (with 魚 'fish' instead of 金 'metal' on the left side). ( also lists a similar character with the codepoint U+29DE0 which may be a typo for U+29ED0, the codepoint for 𩻐. U+29DE0 is for a different character 𩷠 from a source in Taiwan. I cannot find the other 𩻐-like nom character in Unicode.)

There are two other types of characters for mắm which aren't in Unicode yet, so I have to describe them in terms of their semantic and phonetic components:

variations of 鹵 'salt' + 禁 cấm 'to forbid' (the latter is also a phonetic loan for the native Vietnamese word bấm 'to press')

酉 'liquor' + variations of 稟 bẩm 'to receive from above' (more on 稟 here)

Why was mắm written with a b-

Why was mắm written with a b-phonetic? Were the latter two types of characters devised when mắm still had an initial implosive *ɓ-? (Many other Vietic languages still have b- in this word: e.g. Sơn La Muong bam³. Is their b- implosive?) And why was bấm 'to press' written with a c-phonetic 禁?

8.16.1:32: I suspect Proto-Vietic *ɓamʔ 'salting' (as reconstructed in the SEAlan

phonetic? Were the latter two types of characters devised when mắm still had an initial implosive *ɓ-? (Many other Vietic languages still have b- in this word: e.g. Sơn La Muong bam³. Is their b- implosive?) And why was bấm 'to press' written with a c-phonetic 禁?

8.16.1:32: I suspect Proto-Vietic *ɓamʔ 'salting' (as reconstructed in the SEAlang Mon-Khmer Languages Project database) is a Vietic innovation. I have not found any potential true cognates in other langauges in that database. Halang măm 'salt fish' and Mnong măm 'salted fish' are probably Vietnamese loans in those Bahnaric languages, and Bolyu mjaːm¹³ 'salt' may be a lookalike; its -j- matches nothing in Vietic.

Tangut fonts by
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2014 Amritavision