Archives

14.9.1.22:09: GYDDANYZC

I have long been interested in Slavic partly because it underwent massive vowel loss paralleling the massive vowel losses I reconstruct for Chinese and Tangut.

My favorite example is monosyllabic Gdańsk from *Gŭdanĭskŭ* with four syllables (Comrie 1987: 326). That city has been on my mind lately because today is the 75th anniversary of the German invasion of Poland.

On Saturday I learned that Gdańsk is first attested as Gyddanyzc sometime after 997 AD. Is that spelling evidence for

- the retention of medial and

- the loss of final

circa 1000 AD?

Why were the vowels that were later lost both spelled y? Had they merged into [ɨ] in the version of the name that was transcribed? They could not have merged in the ancestor of the modern name Gdańsk, as ń is from *nĭ and still reflects the palatal quality of the lost vowel *ĭ. Or is ny a transcription of [ɲ]?

Is the doubling of d significant?

Why was the sibilant before c [k] written as z instead of s? The z also appears in the later spellings Kdanzk (1148), Gdanzc (1188), and Danzc (1263). I assume the z of the spelllings Danczk (1311), Danczik (1399), and Danczig (1414) is half of a digraph cz [tʂ] and is not evidence for [z].

*This matches Old Church Slavonic Гъданьскъ <Gŭdanĭskŭ>. Is that form of the name attested in ancient texts, or is it a retroactive creation?

I assume the ISO 639-1 code cu for OCS is from c(h)u(rch). cu makes me think of Cuman which has no ISO 639-1 code; its three-letter ISO 639-3 code is qwm. qum and cum were already taken for Sipakapense in Guatemala and Cumeral in Colombia.


14.8.31.1:31: EAT-YMOLOGY 3: *NZ- > *NDZ-?

I just realized that 'eat' from my last post wasn't the best example of a word that had undergone brightening without lenition. What if lenition were followed by fortition (in bold) after a nasal?

stem 1: *NI-dza > *NI-dzja > *NI-z- > *Nz- > *ndz- > dzi 1.11

stem 2: *NI-dza-w > *NI-dzjaw > *NI-z- > *Nz- > *ndz- > dzio 1.51

Perhaps these are better examples of brightening without lenition:

Tangut 0749 phi 1.11 'to order' (stem 1), 4568 phio 2.44  'to order' (stem 2)  : Japhug kɤ-ɣɤ-xpra 'to order'

also cf. Somang ka-wa-kprá 'to order' preserving Proto-rGyalrong *kpr-

Pre-Tangut *CI-Kpra(-w-H) > *CI-Kprja(w-H) > *Kpr- > phi(o) 1.11/2.44

or Pre-Tangut *KI-pra(-w-H) > *KI-prja(w-H) > *Kpr- > phi(o) 1.11/2.44

Tangut 5449 1tị 'to put' 1.67 (stem 1), 5633 1tiọ 'to put' 1.72 (stem 2)  : Japhug kɤ-ta 'to put'

Pre-Tangut *CI-S-ta(-w) > *CI-Stja(w) > *tt- > ti ~ tiọ̣ 1.67/1.72

or Pre-Tangut *SI-ta(-w) > *SI-tja(w) > *tt- > ti ~ tiọ̣ 1.67/1.72

If lenition had preceded brightening, ph- and t- would have lenited to *v- and *l- in those words.

I do not know if the *I of the brightening presyllable followed *K- that conditioned the aspiration of ph- and/or the *S- that conditioned the tension of rhymes 1.67 and 1.72 (indicated by a subscript dot). *I could have been in a presyllable preceding one or both of those consonants.

Pre-Tangut *K(I-)pr- nicely matches Proto-rGyalrong *kpr-, but pre-Tangut *S(I)-t- does not match Proto-rGyalrong *t-. Perhaps the aspirated th- of Written Burmese thāḥ 'to put' is from *St-.

Old Chinese 置 *trək-s 'to place' may be an unrelated lookalike even if it is from *r-tək-s, as it has a *-k absent in  the other languages.

I cannot explain why stem 2 of 'to order' has a second ('rising') tone from an *-H absent in stem 1.


14.8.30.23:54: EAT-YMOLOGY 2: BRIGHTENING BEFORE LENITION?

The second item in Guillaume Jacques' 2006 list of Tangut-Japhug rGyalrong comparisons is

Tangut 5113 1wji 'to do' (stem 1), 36211wjo 'to do' (stem 2)  : Japhug kɤ-pa 'to close'

also cf. other rGyalrong forms: e.g., Somang ka-pa 'to do'

(more in #1133-1135 at Nagano and Prins' database)

See my first "Eat-ymology" post for an explanation of stems 1 and 2.

That post was about a parallel pair of stems:

Tangut 4517 1dzji 'to eat' (stem 1), 4547 1dzjo 'to eat' (stem 2)  : Japhug kɤ-ndza 'to eat'

The parallelism is not as apparent if one looks at rhyme numbers and/or my reconstructions:

Stem 1 2
'to do' vɨi 1.10 vɨo 1.51
'to eat' dzi 1.11 dzio 1.51

Guillaume uses Gong Hwang-cherng's reconstruction in which rhymes 1.10 and 1.11 are both -ji. I reconstruct them differently. I also reconstruct different allophones of 1.51 after different initials. The class II initial v- is followed by Grade III -ɨ- but not Grade IV -i-. It was somehow antipalatal in a way that medial -w- is not. Gong reconstructed w in both initial and medial position and did not reconstruct a Grade IV distinct from Grade III.

I can see why Gong reconstructed rhymes in those two grades identically. There are a few minimal pairs involving them: e.g.,

:

0932 ʔɨi 1.10 'many' (only with that meaning in dictionaries?) : 3119 ʔi 1.11 'many'

Those pairs were probably the reason why the Tangut split rhymes (e.g., into 1.10 -ɨi and 1.11 -i). When no such pairs were present, the Grade III/IV distinction was subphonemic, and there was no split. Hence there was only one rhyme 1.51 -ɨo/-io. The frontness of the first half of the diphthong was predictable.

I think the Grade III/IV distinction was absent from pre-Tangut. I reconstruct the pre-Tangut sources of 'to do' as

stem 1: *CI-pa > *CI-pja > *CI-β- > *vi > vɨi 1.10

stem 2: *CI-pa-w > *CI-pjaw > *CI-βj- > *vjo > vɨo 1.51

I do not know if *-ja and *-jaw became *-i and *-jo before or after lenition. My guess is that such shifts predated the loss of stop codas that might have blocked the raising of *-ja to *-i:

Brightening stage 1 *-ja *-jaw *-jaC
Brightening stage 2 *-i *-jo *-jaC
Final coda loss *-i *-jo *-ja
GV > VV diphthong reanalysis -i -io -ia

Some of the changes above could be viewed in terms of a drag chain:

*-jaC > *-ja > *-i

Tangut syllables with lenited initials must have once had presyllables conditioning intervocalic lenition:

*presyllable + labial > *β- > v-
*presyllable + dental > l-

*presyllable + alveolar > *z- > ɮ-

*presyllable + palatal > - > ʐ-

*presyllable + velar > ɣ-

Lenition must have preceded brightening because there are words such as 'to eat' with brightening but without lenition. 'To eat' must have lost its brightening presyllable before 'to do':

Gloss 'to eat' 'to do'
Pre-Tangut *NI-dza *CI-pa
Brightening *NI-dzja *CI-pja
Presyllable to prenasalization *Ndzja *CI-pja
Lenition *Ndzja *CI-β-

If brightening had followed lenition, 'to eat' should have been *ɮi 1.11 (stem 1) / *ɮio 1.51 (stem 2) with *ɮ- from a lenited *-dz-.


14.8.29.23:45: EAT-YMOLOGY

Guillaume Jacques' 2006 list of Tangut-Japhug rGyalrong comparisons begins with

Tangut 4517 1dzji 'to eat' (stem 1), 4547 1dzjo 'to eat' (stem 2)  : Japhug kɤ-ndza 'to eat'

Guillaume used Gong's 1997 reconstruction of Tangut.

Those Tangut words in my reconstruction are 1dzi (without -j-) and 1dzio (which could be rewritten as dzjo).

Stem 2 (in bold below) was used before the first and second personal singular suffixes when the object is in the third person. Otherwise stem 1 was used:

Subject \ object of 'eat' ... me ... us ... thee ... you ... him/her/it/them
I ... (no 'I eat me', etc.)
1dzji-2nja

1dzji-2nji

1dzjo-2ŋa
We ...
1dzji-2nja

1dzji-2nji

1dzji-2nji
Thou ...
1dzji-2ŋa

1dzji-2nji
 (no 'You eat you', etc.)
1dzjo-2nja
You ...
1dzji-2ŋa

1dzji-2nji

1dzji-2nji
He/she/it/they ...
1dzji-2ŋa

1dzji-2nji

1dzji-2nja

1dzji-2nji

1dzji

The seventeen slots in that table have only six forms. I list my reconstructions when they differ from Gong's.

1. bare stem 1 (3rd person subject and object)

2. stem 1 + 2ŋa (first person singular object)

3. stem 1 + 2nja (= my 2nia; second person singular object)

4. stem 1 + 2nji (= my 2ni; nonthird person plural subject and/or object)

5. stem 2 + 2ŋa (first person singular subject + 3rd person object)

6. stem 2 + 2nja (= my 2nia; second person singular subject + 3rd person object)

Reconstructing the history of the Tangut and Japhug words for 'to eat' involves dealing with issues 2 and 3 from my last post.

Cognates of the Tangut word such as Japhug kɤ-ndza, Written Tibetan za-ba, and Written Burmese cā generally have a. The high front vowel of Tangut -ji is assumed to be the product of 'brightening' (Matisoff 2004).

In 2009, Guillaume derived -ji in 'eat' from *-ja. I don't know whether he still does in his new book. This is phonetically plausible. However, it raises the question of where the *-j- in *-ja came from. How far back can it be projected? Did languages such as Japhug, Tibetan, and Burmese lose it? Or is it a Tangut-internal innovation?

Gong (1994: 42) thought Old Chinese and Tangut retained Proto-Sino-Tibetan *-j- whereas Tibetan and Burmese generally lost it. Perhaps he would have said Japhug had lost it in this word. (There are no *affricate-j clusters in Guillaume's (2004: 331-332) Proto-rGyalrong reconstruction. Did *ndzj- simplifiy to *ndz-?)

On the other hand, I did not reconstruct -j- in Tangut. I proposed that the brightening of *a to -i was due to a high-vowel presyllable:

*CI-dza > *CI-dzja > 1dzi

*CI-dza-w > *CI-dzjaw > 1dzio

The problem with this hypothesis is the absence of external evidence for *CI-. Could the Japhug reflex of *CI- be n-; i.e, was *CI- something like *[ni]? If so, perhaps the presyllable was absorbed into the initial in both Japhug and Tangut:

Pre-Proto-rGyalrong *ni-dza > Proto-rGyalrong *ndza >

Japhug -ndza

Somang -zá

Zbu -ndzeʔ, -ndziʔ (with brightening conditioned by the front vowel of *ni-?), -ndzʌʔ

Tangut: *ni-dza > *ni-dzja > *ndzi > 1dzi

I have followed Gong and Arakawa who reconstructed Tangut voiced obstruent initials without prenasalization, but others such as Nishida (1964) and Sofronov (1968) would disagree. The most recent scholar in favor of complex voiced obstruent initials is Tai (2008:

[...] there are regular use of prescripts in front of voiced obstruents [in the Tibetan transcription of Tangut], suggesting that there should be a pre-initial consonant [in Tangut], which is probably a weak nasal or glottal sound.

I followed Guillaume who reconstructed *ndz- at the Proto-Tangut (= my pre-Tangut) level in 2009.


14.8.28.23:40: GUILLAUME JACQUES' ESQUISSE DE PHONOLOGIE ET DE MORPHOLOGIE HISTORIQUE DU TANGOUTE NOW IN PRINT

That was the best news I'm likely to hear all week. This month too. Maybe even this year.

Unfortunately I haven't seen the book yet. Google Books has no preview for it. Nonetheless I am confident that I will be impressed. I have seen Guillaume's previous work on Tangut and rGyalrong and look forward to see how he has build upon it. I am particularty interested to see his treatment of the following topics:

1. What shared innovations distinguish his proposed Macro-rGyalrongic group from the rest of Qiangic or - if  Macro-rGyalrongic is his term for Qiangic - the rest of Sino-Tibetan?

Tonight I found the 2011 dissertation of Marielle Prins (whose rGyalrongic database I constantly use) which states  on p. 21 that there is "an absence of common innovations" in Qiangic. Prins proposed that

the similarities between the Qiangic languages may be caused by diffusion rather than be genetic in nature. [...] It is more likely that the shared features of these languages are the result of contact induced structural convergence, and that the Qiangic group should be considered an areal language group rather than a group of genetically related languages. (p. 22)

I wonder what Guillaume would say about that.

I am not sure whether Prins is denying that the Qiangic languages are related at all, or if she is just rejecting Qiangic as a subgroup. The latter position need not entail a complete absence of a genetic relationship: e.g., Qiangic could consist of languages from multiple Sino-Tibetan branches which have converged. Is Prins' Qiangic like my Altaic (completely unrelated languages tha have converged) or like the Balkan languages (which are from different branches of Indo-European)?

2. I assume Guillaume is still using Gong's 1997 reconstruction of Tangut which has three grades of rhymes (his III corresponds to my III and IV):

Grade
Gong
Gong's source
Arakawa
This site
I
-Ø- *-Ø- -Ø-
-Ø- + lowering of high vowel
II
-i-
*-r- -j-
-ɤ-
III
-j-
*-j- long vowel
-ɯ-
IV
-i-

Gong's Tangut -j- and Old Chinese *-j- were retentions from his Proto-Sino-Tibetan *-j-. On the other hand, Guillaume does not reconstruct Old Chinese *-j-. How does he account for Gong's Tangut *-j-?

3. Pre-Tangut *a was raised and fronted ('brightened') to various degrees. I have tried to explain the multiple reflexes of *a by reconstructing presyllables with front vowels:

*CI-Ca > Ci

*CE-Ca > Cie

More recently I have hypothesized that some 'brightening' might have been conditioned by a suffix *-j: e.g.,

1749 *kwa-j > 1kwe 'hoof'

cf. Ersu nkhuɑ⁵⁵ 'id.'; more cognates at STEDT

How does Guillaume explain 'brightening' in Tangut?

4. Gong reconstructed long vowels that do not correspond to long vowels in Tangut transcriptions of Sanskrit. I am now agnostic about those vowels and reconstruct them with an abstract symbol ' to differentiate them from their much more common '-less counterparts (short vowels in Gong's reconstruction). The zero-' distinction does not seem to correspond to anything in rGyalrong; both types of Tangut vowels correspond to the same Japhug rGyalrong vowels (Jacques 2006): e.g.,

Tangut rhyme
Gong
This site
Japhug
37
-jij
-ie
-i, -e, -o
40
-jiij
-ie'

What does Guillaume think is the source of vowel length in Gong's reconstruction? Does that length reflect a disticnction lost in Japhug?

5. I reconstructed *-H as the source of the Tangut second ('rising') tone; syllables without *-H developed the Tangut first ('level') tone. This type of tonogenesis has parallels in Chinese, Tibetan, and Burmese, but not. Southern Qiang (Evans 2007). What is Guillaume's account of the origin of Tangut tones?


14.8.27.23:30: A *E(YE)-GRADE ROOT?

If Tangut 1new 'breast', 2niu 'to drink milk', and 2niụ 'to give milk' are from the root *√n-w that I proposed in my last post, there ideally should be other sets of -ew ~ -iu words. Unfortunately, I still haven't gotten around to looking for them, but today it did occur to me that Tangut

4684 1me 'eye'

and Old Chinese 目 *muk 'eye' might share a root *m-kʷ:

*m-e-kʷ (e-grade) > *mew > 1me

(See this series of posts on Tangut *labial-w syllables: 12.7.23 / 12.7.28 / 12.7.29).

*m-kʷ (zero-grade) > 目 *muk

This word is widespread in Sino-Tibetan (STEDT roots #33, 681, 682). It often has an -i-: e.g., Tibetan mig 'eye'. Was there an i-grade? (I have borrowed the terms e-grade and zero-grade from Indo-European studies. There is no such thing as an i-grade in Indo-European, but perhaps it existed in Sino-Tibetan.) Or was the root *mʲ-kʷ with an initial palatalized consonant that was vocalized as -i- in the zero grade in many languages?

One might want to resurrect an old-fashioned reconstruction *mjuk for Old Chinese 目 *muk and view its *mj- as a reflex of *mʲ-, but I have never seen any evidence for *-j- in modern Chinese languages, and there is no trace of *-j- in Sinoxenic:

Taiwanese bak (colloq.), bok (lit.)

Cantonese muk

Mandarin mu (in earlier reconstructions, *mj- > w-, not m-)

Sino-Vietnamese mục (not *dục < *mjuk or *mʲuk)

Sino-Korean mok

Sino-Japanese moku, boku

One might also be tempted to regard 覓 Middle Chinese *mek 'to seek' as being from an e-grade *m(ʲ)ekʷ like Tangut 1me 'eye', but the earliest attestation of the word that I can find is in Yupian (c. 543 AD), so I do not know if it should be reconstructed at the Old Chinese level. It could be a later unrelated innovation that has nothing to do with m-words for 'eye'.


14.8.26.23:52: A *N-W WORD FAMILY?

The Tangut word

4834 2niụ < *S-nuH 'to give milk'

from my last two posts is a causative derivative of

4614 2niu < *S-nu 'to drink milk'

and I think those two words are related to

=+

2123 1new 'breast' =

left of 3588 1new 'radish' (phonetic) +

left and center of 5275 2nɪʳ 'breast' (semantic).

Tangut *-w can be from *-k or *-w. If 2123 1new 'breast' is from *new and not *nek*, then it and the niu-words may share a root *√n-w:

pre-Tangut
prefix
root consonant 1
vowel
root consonant 2
suffix
gloss
e-grade
Ø n-
-e-
-w
Ø breast
zero-grade
Ø
-H
to drink milk
S-
to give milk

The *-w of the zero-grade root would have been pronounced as a vowel [u].

Could the grade hypothesis account for the vocalic diversity of these cognates?

Old Chinese 乳 *Cɯ-noʔ 'nipple, milk' could be from an o-grade *n-o-w. (The prefix could be *pɯ- if 孚 *phu is phonetic.)

If the above scenario is correct, are there other cases of *Cew ~ *Cu alternations in Tangut, and what is the significance of the different grades?

Li (2008: 832) regarded 5275 2nɪʳ 'breast' as a loan from Chinese. The only similar Chinese word I know of is 奶 'breast, milk'. But I cannot find any attestations of 奶 before the Qing Dynasty. If 奶 had existed in northwestern Middle Chinese, it would have been pronounced *nəjˀ, and if Tangut speakers added a *T-prefix, the resulting *T-nəjˀ would have developed into 2nɪʳ. The Old Chinese source of *nəjˀ would be *Cʌ-nəʔ which might have come from an even earlier *Cʌ-nəw-ʔ: i.e., a schwa-grade form of *√n-w. Perhaps the pre-Tangut prefix directly reflects the Old Chinese prefix if the latter had survived in the colloquial speech of the northwest during the Middle Chinese period:

OC *Tʌ-nəw-ʔ > *Tʌ-nəʔ > MC *T(ʌ)-nəjˀ > pre-Tangut *T(ʌ)-nəjˀ > Tangut 2nɪʳ

However, all that is highly speculative.

2nɪʳ could be an unrelated lookalike from a pre-Tangut source such as *Cʌ-nirH.

In any case, I cannot think of a way to derive 2nɪʳ from *√n-w within Tangut. If it is ultimately from that root, it would have to be a Chinese loanword.

*One might be tempted to reconstruct *-k since Maru has nuk⁵⁵ 'breast, milk', but Maru -k is an innovation.


14.8.25.23:45: A BOVINE DYNASTY? (PART 2)

Guillaume Jacques (2010) equated the second half of ngo.snuHi, the Tibetan transcription of the name of the mythical first Tangut emperor, with Tangut

2niụ < *S-nuH 'to give milk'

In 2008, Guillaume rejected the temptation to go further and equate Tibetan s- with pre-Tangut *S- (his *s-):

Une hypothèse plus audacieuse pourrait être de voir dans ce s- une notation du préfixe causatif *s- qui doit se reconstruire pour ce verbe. En tangoute, nju.² [= 2niụ]  est dérivé de nju² [= 2niu]  (#4614) 'boire du lait'; le préfixe causatif *s- a disparu, laissant comme seule trace la 'voix tendue' notée par un point en dessous de la voyelle (Gong 1999). Cette hypothèse, toutefois, est très improbable dans la mesure où elle supposerait que soit conservée dans la graphie tibétaine une prononciation du tangoute plus ancienne que le système reconstruit à partir des dictionnaires du XIIème siècle, et donc antérieure d’au moins quatre cent ans aux textes tibétains eux-mêmes.

He regarded the Tibetan s- as "un simple artifice orthographique" since

dans le tibétain central du XIVème siècle, les consonnes préinitiales étaient déjà probablement confondues, voire amuies

but I wonder if ngo.snuHi reflects a nonstandard 14th century Tangut dialect which preserved pre-Tangut *S-. Tangut may have been internally diverse, and this dialect may have been to 12th century standard Tangut what modern Cantonese (which preserves final stops) is to Tangut period northwestern Chinese (which lost final stops) or what Ladakhi (which preserves some s-clusters: examples here and here) is to 14th century central Tibetan.

The -Hi may reflect a -j which was another trait of this 14th century Tangut dialect. Summing up the differences between the two types of Tangut and their common parent pre-Tangut:

Word cow to give milk
Pre-Tangut *ŋwə(-j)-H *S-nu(-j)-H
Standard Tangut 2ŋwɪ 2niụ
Later nonstandard Tangut ŋwə or ŋ(w)o snuj
Tibetan transcription ngo snuHi

The standard dialect had a *-j suffix in 'cow' absent from the nonstandard dialect. Conversely, the nonstandard dialect had a *-j suffix in 'to give milk' absent from the nonstandard dialect.

It is remotely possible that the -iụ of standard 2niụ could be a metathesis of *-u-j rather than a breaking of *u. But even if that were true - and I don't think it is - many or even most -iu could not be from *-u-j, as -iu regularly corresponds to Japhug rGyalrong < Proto-rGyalrong *-u (Jacques 2004: 143, 2006: 16-17). Moreover, if such a metathesis had occurred in standard Tangut, I would expect Chinese *-uj or perhaps even *-wi to correspond to Tangut -iu in very early loans. No such loans have been identified.

In any case, *-j cannot go back very far because probable cognates lack it, and nothing else leads me to believe that Tangut preserved a *-j lost elsewhere.

Next: A *n-w word family?


14.8.24.23:42: A BOVINE DYNASTY? (PART 1)

Guillaume Jacques (2010) equated 2339, the first syllable of the Tangut imperial surname 2ŋwɪ 1mi, with its homophone (and near-homograph) 0395 2ŋwɪ 'cow':

=

The shared center and right components are phonetic. The surname tangraph has 'sage' on the left, whereas 'cow' has the center of 'bear' according to Precious Rhymes of the Tangraphic Sea:

=+

0395 2ŋwɪ 'cow' = 5605 2riẽ 'bear' + 2139 2ŋwɪ 'a kind of bird'

Without looking outside Tangut, I could reconstruct the pre-Tangut source of 2ŋwɪ 'cow' as

*Cʌ-ŋwiH (if the -w- is original) or

*Pʌ-ŋiH (if the -w- is from a presyllable)

with a low presyllabic vowel to condition the lowering of *i. However, it is unlikely that the root vowel was once *i.* Probable external cognates such as Old Chinese 牛*ŋʷə 'cow' and Written Burmese nvāḥ (< *ŋwaH?*; many more here) point to a nonfront vowel. This word was borrowed into southwestern Tai as *ŋuaA 'ox'**.

I used to reconstruct the rhyme of 2ŋwɪ 'cow' as -əi. Could 2ŋwɪ or 2ŋwəi be from *ŋʷə-i-H?

The name of the first Tangut emperor was transcribed as ngo.snuHi in Tibetan. Guillaume Jacques identified that as Tangut

0395 4834 2ŋwɪ 2niụ 'the cow gives milk /  [someone] fed milk by the cow'

whose meaning was rendered in Tibetan as

ba-la Ho.ma Hthung-ba

cow-DAT milk drink-NMLZ

'he who drinks milk from the cow'

ngo might be a transcription of a nonstandard Tangut *ŋwə without my proposed suffix *-i. There is no character for schwa in the Tibetan script, so Tibetan o might represent a schwa. It is also possible that a pre-Tangut *ŋwə could have become *ŋo in that dialect (whereas *-wə did not fuse into -o in standard Tangut).

*Was *ŋw- > *nw- a regular change in Proto-Lolo-Burmese? I can't remember if my unpublished reconstruction from twenty years ago had either cluster. Matisoff's (1972) reconstruction has only one word with *ŋ(w)- which has a variant initial *mw- (not *nw-!).

**Do variants with w/v- and h- reflect different sources of borrowing? See Gedney's list of forms in Hudak 2008: 95. Unfortunately I could not find the word in Pittayaporn's  2009 dissertation on Proto-Tai. It may have been excluded because it could not be reconstructed at the Proto-Tai level.


14.8.23.23:48: WAS THE TANGUT IMPERIAL FAMILY THE MI OF WEI?

Two years ago I saw Guillaume Jacques' derivation of the Tangut imperial surname

2339 1903 2ŋwɪ 1mi

from a hypothetical homophonous phrase

0395 4542 2ŋwɪ 1mi 'the cow feeds [someone]' / 'fed by the cow'

At the end of last month, I saw another derivation but couldn't remember what it was. I found it last night in Nishida (2010: 233):

It is very probable that the second syllable, miɦ (level 11), of ŋʷwɪ-miɦ (level 11) meaning "imperial family" was one of the corresponding cases of the [Tangut autonym] Mi. Its meaning might have been the Mi of Wei 魏.

The Tangut imperial family claimed descent from the Tuoba clan of the Northern Wei. Although this etymology is initially appealing, it has phonological problems.

First, the Tangut called the Wei

4962 2vɪ or 5574 2vɨi

rather than 2ŋwɪ. v- in those transcriptions reflects the loss of *ŋ- in the Tangut period northwestern Chinese pronunciation of 魏. Perhaps the imperial surname contains an earlier borrowing of 魏 preserving its nasal initial.

Second, the Tangut autonym


2344 2mi < *miH

has a 'rising' tone, whereas the second syllable of the surname has a 'level' tone. This tonal difference does not necessarily rule out a connection between the two names. The 'rising' tone of the autonym may be a reflex of a final glottal suffix *-H absent from the 1mi < *mi of the surname. Both 2mi and 1mi may be cognate to Tibetan mi 'person'.


14.8.22.23:44: A SILKEN SOURCE FOR THE RED RADICAL?

I'm surprised I was able to account for all uses of the 'red' radical

(Boxenhorn code: qie; Nishida radical 226)

in a straightforward manner in my last post. It means 'red' and/or is phonetic in all but one case (E):

A. n-phonetic B. 'red'
E. 1tʂhɨĩ 'Chen' (a family on the land of the 2nie family?) C. xŨ-phonetic in < B. 1xʊ̃ 'red' D. -iã-phonetic in < B. 2ʔiã (1st syl of 'rouge')

I am normally at a loss to explain the function of a component in one or more tangraphs containing it. For instance, I have no idea what

the right side of

1671 1nie 'red'

is doing. It is in 65 other tangraphs. I think it is phonetic in

1674 2nie (second syllable of 2mi 2nie 'younger sister')

1809 2nie (second syllable of 1ɣɤə 2nie 'few')

which are near-homophones of 1nie 'red'. But what is it doing in, say,

3528 2tho' 'to harm, endanger'

whose analysis is unknown? Did red signify danger?

Going back to the other half of 1617 'red', I think

might be derived from the seal form of the top half (幺) of the Chinese 'silk' radical 糸 on the left side of Chinese 紅 'red'. The vertical line at the top of the Chinese 'silk' radical corresponds to the horizontal line of the Tangut 'red' radical, and the two circles correspond to the two

of the Tangut 'red' radical. If the admittedly vague similarity between the two radicals is just pareidolia on my part, did the Tangut simply draw a random line pattern and declare it to be 'red' and/or nie?


14.8.21.23:49: A RED RADICAL

Half of the nie-tangraphs (Tangut characters) from my previous entry contained the element

(Boxenhorn code: qie; Nishida radical 226)

that appears in twenty other tangraphs. Here are all 31 qie-tangraphs. Asterisks indicate words which are only in dictionaries to the best of my knowledge.

Class Tangraphs Li Fanwen 2008 # Reading Gloss
A1 0529
1671
1774
2231
1nie 2nd syl of 'be stifled to death'*
red
2nd syl of 'servant'
to try
A2
0547
0548
0593
1678
1723
1858
2239
2nie surname syl
2nd syl of 'kind of insect'*
2nd syl of 'kind of grass'
2nd syl of 'chin'
2nd syl of 'colored silk'*
2nd syl of 'to hide'*, 'to turn around'
bird name syl
A3 =+ 0363 1nʊ transcription
A4 =+ 1235 1nĩ red
B


0250
0260
0696
0820
1076
1164
1236
1402
1692
2237
4220
4880
4959
5102
5827
1bɪ̣
1ŋa
1tʂwɤi
2dʐɨə̣
1ʔie
1tʂhɨõ
2dʐɨə̣
1xʊ̃
2ʔiã
1dɪ̣
2giẹ
2rəʳ
1tʂɨi
1giuʳ
2so
̣
sand
red*
red*
red
red jade necklace*
red sand
red
red (Chn)
1st syl of 'rouge' (Chn)
red soil*
red wood*
copper
2nd syl of 'rouge' (Chn)
kidney
millet
C =+ 1741 1xõ transcription
D =+ 2049 2siã
E =+ 0298 1tʂhɨĩ surname 陳 Chen

The 31 fall into five categories:

A. qie as n(ie)-phonetic: 13 tangraphs

A1. qie as 1nie-phonetic: 4 tangraphs

A2. qie as 2nie-phonetic: 7 tangraphs

A3. qie as n-phonetic: 1 fanqie tangraph (2nie + 1tʊ = 1nʊ)

A4. qie as n-phonetic / semantic for 'red' (see category B below): 1 fanqie tangraph (1nie 'red' + 1ʔĩ = 1nĩ 'red')

B: qie as semantic abbreviation of 'red': 15 tangraphs

C. qie as xŨ-phonetic: 1 tangraph

Cf. 1402 1xʊ̃ 'red' (Chinese loanword) in category B.

The Tangraphic Sea derived 1741 1xõ from 1671 1nie 'red' (whose Chinese translation was 紅 *xʊ̃) and 3682, first syllable of 2mə 1ʔɤõ 'merit' (which could be translated into Chinese as 勳 *xiũ)

D. qie as -iã-phonetic: 1 fanqie tangraph (1si + 2ʔiã = 2siã)

E. qie as abbreviation of a surname Ne or a surname containing the syllable ne: 1 tangraph

Was a Chen family in the Tangut Empire on the land of a Tangut family whose name contained Ne? The Tangraphic Sea derived the right half of 0298 from 2107 1tsɪʳ 'earth'.


14.8.20.23:59: PROXIMATE PRONUNCIATION

Yesterday I wrote about the transcription evidence and potential cognates of Tangut

1nie 'relative'

which originally may have meant 'near' (relatives being the people nearest to oneself).

Fanqie spellings expressing the pronunciation of tangraphs ( Tangut characters) in terms of the initials and rhymes of other tangraphs are only available for a little over half of the 6,000+ known tangraphs. Unfortunately, no fanqie are known for either 1nie or its second ('rising') tone counterpart 2nie.

Usually first ('level') tone tangraphs have fanqie in the surviving first volume of the Tangraphic Sea, but that volume is missing some pages including those which probably contained tangraphs for 1nie and other syllables with the 36th rhyme of the first tone.

The fanqie for most second tone tangraphs is probably in the lost second volume of the Tangraphic Sea. (Some second tone fanqie are in the surviving third volume Mixed Categories.)

Homophones lists 22 characters in a homophone group mixing 1nie and 2nie. All but one (0548) can also be found in Precious Rhymes of the Tangraphic Sea which has no fanqie for any of them.

Homophones Tangraph Li Fanwen number Reading Gloss Tangraphic Sea Precious Rhymes
13B31 1723 2nie second syllable of 2ŋwəʳ 2nie 'colored silk' (only in dictionaries?) in missing second volume?
13B32 1671 1nie red in missing pages of first volume?
13B33 0547 2nie the surname Ne (occurs as a first or second syllable in disyllabic surnames but unclear if it can occur by itself); transcription character in missing second volume?
13B34 1858 2nie second syllable of 1lɨa 2nie 'to hide' (only in dictionaries?; the first half can mean 'to hide' by itself) and 1gie 2nie 'to turn oneself around, look around; the other way around'
13B35 0593 2nie second syllable of 2khwa 2nie 'a kind of grass'
13B36 0548 2nie second syllable of 1lhə 2nie 'a kind of insect' (only in dictionaries?) not in either book
13B37 1678 2nie second syllable of 2miə 2nie 'chin' in missing second volume?
13B38 1774 1nie second syllable of 2nieʳ 1nie 'servant' in missing pages of first volume?
13B41 0529 1nie second syllable of 1nie' 1nie 'to be stifled to death' (only in dictionaries?)
13B42 1732 1nie first syllable of the surname 1nie 1xɤu
13B43 0806 2nie second syllable of 2mɪ 2nie 'wind' (only in dictionaries?; the first half by itself is the name of the 'wind' trigram ☴) in missing second volume?
13B44 1674 2nie second syllable of 2mi 2nie 'younger sister' ('ritual language'? only in dictionaries?)
13B45 1809 2nie second syllable of 1ɣɤə 2nie 'few'
13B46 1926 2nie in the past
13B47 0213 1nie relative in missing pages of first volume?
13B48 2231 1nie to try, second half of 1lɨe 1nie 'emissary' (the first syllable is 'to serve' by itself), first half of 1nie 2ʔwiəʳ 'writing on silk [cf. 1723 above with a different tone], written correspondence' (the second syllable is 'writing' by itself)
13B51 2239 2nie second half of 2biu 2nie 'nightingale', first half of 2nie 2no 'cuckoo, oriole' in missing second volume?
13B52 3671 1nie first syllable of 1nie 2riaʳ 'father' (only in dictionaries?; the second half needs to be combined with either 1nie- for 'father' or -2si for 'mother') in missing pages of first volume?
13B53 5147 2nie first syllable of 2nie 1ɣa 'dog' (only in dictionaries?) in missing second volume?
13B54 3846 2nie optative prefix (< 'downward'), you (is this pronoun only in dictionaries?)
13B55 3817 2nie to present a gift (only in dictionaries?)
13B56 0638 2nie to compel, drive

Since Tangut dictionaries - both ancient and modern - are character-based, one might think 22 characters stood for 22 monosyllabic words pronounced nie, but in fact there are only three monosyllabic 1nie words and only three or four monosyllabic 2nie words:

1nie: 1. 'red', 2. 'relative', 3. 'to try'

2nie: 1. 'the surname Ne' (? - unsure if it can occur by itself), 2. 'in the past', 3. 'to present a gift', 4. 'to compel, drive'

8.21.2:42: It is tempting to try to derive the polysyllabic words from the monosyllabic nie-words, particularly since some of them are combinations of nie with monosyllabic words: e.g.,

1995 0806 2mɪ 2nie 'wind'

whose first syllable is also the Tangut name of the 'wind' trigram ☴. Could that word literally be 'red wind'? The trouble with that case and others is that 'red' is 1nie, not 2nie. One could try to salvage the etymology by proposing that 2nie in compounds is from 'red' plus an *-H suffix conditioning the second tone. But it is dangerous to build speculations atop speculations. Moreover, in this particular case, perhaps 2mɪ is an abbreviation of a monomorphemic, disyllabic 2mɪ 2nie 'wind'.


14.8.19.23:39: PROXIMATE PEOPLE

Today I saw Tibetan nye 'near' which brought to mind a possible Tangut cognate:

0213 1nie 'relative' (i.e., one's near relations; I covered related characters here)

This word was transcribed in Tangut period northwestern Chinese as 你 *ni. No Tibetan transcription is known, but its near-homophone

3830 2nie 'king'

with a different tone was transcribed in Tibetan as nye(H) and ne(H). (Tibetan ཉ ny- [ɲ] and ན n- are different letters.)

I normally derive Tangut rhyme 37 -ie from pre-Tangut *Cɯ-e:

*Cɯ-ne > *Cɯ-nie > nie

The -i- is a trace of the lost presyllabic high vowel *ɯ.

However, Tibetan nye makes me wonder if Tangut -i- in 'relative' is primary rather than secondary:

*ɲe > nie = [ɲe]? [nje]?

Similar Qiangic and rGyalrongic words for 'near' (see sections 3.2 and 3.3 of this list) have palatal ȵ- (= ɲ) or dental n-. (See items #1757-1758 here for very different rGyalrong words.)

Possible Old Chinese cognates have n-:

*Cɯ-ne(j)ʔ (< *n-e-j + -ʔ?) 'near'

*Tnik (< *T- + √n-j + -k?) 'near, be familiar with'

Could those words contain e-grade and zero-grade forms of a root *n-j? Could the root-initial consonant have been *ɲ-? Were *Cɯ- and *T- the same prefix with and without a presyllabic vowel? Were *-ʔ and *-k variants of the same suffix?


14.8.18.23:36: SREDNJI KITAJSKI JĘZYK

I felt uncomfortable about mentioning Middle Chinese reconstructions in my last post because they may give the false impression that Chinese in the past was more homogeneous than it actually was.

It occurred to me last night that Middle Chinese is about as real as Interslavic, my favorite constructed language. If future linguists knew nothing about Russian, Polish, Serbo-Croatian, etc. - i.e., specific actual languages - Interslavic would have to do for comparisons with other European languages. The title of this post is Interslavic for 'Middle Chinese language'.

I suspect that diversity within Middle Chinese was like that between Slavic languages today. So my *kon for 昆in this table is to real Middle Chinese forms what Interslavic koń 'horse' is to these modern Slavic words: similar but not necessarily identical. Interslavic koń happens to match the actual Polish word for horse, but its vowel is very different from that of Ukrainian кінь [kinʲ] 'horse', and it is completely different from Russian лошадь [loʂətʲ] 'horse', a loan from Turkic. If a language borrowed a word kin 'horse' from Ukrainian, it would be strange to say that kin is from a 'Slavic' koń. Yet how many would blink if I wrote that Sino-Vietnamese mã is a loan from 'Middle Chinese' 馬 *mɤaˀ? (The actual source of mã was more like *ma with a 'rising' tone in a southern late Tang variety of Middle Chinese.)

'Middle Chinese' may sound specific, but it's actually a generic term like 'Middle Indic' which could refer to Pali, Gandhari, Ardhamagadhi, etc.

Unfortunately there is no analogous established terminology for specific varieties of Middle Chinese. It is easier to type a simple name like Pali than a phrase like 'Tangut period northwestern Chinese' (TPNWC), the dialect in the Timely Pearl in the Palm that is also the source of Chinese loans in Tangut.

Tonight I momentarily considered renaming TPNWC 'Zaric' after Tangut


1ɮar 'Chinese'

but that term would make no sense to those who didn't know the Tangut word. Although my older term is more tedious, it is also more transparent.

One could think of Middle Chinese reconstructions as being as open to intrepretation as Interslavic pronunciation: e.g., the ę of język 'language' in the post title could be [ʲa] ~ [ʲɛ] ~ [ʲɛ̃] ~ [ʲɔ̃] ~ [ɛ]; [ʲæ] is a suggested average.

That description of Interslavic states that "[a]ccentuation is free." Hence there is no way that one could figure out Serbo-Croatian tones from Interslavic: e.g., the falling tone of konj 'horse'. (Most of Slavic lacks tones, so Interslavic also lacks them.)

The situation is a bit different with Middle Chinese tones. The Old Chinese sources of Middle Chinese tones are known (e.g., *-ʔ in 'horse'), but their phonetic realizations are not. It is likely that *-ʔ left a trace as glottalization which disappeared at different times in different places (and is still present in today's Xiaoyi), and pitches once associated with glottalization became phonemic.

Although 'rising', the traditional name of the tone category for 'horse', suggests the tone was rising, that may not have been true in all Middle Chinese varieties, and it is certainly not true today: e.g., in Taiwanese, 馬 'horse' has a high falling tone (indicated with an acute accent in romanization!). See Sagart (1998) for more on Chinese tonal history.

I have similarly used an acute accent to indicate the 'rising' tone in Middle Chinese varieties after glottalization was lost, but that accent may imply a rising or even high tone though I am actually agnostic about its contour, so I am reluctant to use it now. Maybe it's time to dust off my tone codes.

All of the above also applies to Old Chinese except for the part about tones since Old Chinese didn't have any. Old Chinese was not uniform before the mid-first millennium AD. In fact, 揚雄 Yang Xiong (53-18 AD) wrote the first Chinese dialect dictionary, 方言 Fangyan 'Areal Speech', toward the end of the old chines period. I think that oddities in Chinese loans in Vietnamese and Tai may in part reflect Old Chinese diversity that has been lost. Proto-Indo-European must also have been diverse.

Speaking of Proto-Indo-European, I don't understand how Proto-Indo-European *ḱem- 'hornless' became Proto-Slavic *konjь; why didn't PIE *ḱ- become PS *s- (cf. Sanskrit śama- 'domestic'), and why didn't PIE *-m- become PS *-m-?


14.8.17.23:51: WHEN B IS SPELLED G

If Vietnamese mắm [mam] < *ɓamʔ 'salting' could be written with the velar-initial phonetic 禁 cấm [kəm] (see my last two entries), could labial-initial syllables be written with velar-initial phonetics in sawndip, the traditional Zhuang script, as well? I looked through Sawndip sawdenj [Traditional Zhuang script dictionary] which I admit is a problematic source* and found the following characters with velar-initial phonetics for Zhuang [p]-initial syllables:

Standard Zhuang reading IPA Semantic component Phonetic (?) component and Middle Chinese reading Zhuang reading of phonetic (?) component Meaning
boenq pon³⁵ 土 'earth' *kon (> some northern Pinghua readings with khw-; the aspiration is irregular) goen [kon³⁵] dust
bomx poːm⁴² 足 'foot' *kuŋ 'bow' (archery) no reading for 弓 in isolation; 弓 is a phonetic in goem [kom³⁵] to crouch
byaij pjaːj⁵⁵ *ŋwajʰ 'outside' (> three northern Pinghua dialects have m-!) vaih [waːj³³] to walk
byangj pjaːŋ⁵⁵ 強 'strong' *kɔŋ (> early Mandarin *khjaŋ) gangj [kaːŋ⁵⁵] hot pain?
byoq pjo³⁵ 火 'fire' *khɨak (> early Mandarin *khjaw, northern Pinghua readings khio, khyo) cog [ɕoːk³³] to bake
byouz pjow³¹ *gu caeuz [ɕaw³¹] to boil
byuk pjuk³⁵ 虫 'bug' *kok goek [kok³⁵] white ant
byuz pju³¹ 瓜 'melon' *ɣo (> some northern Pinghua readings with f-: e.g., Guilin [Yanshan zhuyuan dialect] fu) no reading for 乎 in isolation; 乎 is a phonetic for fouj [fow⁵⁵], fuj [fu⁵⁵], hued [hut³³], huz [hu³¹], ruz [ɣu³¹], and youq [jow³⁵] gourd
bywngj pjɯŋ⁵⁵ 足 'foot' *khəŋˀ haengj [haŋ⁵⁵] verb suffix
扌 'hand'

(8.18.0:54: Added Zhuang reading of phonetic component column. The title of this post should makes more sense now. I was referring to how Zhuang b-syllables were written with g-phonetic components.)

At least two phonetic components may actually be semantic: e.g.,

*kuŋ 'bow' (archery) could refer to bending down in 足+弓 bomx 'to crouch'

*ŋwajʰ 'outside' could refer to going outside in 足+外 byaij 'to walk'

or it could have been chosen for a labial or labiodental initial ([w]? [v]? [m]?) close to by- [pj]

*ɣo may be a reference to its homophone, the first syllable of 葫蘆 'gourd'.

The other phonetic components are baffling. If they are really phonetics, were they chosen only for their rhymes? Or did they have labial-initial readings in local varieties of Chinese?

*Holm (2011: 2) pointed out that

The Sawndip sawdenj is a useful compendium, but it provides no information about where the dialect forms come from, so it is impossible to see any patterns in geographic variation from this source.

Moreover, all the readings in Sawndip sawdenj are in standard Zhuang, even though the characters could be from all over the Zhuang-speaking world. Hence I presume many actual readings have been converted into hypothetical standard Zhuang equivalents. Such readings are strictly speaking not readings at all, since no literate native speaker would have ever used those hypothetical readings. Nonetheless I hope those hypothetical readings are close enough to the originals for my purposes here: e.g., b-[p] readings are most likely from nonstandard [p]-readings.


14.8.16.23:36: WHEN B IS SPELLED C

In my last entry, I wrote about three types of nom characters for Vietnamese mắm 'salting':

1. m-phonetic characters: e.g., 𩻐 = 魚 'fish' + right of 鎫 m- 'head ornament for a horse' (Sino-Vietnamese reading unknown but presumably similar to its nom reading mâm)

2. c-phonetic characters: e.g., 鹵 'salt' + 禁 cấm 'to forbid'

3. b-phonetic characters: e.g., 酉 'liquor' + 稟 bẩm 'to receive from above'

The third type of mắm-character must have been devised at a stage when 'salting' had an initial closer to the initial of 稟 (i.e., stage 1 or 2 below):

Stage
'salting'
1
*p

*m

2
*ʔm
3
b [ɓ] m [m]

The first type of mắm-character must date from stage 2 or 3.

The second type of mắm-character continues to baffle me. If I didn't know anything about Vietnamese or Chinese, I might propose a solution involving a labiovelar, but labiovelars did not exist in earlier Vietnamese, and禁 never had a labiovelar or a velar-labial cluster *kw- in Chinese. Did 'salting' once have a cluster *kɓ- in Vietnamese? There is no support for *k- in other Vietic languages.

I looked for other cases of c-phonetics for syllables with *ɓ- and other labial initials in the Nom Foundation's Kiều index and only found a single example: biếng khuây 'unforgettable' was written as 更亏 in line 246 of the 1872 version of Kiều. 更 is normally read as canh 'watch of the night' and cánh 'more'. Khuây is 'forget', and I doubt 更 has semantic relevance in 更亏: why write 'unforgettable' as 'watch forget' or 'more forget'? 更更 canh cánh 'obsessed' appears earlier in the line, so I wonder if 更 for biếng later in the line is an accidental substitute for the b-phonetic character that appeared in earlier editions.


14.8.15.23:51: FORBIDDEN SALT

Last night I mentioned two examples of phonetics representing Vietnamese syllables with different onsets in the nom script. Here's a third.

As Vietnamese cuisine becomes more popular, more Americans are becoming familiar with nước mắm 'fish sauce'. Nước is literally 'water' and mắm is 'salting'. I do not know of any Sino-Vietnamese reading like mắm. The only similar Middle Chinese syllable was 鋄/鎫 *muamˀ 'head ornament for a horse'. I cannot find a Sino-Vietnamese (SV) reading for that rare character; in theory it should have been *vãm or, if it was borrowed earlier, *muộm. 鎫 was used as a phonetic symbol for the native Vietnamese word mâm 'tray', so its SV reading must have contained the consonant sequence m-m. Variations of its right side were used as a phonetic in nom characters for mâm 'tray' and mắm 'salting': e.g., 𩻐 mắm (with 魚 'fish' instead of 金 'metal' on the left side). (nomfoundation.org also lists a similar character with the codepoint U+29DE0 which may be a typo for U+29ED0, the codepoint for 𩻐. U+29DE0 is for a different character 𩷠 from a source in Taiwan. I cannot find the other 𩻐-like nom character in Unicode.)

There are two other types of characters for mắm which aren't in Unicode yet, so I have to describe them in terms of their semantic and phonetic components:

variations of 鹵 'salt' + 禁 cấm 'to forbid' (the latter is also a phonetic loan for the native Vietnamese word bấm 'to press')

酉 'liquor' + variations of 稟 bẩm 'to receive from above' (more on 稟 here)

Why was mắm written with a b-

Why was mắm written with a b-phonetic? Were the latter two types of characters devised when mắm still had an initial implosive *ɓ-? (Many other Vietic languages still have b- in this word: e.g. Sơn La Muong bam³. Is their b- implosive?) And why was bấm 'to press' written with a c-phonetic 禁?

8.16.1:32: I suspect Proto-Vietic *ɓamʔ 'salting' (as reconstructed in the SEAlan

phonetic? Were the latter two types of characters devised when mắm still had an initial implosive *ɓ-? (Many other Vietic languages still have b- in this word: e.g. Sơn La Muong bam³. Is their b- implosive?) And why was bấm 'to press' written with a c-phonetic 禁?

8.16.1:32: I suspect Proto-Vietic *ɓamʔ 'salting' (as reconstructed in the SEAlang Mon-Khmer Languages Project database) is a Vietic innovation. I have not found any potential true cognates in other langauges in that database. Halang măm 'salt fish' and Mnong măm 'salted fish' are probably Vietnamese loans in those Bahnaric languages, and Bolyu mjaːm¹³ 'salt' may be a lookalike; its -j- matches nothing in Vietic.


14.8.14.23:56: BIT-TẢI-R ROOF

In my last entry, I couldn't explain why 宰 was read as tể instead of tải in Vietnamese. Today I checked various nom dictionaries and found the reading tải in Vũ Văn Kính's Bảng tra chữ nôm sau thể kỷ XVII (18, 19, 20) (Table for Finding Nom Characters after the 17th Century (18, 19, 20)). Unfortunately the book did not provide a context for tải, so I don't know if that syllable was a now-extinct Sino-Vietnamese reading or (part of) a native word. I also don't know if that reading predates the 18th century. My guess is that the taboo substitution occurred in the 18th century (hence the inclusion of the original reading tải in Vũ's book), and that most works only include the later altered reading tể and its spinoffs tẻ and tỉa. (It would be unusual for an -ai character to be used to write -e and -ia syllables, so I assume the latter two readings postdate tể.)

The Nom Foundation's Kiều index lists yet another reading in line 2873 of the 1870 version: tề. However, page 206 of its romanized text of that version has the usual reading tể.

I just realized that although characters could be used as nom phonetic symbols and components without regard for tone (e.g., 宰 tể for tề in Kiều?), all taboo deformations I have seen retained tones along with onsets*. Final consonants could be slightly changed: e.g., hoàng [hwaːŋ] became huỳnh [hwiɲ]. The hierarchies of 'loyalty' for nom phonetics and taboo deformation were slightly different:

nom phonetics: vowel quality > onsets, codas, tones**

taboo deformation: onsets, tones > codas > vowel quality

Nom phonetics were generally used for syllables with similar vowels: e.g., 宰 tể could not represent a syllable like tổ even though it had the same onset, tone, and zero coda. However, native Vietnamese words had more onsets and onset-coda sequences than Sino-Vietnamese, so there was more freedom to use phonetics to represent syllables with different onsets or codas: e.g.,

la as a phonetic with semantic 出 xuất 'to go out' in 𠚢 ra 'to go out' (there are no r-syllables in Sino-Vietnamese)

n as a phonetic with semantic 口 khẩu 'mouth' in 𠵘 mồm 'mouth' (there is no syllable môm in Sino-Vietnamese)

I'll look at another example tomorrow. Note how tone is disregarded in the latter case. 羅 la can represent là 'to be' with a different tone.

*The spellings of initial onsets could change because of quốc ngữ spelling conventions: e.g., kiểu [kiəw] became cảo [kaːw].

**8.15.1:30: Vietnamese tones historically had 3 x 2 categories. Each tone name exemplifies its tone.

 voice quality *plain *creaky *breathy
*voiceless initial > *upper register ngang sắc hỏi
*voiced initial > *lower register huyền nặng ngã

The reconstructed category names no longer necessarily describe the modern tones: e.g., huyền is breathy, ngã is not breathy and is higher than hỏi, etc.

There seems to be a hierarchy of tonal 'loyalty' in nom:

1. Retention of original tone in phonetic symbol/component.

2. Use of phonetic for syllable with opposite-*register tone: e.g., 羅 la for là above.

3. Use of nonplain tone phonetic for syllable with any other *nonplain tone: e.g., 禮 lễ for lấy, lạy, and rẻ in Kiều. (Lễ 'ceremony' and lạy 'to bow' are in fact the same Chinese word borrowed into Vietnamese during two different periods.)

4. Use of phonetic for syllable with any tone: e.g., 永 vĩnh for the *plain tone syllable vành as well as *nonplain vắng and vạnh in Kiều. (Is there an example of a phonetic used for syllables with all six tones?)

The most 'loyal' phonetics have readings ending in stops. Stop-final syllables in Vietnamese can only have *creaky tones: e.g., 越 việt also represented the *creaky-tone syllables vượt, vết, and vớt, but could not represent *noncreaky tone syllables like *vVt, *vV̀t, *vV̉t, or *vṼt which were impossible in Vietnamese.


14.8.13.23:59: BIT-TỂ-R ROOF

I regret not paying attention to Vietnamese until I started reading Bernhard Karlgren's books after my first semester of graduate school over twenty years ago. I remember flipping through a Vietnamese-English dictionary and being astounded by all the words I could recognize because they were Chinese borrowings. (Of course the native words were totally alien to me, as I had never studied an Austroasiatic language before, much less one that was closely related to Vietnamese like a variety of Muong. I didn't even know what Muong was!) I soon learned the sound correspondences between Sino-Vietnamese and what I was more familiar with (Mandarin, Cantonese, Sino-Japanese, and Sino-Korean). Since then I've committed many Sino-Vietnamese readings to memory and can guess still others using those correspondences. When I look at Vietnamese, I can usually 'see' the characters for Chinese loans. However, there are 'blind spots': i.e., exceptional readings.

One such reading that I can't explain is tể instead of the regular reading *tải for 宰 'minister'. (The title refers to the components of the character: 宀 miên 'roof' and 辛 tân 'bitter'. Why those add up to 'minister' is a topic for another time.) 宰 belongs to the Middle Chinese 海 *-əj (> later *-aj) rhyme category which usually corresponds to three rhymes in Sino-Vietnamese:

Old borrowings: -ơi [əːj]

Later borrowings with nonlabial initials: -ai [aj]

Later borrowings with labial initials: -ôi [oj]

tể is the only instance of [e] in this category. It is probably not an archaism from Old Chinese since Middle Chinese *-əj goes back to *-ə, not *-e. I do not know of any cases of *-ai becoming in Vietnamese.

宰 must have been read with a front vowel when it was used as a nom phonetic symbol to write the unrelated native Vietnamese words lẻ tẻ 'scattered' and tỉa 'to trim'.

Although there are modern Chinese languages in which this rhyme has become e-like, they are geographically distant from Vietnamese with the sole excpetion of only one variety of Pinghua (Guilin Yanshan Zhuyuan which has tse with an irregular tone). I doubt tể is a borrowing from Guilin which is 400 miles from Hanoi.

Is tể the last survivor of a long-dead trend of monophongization in earlier Vietnamese or the source dialect of Sino-Vietnamese? I doubt it.

8.14.1:42: Could the sui generis reading tể be the product of taboo deformation? But would 宰 have been used in a name? Tể is not in this long list of deformed readings that I just found. (The original readings are in the "Âm chính" 'main sound' columns; the altered readings are in the "Âm trại" 'mispronounced sound' columns.)

I tried looking for tể in de Rhodes' dictionary to see if tể existed in the 17th century. However, I couldn't find it or my theoretical regular *tải with any of the meanings of tể.


14.8.12.23:40:  THOUGHT-BEARING HAPPY PROGRESS?

Thai names contain many Indic elements, so they should be transparent to me. However, they often contain surprises. For instance, last night I encountered the name

จินตหรา สุขพัฒน์ <cinthrā sukhbaḍhn˟> [tɕintaraː sukʰapʰat] (?) 'Chintara Sukapatana'

which looks like it should be from an Indic *cinta-harā sukha-baḍhana-. However, only [sukʰa] < Sanskrit/Pali sukha- 'happy' is straightforward. The remaining three components puzzle me:

- I would expect the final long vowel of Sanskrit/Pali cintā 'thought' to remain intact in compounds; this same shortening is also in regular Indic loanwords in both Thai and Khmer (so perhaps the shortening is of Khmer origin)

- apparently <hrā> is pronounced as if it were a monosyllabic native Thai word [raː] rather than the expected [haraː] from Sanskrit/Pali harā 'bearing' (f.). There is a Thai word หรา <hrā> [raː], but I don't think it is part of this name because it is an adverb 'boldly', not an adjective which should follow [tɕinta] 'thought'.

- although I'm accustomed to Pali vaḍḍhana- 'increase' (< Sanskrit √vṛdh) becoming the regular Thai word พัฒนา <baḍhnā> [pʰattʰanaː] 'progress', I didn't expect it to be clipped to [pʰat]. (์ <˟> indicates silent characters.) Was *[sukʰapʰattʰan] too long? Is the feminine ending [aː] always absent from Thai surnames?

Chintara's birth name is

จิตติมาฆ์ <cittimāgh˟> [tɕittimaː]

which has mysteries of its own. [tɕitti] is from Sanskrit/Pali citti-, a variant of cintā 'thought', but what is [maː] from Sanskrit māgha- 'name of a constellation' (> 'third lunar month' in Thai and Khmer) doing, and why was its final consonant dropped? Compare มาฆ์ <māgh˟> [maː] with เมฆ <megh> [mek] < Sanskrit/Pali megha- 'cloud' whose final <gh> [k] is not silent.

Has the phenomenon of dropping perfectly pronounceable segments in Indic loans in Thai been studied? (Some dropping is required to make Indic loans fit the constraints of Thai phonology: e.g., จันทร์ <candr˟> 'moon' is [tɕan] because Thai does not permit final consonant clusters.)


14.8.11.23:59: C-RUTSUBO

Li (2008: 721)  listed the first syllable of

4538 5544 2ko' 1riuʳ 'crucible'

as a borrowing from the second syllable of Chinese 坩堝 *kã ko 'crucible'.

I am skeptical of a connection between the two words for the following reasons.

First, I do not know of any cases of the rhyme of 堝 *ko borrowed as Tangut rhyme 54 -o'. I use the symbol ' to indicate that rhyme 54 was similar to rhyme 51 -o yet different in some unknown way. I only know of a single case of rhyme 54 transcribing Chinese *-o:

5388 2bo' for Chinese 摩 *mbo (Gong 2002: 436)

Chinese *-o was normally transcribed with rhyme 51 -o (see Gong 2002: 456 for many examples).

I originally was going to write that I thought it was unlikely that 坩堝 *kã ko would be cut in half by the Tangut, but in fact 堝 *ko is attested as an independent word in the Song Dynasty. I suspect it is a specialized use of 鍋 *ko 'cooking pot' written with a radical 土 to match 坩 which is attested as an independent word in the Tang Dynasty.

So my third objection is now my second:  Even if 4538 is a borrowing from Chinese, what is 5130? No homophone of 5130 is an adjective that would make sense as a modifier of 4538. Here are Li's (2008) glosses for the other tangraphs pronounced 1riuʳ:

0968 'all'

1403 'complain'

2147 'sweep'

2324 'sigh'

2542 'gadfly'

2543 'hate'

2812 'cherish'

3491 'bright star'

3493 'firefly'

3737 'frivolous'

4364 'wooden framework'

4437 'auspicious'

4713 'world'

5130 'subdue'

I think Tangut 2ko' 1riuʳ 'crucible' is an indivisible disyllabic word that is a coincidental soundalike of 堝 *ko.

As for the title, Nishida (1986: 43) translated Tangut 2ko' 1riuʳ 'crucible' as Japanese rutsubo. No native Japanese word can begin with r-, so the word must be of at least partly foreign origin. I think it might be a compound of Middle Chinese 爐 *lo (borrowed into Japanese as *ro which then became ru after raising) 'stove' and the native word tsubo 'pot'.


14.8.10.23:45: FROZEN WHITE WATER IN THE BLACKSMITH'S CRUCIBLE

Andrew West pointed out that 4053 1ʔwọ 'ice' from my recent entries occurs three times in The Ode on Monthly Pleasures: e.g., in the 'common language' line 3B of the section on the second month. The 'ritual language' line 3A is slightly longer:

2.3A 1 2 3 4 5 6 7
Tangraph

Li Fanwen number 0804 4051 3052 1659 5441 4538 5544
Reading 2diə 1kiʳw 1nioʳ' 2lew 2swi 2ko' 1riuʳ
Gloss PERF cold water white blacksmith crucible

The 'common language' line only has a single direct match:

2.3B 1 2 3 4 5 6 7
Tangraph

Li Fanwen number 1490 4053 1572 0185 1452 3956 -
Reading 1tsʊʳ 1ʔwọ 1phɤõ 2nwiə 1nia 1dʐi -
Gloss winter ice white spring PERF melt -

Nishida (1986: 43) translated those lines as topic-comment sequences:

A. 'The cold, white water - a crucible of materials'

B. 'Spring melts the white water of winter'

He proposed four parallel pairs:

A1-2 'cold' : B1 'winter'

A3-4 'white water' : B2-3 'white ice'

A5 'materials' : B4 'spring'

A6-7 'crucible' : B5-6 'melts' (my 'melted')

A1 0804 2diə is a perfective prefix originally indicating motion toward the speaker. Perhaps its combination with A2 4051 1kiʳw 'cold' could be translated as 'frozen' (i.e., finished becoming cold). A 'common language' perfective prefix is not what I would expect in the 'ritual language' if the latter was an unrelated substratum language.

As far as I know, A2 4051 is only in dictionaries and this ode. If it is a 'ritual language' word, it demonstrates that 'common language' affixes can be attached to 'ritual language' vocabulary.

A3 3052 1nioʳ' is a 'common language' word for 'water' that is not very common. It is the Tangut name of the Chinese trigram ☵ for water. If the 'ritual language' were a low-prestige substratum language, I would not expect its words to be used to refer to concepts from a high-status culture. I should look into the names of the seven other trigrams; none match the common words for their concepts.

Similarly, A4 1659 2lew is a 'common language' word for 'white' that is not very common.

I do not know why Nishida translated A5 5441 2swi as 'materials'. A note in Homophones text D equates 2swi with 'iron artisan' (i.e., 'blacksmith'). 5441 is also a verb 'to (s)melt' (Kychanov and Arakawa 2006: 542), so a blacksmith was a 'smelter'. 5441 can also mean 'mother-in-law' (Li 2008: 858), but I assume the character is used for two different unrelated words. Unfortunately, I do not see this term for 'mother-in-law' in Jacques 2012 which mentions the term

3986 4893 1niə 1vɨə

A6-7 4538 5544 2ko' 1riuʳ 'crucible' is a indivisible disyllabic word which is not in the 'common language' to the best of my knowledge. Neither half can stand alone. Li (2008: 721) regarded the first syllable as a loan from the second syllable of 坩堝, but this is problematic for reasons I'll go into in my next entry.

A5-7 'blacksmith's crucible' makes little sense as a gloss for 'spring melted'. It is an odd metaphor for spring, as a crucible is much hotter. Nishida's 'crucible of materials' is even more puzzling.

I would translate the B line as a topic-comment sequence:

'As for the white ice of winter, spring melted it.'

B6 1452 1nia is a perfective prefix originally indicating downward motion, so B6-7 1452 3956 1nia 1dʐi 'down-melt' (i.e., 'melted') is reminiscent of English melt down, though they are not equivalents: the former could be translated as 'melted down' but not 'melt(s) down' and the latter has an extended meaning '(emotionally) collapse'.


14.8.9.23:59: MINING THE ODE ON MONTHLY PLEASURES

Andrew West pointed out that 5952 'ore, mine' from my last entry does occur outside dictionaries: e.g., in the 'ritual language' line 4A of the section on the tenth month in The Ode on Monthly Pleasures:

10.4A 1 2 3 4 5 6 7
Tangraph

Li Fanwen number 2992 0026 5429 2431 5952 5072 1420
Reading 2bia 2ŋwʊ 1po 2khwa 2nɤa' 1ɣiəʳ 2tʂɨụ
Gloss the Ba clan territory the Po clan Chinese (? - see below) make wing

The corresponding 'common language' line does not seem to match it at first glance:

10.4B 1 2 3 4 5 6 7
Tangraph

Li Fanwen number 1518 3437 2344 5882 0795 3497 4999
Reading 1tʂɨẹ 2ʔõ 2mi 1ɮaʳ 2riəʳ 2lʊ̣ 1giẹ
Gloss the Che clan the On clan Tangut Chinese PERF obstruct scissors

Nishida (1986: 62) translated those lines as topic-comment sequences:

4A. 'As for the Po and Chinese of the Pa [= my Ba*] territory - wings made of iron'

4B. 'As for the Tangut and Chinese of the Che Hon [= my Che On**] - scissors that obstruct'

He proposed four parallel pairs:

4A1-2 'Pa [= Ba] territory' : 4B1-2 'Che Hon [= Che On]'

4A3 'Po' : 4B3 'Tangut'

4A4 'Chinese' : 4B4 'Chinese'

4A5-7 'wings made of iron' : 4B5-7: 'scissors that obstruct'

Only the third pair is obvious.

I have suspected that the 'ritual language' is a non-Sino-Tibetan substratum language glossing the superstratum Sino-Tibetan 'common language'. For other interpretations of these 'languages', see Andrew's "The Myth of the Tangut Ritual Language".

But my hypothesis is problematic even for the third pair. If 4A4 2431 2khwa is a substratum word for 'Chinese' glossing the superstratum word 4B4 5882 1ɮaʳ 'Chinese', why do both words appear in the foreword to the Timely Pearl in the Palm, a text otherwise in the 'common language'? Is that a case of a substratum word borrowed by the superstratum language? Neither word has any connection to Chinese autonyms. (The Chinese autonym 漢 *xã 'Han' was borrowed as

5916 1xã

which like 2431 and 5882 is written with the unflattering components 'little' and 'insect'.)

The fourth pair only makes sense if 4A5-7 2nɤa' 1ɣiəʳ 2tʂɨụ 'wings made out of 5592' was the substratum term for 4B7 4999 1giẹ  'scissors'. 4999 can also be a verb 'cut' - is 'scissors' a derived noun 'cutter'? - but Kychanov and Arakawa 2006: 327 do not list any compound verb 3497 4999 'obstruct-cut', and a verb sequence 'obstructed and cut' is even harder to relate to the noun phrase 'wings made of 5592'.

Li (2008: 938) and Kychanov and Arakawa (2006: 296) agree that 5592 means 'ore'. (As Andrew pointed out, none of the textual examples in Li 2008 support 'mine'. Li's English gloss for 5592 seems to be a translation of his Chinese gloss 礦 which means both 'mine' and 'ore', but the Tangut word may have had a narrower meaning.) However, 'ore' might be odd in this context. Nishida translated 5592 as 'iron', even though 'iron' is a distinct word that combines with 5592 to form the phrase

5592 4995 2nɤa' 1ʂɨõ 'iron ore'.

in Homophones 14B33. Could 5592 also refer to some other metal? English ore is "partly from Old English ār brass". Also cf. the ambiguity of Japanese kane 'metal, gold'.

4B5-6 0795 3497 2riəʳ 2lʊ̣̣ 'obstructed' (0795 3497) has no substratum equivalent in line 4A. Was the superstratum verb understood even by speakers of the substratum language, or was it just omitted to make line 4A the same length as 4B?

Why would glosses have to match the lengths of the lines they glossed? Were the glosses meant to be poetry in their own right? And why place the glosses before the lines they gloss?

The '-ed' of my translation 'obstructed' corresponds to the perfective prefix 0795. Nishida (1966: 579) regarded 0795 as the Tangut equivalent of Classical Chinese 所, so I would have expected his translation to be 妨げるところ  'that which is obstructed' (cf. his translations of 0795 in 1966: 279), but that would make no sense before 'scissors'. His actual translation 妨げる 'obstructs' only corresponds to 3497.

4A6-7 5072 1420 1ɣiəʳ 2tʂɨụ 'wing made of ...' consists of words also in 'common language' texts. I don't know of any Sino-Tibetan cognates for those words. Could they be substratum loans in the superstratum language?

The first two pairs are the most baffling. When I see Tangut names, I feel as if I'm reading about characters in a TV show I've never seen. Who were the Ba, Po, Che, and On? I don't know, but I'm certain that Po is not a synonym of 2mi 'Tangut'. The names Che and On are together in Homophones 35B37, so they may have been a common collocation ('Che [and] On') or a disyllabic name ('Che'on'). Kychanov and Arakawa (2006: 315) favor the latter interpretation. Was the territory of the Ba also the land of Che and On (or Che'on)? If so, then the Po were a Tangut clan in that land, and 'the Po of the Ba territory' and 'the Tangut of Che (and?) On' refer to the same group of people. This hypothesis is impossible to explore further without learning more about these clan names.

4A2 0026 2ŋwʊ 'territory' in the 'ritual language' line is also in 'common language' texts. Is it a superstratum loan in the substratum, or vice versa? Like 'make' and 'wing' later in that line, it too has no known cognates.

*8.10.3:23: I follow Gong (1997) and Li Fanwen (1986: 218) who reconstructed the initial of 2992 as b-. Nishida (1986: 62) reconstructed it as p-; in 1966 he reconstructed it as m- (p. 396). Sofronov (1968 II: 307) reconstructed it as mb-. The character is in the labial chapter of Homophones, so there is no doubt that its reading had a labial initial of some kind. The only transcription for a member of its homophone group that I know of is *mba for

2bia 'belly'

in Timely Pearl in the Palm 19.1. I think the diacritic might indicate that the initial was b- (absent from Tangut period northwestern Chinese) rather than mb-. In any case, the Chinese transcription rules out Nishida's later reconstruction p-; it tells us that the initial was voiced (though whether it was nasal or prenasalized may be debated).

**8.10.4:04: I follow Gong (1997), Li Fanwen (1986: 425), and Sofronov (1968 II: ) who reconstructed the initial of 3437 as ʔ-. Nishida (1986: 62) reconstructed it as x-; in 1964 he reconstructed it as ɣ- (p. 134). (Nishida 1964 does not explicitly mention 3437, but it does list the reconstruction ɣõ for a rising tone rhyme 47 syllable without any homophones in chapter VIII of Homophones. 3437 is the only syllable that matches that description.) I do not know how those scholars reconstructed the initial of 3437, as its character has no transcriptions or homophones. Given that there are three basic chapter VIII initials (x-, ɣ-, ʔ-) and that the following rising tone rhyme 47 syllables can be reconstructed in Homophones chapter VIII -

xõ, ɣõ

- 3437 is likely to be ʔõ by a process of elimination. Tangut did not seem to permit xw- and ɣw- before o in native words, so xwõ and ɣwõ are unlikely. However, ʔw- was possible before o: e.g., in

4053 1ʔwọ 'ice'

which led to my interest in

2040 2nɤa' 'ice' (with 'water' on the left instead of the mystery element ヒ on the right)

and its homophones like 5952 'ore'. So perhaps 3437 was ʔwõ. Maybe it would be safest to write its reconstruction as Xõ with X representing an unknown back consonant.


14.8.8.23:54: AN OCTET OF ICY HOMOPHONES

The last of the Tangut words for 'ice' that I have been writing about belongs to a set of eight characters in Homophones:

Homophones location Li 2008 number Tangraph Reading Gloss
14B24 2189 2nɤa' < *nraXH second half of 1937 2189 2kha 2nɤa' 'to stutter; sad' (both only in dictionaries)
14B25 2249 wrist (only in Homophones; synonym of the more common word 0682 1khwiə?; not sure if it can stand alone)
14B26 3556 to apply, smear
14B27 2726 colored glaze
14B28 2040

ice (only in dictionaries; not sure if it can stand alone)
14B31 2582 mud (only in Homophones; not sure if it can stand alone)
14B32 4765 yarn (only in Homophones and Miscellaneous Characters; not sure if it can stand alone)
14B33 5952

ore; mine (only in dictionaries)

Out of these eight characters,

- six may only be attested in dictionaries, judging from the absence of nondictionary examples in Li (2008)

- at least three are freestanding words; the others are only found in dictionaries next to other characters

I am still surprised there can be so many 2nɤa'. 2726 'colored glaze' may be an extended usage of 2040 'ice', and 3556 'smear' and 2582 'mud' may be the same root, judging from the analysis of the former which implies 'muddy' semantics:

=+

3556 2nɤa' 'to apply, smear' =

'water' and 'earth' = left and center of 2005 1tʂɤoʳ 'mud' +

bottom left of 4737 2ma 'apply'

But the others do not appear to be related to each other. I doubt pre-Tangut had six homophonous roots. Were those roots nonhomophonous in pre-Tangut: i.e., is 2nɤa' a merger of *nrakH, *nratH, *nrapH, etc. (if *X was a consonant)?


14.8.7.23:56: DISTRIBUTIO-NA-L ODDITIES IN TANGUT

I was surprised to find that

2040 2nɤa' < *nraXH

had seven homophones in Homophones. If my pre-Tangut reconstruction is correct, I would expect simple syllables to be more common than complex syllables. *na was more common than other *na-type syllables, but it's surprising that there were no *naH, *nra, *nraH, *naX, or *nraX while there were multiple *naxH and *nraXH. Nonexistent syllables are in gray.

Rhyme Grade Tangut syllable Pre-Tangut Number of characters per syllable
17.1.17 I 1na *na 12
17.2.14 (*2na) (*naH) 0*
18.1.18 II (1nɤa) (*nra) 0
18.2.15 (2nɤa) (*nraH) 0
(Grade III rhymes like 19.1.9/2.16 do not normally occur after dentals like n-. But see rhyme 21 below.)
20.1.20 IV 1nia *Cɯ-na 1
20.2.17 2nia *Cɯ-naH 3
22.1.22 I (1na') (*naX)o 0**
22.2.19 2na' *naXH 5
(23/1.X) II (1nɤa') (*nraX) 0
23.2.20 2nɤa' *nraXH 8
21.1.21 III 1nɨa' *Cə-naX? 2
21.2.18 2nɨa' *Cə-naXH? 3***
24.1.23 IV (1nia') (*Cɯ-naX) 0
24.2.21 (2nia') (*Cɯ-naXH) 0

Although it's not impossible for a language to have a complex syllable while lacking simpler, similar syllables (e.g., English has strength but not streng, trength, treng, rength, etc.), I wouldn't have predicted multiple instances of a complex syllable. Having eight 2nɤa' < *nraxH is like having eight unrelated strengths in English. I'll look at the eight 2nɤa' next time.

For now, I'll close by noting two peculiarities involving Grade III rhyme 21. First, it was placed before the Grade I and II rhymes in the Precious Rhymes of the Tangraphic Sea, disrupting the usual I-II-III-IV pattern. Second, normally Grade III rhymes do not combine with dental initials, yet there are five Grade III nɨa' (but no nɨa!). I am not happy with my pre-Tangut reconstructions for their sources; they are placeholders. See how *Cə-naXH became 2nɨa' here.

*5 according to Arakawa (1997: 30). Gong reconstructed those five as 2da.

**1 according to Arakawa (1997: 30). Gong reconstructed that syllable as 1daa (= my 1da').

***0 according to Arakawa (1997: 30), who used Nishida's reconstruction in which these three syllables had t- instead of n-.


14.8.6.22:39: PRE-TANGUT *NRAXH 'ICE'

The last of the three words for 'ice' in Li (2008) is

2040 2nɤa' < *nraXH

In the past I would have reconstructed it as 2nææ with a low front long vowel. It is not clear how its Grade II (i.e., -ɤ-medial < *-r- + low vowel) a-type rhyme differs from that of the more common Grade II a-type rhyme -ɤa. Since I no longer think Tangut had long vowels, I write the less common rhyme with a ' reminiscent of a prime symbol to represent its unknown distinctive feature(s). Arakawa also uses ' for this rhyme (-ya' in his system), but I do not know if he regards it as a phonetic symbol or as a notational device.

I used to think that '-vowels (my former long vowels) came from vowel-consonant sequences, as the Tangut autonym

3752 3296 2miə 2nɨa' < *mə-naXH

corresponds to Tibetan mi-nyag which must have been borrowed into Tibetan before the loss of a final obstruent *X (probably *k; more details on the development of this word here*).

If that was the case, then ideally all Tangut -V' should correspond to -VC in related languages that preserve obstruent codas. But that is not the case: e.g.,


5700 2ni' < *Ci-naXH 'nose' (not *2ni!)

corresponds to Japhug rGyalrong tɯ-ɕna and Tibetan sna 'id.' which lack final obstruent codas. Could *-X in such cases be a pre-Tangut suffix absent from other languages?

Conversely, there are cases in which non-Tangut obstruent codas correspond to zero instead of -': e.g.,

5700 1sia < *Cɯ-sa 'to kill' (not *1sia'!)

corresponds to Japhug rGyalrong kɤ-sat, Tibetan gsod-pa, and Old Chinese 殺 *ksat 'id.' Were some codas lost in pre-Tangut under certain conditions before they could condition -' in Tangut?

Both types of cases require explanation.

The problems with 2040 2nɤa' go beyond the mystery of its rhyme. I'm not even completely sure it means 'ice'. I'm surprised Kychanov and Arakawa (2006: 296) also glossed it as 'ice'; they often disagree with Li (2008). I have not seen any attestations of 2040 outside dictionaries. That is a lexicographical red flag; it means any glosses cannot be confirmed in context. Moreover, its Tangraphic Sea definition is presumably in the lost second volume. Here are the only two instances of 2040 known to me:

Homophones: 2040 4053 2nɤa' 1ʔwọ

Homophones text D note: 2040 - 3058 2765 2ɮiəʳ' 1nwie

Li (2008: 339) regarded 2040 4053 as a pair of synonymous nouns: 'ice ice', whereas Kychanov and Arakawa (2006: 296) translated it as a disyllabic verb 'turn to ice'. Given that 4053 can also mean 'frozen', another possibility is a noun-adjective phrase 'ice frozen'; each entry in Homophones has one or two clarifier characters, and the clarifier 'frozen' would distinguish 2nɤa' from its homophones (more on them later). Have Kychanov and/or Arakawa seen 2040 4053 as a verb in a text?

3058 is 'water' without a doubt, but 2765 only occurs in dictionaries. Its Tangraphic Sea entry says,

'[The character 2765 is from] the left of earth (2627) and all of blood (2734). 2765 is 0975. It is what blood uniting (2734 3591) is called.'

Unfortunately 0975 is only known from dictionaries, and the Tangraphic Sea defines it as ... 2765 and 'blood gathering (2734 0269)'.

Li (2008: 454) translated 2765 as a verb 'to swell, coagulate', but Kychanov and Arakawa (2006: 296) translated it as a noun 'coagulated blood'. I favor 'coagulate' as 3058 2765 would make no sense as a note for 'ice' if it meant 'water [and] coagulated blood'.

The verbs

3591 2ni' 'to unite' and 0269 1khiə' 'to gather'

may only have those meanings in dictionaries; their characters are attested with different meanings in nondictionary texts:

3591: 'to attack; a shield; to cover; to die'

0269: 'second half of 4059 0269 investigate; hide; rigid'

I presume these sets are unrelated homonyms apart from the noun 'shield' and the verb 'to cover'.

I could try to force three of the above words into a single word family:

2040 2nɤa' < *nra-X-H 'ice'

2765 1nwie < *Pe-nra 'to coagulate'

3591 2ni' < *Ci-nra-X-H 'to unite'

However, I would then need to account for the functions of the various affixes. Moreover, it is not possible to determine for sure whether grade IV words like 2765 and 3591 originally had *-r-. If I did not link them to 2040, I would have reconstructed them in pre-Tangut without *-r- or front-vowel prefixes to condition *a-raising: *Pɯ-ne and *niXH.

*3752 3296 2miə 2nɨa' has a strange second syllable. Normally n- does not combine with Grade III (-ɨ-medial) rhymes. Perhaps the word developed like this:

Pre-Tangut *mə-naXH

Breaking of *a before nonlow vowel *ə: *mə-nɨaXH

Breaking of *ə: *mɨə-nɨaXH

Tonogenesis: *mɨə-2nɨaX

Tone spread: *2mɨə-2nɨaX

The timing of tonogenesis relative to the vocalic changes is unknown.

The first syllable may be an unstressed form of pre-Tangut *mi 'Tangut' which became

2344 1mi (cf. Tibetan mi 'person')

The second syllable may be cognate to


0176 1nɨa' < *Cə-naX (cf. Tibetan nag-po 'black')

which also has an anomalous n-Grade III rhyme combination. Was *mə-naXH originally *mi-naX-H 'black people'? 'Black' brings to mind the term

2750 0176 1ɣɤu 1nɨa'

'black-headed'

which is a term for one subgroup of the Tangut people.


14.8.5.22:10: PRE-TANGUT *TɅ-KU-H 'ICE'

The second of the three words for 'ice' in Li (2008) is

3177 2kʊʳ (also 'frozen')

which may be cognate to

4034 1kiụ < *S-ku 'cold' (adj.?; see below)

if it is from *Tʌ-ku-H with a root *ku instead of *Cʌ-kur-H with a root *kur.

The prefix *Tʌ- conditioned the lowering and the retroflexion of the root vowel:

*Tʌ-ku > *Tʌ-kʊ > *T-kʊ > *r-kʊ > *r-kʊʳ > kʊʳ (ignoring *-H; see below)

The suffix *-H conditioned the second ('rising') tone.

The semantic difference, if any, between 3177 2kʊʳ 'ice' and

4053 1ʔwọ 'ice'
is unknown (apart from the fact that 3177 can also mean 'frozen').

Unfortunately I do not know of any other pairs of the type

*Tʌ-√-H (noun) : *S-√ (adjective)

*S- is normally a verbalizing prefix but not in the adjective 4034 *S-ku 'cold' or in the pre-Tangut source of the noun 4053 1ʔwọ 'ice' which could have been *S-ʔʌ-pam (as reconstructed last week) or *S-P-ʔo (as reconstructed last night).

Perhaps I should not call 4034 *S-ku an adjective, as Li (2008: 652) does not list any instances of it as an independent word (or in any text outside a dictionary). It may occur only in the disyllabic words

4034 4051 1kiụ 1kiʳw < *S-ku T-kuk 'cold' (?)

4034 4077 1kiụ 1miẹ < *S-ku Sɯ-me 'cold' (?)

which might have originated as synonym compounds.

I am not certain about these glosses. Li (2008) does not list any attestations of 4034 4051 and 4034 4077 outside dictionaries. Both 4051 and 4077 appear as independent words for 'cold', but the latter is only in the Tangraphic Sea. I have no doubt that the meanings of 4034 4051 and 4034 4077 have something to do with cold, but without textual examples, I cannot be sure of their parts of speech.

Unlike Li (2008: 652) who regarded 4034 as an adjective 'cold', Kychanov and Arakawa (2006: 489) defined 4034 as nouns 'frost' and 'cold' (i.e., 'coldness'?). Perhaps they have seen it in contexts where those glosses are appropriate.

The Tangraphic Sea equates 4034 with

1. 4034 4051 (see above)

2. 4089 0143, lit. 'cold (?) cold (adj.)' (compound attested only in dictionaries; only second half confirmed by nondictionary textual examples)

3. 2720 'cold' (adj.; confirmed in nondictionary textual example)

4. 4077 (see above)

5. 1918 0115 'not hot' (adj.; confirmed in nondictionary textual example)

On the other hand, the meanings of 3177 can be confirmed in the nondictionary textual examples in Li (2008: 518).


14.8.4.23:44: AN ICY *P-REFIX?

While writing about a possible *p-prefix in the Chinese word for 'ice' in my last entry, I forgot to mention that *P- could also be a source of the -w- in Tangut 

4053 1ʔwọ < *S-P-ʔo? 'ice'

I reconstructed *P- to account for pairs of semantically and phoneticaly similar words such as

1829 1tsha < *Kɯ-tsa 'hot' : 1825 1tshwia < *P-Kɯ-tsa 'to heat'

in which one member has -w- and the other does not.

Ideally I would like to pair 1ʔwọ 'ice' with a word like ʔo 'ice, freeze, cold, etc.' But none of the ten words glossed as 'cold' in Li (2008) sound anything like 1ʔwọ 'ice' or ʔo. Nor do words with similar meanings like 'frigid' or 'snow'. I am hesitant to reconstruct *P- if I cannot find a -w-less potential relative, though there is no guarantee that a language must have a bare form alongside each affixed form.

Moreover, Gong (2002: 46) found that most zero ~ -w-pairs "clearly show a morphological process of forming verbs from adjectives or nouns". Hence *P-, the source of -w-, was often a verbalizing prefix. Obviously 1ʔwọ 'ice' was not a verb. Gong found only one zero ~ -w-pair whose -w-member was a noun:

3354 1ɣɤi < *Cʌ-Kri 'power' : 5307 1ɣwɤi < *Pʌ-Kri 'power'

I would add

3596 1ɣwɤi < *Pʌ-Kri 'power' (homophonous with 5307!)

to this set.

By analogy with this pair, a hypothetical ʔo that was the root of 1ʔwọ 'ice' would also mean 'ice'.

I wonder if the 'power' set actually consists of two reflexes of *Pʌ-Kri rather than *Kri with two different prefixes. I reconstruct prefixes with low vowels to account for ɣ- which is (sometimes? often? always?) from a lenited velar obstruent and -ɤ-, the reflex of *-r- after a low presyllabic vowel (see the second table in "G-*r-adation in Tangut (Part 2)"). Incidentally there is a tangraph

5309 1ʔo

that Li (2008) glossed as ... 'power'! Alas, not the 'ice' I was hoping for.

If the -w- in 1ʔwọ 'ice' is not a lenition of a prefix *P- or a medial (root-initial?) *-P-, then I wonder if Cw-clusters such as ʔw- come from original clusters or unit phonemes such as the Old Chinese *ʔʷ- reconstructed by Baxter and Sagart (2011).


14.8.3.23:48: S-PRƏNG FROM SOME COMMON SOURCE? (PART 2)

In my last entry, I forgot to address the fact that Old Chinese 冰 *prəŋ 'ice' had an *-r- absent from pam-words for 'ice' or 'snow' in non-Chinese Sino-Tibetan languages. I could try to explain away the *-r- as an infix or as a prefix that metathesized: *T-p- > *pr-. (Medial *-r- is so common in Old Chinese that I suspect it came from a variety of sources - *t- and *l- as well as *r- - that I symbolize as *T-.) The dissimilation of *-m to *-ŋ after *p- could have occurred before *T- moved into medial position as an *-r- that would have blocked the shift. However, this scenario would still require a pre-Shijing dissimilation, long before the dissimilation is evident in poetry.

Here's a very different scenario. In Guangyun, 冰 'ice' has two Middle Chinese readings, *pɨŋ and *ŋɨŋ. I know of no other case in which a character has both *p- and *ŋ-readings. The *ŋ-reading is homophonous with 凝 *ŋɨŋ 'to freeze'.

Was 凝 used to write two unrelated words for 'ice' which happened to have identical rhymes?

Or were the two words related? Zhengzhang reconstructed them as *pŋrɯŋ and *ŋrɯŋ. (His *ɯ is equivalent to in other reconstructions.)

This internal etymology has no phonological problems if one accepts the simplification of *pŋ- to *p-, but it does raise the question of what *p- was. In this case it could be a nominalizer or even a participial prefix (a fossil of an earlier system of conjugation?): 'ice' < 'frozen'. Are there other pairs of the type

*X 'verb' : *p-X 'nominalized verb' / 'verb-ed'?

Next: More Tangut words for 'ice'.


Tangut fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2014 Amritavision