You may have noticed in my last entry that I started quoting Tangraphic Sea four-character character analyses in full instead of converting them into A + B (+ C) formulae which are easier for me to type.  Perhaps paying attention to the exact wording of the analyses might help me formulate hypotheses about the script (which remains enigmatic to me after two decades).

David Boxenhorn saw the analysis of 'six' (which I will rewrite here in full) in "Tired Chapters of Accuracy"


3200 1chhiw4 = 3849 3130 1012 2705 1zhiw3 2mer4 1zeq4 2beq4

'(first syllable of 'sixth (month)') + palace + how-much right'

and my assessment of it as "implausible". He wrote that the implausiblity

supports the theory that the analyses are mnemonic.

You know what else supports the mnemonic theory (I just thought of it!)? The analyses are all four characters. I can easily imagine a group of students reciting them. Even singing them.

And isn't there a Chinese tradition of four-character sayings?

[... I]s there any continuity between adjacent analyses, either semantic or phonetic? Something to suggest that they are not meant to be read in isolation?

I just realized that The Golden Guide consists of 5-character lines. Characters plus their 4-character analyses are also 5 characters long. They could be both be recited in the same way, e.g. with the same tune!

You know, in English the most common meter is iambic pentameter, which would be great for every two lines of these works.

I should look at analyses in adjacent Tangraphic Sea entries in the new features.

I had been thinking that if the Golden Guide had a tune, it couldn't apply to the Tangraphic Sea analyses, but I had overlooked the possibility of including the analyzed character's reading in the tune.

As for meter, I have never seen a comparative study of the structures of poetry in (South)east Asian 'monosyllabic' languages. Does such a study exist? Or even, say, a study of poetic structures in what Guillaume Jacques might call the Macro-rGyalrongic world? Is there a Qiangic language today with poetry characterized by five-syllable lines? Is there a tonal pattern within and/or between lines of the Golden Guide and/or the Tangraphic Sea analyses?

In any case, I don't think the Tangraphic Sea analyses are truly etymological. In other words, I don't think they necessarily reflect the reasoning of the creator(s) of the script. Who would devise graphs for 'sixth (month)', 'palace', and 'how much', and then fuse them into 'six'? 'EIGHT' FROM 'SEVEN', 'SIXTH' FROM 'FIVE'?

I have been haunted by the Tangraphic Sea analysis of 4602 1ar4 'eight' for twenty years:


4602 1ar4 'eight' = 4778 1139 2750 1868 1shaq4 1e4 1ghu2 1teq4

'seven GEN head remove' = 'remove the top of seven'

If 'seven' and 'eight' are related characters, I would expect the more complex character ('seven') to be built up from a less complex character ('eight') rather than the other way around. But that scenario also has issues. Why would the character for 'eight' be designed before the character for 'seven'? And what does the top half of 'seven' do? I used to think it might be a phonetic symbol for sh-, so 'seven' was written as sh-yar = 1shaq3. However that configuration of strokes (Boxenhorn code: biozoxzox) appears nowhere else in the Tangut script. Why did 'seven' merit a unique 'head'?

The Tangraphic Sea analysis of 'seven' does not answer that question:


4778 1shaq4 'seven' = 4751 3916 4602 1602 1se1 2si4 1yar4 2ngorn1 

'clean (nominalizer) eight all' = '(the top of?) cleanliness all of eight'

If 'eight' is from 'seven' - at least in the Tangraphic Sea - it is not entirely surprising that the second syllable of

3849 5081 1zhiw3-1vi1 'sixth (month)'

is said to be from 'five':


5081 1vi1 = 5286 3936 1999 2705 1vi1 1pha1 1ngwy1 2beq4

'(second syllable of 1ten4-1vi1 'clever') left + five right'

5286 is phonetic. 1999 is not optimally semantic, though. Why not simply extract part of

3200 1chhiw4

the character for the regular word for 'six'?

8.27.22:39: Cf. how the character for 'fourth (son)' is derived from the regular character for 'four' (though the words are unrelated!):


4934 1ngwyr 'fourth (son)' = 4971 2750 2205 1602 1shwi3 1ghu2 1lyr'4 2ngorn1 

'age head (i.e., top) + four all'

The only other son-counting character with a similar structure is


1257 1ar4 'eighth (son)' = 0384 3936 4602 1602 1leq4 1pha1 1ar4 2ngorn1 

'son left + eight all'

which is simply a different spelling of 4602 1ar4, the regular word for 'eight'. Why did 'fourth (son)' incorporate 'four' unlike other characters for counting words unrelated to regular numerals? DID COMMON AND 'RITUAL' TANGUT SHARE A ROOT FOR 'SIX'?

So far the last word on the subject of the so-called 'ritual language' of Tangut is Andrew West's 2011 article. I have yet to write a full response after over five years. My view in short is a blend of Nishida Tatsuo's and Andrew's; the 'ritual' language is a subset of substratal vocabulary used in glosses. I think these words were borrowed from a language of unknown affiliation - possibly an isolate spoken in Tangut territory. 

But if I am right, I would not expect the first syllable of 'sixth (month)'

3849 5081 1zhiw3-1vi1

to sound like the regular word for 'six':

3200 1chhiw4*K-truk.

Last night it occurred to me that 3849 might be 3200 with a root initial that lenited after the vowel of a lost presyllable:

*KV-truk > *KV-ch- > *KV-zh- > zh- (the relative timing of *-uk > -iw is unknown)

cf. 3200 in which *KV- conditioned aspiration: *KV-truk > *K-truk > 1chhiw4

But if that were the case, then what is 5081? It is not attested by itself except as a dictionary entry, and it does not occur as a suffix after other numerals.

I used to think that 3849 5081 1zhiw3-1vi1 was an unanalyzable disyllabic word, but 1vi1 is homophonous with

3649 1vi1 'sixth' (son).

Is 1zhiw3-1vi1 a redundant compound combining a variant of the basic Tangut word for 'six' with a substratal word? Or is 'sixth' before 'son' an abbreviation of a disyllabic substratal word 1zhiw3-1vi1?

8.26.14:31: Perhaps the resemblance of substratal 1zhiw3-1vi1 to 1chhiw is no more meaningful than that between Malay dua < *duSa 'two' and Sanskrit dva- ~ dvi- 'id.' (borrowed into Malay as dwi-). The zh- of 1zhiw3 does not have to be from a lenited consonant; it could be a direct borrowing of a voiced fricative from a substratal language. Similarly,  the -iw of 1zhhiw1 need not be from *-uk. TIRED CHAPTERS OF ACCURACY

In my last post I mentioned the unusual -aq1 ~-iq1 alternation in a disyllabic Tangut verb 'not know':


5077 1817 1my4-1daq1 ~ 5077 5283 1my4-1diq1. 

The first syllable does not seem to occur by itself. Was it originally the same morpheme as

5643 1my4 'not'

which negates auxiliary verbs? Was there originally a d-verb for 'know' that was an auxiliary in a V + 1my4 + d-verb 'not know how to V' construction?

The second syllables occur as glosses for


3020 1ja'3 'accuracy' = left of 3543 1dzwy1 'chapter' + left of 1817 1daq1 'know'

in Mixed Categories. That implies they might also be standalone verbs, though I have not yet seen them used by themselves in texts.

3020 shares its right side 'speech' (derived from Chinese 言 'id.'?) with 1817. (The bottom half of 'speech' has different shapes depending on whether it is on the left or right. Generaly the final stroke of a Tangut character cannot point in a northeast direction.)

Its left side

Boxenhorn code: jil

is only in three other characters. It is semantic in 3543 (see above) with 'hand' on the right (why?), semantic and phonetic in

3021 1ja'3 'chapter' (cf. 3543 'chapter' above)

and phonetic in 3523, half of the mirror-image disyllabic words


3523 1688 2ja'3-1gu1 ~ 1gu1-2ja'3 'tired' (cf. 3020 and 3021, both 1ja'3)

I have not seen 3021, 3523, or 1688 outside dictionaries.

(8.25.23:20: Li Fanwen 2008 and Kychanov and Arakawa have glosses for each half of those words, but I don't know how they determined which side meant what:

3523:  L 'skinny, wan and sallow'; K&A 'toil, exhausting labor'

1688: L 'toil', K&A 'toil'

K&A defined 2ja'3-1gu1 as 'toil'; they have no entry for 1gu1-2ja'3.)

I first encountered the right side of 3021 (Boxenhorn code: dim) in 3200 1chhiw4 'six' whose Tangraphic Sea analysis is implausible:


3200 1chhiw4 = 3849 1zhiw3 + 3130 2mer4 'palace' + 1012 1zeq4 'how much'

I doubt that the character for the regular word for 'six' was derived from 3849,  the character for the first half of 'sixth (month)' in the so-called 'ritual language':

3849 5081 1zhiw3 1vi1

And why write 'six' with part of 'palace'?

'How much' is not the worst source of a numeral character component, but its right half doesn't appear in any other numeral characters.

The right side of 3523 and 1688 is semantic:

4675 2rer4 'toil, hard work'.

Work makes one tired.

8.25.20:20: The vague similarity of 4675 to the right side of Chinese 作 '' may be coincidental.

The left side of 1688 from

0678 1gu1 'arise, build'

is phonetic and vaguely similar to the left of Tangut period northwestern Chinese 孤 *1ku1. WHERE DID *-U GO IN TANGUT?

My last series of posts was about the Tangut verb 'eat' which has two stems:


4517 1dzi3 < *Nɯ-dza and 4547 1dzo4- < *Nɯ-dza-u

The latter was derived from the former via the addition of a third person object suffix *-u. Here is my version of Gullaume Jacques' (2014: 232) table of pre-Tangut *-Ø ~ *-u verb stem alternations and their Tangut reflexes. I have added the first two rows and reorganized the others.

Alternation type Pre-Tangut Tangut
u *Cʌ...-ə -u1
*-ə-u -u3
i *Cɯ-...-ej -e3/4
*Cɯ-...-ej-u -i3/4
y *Cɯ- ...-aC (Cŋ) -a3/4
*Cɯ-....aC-u -y3/4
o2 *(Cʌ)-...-ra -i2
*(Cʌ)-...-ra-u -o2
o3/4-i *Cɯ-...-a -i3/4
*Cɯ-...-a-u -o3/4
o3/4-u *Cɯ-...-o -u3/4
*Cɯ-...-o-u -o3/4
o3/4-e *Cɯ-...-aŋ -e3/4
*Cɯ-...-aŋ-u -o3/4

Two oddities that jump out at me:

1. Why are there so few alternating verbs with Grade I rhymes? I only know of two such verbs, and the second cannot be reconstructed with *-u:


1338 1dzu1 ~ 4973 1dzu3 'love'


5077 1817 1my4-1daq1 ~ 5077 5283 1my4-1diq1 'not know' (but the first syllable is Grade IV)

2. Why are there no pre-Tangut verbs ending in *-i ~ *-i-u?

I presume pre-Tangut *-u ~ *-u-u stems merged into Tangut -u. Did *-u vanish after higher series vowels (*u, *i, *ə)? Or is the set above an incomplete remnant of what was once a much larger system involving more (or even all) verb stems that was increasingly regularized (i.e., lost its alternations) over time?

8.24.23:13: The table above does not include non-*u alternations such as the one in 'not know'.

I wonder if the i-type and y-type alternations involve ablaut rather than *-u. I can imagine how *-ej-u fused into -i

*-eju > *-iju > *-ju > *-y > -i

though I'm surprised it didn't merge with -ew.

However, I have more difficulty with deriving -y (phonetically some sort of nonlabial central vowel or diphthong) from *-aC-u. *-C- must have lenited to nothing, and *a and *u might have fused into a vowel that was nonlabial and central like *a but nonlow like *u. THE PAST AND PRESENT SOUND OF EATING IN TANGUT (PART 5)

In parts 3 and 4, I reconstructed the ancestors of the stems of the Tangut verb 'eat'

4517 1dzi3 and 4547 1dzo4-

with a presyllable *Nɯ- whose high vowel conditioned Grades III and IV.

But looking at the word's cognates in the rGyalrongic Languages Database (e.g., Brag bar ka-nə'-dza) made me wonder if the word had two presyllables - and if the high vowel conditioning Grades III and IV belonged to a presyllable preceding a nasal: Unfortunately, almost none of the presyllables preceding (n(V)-)dzV in rGyalrongic languages have high vowels:


a'- ka'-

kə- kwə- tə- nə-




The exceptions are Da tshang towu'nza and Japhug (variety A) tu'-nza; Guillaume Jacques' (2016: 143) Japhug simply lists ndza (the form given for variety B in the database) with the directional prefixes tɤ- and thɯ-. (How many of the presyllables above are directional prefixes?)

I doubt that a single presyllable before *NV- can be reconstructed at the Proto-rGyalrongic level; different varieties apparently added different prefixes to *NV-dza. In some cases, there were two prefixes:

*ka-tV- > Tsho bdun (variety A) kat˺'- (B has kə-)

*kV-pV- or *o-kə- > Brag steng 'Khyung-ri kwə-

*to-pV- > Da tshang towu'-

My hypothetical *pV- is the source of xxx ᴾ-. I don't know how ᴾ- differs from p-. I also don't know what the function of the apostrophe is. Is it a breve? I assume it is not a glottal stop which appears as ʔ in the site's transcription.

Brag bar ka-nə'-dza preserves the syllabicity of the nasal element of 'eat'. Without knowing anything about Brag bar phonology, I do not know whether *nə' can be derived from *Nɯ- with a high vowel.

bZhi lung kaˈ-ᵐtsok˺ points to *m as the specific value of *N. I am guessing that the small indicates that ᵐts- is a unit phoneme, and that the presyllable is ka'- rather than *ka'm-.

8.23.23:08: Do I have to posit a high-vowel presyllable to account for the grades of the Tangut forms? Guillaume Jacques derives his Grade III (= my Grades III and IV) from *-j-, and such a medial is in some forms of 'eat' in the rGyalrongic database: e.g., dPa' dbang ndzja (identical to Guillaume's pre-Tangut *ndzja!) and Kha ra kyo ka'-zje. Is this attested -j- primary or secondary? In other words, did languages like Japhug lose *-j- (*ndzja > ndza), or did *a become ja (with or without further fronting and raising: ja > je > i) in some languages? I favor the latter scenario, but I am not absolutely certain, as I do think *-j- did exist in the ancestor of these languages. I just don't know if it existed in this particular word at an early stage. THE PAST AND PRESENT SOUND OF EATING IN TANGUT (PART 4)

As I revised what was supposed to be the last part of my trilogy on

4517 1dzi3 < *Nɯ-dza 'eat'

I realized I had overlooked something obvious. 4517 belongs to a small class of verbs with two stems. The other stem is

4547 1dzo4-

with a Grade IV rhyme. Why don't the rhymes of both stems have the same grade? Did 4517 have some additional affix that conditioned Grade III instead of the usual Grade IV after dz-? The hyphen of 1dzo4- indicates that it never appears by itself; it is combined with person suffixes in 'I eat him/her/it/them' (1SG > 3), 'you (sg.) eat yourself/him/her/it/them' (2SG > 2SG, 2SG > 3), and perhaps 'I eat myself' (1SG > 1SG; forms for unattested subject/object combinations are in parentheses).

subject\object 1SG 2SG 1/2PL 3




(*?) (*1dzi3-2ni4?)









(*?) (*1dzi3-2ni4?)





(*?) (*1dzi3-2ni4?)


The o-stem is derived from the i-stem plus a third person object suffix *-u prior to 'brightening' (raising and fronting: *a > i):

*Nɯ-dza-u > 1dzo4

If *-w had been added after 'brightening', the o-stem might have been an iw-stem:

*Nɯ-dzi-u > *1dziw4

8.22.23:32: Note that *-u is not present in all verb forms for third person objects. In that respect Tangut differs from northern Qiang in which a cognate suffix -w is consistently in all verb forms for third person objects.

