At a glance, Tangut and Tangut period northwestern Chinese (hereafter simply 'Chinese') phonology appear to be similar: 

- They had largely overlapping consonant inventories with a three-way distinction between voiceless unaspirated, voiceless aspirated, and prenasalized voiced: e.g., p- : ph- : b- [mb].

Tangut, however, had more consonants: gh-, lh-, ld-, r-, z- [ɮ].

And Chinese had an f- absent in most Tangut reconstructions (the exceptions being Nishida's and Arakawa's).

- They had six basic vowel types: u, i, a, y, e, o.

- These vowels had four types of variations ('grades').

Tangut, however, had further variations absent from Chinese: tension, retroflexion, and the mysterious quality that I write with -' and call 'prime'.

- They contrasted oral and nasal vowels.

- Their syllables had the structure C(w)V(G); they only permitted -w and perhaps -j in coda position.

Despite many common features, it would be an exaggeration to say that the two languages share a common phonology. Notice that I have not mentioned tones. There does not seem to be any correlation between the two 'tones' of Tangut and Chinese tonal categories: e.g., Chinese 龍 *2lon3 'dragon' was borrowed twice with both tones:

4897 1lon3 and 4203 2lon3

This could imply that Tangut and Chinese tones sounded very different, making one-to-one mapping between them impossible.

Or perhaps Tangut had phonations (plain vs. breathy?) instead of tones despite the use of 'tone' in the Tangut phonological tradition. The Tangut couldn't hear tones because they didn't have any. (I am now skeptical of the phonation hypothesis that I came up with in the late 90s. If Tangut had phonation and Chinese didn't, why didn't the Tangut simply borrow and transcribe all Chinese tones with Tangut clear phonation?)

One last possiblity - as yet unexplored - is that the Tangut were sensitive to sandhi variants of tones. Suppose, for instance, that Tangut and Chinese tones 1 and 2 were similar, and that Chinese tone 1 became tone 2 before tone 4: 龍栢 */1lon3 4pe2/ > [2lon3 4pe2] 'dragon cypress'. Then it would make sense to borrow that disyllabic word as

4203 4119 2lon3 1pi2

with the second tone while borrowing monosyllabic 龍 /1lon3/ = [1lun3] 'dragon' as

4897 1lon3

with the first tone. But why, then, was Chinese 龍栢 */1lon3 4pe2/ 'dragon cypress' transcribed (as opposed to borrowed) in the Timely Pearl as

4897 5970 1lon3 1pi2

with the first tone rather than the second? Here are five explanations:

1. The most boring, namely, that this is a random error.

2. Hypercorrection: the transcriber knew that the Chinese word for 'dragon' had tone 1 and might have assumed that tone 2 in the Tangut loan deviated from the Chinese (when in fact it reflected Chinese tone sandhi).

3. The transcription reflects a careful Chinese reading pronunciation "1lon3 ... 4pe2" without tone sandhi.

4. The transcription reflects a variant Chinese pronunciation without tone sandhi - perhaps from a dialect slightly different from the source of the Tangut borrowing.

5. The borrowing reflects a slightly earlier stage of Chinese with tone sandhi and the transcription reflects a slightly later stage without tone sandhi (and with the original first tone restored by analogy with 'dragon' in isolation?).

The tones are not the only differences between Tangut 2lon3 1pi2 'dragon cypress' and its Chinese source lon3 4pe2. I'll explore the others in part 2. DISSECTING A TANGUT MARRIAGE (PART 5)

If 5051 (second half of 1y4 1naq4 'marriage'; Boxenhorn code: biogeodex) could be abbreviated to resemble 2544 'sage'  (Boxenhorn code: geo) in 0532 2ge4 'to marry' (Boxenhorn code: hosgeo),


why wasn't it abbreviated that way in other derivatives?

3657 1y4 (first half of 1y4 1naq4 'marriage'; Boxenhorn code: giibiogeo)

1625 2tuq4 'to mate, marry' (Boxenhorn code: fosbiogeo)

5975 1naq4 'parallel, weft' (Boxenhorn code: palbiogeo)

In other words, why do those three characters have a 'hat' (bio) absent in 0532?

I think 3657 needed a 'hat' (bio) to distinguish it from an existing character without it:

2449 2bi1 'sun' (Boxenhorn code: giigeo)

2449 must precede 3657 in the chronology of tangraphic creation.

But there are no characters with the structures





so in theory the 'hats' (bio) are redundant, though their presence does makes the connection of 1625 and 5975 to 5051 more transparent.

I am reminded of the inconsistency of simplification in the postwar Japanese script:

- 獨 'alone' was simplified to 独 (with the phonetic 蜀 'the state of Shu' reduced to 虫 'bug')

- but 濁 'muddy' was not simplified to 浊 even though no such character already exists (and years later, 濁 was simplified to 浊 in the PRC).

There is no deep meaning behind the inconsistency of 独 and 濁. Perhaps there is none behind the inconsistency of

0532 without a 'hat' (bio)

on the one hand and

1625 and 5975 with 'hats' (bio)


Many Tangut marital characters from the previous parts contain

2544 2shen4 'sage' < Chinese *3shen3

and if one had never known about 2544, one might guess that it was a semantic component 'marry'. But it acquired that secondary function as an abbreviation of 5051:


5051 1naq4 = 3657 2705 2546 2705 1y4 2ber'4 1naq4 2ber'4

(first half of 1y4 1naq4 'marriage') right + 'god' right

2544 'sage' is semantic in 2546 'god', the phonetic of 5051. I have no doubt about the first half of the Tangraphic Sea analysis of 2546:


2546 1naq4 = 2544 1602 0149 0737 1naq4 2ngorn1 2wer1 1chhen3

'sage' all + 'protect' bottom

But I have doubts about the second half. 0149 must be derived from 2546 rather than the other way around. The 'person' on the right of 2546 is either simply 'person' (but why would 'god' have 'person'?) or an abbreviation of one of the 1,186 (!) tangraphs containing 'person'.

Someone (I?) should try to reconstruct a chronology of the derivation of tangraphs based on the Tangraphic Sea derivations plus common sense. Here's a sliver of that chronology:

In words: 2544 begat 2546, which in turn begat 5051 and 0149.

5051 begat 3657, 1625, 5975, and 0532 (but why does 0532 lack the 'horned hat' of the others?).

5138 begat 5138 1gu'1, first syllable of 1gu'1 1chhiw4, the name of a Tangut god (1chhiw4 is 'six').

Next: Why don't all married sages wear hats? DISSECTING A TANGUT MARRIAGE (PART 3)

As I wrote in part 2, I thought that 5051 1naq4

was simply phonetic in its homophone 5975:


5975 1naq4 'parallel, weft' = 5938 3936 5051 3936 2ge4 1pha1 1naq4 1pha1

'classical text, warp' left + (second half of 1y4 1naq4 'marriage') left

But then I discovered that 5938, listed as the source of the left side of 5975, had a homophone

0532 2ge4 'to marry'

which the Tangraphic Sea lists as a definition for the first half of

3657 5051 1y4 1naq4 'marriage'.

Is 0532 'to marry' a metaphorical extension of 5938 'warp' (in the sense of weaving)? Li (1997: 104) defined 0532 as 'weave, marry' - to which STEDT added '(join in marriage)' - but the revision of the entry for 0532 in Li (2008) has the definition 'to marry, to unite in marriage' without any reference to weaving..

If 0532 is originally a weaving term, then could 5051 1naq4 of 3657 5051 1y4 1naq4 'marriage' also originally be a weaving term - specifically, an extended usage of 5975 1naq4 'parallel, weft'?

3657 1y4 is attested as an independent word 'marriage, matchmaker, relatives by marriage'. 3657 5051 1y4 1naq4 'marriage' is thus originally 'marriage weft' with the first half clarifying the metaphorical use of the second half which does not occur on its own in the sense of 'marriage'.

Do 5938 2ge4 < *Nɯ-Kan/ŋ ~ *Cɯ-ŋgan/ŋ 'warp'* and 5975 1naq4 < *Sɯ-naC 'weft' have cognates outside Tangut? Unfortunately, neither 'warp' nor weft' are in the rGyalrongic Languages Database. Both are at STEDT, but I can't find any cognates there or in Guillaume Jacques' Japhug dictionary which lists tɤ-ʁjar 'warp' and tɯ-jlɤβ 'weft'.

*I reconstruct a presyllable with to condition Grade IV after a velar. However, I do not know whether that presyllable had a nasal initial *N- or preceded a nasal. I also do not know if the velar stop after the nasal was originally voiced or not. In any case, Tangut g- is from *ŋg- which may in turn have more complex origins. DISSECTING A TANGUT MARRIAGE (PART 2)

The character for the second half o

3657 5051 1y4 1naq4 'marriage'

has two probable derivatives besides the first character:


1625 2tuq4 'to mate, marry' = *0482 3936 5051 3936 2dzen4 1pha1 1naq4 1pha1

*'to copulate' left + (second half of 1y4 1naq4 'marriage') left?


5975 1naq4 'parallel, weft' = 5938 3936 5051 3936 2ge4 1pha1 1naq4 1pha1

'classical text, warp' left + (second half of 1y4 1naq4 'marriage') left

The analysis of 1625 is my guess since it is one of the many characters whose analysis was in the lost second tone volume of the Tangraphic Sea.

0482 is the clarifier of 1625 in Homophones, so it is certain that the Tangut considered the two to be semantically related even if 0482 was not actually in the analysis of 1625.

1625 2tuq4 should go back to pre-Tangut *Sɯ-to-H:

*S- conditioned the tension of the vowel transcribed as -q.

*-ɯ- conditioned Grade IV in lower vowels (*a, *e, *o) after dentals

I am assuming that the raising of *o to *u predated the conditioning of Grade IV.

I could be wrong. Maybe Grade IV was conditioned by a raised *o after a dental:

*S(ɯ)-to-H > *S(ɯ)-tu-H > 2tuq4

If so, maybe there was no after *S-.

*-o raised to -u (Jacques 2014: 206); whether this occurred before or after Grade IV is uncertain.

*-H conditioned tone 2; it may ultimately be from *-ʔ or *-h (< *-s).

*Sɯ-to-H might go back to an even earlier *Sɯ-ton-H if it is cognate to forms for 'to marry' like

Somang rGyalrong ston muŋ ka-pa

Daofu sto lmo və  (is v- a lenited *p preserved in Somang?)

Xinlong Queyu ste⁵⁵ rmu⁵⁵ vi¹³ (did *o front before *-n?; v- < *p-?)

and if *Cɯ-...-on merged with *Cɯ-...-o into -u3/4. That would be parallel with the merger of *Cɯ...-en and *Cɯ...-e into -i3/4, and one could propose a general rule:

*Cɯ-...mid vowel + -n > Grade III/IV high vowel

5051 must be semantic in 1625 since the two sound nothing alike. Conversely, 5051 must be phonetic in 5975. But could it be something more? I didn't think so at first. I'll explain why I changed my mind next time. DISSECTING A TANGUT MARRIAGE (PART 1)

The two halves of

3657 5051 1y4 1naq4 'marriage'

are written similarly, so it's not surprising that they have circular derivations in the Tangraphic Sea:


3657 1y4 = 3436 2705 4973 3936 1sa'1 2ber'4 1naq4 1pha1

(second half of 1ne4 1sa'1 'close relative) right + (second half of 1y4 1naq4 'marriage') left


5051 1naq4 = 3657 2705 2546 2705 1y4 2ber'4 1naq4 2ber'4

(first half of 1y4 1naq4 'marriage') right + 'god' right

2546 is clearly phonetic in its homophone 5051. So I think the sequence of character creation was

2546 > 5051 > 3657

though I am surprised the character for a second syllable was devised before the character for a first syllable.

Why was 5051 abbreviated in 3657? Because it was no longer phonetic, so there was no longer any need to keep all of 2546 under the 'horned hat'? Because the right-hand 'person' component (Boxenhorn code: dex) is so common (it appears in one out of five Tangut characters) that it is almost expendable? In any case, 5051 doesn't appear in its entirety as a component of any character.

Next: Other instances of 'depersonalized' 5051. EATING BEGINS WITH LOVE, NOT MARRIAGE

Thanks to Guillaume Jacques for catching my mistake. The correct fanqie for 'eat' from "The Past and Present Sound of Eating in Tangut" (Part 1 / Part 2) is


4517 1dzi3 'eat' = 4973 1dzu4 'love'+ 0932 1i3 'many, more, much'

The correct initial speller 4973 is visually very similar to 5051, the erroneous initial speller that I posted which represents the second half of the disyllabic word

3657 5051 1y4 1naq4 'marriage'

I will take a closer look at this word starting tomorrow.

Alas, the enigma of the final speller remains. Why is it Grade III instead of Grade IV after dz-?

