Archives

12.1.23.2:57: DATING THE DATIVE

On Friday night, I looked for every instance of the Old Korean dative-locative particles in the hyangga corpus. I've listed these attestations in chronological order below:

Spelling Poem Author King at time of composition
(前)乃 *(tsyən)-ʌy or *(tsen)-ɯy 願往生歌 廣德 文武王 (Shilla; 661-681)
?衣希 *ɯyhɯy
*hɯy 獻花歌 佚名老人 聖德王 (Shilla; 702-737)
*la 祭亡妹歌 月明師 景德王 (Shilla; 742-765)
也中 *laKɯy, 惡希 *axɯy 讚耆婆郎歌 忠談師
良中 *laKɯy,  羅 *la 禱千手觀音歌 希明
*la 處容歌 處容 憲康王 (Shilla; 876-886)
*ɯy 禮敬諸佛歌 均如大師 (923-973) (Koryo Dynasty;
exact dates of composition unknown)
良衣 *laɯy, 惡中 *akɯy 稱讚如來歌
*la 廣修供養歌
*la 隨喜功德歌
惡之 *akɯy, 阿希 *ahɯy 請轉法輪歌
*aKɯy, (夜)未 *(pam)-ɯy 請佛住世歌
惡中 *akɯy 普皆迴向歌

All of the pre-均如 Kyunyŏ attestations are in the 三國遺事 Samguk yusa that was compiled centuries later. The 均如傳 Kyunyŏjŏn (1075) containing all ten of Kyunyŏ's hyangga appeared over a century after his death. Hence the extant spellings may or may not match the originals.

I have revised my reconstructions of the Old Korean particles. My reasoning for my new reconstructions follows.

(前)乃 *(tsyən)-ʌy: cf. Sino-Korean 前 tsyən 'front', 乃 nʌy. The phonogram 乃 represents the final consonant of 'front' and the locative particle. *ə and *ʌ are in different Middle Korean vowel classes but coexist here, implying that Old Korean did not yet have vowel harmony.

But perhaps 前 'front' was *tsen  with a nonhigh vowel in the same harmonic class as *ʌ.

Sino-Korean readings generally date from the 8th century, so perhaps an earlier Late Old Chinese reading *nəyʔ influenced the choice of 乃. LOC 乃 *nəyʔ was a phonogram for Old Japanese *nə (since there was no LOC *nə) and may be a phonogram for OK *n.ɯy (since there was no LOC *nɨy or *nɯy). Maybe there is no need to reconstruct an *-ʌy locative. All other evidence below points to an *-ɯy locative.

衣希 *ɯyhɯy: *ɯy 'genitive' + *hɯy 'locative'? Kim Wan-jin (1980: 116) rejects the locative interpretation of these graphs.

*ɯy: cf. Sino-Korean ɯy, Late Old Chinese *ʔɨəy; is this *ɯy < *ɣɯy < *xɯy < *kɯy, or is it the genitive reused as a locative? If my earlier reconstructions with *-ʌy were correct, this version of the postposition would have been transcribed with a graph read as ʌy in Sino-Korean and *ʔəy in Late Old Chinese: e.g., 愛.

*hɯy: cf. Sino-Korean hɯy, Late Old Chinese *xɨəy. If my earlier reconstructions with *-ʌy were correct, this version of the postposition would have been transcribed with a graph read as hʌy in Sino-Korean and *xəyʔ in Late Old Chinese:: e.g., 海.

*la: alternates with 也 *la; cf. Sino-Korean rang, the use of this graph for Old Japanese ra (an idiosyncratic usage that spread from the Korean peninsula?) and its idu reading a (< *la?) An example of an OK *l that later disappeared? 良 ~ 惡 alternations suggest synchronic allomorphy (*l after some nouns, zero initial after others) or 良 could be an archaism reflecting a lost *l- whereas 惡 could be an innovation reflecting the zero initial after *l- was lost.

*la: alternates with 良 *la; cf. Old Chinese *ljajʔ

*Kɯy: initial consonant could be *k or *h; *-ɯy based on 希, 衣; cf. idu reading aɯy for 中

惡希 *axɯy: cf. Sino-Korean ak, hɯy; is -k + h- an attempt to write *x?; cf. the romanization of Cyrillic х as <kh>; maybe OK h was simply *[x], the voiceless counterpart of *ɣ, and *[x] didn't back to h until was lost by the end of the 16th century. It's also possible that was *[ɦ], the voiced counterpart of h.

Before end of 16th century *x (or *h?; hangul ㅎ) (or *ɦ?; hangul ㅇ)
End of 16th century h zero

The exact place of articulation may not matter because at no point was there a phonemic distinction between velar and glottal fricatives in Old or Middle Korean. By convention, the Middle Korean back fricatives are romanized as h and ɣ (G in Yale romanization).

*la: cf. 良 *la ~ 也 *la; Sino-Korean ra

良衣 *laɯy: could be *laɣɯy? No Sino-Korean readings with ɣ-.

惡中 *akɯy: cf. Sino-Korean ak

惡之 *akɯy: 之 is 'genitive'; cf. Middle Korean ɯy 'genitive', 希, 衣 ending in *-ɯy

I wrote,

The spelling 惡之 <ak.GENITIVE> suggests that OK *akʌy may in turn be a compound of *ak 'locative' (?) plus *ʌy 'genitive'.

Now I reject this analysis in favor of *(l)a + *kɯy. If the morpheme boundary were between *k and (my earlier *ʌ), I would expect to see the monosyllabic *l-variant spelled as *lak, not *la.

阿希 *ahɯy: cf. Sino-Korean a

(夜)未 *(pam)-ɯy: 夜 is a logogram for 'night', a word ending in *-m and presumably cognate to Middle Korean pam 'id.'; 未 Late Old Chinese *mɨyh represents the final *-m of 'night' and the locative suffix *ɯy

One might expect to see different stages of lenition of the medial consonant across the centuries: e.g .,

*akɯy > *axɯy > *aɣɯy > *aɯy

But in fact the spellings, taken at face value, indicate no such sequence. The earliest spelling with a consonant 希 has *h, whereas the latest spellings which might be from the same man mix *k, *h, and even or what may have been zero: 惡中 ~ 惡之, 阿希, 良衣.

Here are two possible explanations:

1. The 惡-spellings are archaic and reflect a pronunciation with *k that was already extinct by the 8th century. The 希-spellings reflect the *h that was current. By Kyunyŏ's time, even *h might have lenited to or zero, resulting in 良衣 *laɯy.

2. The spellings reflect synchronic variation: a spectrum ranging from *akɯy to *aɯy and even just as *ɯy existed as late as Kyunyŏ's time.

My reconstructions conflict with some post-OK idu readings attested in hangul (discrepancies in bold):

良中: ahʌy, ahay, aəy, əɯy as well as aɯy

The -h- is more archaic and dates before Kyunyŏ's time.

The vowel variation reflects different harmonizations of OK *ahɯy:

ahʌy, ahay, aəy: lower second vowel to harmonize with the first

aəy is not harmonic according to Middle Korean rules, but ə is still more like a than *a

əɯy: raise first vowel to harmonize with the second

I think *lakɯy is the most archaic form since

apheresis (*l-loss) is more likely than prothesis (*l-addition)

to dissimilation goes against a trend toward vowel harmony between OK and Middle Korean.


12.1.20.5:40: OLD KOREAN *LA(NG)-CATIVES

*La-st, er, last night I rediscovered Alexander Vovin's (2000) article "Pre-Hankul Materials, Koreo-Japonic, and Altaic" which dealt with Old Korean locatives in section 11. He did not include the locative 惡中 that I've been writing about but did cover some others. I've rethought my reconstruction and have revised it to incorporate some of Vovin's ideas:

Old Korean spelling Old Chinese readings Middle Chinese readings Premodern Sino-Korean readings Idu reading (Yu 1964) Vovin's (2000) Old Korean My Old Korean until now My latest Old Korean
良中 *raŋ truŋ *lɨaŋ ʈuŋ ryang tyung ahʌy, ahay, aəy, aɯy, əɯy *lang *(l)akʌy *(l)akʌy
也中 *ljajʔ truŋ *jæʔ ʈuŋ ya tyung (none) (none) *lakʌy
*raŋ *lɨaŋ ryang (r)a *(l)a *la
*raj *la ra (none) *la (none)
惡中 *ʔak truŋ *ʔak ak tyung (none) *akʌy *akʌy
*truŋ * ʈuŋ tyung aɯy
惡之 *ʔak *ʔak tɕɨ ak ci (none) (none)
惡希 *ʔak xəj *ʔak xɯj ak hɯi *axʌy
阿希 *ʔaj xəj *ʔa xɯj a hɯi *ahʌy

Vovin's *lang has nothing to do with the later idu readings, whereas I and others assume that the idu V(h)Vy readings are somehow related to the OK readings.

I agree with Vovin that 也 might have retained an archaic *l- in a Chinese dialect known to early Koreanic speakers. The original *l- of Old Chinese 舌 *m-let 'tongue' is still intact in Cantonese 脷 lei < *let-s today, roughly two millennia after it was lost in mainstream Late Old Chinese.

The spellings 阿希 and 惡希 indicate that intervocalic *-k- might have already begun to lenite in Old Korean. 希 may have been chosen for an archaic OC-like reading *xəj that was closer to OK *xʌy ~ *hʌy than the newer Sino-Korean reading hɯi. *akʌy, *axʌy, and *ahʌy exemplify three stages of lenition. Kyunyŏ (10 c. AD), author of nearly half the known OK corpus, used the spellings 惡中 and

The 惡-spellings are absent from idu, which may indicate that *-k-lenition was complete in the dialect(s) underlying idu, so writing the locative with 惡 *ak no longer made sense.

The Late Middle Korean locative -əy/-ay may be a contraction of OK *ahʌy < *axʌy < *akʌy. LMK -əy should come from an OK *-əkɯy, but the absence of such an allomorph suggests that -əy, the higher vowel counterpart of -ay, and the LMK vowel harmony system as a whole may not be old: i.e., it may postdate OK poetry.

The spelling 惡之 <ak.GENITIVE> suggests that OK *akʌy may in turn be a compound of *ak 'locative' (?) plus *ʌy 'genitive'.

請轉法論歌 3.1-5 has the interesting phrase

法界惡之叱

<LAW.WORLD ak.GENITIVE s> *pəpkyəy akʌy s

which combines the locative akʌy with the other OK genitive morpheme s.


12.1.19.2:31: IN THE MIDDLE OF GOOD AND EVIL

I just realized that two spellings for the Old Korean locative postposition are

良中 <a.MIDDLE> with 良 'good'

惡中 <ak.MIDDLE> with 惡 'evil'

I didn't notice this earlier because 'good and evil' is 善惡 with a different morpheme 善 for 'good', not 良惡.

*ak 'evil' was borrowed from Chinese, though it's purely phonetic in OK 惡中 *akʌy.

I've been assuming that 良 *a was once read *la, a simplification of its Middle Chinese reading *lɨaŋ. But perhaps a native OK word a- 'good' underlay the use of 良 'good' as a phonogram for MK *a. However, there are no known descendants of such a word. The current native Korean gloss for 良 is ŏjil 'benevolent, virtuous'. Neither it nor Middle Korean tyoh- 'good' sound like a-.

Unlike 惡, 良 could stand by itself as a locative postposition: e.g., after 'branch' in

枝良 (祭亡妹歌 7.4-5)

<BRANCH ?> 'on a branch'

Kim Wan-jin (1980: 124) read this as *kaci ra, but Lee and Ramsey (2011: 70) imply that the OK reading of 良 was cognate to the Middle Korean locative postposition ay/əy. If 良中/惡中 was *akʌy, then perhaps 良 alone was *a. And Kim in fact interpreted locative 良 as *a just two lines later:

彌陁剎良 (祭亡妹歌 9.3-6)

<mi.tha.char a> *mithachar a 'in the land of Amitabha'

(陁 = 陀; 彌陁剎 < 阿彌陀刹 Amitaabha-kṣetra 'Amitabha land')

Was the locative postposition *a after consonants (e.g., *mithachar) but *ra after vowels (e.g., *kaci)? Could this alternation be extended to 良中: *akʌy ~ *rakʌy? If OK 'sea' ended in a consonant like the later MK words parʌr and patah, then it must have been followed by *akʌy:

海惡中 (普皆廻向歌 3.4-6)

<SEA ak.MIDDLE> *pa(t)tVC akʌy 'in the sea'

The spelling 惡中 <ak.MIDDLE> was chosen to unambiguously indicate the *a-variant of the postposition after the consonant-final word *pa(t)tVC 'sea'.


12.1.18.23:59: A C-L-AS-H OF CODAS (PART 4: A MOMENTARY MISTAKE)

In part 3, I analyzed Old Korean

海惡中 (普皆廻向歌 3.4-5)

as 海惡 *pa(t)tak 'sea' + 中 akʌy. I was excited to see a spelling that seemed to reflect a final *-k corresponding to the -h of Middle Korean patah 'sea'. However, I overlooked another instance of 惡中 in 稱讚如來歌 right after an instance of the other OK spelling of 'sea' that I mentioned:

無盡辯才叱海等

<NO.LIMIT.DISCUSS.ABILITY s SEA.tʌrh>

一念惡中湧出去良

<ONE.MOMENT ak.MIDDLE GUSH.OUT.kə.ra>

一念惡中 is 'at (that) one moment'.

一念 'one moment' is probably *irnyəm, a borrowing from Late Middle Chinese *ʔir niem, a translation of Sanskrit kṣaṇa 'moment'. There is no later native Korean word ending in -ak meaning 'one moment', so it's unlikely that 一念惡 should be interpreted as a phonogram-final sequence <ONE.MOMENT.ak> *...ak. Hence I regard 惡中 as a locative postposition *akʌy in both 海惡中 <SEA ak.MIDDLE> and 一念惡中 <ONE.MOMENT ak.MIDDLE>.


12.1.18.8:20: A C-L-AS-H OF CODAS (PART 3: SINISTER SEAS)

I should have mentioned these Old Korean spellings of 'sea' much earlier in this series:

海等 (稱讚如來歌 3.6-7, 普皆廻向歌 5.4-5)

海惡 (! - 普皆廻向歌 3.4-5; stay tuned if this surprises you)

海 is a logogram for 'sea' that tells us nothing about the OK word(s) it represents. It's the following characters that are interesting.

Many Old Korean words are written with logogram-phonogram sequences: e.g.,

道尸 <ROAD.hli> *kil 'road'

夜音 <NIGHT.ɯm> *pam 'night'

二肹 <TWO.hɯr> *tuɣɯr 'two'

海等 is also such a sequence. 等 is a phonogram, but was it read as
Sino-Korean *tɯŋ 'class, grade'

or the OK ancestor of its loose Middle Korean translation equivalent tʌrh 'group, plural suffix'

Either reading began with a *t that corresponds to the r of MK parʌr 'sea' and the t of MK patah 'sea'. So 海等 may have represented *pat ... But what followed the t? I'm guessing that 等 was read as something like MK tʌrh. Did

海等 <SEA.?tʌrh>

represent OK *pa(t)tʌrh, with a final cluster -rh that was simplified differently in two MK dialects?

Proto-Koreanic? Old Korean Middle Korean
*pattarh dialect A: 海等 *pa(t)tʌrh parʌr (with *-tt- > *-t- > *-r-)
(dialect B form unattested) patah (with *-tt- > *-t-)

Or was 'sea' simply OK *pa(t)tʌr? MK also had -rh words: e.g., tʌrh itself and hanʌr(h) 'heaven'. Some of these -rh words had variants ending in -r, but patah would be the only -h variant of a -rh word.

海惡 <SEA.ak>

with the phonogram 惡 *ak 'evil' looked like a spelling of OK *pa(t)tak with a final -k corresponding to the -h of MK patah. However, 惡 may not be part of the noun, as it's followed by 中:

海惡中 <SEA ak.MIDDLE> *pa(t) ...? akʌy

惡中 may be an alternate spelling of the OK locative postposition 良中 *akʌy (later read as ahʌy, ahay, aəy, aɯy, əɯy with two degrees of -k- lenition in idu texts).

Then again, Yu (1964: 782) lists aɯy as an idu reading of 中 sans 良*, so

海惡中 <SEA.ak MIDDLE> *pa(t)tak akʌy

could also be possible if the one-character spelling 中 for the locative can be projected back into OK. Lee and Ramsey (2011: 70) glossed OK 良, 中, 良中 as 'locative'. Some early Korean peninsular texts predating OK poetry have this un-Chinese usage of 中 as a case marker: e.g., 三月中 <THREE MONTH MIDDLE> = 'in the third month' in a box "believed to have been crafted in Koguryŏ in 451" (Lee and Ramsey 2011: 55).

This usage of 中 even spread to Japan. The Inariyama burial mound sword inscription from 471 or 531 has 七月中 <SEVEN MONTH MIDDLE> = 'in the seventh month'.

*The use of 良 Middle Chinese *lɨaŋ 'good' as a phonogram for early Korean a is difficult to explain. Perhaps this a was originally *la with a liquid that was later lost.


12.1.16.23:59: A C-L-AS-H OF CODAS (PART 2)

I am not at all convinced that *r ever became Korean h as I proposed in part 1. Two examples in two different positions (onset and coda) are insufficient evidence. Moreover, all other evidence links h to k and ng: e.g.,

Late Middle Korean səyh : Old Japanese saki- 'three', Late Middle Korean k 'three' (allomorph before t-, c-: sək cah 'three feet')

Late Middle Korean cah : Early Middle Chinese 尺 *tɕhɨak 'foot'

Late Middle Korean tyəh : Late Middle Chinese 笛 *tɦiek or Early Middle Chinese *dek 'flute'

Late Middle Korean syoh : Late Middle Chinese 俗 *sɦyok 'vulgar'

Late Middle Korean zyoh : Late Middle Chinese 褥 *ɲʑyok 'mattress'

(1.17.00:24: These four borrowings may reflect a Chinese *-k that had lenited to *-ɣ just as Chinese *-t had lenited to *-r. See Lee and Ramsey 2011: 86.)

Early Middle Korean 亇支 *maki : Late Middle Korean mah 'yam' (Lee and Ramsey 2011: 87, 93; I can't find this word in Yu Chhang-don's 1964 dictionary)

Late Middle Korean stah : Modern Korean ttang 'earth' (Matisoff's rhinoglottophilia comes to mind.)

How can I reconcile these correspondences with my derivation of h from *r? Perhaps the Vietnamese word for 'carbon' may provide a clue. French carbone [kaʀbɔn] was borrowed into Vietnamese as các-bon [kaakɓɔn]. Did Korean

*r > > > > *x > *k

harden in certain environments: e.g., before *t and *c (and sporadically intervocalically*)? If so, why weren't all *r in those positions affected? Even Old Korean had words with final *r transcribed with 乙 Late Middle Chinese *ʔɨt.

*1.17.1:13: Old Japanese saki- 'three' may be borrowed from an early Koreanic *sahi with an intervocalic fricative. Perhaps

*səri > *səhi > səyh

in the Koreanic branch leading to Late Middle Korean. (The OJ loan may reflect a different branch with *a as the first vowel of 'three'.) Modern Korean rŭn 'thirty' still has an -r- before -ŭn 'ten'.

LMK 'river' may have a similar derivation

*nari > *nahi > nayh

whose earliest proto-form resembles the Paekche word transcribed as

那利 Early Middle Chinese *na lih

katakana glosses nari ~ nare (Bentley 2000: 425)

There seem to have been at least two kinds of liquids in earlier Koreanic. One may have remained a liquid r/l in later Korean, whereas the other may have become h or even k depending on circumstances.

1.17.1:37: Two liquids contrasted in final position in Old Korean (Lee and Ramsey 2011: 66):

乙 Late Middle Chinese *ʔɨt < Early Middle Chinese *ʔɨət

尸 (Late/Early Middle Chinese *ɕi)

The interpretation of the latter is uncertain. I've assumed that it was chosen for a reading like Old Chinese *l˳i that survived as an archaism in the peninsula, but it could be an abbreviation of 卢 < 盧 Late/Early Middle Chinese *lo. Could 尸 have represented a voiceless liquid or a voiceless lateral fricative similar to Old Chinese *l˳i? An early Koreanic *l˳ or could have sometimes lost its lateral quality and become *h.

The Old Korean prospective modifier suffix -尸 corresponds to Late Middle Korean -rʔ. Could the glottal stop be a remnant of an earlier *h?

*-l˳ > -lh (reanalyzed as a cluster?) > -rh > -rʔ

Old Korean is the language of Shilla. The prestige dialect of Late Middle Korean was from the capital region which was in former Paekche and Koguryo territory. Did non-Shilla Koreanic speakers pronounce Shilla *l˳ as *lh or *rh or even *h when they shifted to the Shilla language?

If 尸 represented a lateral, Old Korean 日尸 <DAY.l> 'day' might have been *nal (cf. Late Middle Korean nar). If early Koreanic *r and *l correspond to *r and *l in other Altaic languages, I would expect *nal to correspond to a non-Koreanic *-l word, but the matches that come to mind

Khitan *neir (transcribed in Chinese as 捏咿兒 *nie i r) 'day'

Written Mongolian nara(n) 'sun'

have r,  not l!


12.1.15.23:57: A C-L-AS-H OF CODAS (PART 1)

So far, I've tried to explain the t : r correspondence in two Middle Korean words for 'sea' by reconstructing a geminate *tt as their common source:

patah < *-tt- (from a dialect in which geminates simplified after lenition)

parʌr < *-d- < *-t- < *-tt- (from a dialect in which geminates simplified before lenition)

MK ʌ could be a reduction of *a, so the ʌ : a correspondence is not a problem. However, I couldn't initially think of any other cases of an h : r correpsondence. Then I remembered the Old Korean accusative particles

乙 ~ 肹 (for *ɯr ~ *hɯr?; cf. their Middle Chinese readings *ʔɨt and *xɨt)

corresponding to Middle Korean (r)ʌr ~ (r)ɯr*. Could OK *h- correspond to MK r-?

OK *hɯr does not occur exactly where I'd expect it to: i.e., only after vowel-final nouns:

吾肹 'I-ACC' (獻花歌 3.1-2); cf. MK na rʌr

but 花肹 'FLOWER-ACC' (獻花歌 4.1-2); cf. MK koc ʌr (not *koc-rʌr)

Here's an instance in which the *h of 肹 may actually be at the end of the preceding noun:

地肹 'EARTH-ACC' (安民歌 7.2-3); cf. MK stah ʌr (not *stah-rʌr)

The problem is that we cannot be sure how the logograms 吾, 花, and 地 were read in OK. They could have represented

words lacking final consonants which were unrelated to MK na, koc, stah

or the vowel-final ancestors of those MK words: e.g., OK *na, *kocV, *stahV

But let's suppose *hɯr is cognate to MK rVr. Could the h of OK *hɯr and MK patah reflect dialects in which */r/ was pronounced *[ʁ], a fricative that was both h- and r-like? Brazilian Portuguese has similar pronunciations of *ʁ:

a voiceless velar fricative [x], voiceless uvular fricative [χ], or a voiceless glottal fricative [h].

MK as a whole is not descended from such a dialect. Do any such dialects exist today that have h instead of r as in MK and  r/l as in standard Korean?

Next: The Các-Bon Conundrum

*1.16.2:20: The height of the vowel of the MK particle depended on the vowel of the preceding noun:

'yin' (higher) vowels: ɯ, ə, u:  (r)ɯr

'yang' (lower) vowels: ʌ, a, o: (r)ʌr

neutral vowel: i: either (r)ɯr or (r)ʌr

In the 朝鮮館譯語 (c. 1400), the Korean phoneme soon to be written in hangul as -ㄹ was transcribed in Chinese as 二 *r, so I interpret MK -ㄹ as [r], not [l]. In modern Korean, -ㄹ is [l].


12.1.12.23:59: *B-UT *WATA-BOUT THE ONSETS? (PART 3)

In part 2, I reinterpreted the Psara evidence for Proto-Japonic *b- as presented in Vovin (2010: 37-38) as evidence for PJ *wu.

Vovin (2010: 38-40) also found mainland evidence for PJ *b- in the cities of 氷見 Himi and 魚津 Uozu which are on opposite sides of Toyama Bay. Here are the correspondences between Shimao (a town in Himi) and Tokyo based on data that Vovin found in Kawamoto (1973). V represents any vowel other than a.

Vovin-style Proto-Japonic? Shimao Tokyo
*ba ba-, -wa- wa-, -wa-
*Npa ba-, -ba-
*bV V V
*NpV bV bV

(PJ forms are my guesses based on my understanding of Vovin 2010.)

Uozu has the same pattern as Shimao.

Vovin (2010: 39) wrote,

Kawamoto suggests that initial /w-/ underwent a fortition to /b-/ (1973: 75) [in Himi and Uozu]. Structurally, this would be reasonable, but the geographic distribution makes parallel innovation in Himi and Uozu unlikely. From the viewpoint of linguistic geography, initial /ba-/ in Himi and Uozu looks instead like a retention.

In other words, it's unlikely that /w-/ hardened to /b/ independently in Himi and Uozu, so /b/ must be a retention. However, Himi and Uozu share what seems like a parallel innovation: the merger of *ba and *Npa as wa.

I think Kawamoto was right. I propose the following scenario. (Tables added 1.13.00:13.)

1. PJ had *w. (I also suspect that PJ had a *b which was not the source of b or w in Himi, Uozo, or Tokyo. However, my PJ *b plays no role in the following changes, so I will no longer mention it.)

*wa *wV
*Npa *pV

(*V = a non-a vowel.)

2. Pre-Toyama, the common ancestor of Himi and Uozu, lost w- before vowels other than a:

we, wi > ɥe, ɥi > ye, yi > e, i (glides assimilated to following vowels)

wo, wu > o, u

*Np became *b.

These changes also occurred in the ancestor of Tokyo:

wa V (no more w-)
ba bV

3. Pre-Toyama w became v.

va V
ba bV

4. Initial v- hardened to b- as in Spanish: va > ba. (I have added a new column for medial -va- and -ba-.)

ba < ba, va -va- V
-ba- bV

5. Medial ba (< PJ *Npa) and va were confused. Eventually the variant with a consonant more like the surrounding vowels (i.e., the fricative v, which had less constriction than the stop b) dominated.

ba -va- < -va-, -ba- V
bV

Cf. how Spanish intervocalic -b- and -v- merged as [β].

6. Speakers of the dialects between Himi and Uozu relearned the older pattern from step 2 still maintained in mainstream dialects like Tokyo, losing the un-Tokyo traits acquired during steps 3-5 above.

7. v shifted back to w under mainstream influence in Himi and Uozu.

ba -wa- V
bV

In short, Himi and Uozu share innovations that they independently retained after the dialects between them assimilated to dialects that lacked those innovations.

Next: A C-l-as-h of Codas


12.1.11.23:59: *B-UT *WATA-BOUT THE ONSETS? (PART 2)

In part 1, I assumed that a Koreanic word for 'sea' was borrowed into early western Japonic as *bata with a voiced stop that later lenited to Old Japanese w-. Those who reconstruct *b as the source of OJ *w generally also reconstruct *d as the source of OJ *y. Some also reconstruct *z and *g that respectively became OJ Ø- ~ -s- (voiceless!) and Ø. These voiced proto-obstruents are not the sources of the OJ voiced obstruents which were prenasalized:

Proto-Japonic *b *d *z *g *Np *Nt *Ns *Nk
Old Japanese w y Ø- ~ -s- Ø b [mb] d [nd] z [nz] g [ŋg]
Modern Japanese Ø- ~ -s- ~ -sh- [ɕ] b [b] d [d] z [z] ~ [dz] ~ j [dʑ] g [g] ~ [ŋ]

Although I think Proto-Japonic (PJ) or strictly speaking, pre-PJ, had voiced obstruents, I don't think they lenited. I'll present my views on that elsewhere and confine my remarks here to the Japonic lenition theory.

Vovin (2010) recently proposed that PJ had *b- but not *d-:

*p *t *s *k
*Np *Nt *Ns *Nk
*b (no nonlabial voiced obstruents)
*m *n (no other nasals)
(no *w!) *r *y (no velar glide)

Is such a system plausible? I don't know of any current language that has only one voiced stop b.* The b and d sans backer voiced stops pattern is common in Southeast Asia: e.g., Khmer, Thai, Lao, Vietnamese, and standard Zhuang.

Vovin (2010) regarded Psara v- corresponding to OJ u- as the reflex of a lenited PJ *b-:

Vovin-style Proto-Japonic? Psara OJ
*u- u- u-
*bu- v(ụ)-, vu u-, u
*pu- vụ- pu-
*mu- v(ụ)- mu-

(PJ forms are my guesses based on my understanding of Vovin 2010. Do the subscript dots of Psara stand for devoicing which is represented with subscript circles in IPA?)

However, I would rather reconstruct PJ *w- instead of *b-: e.g.,

PJ *wu > Psara vu, OJ u 'hare' (calendrical)

12.1.12.1:59: If the monosyllabic term for 'hare' is cognate to Middle Japanese usagi (no earlier attestations?), then perhaps it is also cognate to the Koguryo word for 'hare' transcribed as 烏斯含 Middle Chinese *ʔosieɣəm. MC or Late Old Chinese may have lacked *wo, so I cannot tell if the Koguryo word had initial o- or wo-. MC *ie does not match MJ a. If MJ -i is from PJ *-əi, then PJ may correspond to the of 含 MC *ɣəm.

Next: Sowa-t Happened in Shimao?

12.1.12.2:36: I can no longer doubt Vovin's PJ *b on typological grounds. I found a few languages in UPSID which have a single voiced stop b:

Alabaman (North America) and Paya (South America): unlike Vovin's PJ, both have w as well as b

Roro (Papua New Guinea): like Vovin's PJ, has b but not w

There may be others. I wish there were an easy way to search for combinations of sounds in UPSID.


12.1.10.23:59: *B-UT *WATA-BOUT THE ONSETS? (PART 1)

My previous entry only dealt with the medial consonant of Old Japanese wata and Middle Korean patah ~ parʌr 'sea'. The initial consonants of those words don't match. (The final consonants are also problematic, but I'll deal with them later.) If those words are indeed related, it's unlikely that Japonic speakers borrowed Koreanic *p- as w-. Therefore one or both initials must be innovations.

Some derive OJ w from Proto-Japonic *b. PJ *b- is closer than PJ *w- to MK p- but is still not a perfect match. Given that MK p- normally corresponds to PJ *p- in words such as

MK pər < *pətɯ* : PJ *pati 'bee'

why would early Koreanic *p- sometimes be borrowed as early western Japonic (EWJ)** *b-?

Solution 1: Early Koreanic *p- was unaspirated, whereas EWJ *p- was aspirated [ph]. Speakers of languages like English and, in this scenario, EWJ, with initial [ph] and [b] but no initial [p] may perceive foreign [p] as being like their [ph] or [b]. Hence early Koreanic *p- was borrowed at random as either EWJ *p- or EWJ *b-.

Solution 1a: If Koreanic and Japonic are related (which I doubt), EWJ *p- and EWJ *b- corresponding to Koreanic *p- are from two different strata of vocabulary: one inherited from Proto-Koreo-Japonic and another borrowed from Koreanic.

Solution 1b: If Koreanic and Japonic are not related, EWJ *p- and EWJ *b- corresponding to Koreanic *p- are from two different strata of borrowing from Koreanic.

(1a and 1b added 1.11.2:18.)

Solution 2: Early Koreanic had a *p-/*b-distinction and the Early Koreanic word for 'sea' had initial *b- which was borrowed as EWJ *b-. There are two problems with this scenario.

First, none of the attested transcriptions of 'sea' on the Korean peninsula had initial *b-:

Koguryo*** 波且: Middle Chinese *patshjaʔ (regarded by Ryu 1983: 520 as an error for 波旦 *patanh)
Koguryo 波利: Middle Chinese *palih

Shilla 波珍: Middle Chinese *paʈin

Shilla 波澄: Middle Chinese *paɖɨŋ

This does not rule out the possibility that EWJ speakers happened to borrow the word from a Koreanic language (Paekche?) that had not yet shifted *b- to *p-. Unfortunately, there is no known Paekche cognate of this word.

Second, interchangeable initial transcription characters in other words on the Korean peninsula such as

Koguryo: 夫 *p ~ *b : 伏 *buk (are there better examples in initial position?)

Paekche: 沸 *pujh : 避 *bieh

Paekche: 富 *puh : 伐 *buat

Shilla: 發 *puat : 伐 *buat

imply that early Koreanic did not have an initial *p-/*b-distinction and that early Koreanic speakers pronounced Chinese *p- and *b- as *p-.

Next: *D-u-*b-ious Voiced Stops in Proto-Japonic

*1.11.1:55: MK pər could be from Proto-Koreanic *pərɯ or *pətɯ with a LH pitch accent but must be from the latter if it is related to PJ *pati.

MK patʌri HHR ~ HHH 'wasp' is vaguely similar, though its vowels belong to the low class whereas *pəCɯ LH had high class vowels with a very different pitch accent pattern. Nonetheless, one could try to relate patʌri to *pəCɯ by positing a common root *pVttV with a geminate *-tt- that was simplified to *-t- and lenited in one dialect but not another.

In my last post, I wrote (emphasis mine),

In Hebrew, intervocalic single stops lenited, whereas intervocalic geminates were simplified [...] The same thing happened in one dialect of Koreanic

I did not intend to imply that Hebrew and Koreanic underwent exactly the same sound changes. In fact, none of the modern Hebrew and Koreanic reflexes of lenited stops are the same:

Lenited intervocalic stop VpV VtV VkV
Modern Hebrew VfV VtV < VθV VxV
Modern Korean (< Middle Korean) VwV < VβV VrV VV < VɣV

The Middle and Modern Korean reflexes are similar to the lenited stops of Tangut and Vietnamese:

Tangut vV, lV, ɣV < *VPV, *VTV, *VKV

Vietnamese [vV zV zV ɣV] < *VPV, *VTV, *VCV, *VKV (*C = palatal stop)

Moreover, I should have noticed that lenition also occurred in final position after vowels in Hebrew, whereas lenition was purely intervocalic in Korean and Tangut.

1.11.2.22: Finally, Korean and Tangut had lenited fricatives and affricates:

Modern Korean VV < Middle Korean VzV < Proto-Korean *V(t)sV

Tangut zV, ʒV < *V(T)SV, *V(T)ŠV

No fricatives lenited in Hebrew which originally had no affricates. (Modern Hebrew ts is from earlier emphatic s.)

**1.11.0:44: I use the term 'early western Japonic' here instead of PJ because Western OJ wata has no Ryukyuan or Eastern OJ cognates that would allow me to reconstruct it at the PJ level.

***1.11.1:59: I use terms like 'Koguryo', 'Paekche', and 'Shilla' to represent any languages or dialects spoken in those kingdoms. I am agnostic about the number of dialects of languages on the Korean peninsula prior to unification. I tentatively assume that the peninsular languages were all Koreanic with remnants of a Japonic substratum.


12.1.9.23:59: A HEBREW HINT FOR A MARITIME MYSTERY?

Japanese has two words for 'sea', one shared with Okinawan (umi < Proto-Japonic *omi) ́ and another shared with Korean (Old Japanese wata). According to Vovin (2010: 12-32), Korean intervocalic *-t- became -r- at some point after Japanese borrowed wata from Korean, so one would expect the later Korean word to have -r-. But there are two Middle Korean words for 'sea', and only one has *-r-!

patah

parʌr (ʌ may be a reduction of *a)

Vovin derived MK intervocalic -t- from earlier *-nt-. But if the earlier Korean word were *panta, it should correspond to Old Japanese wada [wanda], not wata.

Here's what I think happened. In Hebrew, intervocalic single stops lenited, whereas intervocalic geminates were simplified: e.g.,

saapar > safar 'he counted'

sappaar > sapar 'barber'

(Examples from Hetzron 1993: 695.)

The same thing happened in one dialect of Koreanic:

*kətan > MK *kəran (unattested?) > modern ran 'Khitan'

*pattak > MK patah > modern pada 'sea'

However, in another Koreanic dialect, simplified intervocalic geminates also lenited:

*pattar > *patar > MK parʌr > (no modern descendant)

Thus Old Japanese wata corresponds to an early Koreanic *pat(t)a with or without a geminate prior to lenition.

Next: *B-ut *W-hat about the Onsets?

Then: A C-l-as-h of Codas

1.10.1:40: Unfortunately, the earliest Chinese character transcriptions of Koreanic words for 'sea' do not point to Vovin's *-nt- or my *-tt-:

Koguryo 波且: Middle Chinese *patshjaʔ (regarded by Ryu 1983: 520 as an error for 波旦 *patanh; *-n could transcribe foreign *-r)
Koguryo 波利: Middle Chinese *palih (for *parih?; there was no MC *r)

Shilla 波珍: Middle Chinese *paʈin < Old Chinese *tər

Shilla 波澄: Middle Chinese *paɖɨŋ

I would expect OJ wata to be a borrowing from Paekche, the peninsular state that was the source of literacy and Buddhism in Japan, but the Paekche word for 'sea' was transcribed as 内米 MC *nəjmejʔ which vaguely resembles Japanese nami 'wave'. 内米 could also refer to ponds, so it may have meant 'body of water'. If Paekche had a word meaning only 'sea', it might have been cognate to MK patah and parʌr.

The earliest Chinese character transcriptions of names and titles from Japan had clusters that might represent geminates or tense consonants: e.g.,

邪馬臺 Late Old Chinese *jæmæʔdə 'name of the state of Yamatai' (for  *yamaddə?; Yamatai is the modern Sino-Japanese reading of the transcription; even the alternate spelling 邪馬壹 *jæmæʔʔit may have represented *yamaʔʔit(V) with a geminate)

彌馬獲支 Late Old Chinese *miemæʔwɛkkie 'a title of Yamatai' (for *mema(w)wekke?; cf. Proto-Ryukyuan *weke 'male' [Thorpe 1983: 304])

己百支 Late Old Chinese *kɨəʔpakkie 'name of a state' (for *kəppakke?)

好古都 Late Old Chinese *xouʔkɔʔtɔ 'name of a state' (for *hokkotto?; *h may have merged with zero in early Japonic; modern Japanese h- is from proto-Japonic *p-, not PJ *h-)

對蘇 Late Old Chinese *tuəssɔ 'name of a state' (for *tusso?)

Although the linguistic affiliation of these names is unknown, perhaps early Japonic also had geminates that were later reduced to Old Japanese single consonants and the Koreanic word for 'sea' could have been borrowed as *watta with a geminate. The geminates of modern Japanese would be unrelated to these early geminates.


12.1.8.23:59: JURCHEN POLYPHONY 3: THE WE BACK TO THE CAPITAL

I began this series with a Jurchen character

transcribed in Chinese as 苦 *ku 'bitter' and as 都蠻 *duman 'capital-southern barbarian', and I am ending it with another 'urban' Jurchen character which has a record number of different readings:

~~

Kiyose 70 (hereafter K70): <her> (J: <hele>) 'city'

phonogram for <hu>, <u>, <we>, (J: <huwe>), <e>, (Y: <o>), <du> (transcribed as 都 *du 'capital'), (J: <ke>)

The readings are from Kiyose (1977: 65, 127) except for those marked with 'J' from Jin (1984: 35) and 'Y' from Yamaji Hiroaki.

If a word spelled with <huwe> came to be spelled with <we>

>

<huwe> > <huwe.we> = huwe

then <huwe> could have been reinterpreted as a phonogram <hu>.

And just as <clha> might have once been <ilha>, <we> might have once been <huwe>:

>

<huwe> > <hu.huwe> reinterpreted as <hu.we> = huwe

If those derivations are correct, the number of readings of K70 can be reduced to seven: <her>/<hele>, <huwe>, <u>, <e>, <o>, <du>, <ke>. Perhaps each of the characters now regarded as variants originally had only one or two of these readings. Were there originally up to seven distinct characters?

1.9.3:15: Are we seeing the Jurchen equivalent of merging the mostly unrelated though similar-looking Chinese characters (all readings are in Cantonese)

tin

yau

jaat (derived from inversion of 曱 below; Ct 曱甴 gaatjaat is a disyllabic word 'cockroach')

san < Old Chinese *hlin

din < Old Chinese *lins (derived from 申 above)

gaap

gaat (derived from near-homophone 甲?)

into a single 'character'?

Next: A Hebrew Hint for a Maritime Mystery?

1.9.1:15: Jin (1984: 35) noted that K70

~~

resembled Chinese 左 *tso 'left' which was Jurchenized as

<dzo>

since Chinese unaspirated obstruents were borrowed as Jurchen voiced obstruents.

The Jurchen word for 'left' was

<hai.su> (cf. Manchu has'hu; <su> is derived from the right side of Chn 穌 *su)

<dzo>, <hai.su>, and their graphs bear no resemblance to K70 and its readings. So is the resemblance between K70 and Chn 左 'left' coincidental? The reading <o> of K70 is vaguely like Middle Korean 왼 oyn 'left'. Was K70 based on a Parhae modification of 左 representing a Koreanic word for 'left'?

1.9.1:29: <her> [xər]/<hele> [xələ] 'city' must be related to the Koguryo word for 'fortress' transcribed as 忽 Late Old Chinese *xwət ~ Middle Chinese *xot. Chinese *-t might correspond to a Koguryo *-r or *-l. LOC and MC did not have liquid codas.

The reading <du> for K70 could be a Jurchenization of Chn 都 *tu 'capital' which in turn might have been a loose translation of <her>/<hele> 'city'.

I don't know why Kiyose (1977: 65) reconstructed <her> with <r>. The Chinese transcription was 黑勒 *xəj-ləj, not 黑兒 *xəj-r̩ which would correspond to <her>.


12.1.7.17:09: JURCHEN POLYPHONY 2: SCIENTIA ALBA

Jurchen has a single character with variants
~~

for both shang 'white' (cf. Manchu shanggiyan 'id.') and sa- 'to know' (cf. Manchu sa- 'id.'*). (Jin 1984: 98 only listed sa as a reading for the third variant.)  Did one or more variants originally represent shan while the other(s) represented sa-? Kiyose (1977: 68) wrote, "It is impossible to say whether similar characters with different pronunciations were erroneously written the same way."

Jin (1984: ) derived this graph from the Khitan large script graph

~

<sha>

in the Khitan title

<sha.ri>

transcribed in Chinese as 沙里 *shali and translated as 郎君.

There is also a Khitan large script character

<?> '?'

resembling one of the Jurchen variants.

'White' was

<?>

in the Khitan large script. Janhunen (2003: 397) regarded Manchu shanggiyan 'white' as a loan from a Para-Mongolic cognate of Proto-Mongolian *cagaxan with a Para-Mongolic innovation *c- > sh-. One might think that the Manchu and Jurchen words for 'white' were borrowed from Khitan, but Khitan had a c- in addition to an sh- even in native words (implying that the c- > sh- shift had not taken place) and the Khitan word for 'white' is unknown, so it is doubtful that Khitan had a sh-word for 'white'. In any case, the KLS graph for 'white' only very vaguely resembles the Jurchen characters and is probably not related to them.

If the Jurchen characters were derived from the KLS:

Khitan large script

Early Jurchen

Later Jurchen (and/or mistakes in the Sino-Jurchen Vocabulary)

<sha>

~

<shang>? (but not <sha>!)

~~

<shang> ~ <sa> (but not <sha>!)

<?> (<sa> like its Jurchen derivative?)

<sa>

If the Jurchen and KLS characters were independently derived from common (Parhae?) prototypes:

Parhae

Derivatives

?

Khitan large script:

~

<sha>

Jurchen:

 ~

<shang> ~ <sa> (but not <sha>!)

?

Khitan large script: 

<?> (<sa> like its Jurchen counterpart?)

Jurchen:

<sa>

Next: The We Back to the City

*A shaman is a sa-man 'knower' in Manchu. The same suiffix was in Jurchen

<sori.duman>

soridu-man 'fighting'

from soridu- 'to fight' in part 1.


12.1.6.23:56: JURCHEN POLYPHONY 1: BITTER URBAN BARBARIANS

Kiyose (1977: 80) listed two Jurchen characters in a row with identical shapes:

399: <ku>; transcribed in Chinese as 苦 *ku 'bitter'; phonogram for the second syllable of takura- 'send':

<ta.ku.ra> (cf. Manchu takūra- 'id.')

400: <duman>; transcribed in Chinese as 都蠻 *duman 'capital-southern barbarian'; phonogram for the second half of soridu-man 'fighting, melee':

<sori.duman> (cf. Manchu soridu- 'to fight' [Kiyose 1977: 122; I can't find this word in any Manchu lexicon at hand])

One might wonder if Jurchen characters often had multiple readings like the Chinese characters used in Japanese. However, I could only find three Jurchen characters in Kiyose (1977) with multiple readings. I'll look at the other two in parts 2 and 3 of this series.

Kiyose (1977: 80) wrote,

This character [400] seems to be exactly the same as character 399 as far as appearance goes. These characters were, however, perhaps different from each other, and one or the other is presumably a scribal error.

Either possibility raises questions I cannot answer:

If 399 and 400 are a single polyphonous character (i.e., a character with multiple readings), which reading came first? Or was the character designed with two readings in mind?

If 399 and 400 were originally distinct characters, what did the lost other character look like?

1.7.00:59: Jin (1984: 160) listed two variants of 399/400:

The second was read <ku>. I do not know which reading(s) belonged to the first. Could one or both of these variants actually be distinct characters? For example, perhaps

~ or

~

were <ku> and <duman> or vice versa.

In either case,

what is the relationship between the shape(s) of 399/400 and its readings?

why was a phonogram <duman> created even though there was no suffix -duman? Kiyose (1977: 122) analyzed soriduman as soridu-man with a nominal suffix -man, and Kiyose (1977: 80) speculated that -du- "could be the cooperative verbal suffix; cf. Ma. -ndu- id." I know of no Manchu word duman, so there may not have been a Jurchen word duman.

1.7.1:28: why not write the infrequent syllable sequence duman with phonograms as, say,

<du.man>

(The character <man> is in Jin's (1984: 231) entry for the place name 滿涇站 <man.ging.jan> but does not have an entry of its own in that dictionary.)

or

<du.ma.an>?

Next: Scientia alba


12.1.5.18:20: FLORA DIVINA

Kiyose's (1977) A Study of the Jurchen Language and Script: Reconstruction and Decipherment has a list of 728 Jurchen (large script) characters arranged by number of strokes:

Number of strokes 1 2 3 4 5 6 7 8 9 10
Number of characters 1 4 14 73 165 197 155 88 22 9
Percentage of total 0.1% 0.5% 2% 10% 23% 27% 21% 12% 3% 1%

The characters in Jin Qizong's (1984) Jurchen dictionary have a similar distribution.

I'd like to make a similar table for the number of syllables in a Jurchen character. There is no correlation between graphic complexity and the complexity of a Jurchen reading: e.g.,

~

<r> (phonogram; zero syllables : eight strokes)

~

<uyewunju> 'ninety' (logogram; four syllables: three strokes)

Kiyose lists only one Jurchen character whose reading begins with a consonant cluster:

~

<lha> [lχɑ] (the variant is from Jin 1984)

(Written Tibetan lha [l̥a] is 'god'.

This character only appears in the spelling

<i.lha>

for ilha 'flower'. No Jurchen word could begin with <lh>, and no other Jurchen character had a reading beginning with a nonhomorganic cluster. Why wasn't 'flower' spelled

<il.ha>

just as ilhahong 'shallow' was spelled

<il.ha.hong>?

The above two readings are based on those in Jin (1984). However, Kiyose (1977: 135, #694) read 'shallow' as <ir.ha.hun> with <r> instead of <l> on the basis of the Chinese transcription 一兒哈洪 *i ri xa xuŋ. Kiyose did not reconstruct <il> as the reading of any Jurchen character. Hence if Kiyose is correct, a spelling <il.ha> would not be possible. There was no Jurchen character read <l>, so <i.l.ha> would also not be possible.

Why was irhahun (possibly irhahũ?) written as <ir.ha ...> whereas flower was written as <i.lha>? Why not write Ch-clusters consistently either as <C.h> or as <.Ch>?

I was surprised a <lhV> graph exists at all because I assumed that

- there were no <lh> (or <Ch>-initial words in Jurchen just as there were no such words in Manchu. The absence of initial consonant clusters is a trait of 'Altaic' languages , though there are exceptions: e.g., Middle Korean had ᄣ pst- and Monguor developed st- under Tibetan influence (Ramsey 1987: 202).

- Jurchen character readings would be pronounceable in isolation.

Then again, Jurchen did not have syllabic r, so

~

<r>

would also not be pronounceable in isolation.

Literate Jurchen knew of the Khitan scripts which also contained chararacters with subsyllabic readings, though such characters might have been read with inherent vowels in isolation.

I wonder if

was originally a disyllabic logogram <ilha> 'flower' (could it be even a drawing of a flower?) and

<i.lha>

was added later, so the later two-character spelling for ilha

should be transliterated as <i.ilha> and Jurchen referred to the second character in isolation as <ilha>, not <lha> with an initial <lh> they couldn't pronounce.

There are many Jurchen words written as <logogram.phonogram(.phonogram)>: e.g.,

<guru.un> 'country' (the first character is related to Chn and Khitan large script 囯 'country').

I presume these words appeared as <logogram> sans <phonogram> in 女眞字書 The Book of Jurchen Characters in which "[a]lmost all the individual characters [...] represent complete words" (Kane 1989: 8-9): e.g.,

>?

Kane (1989: 29) lists an extreme case of a logogram (a drawing of a saddle?) being replaced by a logogram-phonogram-phonogram sequence:

>

<engemer> (BJC spelling) > <engemer.ge.mer> (later spelling) engemer 'saddle'

Is <i.ilha> the only case of a logogram later preceded by a phonogram, or are there others?

Next: The Bitter City of the Southern Barbarians

Maybe Someday: Initial Consonant Clusters in Jin's Jurchen Reconstruction


12.1.4.16:09: *K-OUNTING IN TANGUT

Last Friday, I proposed that Tangut 'six' and 'seven' shared a *k-prefix. The following day, I wondered if I could reconstruct a *k-prefix in other Tangut numerals. Let's see how far I can go with that. But first, let me predict the effects of *k-prefixes on different pre-Tangut initial classes:

1. *k- + nonfricative obstruent = aspirated nonfricative obstruent

2. *k- + fricative = fricative + tense vowel

3. *k- + nasal = nasal

4. *k- + glide = ʔ- + glide

5. *k-r- > *k- + grade II rhyme

6. *k-l- > lh-

7. *kV- may condition

lenition of the following consonant

bending of the root vowel

upward if *V is

downward if *V is

How many of those predicted reflexes can be found among Tangut numerals?

Gloss Tangraph Reading Pre-Tangut Tibetan transcription Also cf. Notes
one 1lew *Cʌ-tek gliH, gli, kli OC 隻 *Cɯ-tek, WT gcig Pre-Tangut and OC prefixes could have had initial *k-
two 1niəə *(k-)niəə gniH, gni, nyi WT gnyis *k-prefix possible, but no internal evidence for it: *kn- > *hn- > n-
three 1sọ *k/s-so gsoH, gso, bso, so WT gsum Prefix could be *k- or *s-
four 1lɨəəʳ *r-ləə ldiH, ldi, zlaH Mawo Qiang gʐə Initial l- instead of lh- < *k-l- rules out *k-prefix; I could claim the prefix dropped before it left a trace, but I'd rather not do so
five 1ŋwə *(k-)pʌ-ŋə bngiH, rngwa WT lnga *k-prefix possible, but no internal evidence for it: *(k-)pʌ-ŋ- > *kpŋ- > *kŋw- > *hŋw- > ŋw-
six 1tʃhɨiw *k-trik chiH, chi WT drug Aspirate initial points to *k-prefix
seven 1ʃɨạ *k/s-ʃa sha, gshaH Mawo Qiang stə, Taoping Qiang ɕiŋ Prefix could be *k- or *s-
eight 1ʔjaʳ *k-rja rye, na (sic!) WT brgyad < *bryad Unsure if Tangut had ʔj- instead of j-
nine 1giəə *gəə HgiH, dgiH Mawo Qiang rguə Initial g- instead of kh- < *k-g- (cf. 'ten thousand' below) rules out *k-prefix; I could claim the prefix dropped before it left a trace, but I'd rather not do so
ten 2ɣạ *(k-)sʌ-KaH Hga, k.ha, dgaH Daofu zʁa No need for a *k-prefix, but if one existed, it could have fused with *s-: *k-s- > *kʃ- > *ʃ- > *s-.
hundred 1ʔjiʳ *k-rji (none) Mawo Qiang khiʴ, WT brgya < *brya Unsure if Tangut had ʔj- instead of j-
thousand 1təụ *(k-)sʌ-tu tu (?) Taoping Qiang χto < *st-?, WT stong No need for a *k-prefix, but if one existed, it could have fused with *s-: *k-s- > *kʃ- > *ʃ- > *s-.
ten thousand 2khiə *k-gəH (none) Taoping Qiang χgya < *s-g-?, Daofu khʂə < *s-g-?, WT khri Not possible to reconstruct *k-prefix using internal evidence due to lack of kh- ~ g- alternation; root *g- reconstructed on the basis of Taoping Qiang

In a 'strong-k' scenario in which I reconstruct as many *k-prefixes as possible, the only numbers without them are 'four' and 'nine' (unless I resort to the 'prefix dropped without a trace' trick - ugh).

In a 'weak-k' scenario in which I reconstruct as few *k-prefixes as possible, the only number with a prefix is 'six', and even its *k- is debatable because there is no Tangut-internal alternation tʃ- ~ tʃh- suggesting a prefix.

I suspect several, though not all, of the numerals above had *k-. I am tempted to interpret the k- and g- in the Tibetan transcriptions as a prefix rather than as a tone letter, so 'one', 'two', and 'three' and perhaps 'ten' may have had k- in the transcribed dialect.

I don't understand why prefixation isn't consistent even in the 'strong-k' scenario or in Written Tibetan:

WT prefix g- b- l- d- s- none?
WT numeral gcig 'one', gnyis 'two', gsum 'three', perhaps dgu 'nine' via Sa-skya Pandita's law (named by Hill 2011): *g- > d- before grave consonants. bzhi 'four', bdun 'seven', brgyad 'eight', bcu 'ten', brgya 'hundred' lnga 'five' drug 'six'; the d- of dgu 'nine' could be from *g- (see the g-column) stong 'thousand' khri 'ten thousand'

Janhunen (1994) pointed out that there is no consistent numerical radical in the Tangut characters for numerals. The component

(alphacode: dex)

appears in 'one', 'four', 'six', and 'nine' and at least 1183 other characters. One out of five characters contains dex, the most frequent of 825 different character components. Determining the function(s) of dex could be a key to the Tangut script.

Next: Flora divina


12.1.3.2:31: THE *REK-ONING

On Friday I stumbled on Schuessler's (2009: 132) entry for the homophones 歷 'to count, to experience, calendar' and 曆 'calendar', both Old Chinese *rek. Schuessler compared those words to

Written Burmese ရေ re 'to count'

Kanauri ri (no definition given)

Written Tibetan rtsi-ba < *rhji < *rhi 'to count', rtsis-pa 'astronomer'

This comparison also appeared on p. 73 of his 2007 dictionary.

I was surprised by his derivation of rtsi from *rhi. I couldn't find any sound change like -ts- < *-h- before *i in Nathan Hill's recent (2011) compilation of Tibetan sound laws. Such a change looked odd to me until I inserted some extra steps (in bold):

0. *rhi

1. *rhyi (palatalization of *rh before *i)

2. *rhji (fortition of *y)

3. *rhci (devoicing of *j before *h)

4. *rci (loss of *-h-)

5. rtsi (shift of *c from palatal to alveolar - why?)

Written Tibetan has no rc-, though other similar clusters are possible (derivations from Jacques 2004):

lc- < *hly-, *lt-y- (no rc-) rts-
lj < *n-ly- rj- < *r-ly- rdz-

Is the absence of rc a chance gap (was there no pre-Tibetan *r-hly-?) or is rts- partly derived from *rc-?

Benedict derived Tibetan rtsi- from Proto-Tibeto-Burman *r-tsiy, later revised to *r-tśrəy (Matisoff 2003: 79; STEDT etymon 2738), and linked it to his Old Chinese 數 *śri̭u = my *sroʔ 'to count'. However, the vowels of the PTB and Chinese forms do not match. OC 數 *sroʔ might share a *s-r-ʔ root with OC 算 *sonʔ < ?*sorʔ 'to calculate' (cf. Jpn soroban 'abacus').

Neither Schuessler's *rhi nor Benedict's PTB forms have codas corresponding to the *-k of OC 歷/曆 *rek. Would Schuessler regard that *-k as a "k-extension"?

I think there might be a relationship between OC *rek and WB re, but am hesitant to relate them to WT rtsi-.

There are no TSr-clusters in WT, so rts- might be partly from *tsr-. However, I do not know of any Tibetan prefix ts- and hence cannot derive rtsi- from *ts-ri- with a root *ri cognate to the OC and WB r-words.

Schuessler's *rhi avoids the problem of explaining what *ts was, but are there any other examples of WT rts- from *rh-, and why did Tibetan have *rh- instead of simple *r-?

My *r- [rˁ] was an allophone of */r/ before and after nonhigh vowels that became phonemic after presyllabic loss and reduction:

*/re ra ro/ [rˁeˁ rˁaˁ rˁoˁ] > */rˁe rˁa rˁo/ *re *ra *ro

*/ri rə ru/ [ri rə ru] > */ri rə ru/ *ri *rə *ru

*/Cʌ-rV/ [CˁʌˁrˁVˁ] > */(C)rˁV/ *(C)rV

*/Cɯ-rV/ [CɯrV] > */(C)rV/ (C)rV

The phonemicization of pharyngealization (a.k.a. 'emphasis' on this site) in Old Chinese is similar to the phonemicization of palatalized consonants in Slavic: e.g.,

Russian тьма /tʲma/ (one syllable) < *tĭma (two syllables) 'fog'

In both cases, vowels that conditioned allophony were lost and the allophones became phonemes.


12.1.2.2:02: DISSECTING DRAGONS

(I originally intended to only dissect the Tangut character for 'dragon', but why stop there?)

There are two words for 'dragon' in Chinese languages, the noncalendrical 龍 (with at least 51 variants!) and the calendrical 辰 (with at least 19 variants!).

The right side of 龍 looks like a drawing of a dragon, but the left side initially seems to defy explanation: 立 is 'to stand' and 月 is either 'moon' or 'flesh'. One might wonder if the Chinese think of dragons as standing on the moon. In Shuowen (100 AD), Xu Shen analyzed 龍 *luoŋ as an abbreviation of a phonetic 童 *doŋ plus 月 'flesh' and the shape (of a dragon, presumably) in flight. However, *d-phonetics are otherwise unknown in *l-graphs and as far as I know,  龍 originated as a drawing of a dragon that was later split into three components. Two resemble unrelated components 立 'to stand' and 月 'moon/flesh' while the third is only found in 龍 and its variants and compounds.

According to Richard S. Cook (1995), 辰

is in fact a representation of a scorpion in striking position as seen in profile. It is shown that this representation bears directly upon the once vigorous traditions relating to the ancient equinoctial position of the star Antares in the Breast of the Celestial Scorpion. And though certain stellar concepts betray the likelihood of an early (pre-OBI [oracle bone inscription]) Sino-Mesopotamian relation (stimulus diffusion), these concepts nevertheless took peculiar Chinese form, such that it is possible to demonstrate the cognacy of Chinese 辰 chén and ‘scorpion’ words in Sino-Tibetan.

I have not yet read this monograph, so I don't know how 辰 'scorpion' came to mean 'dragon'. I would reconstruct  辰 as Old Chinese *dər which only shares a *d with Matisoff's Proto-Tibeto-Burman *s-diik 'scorpion' and doesn't have any strong matches in the STEDT database or in Tangut.

One might expect the Tangut, Jurchen, and Khitan graphs for 'dragon' to resemble some of the 70+ Chinese graphs for 'dragon', but none have any obvious Chinese origin:

Khitan large script Jurchen (large) script Khitan small script Tangut
~
<lu> <mudu.r> = mudur <lu> 1vəi

The Khitan large script character and Jurchen <mudu> are obviously related, though it is not certain whether the Jurchen character was derived from its KLS equivalent or if both were derived from a common Parhae prototype.

The second Jurchen character <r> may have been added later if the first character originally stood for <mudur>. <r> has nine strokes and is surprisingly complex for a graph representing a single consonant. Then again, its Chinese equivalent 兒 -r has eight strokes. To Chinese eyes, <r> looks like two 人 people standing atop a 羊 sheep minus one horizontal stroke. The rationale for the structure of <r> is unknown. It also has a variant with Xs instead of 'people':

No other Jurchen characters have 人x 2 or X x 2 as top elements.

The Khitan small script character <lu> may or may not be derived from its large script equivalent.

Khitan <lu> was borrowed from Chinese 龍 *liuŋ, though their graphs are completely different.

Jurchen mudur is from Proto-Tungusic *muduri. It is vaguely similar to Middle Korean mirɯ < ?*mitɯ 'dragon', but the vowels do not match. Japanese mi 'snake (calendrical)' might be related to the Korean word. But if it's from *mi rather than *məi, *moi, or *mui, it could just be an abbreviation of Old Japanese pəymi, itself probably a loan from a relative of Middle Korean pʌyam 'snake'.

Tangut 1vəi may be from *Cʌ-Pi. The presyllabic vowel conditioned the lenition of the following labial consonant and the partial lowering of *i.

The Tangut character for 'dragon' has four parts:

=+++

The Tangraphic Sea analysis of 'dragon' is

=+

0083 1vəi 'dragon' =

top of 0111 (first half of

1lɨə 1lwɨụ 'to crawl' - a reduplicated root?) +

bottom of 4234 (first half of

1vəi 1məuʳ 'dragon tree' (lit. 'dragon dark' = 'dark dragon')

I doubt that the character for the first half of 'dragon tree' was devised before the much more frequent character 'dragon'. 'Dragon tree' looks like 'dragon' and 1məuʳ 'dark' plus the 'wood' radical. The Tangraphic Sea analyses confirm that:

=+

4234 1vəi = top of 4250 1si 'wood' + bottom of 0083 1vəi 'dragon'

=+

4117 1məuʳ = top of 4250 1si 'wood' + all of 1məuʳ 'dark'

The second half of 'to crawl' is derived from 'dragon' and the first half of 'to crawl':

=+++

0047 1lwɨụ (2nd half of 'to crawl') =

top of 0083 1vəi 'dragon' +

bottom right of 0111 1lɨə (1st half of 'to crawl') +

bottom left of 41691tshõ 'desolate' (why?) +

bottom left of 0080 2phɔ 'snake'

The first half of 'to crawl' is not derived from the second:

=+

0111 1lɨə (1st half of 'to crawl') =

0054 1tswa 'hair worn in a bun' (why?)

0338 1lɨə 'to lock up' (phonetic)

The top component of 0054 may mean 'top'. 'Dragon' is a top animal and hair worn in a bun is near or at the top of the body, but things crawl on the bottom, not the top.

The analysis of 0054 implies that the top element does mean 'top':

=+

0054 1tswa 'hair worn in a bun' =

0055 2tʃɨw 'top of the head' +

2061 2pɛ̃ 'hair'

Unfortunately, no analysis of 0055 is known, so the chain of characters with 'top' ends there.

If the top element of 'dragon'

is 'top', what is the bottom? There is only one other tangraph with the same bottom elements

++

as 'dragon' and the first half of 'dragon tree':

1188 2ŋa 'egg' (analysis unknown)

The function of the top element ユ is unknown. Were dragons 'top eggs'? 1188 in turn had a derivative

=+

1210 2dʒæ̃ 'egg' =

frame of 1188 2ŋa 'egg' +

? of 0088 1tew 'egg' (defined as 1210 in Tangraphic Sea)

No part of 0088 matches the bottom center of 1188. Is this Precious Rhymes of the Tangraphic Sea analysis really a list of synonyms? Why did Tangut have three words for 'egg'? How did these words differ?

What if 'dragon' had nothing to do with eggs? Three of the four parts of 'dragon' vaguely resemble the components of 龍:

: 立

:月

: right of

But what about the fourth part 干? Is it the horizontal lines 二 from 月 plus an additional vertical line?

:月?

If 'dragon' is not a heavily disguised 龍, what is it?

Next: The *Rek-oning


12.1.1.12:21: DISSECTING THE DATE 2012

Here's the solution to the problem I posted last night:

The five Tangut characters under 'dragon'

say '2012 year'. Can you identify the characters for

1. 'two'

2. 'thousand'

3. 'ten'

4. 'year'

And can you figure out whether the line at the bottom is read from left to right or right to left?

The first clue is '2012 year'. The un-English order of these words is absolute. '2012' comes first, followed by 'year'. If the line was meant to be read from left to right, 'year' should be the character on the right:

Conversely, if the line was meant to be read from right to left, 'year' should be the character on the left:

Since

appears twice and '2012 year' only contains one 'year', that character cannot mean 'year'. So by process of elimination, the line must read '2012 year' from right to left:

4. year ? ? ? ?
2012

My next clues were in the questions. I asked if you could identify characters for 'two', 'thousand', 'ten', and 'year'. We already know what 'year' is, so 'two', 'thousand', and 'ten' must be among the remaining four characters.

The gloss 'ten' hints that 'twelve' must contain 'ten' in it: 'ten two' or 'two ten' (cf. Sanskrit dvaa-daśa 'two-ten' = 'twelve')..

The character

appears twice, so it must be the 'two' in 'two thousand' and 'twelve' (= 10 + 2 or 2 + 10):

4. year 1. two ? ? 1. two
2012

Often the key to solving my puzzles lies in finding a character that appears more than once correlated with something that appears more than once in the gloss. Once this character is identified, the rest of the pieces fall into place.

In theory, the line could be either

'two ten thousand two year' ((2+10) + (1000 x 2))

or

'two thousand ten two year' ((2 x 1000) + (10 + 2))

read from right to left, but I was hoping the reader would assume that the order I listed the glosses in

1. 'two'

2. 'thousand'

3. 'ten'

4. 'year'

was the order the characters were read in:

4. year 1. two (again) 3. ten 2. thousand 1. two
2012

I was also hoping that the reader would have the English phrase two thousand (and) twelve in mind. The Tangut equivalent 'two thousand ten two' is close.

If Tangut had the more exotic word order

'two ten thousand two year' ((2+10) + (1000 x 2))

I would have listed the glosses in that order as a hint: 'two', 'ten', 'thousand'. Or I might not have asked the question. I'd be reluctant to ask someone to figure out that French

quatre-vingt-quatre

lit. 'four-twenty-four' ((4 x 20) + 4)

is 'eighty-four' is tough unless I gave the hints 'four' and 'twenty'. And even then, one would not know if the structure of that numeral were

((4 x 20) + 4) = 84

or

(4 + ((4 x 20)) = 84

without another example like

quatre-vingts

(4 x 20s) = 80

Moreover, one might even think that 'four-twenty-four' could be 'ninety-six' (4 x 24). Sanskrit has numerals like tri-nava 'three-nine' (3 x 9) for 'twenty-seven', though I haven't seen any Sanskrit numeral as complex as catuś-catur-viṃśati 'four-four-twenty' (4 x (4 + 20)).

(Skt catur is cognate to Eng four and Fr quatre. Its final -r becomes before c-: catuś-catur. Skt viṃśati 'twenty' is cognate to Fr vingt.)

Next: Dissecting the Dragon (I originally meant to include that in this post, but I decided to separate the two topics.)


12.1.1.3:31: HAPPY SIW YEAR 2012

This siw (Tangut: 'new') year is associated with the vəi (Tangut: 'dragon'):

The five Tangut characters under 'dragon' say '2012 year'. Can you identify the characters for

1. 'two'

2. 'thousand'

3. 'ten'

4. 'year'

And can you figure out whether the line at the bottom is read from left to right or right to left?

No knowledge of Tangut is required. Logic is sufficient.


11.12.31.20:23: TRAGICAL TANGRAPHY

This description of a cryptic crossword clue reminded me of some tangraphic analyses:

15D Very sad unfinished story about rising smoke (8)

is a clue for TRAGICAL. This breaks down as follows.

15D indicates the location and direction (down) of the solution in the grid

"Very sad" is the definition

"unfinished story" gives "tal" ("tale" with one letter missing; i.e., unfinished)

"rising smoke" gives "ragic" (a "cigar" is a smoke and this is a down clue so "rising" indicates that "cigar" should be written up the page; i.e., backwards)

"about" means that the letters of "tal" should be put either side of "ragic", giving "tragical"

"(8)" says that the answer is a single word of eight letters.

There are many "code words" or "indicators" that have a special meaning in the cryptic crossword context. (In the example above, "about", "unfinished" and "rising" all fall into this category). Learning these, or being able to spot them, is a useful and necessary part of becoming a skilled cryptic crossword solver.

Tangraphs have no components equivalent to "15D" or "(8)", but "very sad" is like a semantic element of a tangraph and "unfinished story" and "rising smoke" are like cryptophonetics in tangraphs: e.g.,

=+

5916 1xã (transcription of Chinese 漢 *xã 'Chinese') =

all of 5882 1zaʳ 'Chinese' (cryptophonetic referring to its Chinese translation 漢 *xã 'Chinese'; a semantic compound of

=+

'small' + 'insect') +

right of 0789 2ɣʊ 'the surname Ghu' (function unknown)

High-frequency elements like ヒ (alphacode: cin) on the right of 5916 and 547 other tangraphs might be like the "code words" or "indicators" of cryptic crossword puzzles.

The indicator "about" reminds me of the term


5258 1ʔɔ̣ 'round'

used in tangraphic analyses to mean 'take the surrounding elements of the preceding character': e.g., 2634 is made up of the surrounding elements of 2639 plus the right side of 2705:

=

2634 1dʒwiõ 'publicize; propagate; declare; spread; to name' =

2639 2miee 'name' (semantic)

5258 (take the surrounding elements of 2639)

3678 2to 'to be born; to rise' (semantic)

2705 (take the right side of 3678)

I translate 5258 as 'frame' in analyses: e.g.,

2634 = frame of 2639 + right of 2705

I still do not know whether the analyses from the Tangraphic Sea reflect the intent of the creator(s) of the script or were (independently?) devised later as mnemonic devices.


11.12.31.13:33: LEDYARD ON "THE SO-CALLED JURCHEN SCRIPT"

While looking through The Korean Alphabet for Middle Korean ss-words last night, I found this passage by Gari Ledyard (1997: 54; emphasis mine):

The so-called Jurchen script was more a code than a writing system; to this day its complete decipherment is unattained and probably unattainable given the few written texts that still exist. Although what exists is often partly decipherable because of surviving Sino-Jurchen glossaries, no one yet has figured out the principle of this writing - indeed it may not have had any. If it did no more than discourage Koreans from imitating it in developing their own writing, it made a noble contribution [to the development of hangul, the Korean alphabet].

14 years later, I have yet to see anyone explain the principle(s) of the Jurchen script. When I first took a serious look at it 15 years ago, it struck me as a random imitation of Chinese characters. Its strokes were mostly Chinese, but they weren't combined into phonetic or semantic elements recycled in multiple characters. Learning one character with a certain component would not help you learn the pronunciation or meaning of any other characters sharing that component. The recurring shape 山 has no apparent recurring function in Jurchen. How could anyone learn such a nonsystem of c. 1,000 characters, excluding variants? I used to think that the Jurchen script was to sinography what the Cherokee syllabary was to the Roman alphabet - a recycling of shapes without regard for phonetics.

Some [Cherokee] symbols do resemble the Latin, Greek and even the Cyrillic scripts' letters, but the sounds are completely different (for example, the sound /a/ is written with a letter that resembles Latin D).

However, my analogy was incorrect because the Jurchen elite were literate in Chinese, whereas Sequoyah was not literate in English. Sequoyah did not know how alphabets worked, so he independently invented a syllabary. The Jurchen, on the other hand, must have understood the semantophonetic principles of sinography, so why did they create a script that had no (obvious) principle?

Juha Janhunen did not think the Jurchen actually created a script. He viewed the Jurchen script as an offshoot of a Manchurian branch of the Chinese script:

Proto-sinography
Sinography proper Manchurian sinography
(the existence of the Parhae script is still controversial)
Khitan large script Jurchen (large) script

Although I think Janhunen is correct, his view leads to more questions. What was the principle of the Khitan large script? Why does the Manchurian sinographic tradition seem to be based on different principles (if any?) from mainstream sinography? Do the Khitan and Jurchen (large) scripts seem to lack principles because they were originally designed for a third language spoken in Parhae? That third language would most likely be Koreanic (or possibly even Japonic) since Parhae was a successor to Koguryo. But I don't recall seeing anything hinting at Japonic-based phonetic elements in the Khitan and Jurchen (large) scripts and my attempts to find Koreanic-based phonetic elements have been unconvincing:

Koreanic *an 'not' in "An-certain about Oxen in Jurchen"

Koreanic *on- 'to come' in "Getting Back on the Jurchen Track"

I am more interested in the Khitan and Jurchen (large) scripts than the Khitan small script because the principles of the former are a mystery, whereas the principles of the latter are at least somewhat understood, though the details remain hazy and the phonetic values of many symbols await identification.

The stacking principle of the Khitan small script (and the Jurchen small script?) is very reminiscent of the stacking principle of hangul. I am still not certain that this similarity is just a coincidence. Could the stacking in all three scripts reflect stacking in an earlier fourth script (i.e., Parhae) rather than Khitan and/or Jurchen influence on hangul? If the occasional ligatures of the Khitan large script such as

<

<muɣoo> < <mu> + <ɣoo> 'snake'

predate the Khitan small script, they could be forerunners to the stacking of the Khitan small script.


11.12.30.23:59: SSEQUENCES (SSIC)

While looking up 0586 in Li Fanwen (2008: 100) last night for the analysis of 1306 in line 95 of the Golden Guide, I saw the entry for 0585

2śjị 'cogon grass' (in Gong's reconstruction; mine is 2ʃɨị)

and wondered how it was pronounced in pre-Tangut if Tangut tense vowels (in rhymes 61-75 in Gong's reconstruction and mine) were conditioned by earlier *s-clusters:

*sCV > *CCV > *CC > CṾ

Was 'cogon grass' once *sʃiH? (The -ɨ- in later 2ʃɨị is nonphonemic. /i/ is [ɨi] after alveopalatals: cf. Russian ши [ʃɨ]. *-H - a glottal stop or fricative - is the source of the second tone. *-H may ultimately be from an *-s in at least some cases.)

Just as Russian SS-clusters came from earlier *SVS-sequences: e.g.,

ссора < съсора (the prerevolutionary spelling) 'quarrel'

Tangut SS-clusters could have had similar origins: e.g.,

*iH < *sɯʃiH 'cogon grass'

is my cover symbol for a pre-Tangut vowel that conditioned high vowels in Tangut.

But the simple prefix *s- that Gong proposed may be another source.

A third source might be *h(V) or *χ(V) or *x(V): cf. Ramsey's (1997: 135) emphatic prefix *hɯ- in proto-Korean on the basis of the 雞林類事 Jilin leishi (1103-1104) transcription of 'to write' as

核薩 *xəʔ s (cf. Late Middle Korean ssɯ-)

Qiang languages have χC- and xC-clusters. Ronghong Qiang has xs- corresponding to Mawo Qiang khs-:

RQ xsə : MQ khsə 'new' < *k-sə (the root is *sə, cognate to Tangut

1siw < *sik

and Old Chinese 新 *sin 'new')

Perhaps pre-Tangut *h- or *χ- or *x- could be from an even earlier *kV- that lenited to a fricative before fricatives after its vowel was lost: e.g.,

*kVSV > *kSV > *HSV > *SSV > *SS > SṾ

I derive Tangut aspirates from pre-Tangut *k-C- if they alternate with nonaspirateś: e.g.,

1pị < *s-pi 'to aim at' (*s- is a transitive verb prefix)

1phi < *k-pi 'aim' (noun)

There are no fricative-initial words with such alternations since Tangut has no aspirated fricatives. Perhaps the reflexes of *kS-clusters may be found among SṾ-words with SV-cognates: e.g.,

2sie < *Cɯ-seH 'to know; knowledge'

2siẹ (rather than 2shie) < *-seH 'knowledge'

12.31.11:36: The Tangut words for 'know' are cognate to Tibetan shes- 'to know'. According to von Koerber's rule*, sh- is from *sy-. So I could rewrite the Tangut derivations as

2sie < *sjeH 'to know; knowledge'

2siẹ (rather than 2shie) < *k-sjeH 'knowledge'

I would no longer need *ɯ-prefixes to account for the upward bending of *e to ie. ie would simply be a glide-vowel sequence /je/ reanalyzed as a diphthong /ie/.

Tangut -H was probably from an *-s corresponding to the final -s of Tibetan shes- < *syes-. So pre-Tangut *sjes and pre-Tibetan *syes were identical. (The choice of j or y for the palatal glide is merely a convention.) Of course, one should not expect all pre-Tangut and pre-Tibetan forms to be identical: e.g., Tangut ʃɨạ 'seven' is not cognate to Tibetan bdun 'id.'

Could Tangut 'six' and 'seven' share a *k(V)-prefix?

1tʃhɨiw < *k(ɯ)-trik or *k(ɯ)-drik 'six' (cf. Tibetan drug; could Tangut -i- be from a *-y- < *-u- that assimilated to a front vowel *-i- in the prefix?)

1ʃɨạ < *kɯ-ʃa 'seven'

(or did ʃ < *kʃ- < *ks- < *kɯ-ʃ-? cf. Skt kṣ [kʂ] < *ks)

(or did ʃ < *ʃt- < *st-? cf. Mawo Qiang stə 'seven' and German st [ʃt] < *st)

12.31.12:09: Here are several kinds of *s-/*k-presyllables and their effects on Tangut syllables.

I. Dropped without a trace

Presyllable vowel matches height class of following vowel:

*Cɯ-Ci > Ci

*Cʌ-Ca > Ca

II. Dropped with a trace

Presyllable vowel height causes following vowel to bend:

*Cɯ-Ca > *Cɯ-Cia > Cia

*Cʌ-Ci > *Cʌ-Cəi > Cəi

III. Fused before presyllabic vowel (if any) can condition lenition

*s(V)-CV > *sCV > *CCV > *CC̣ > CṾ

*k(V)-CV > *kCV > ChV (if C is not a fricative; Sh- is not possible)

*k(V)-SV > *kSV > *xSV > *hSV > *SSV > *SS > SV (S is any fricative)

IV. Fused after presyllabic vowel conditioned lenition

*sV-sV > *sV-zV > *szV > *zzV > *zẓ > zṾ

*kV-tsV > *kV-dzV > *kV-zV > *kzV > *gzV > *ɣzV > *ɦzV > *zzV > *zz > zṾ

I couldn't think of cover symbols for 'lenited fricative' or 'lenited nonfricative', so I gave specific examples above.

Many consonants merged in lenition:

Consonant class (Homophones chapter)

Before lenition

After lenition

Labials (I)

*-p-, *-ph-, *-b-

v-

Dentals (III)

*-t-, *-th-, *-d-

l-

Alveolars (VI)

*-s-, *-ts-, *-tsh-, *-dz-

z-

Alveopalatals (VII)

*-ʃ-, *-tʃ-, *-tʃh-, *-dʒ-

ʒ-

Velars (V, VIII)

*-x-, *-k-, *-kh-, *-g-

ɣ-

Perhaps the glottal stop and sonorant consonants (nasals, liquids, and glides including v- /w/) did not lenite.

*I use Nathan Hill's (2011) names for Tibetan sound laws.


11.12.29.20:49: THE GOLDEN GUIDE: LINE 95: TANGRAPHS 471-475

95. Three out of five tangraphs are transcriptive characters not associated with any specific morpheme:

Tangraph number 471 472 473 474 475
Tangraph
Li Fanwen number 1936 0707 4660 3774 1306
My reconstructed pronunciation 2xɛ̃ 1tʃɨw 1ʔiã 2ʃɨõ 1kiõ
Tangraph gloss (transcription of Chinese) district (transcription of Chinese) to guard (transcription of Chinese)
Word the surname 解 Xie (*xɛ) the surname 周 Zhou (*tʃɨw) the surname 燕/閆/鄢 Yan (*jã) the surname 尚/商/賞  Shang (*ʃɨõ) or 昌/常 Chang (*tʃhɨõ) the surname 龔/弓/宮/鞏 Gong (*kiũ) or 姜 Jiang (*kiõ)
Translation Xie, Zhou, Yan, Shang/Chang, Gong/Jiang

471: 'High' on the left of 1936 is an abbreviated phonetic. Was there a Xie family that raised livestock?

=+

1936 2xɛ̃ (transcription of Chinese) =

left of 2949 2xɛ̃ 'skill' +

all of 2306 1pə 'small livestock'

I am not sure that 1936 should be reconstructed with a nasal vowel. It could transcribe Chinese syllables with oral and nasal vowels:

解薤 *xɛ

*xɛ̃

Perhaps 1936 was 2xɛj with a -j (cf. Gong's reconstruction 2xiəj).

(12.30:13:30: Li Fanwen 2008: 322 phonetically glossed 1936 as 郝 *xa, but the vowel doesn't match.)

472: 0707 is a semantic compound:

=+

0707 1tʃɨw 'district' (borrowed from Chn 州 *tʃɨw) =

bottom left of 1408 1lhiooʳ 'place, site, market, street, military formation' +

left of 2627 2lɨə̣ 'earth'

473: Were the components of 4660 meant to be reminscent of Chn 炎/焱/焰 *jã 'flames'? 炎 and 焱 both consist of multiple 火 fires.

=+

4660 1ʔiã (transcription of Chinese) =

bottom right of 4408 1məə 'fire' +

left of 5659 1veʳ 'flourishing, luxuriant'

I am not sure whether had a simple initial j- (as reconstructed by Arakawa) or an initial ʔ- (as reconstructed by Gong). I chose ʔ- because of its fanqie initial speller:

=+

4660 1ʔiã (transcription of Chinese) =

0932 1ʔɨi 'many, more, much' +

1102 1kiã (transcription of Chinese)

But perhaps 0932 also had initial j-. 0932 transcribed Chinese syllables which were *ʔi and *ji in Middle Chinese. It is not clear whether the *ʔi/*ji distinction survived into Tangut period northwestern Chinese.

474: I suppose guarding enables the guarded to evade the effects of evil, but I would have expected a semantic compound like 'evil' + 'shield':

=+

3774 2ʃɨõ 'to guard' =

left of 3551 2niõ 'evil, wicked, bad' +

center and right of 3789 1phie 'to escape, evade'

(12.30.12:21: Possibly borrowed from Chn 避 *phi 'to avoid'? But I would expect that to correspond to Tangut 1phi, not 1phie. Tangut -ie matches the -ie of Early Middle Chinese *bieh, but the initials don't match. Could Tangut ph- be from *k-b- with a native prefix *k- rather than from Tangut period NW Chn *ph- from EMC *b-?)

3774 could represent affricate-initial as well as fricative-initial Chinese syllables:

*tʃɨõ

*tʃhɨõ

Why not transcribe those syllables with tangraphs for tʃɨõ and tʃhɨõ, syllables which existed in Tangut?

Although all Chinese syllables transcribed with 3774 had nasal vowels, Gong reconstructed it as 2ɕjow and I wonder if its rhyme was -ow with a nasal vowel. Gong's glide codas correspond to my nasal vowels in his rhyme groups VIII and XI:

Rhyme group Rhyme Grade Gong This site
(nasal interpretation)
This site
(glide interpretation)
VIII 41 I -əj -ẽ -ej
42 II -iəj -ɛ̃ -ɛj
43a III -jɨj -ɨẽ -ɨej
43b IV -iẽ -iej
XI 56 I -ow -õ -ow
57 II -iow -ɔ̃ -ɔw
58a III -jow -ɨõ -ɨow
58b IV -iõ -iow

(I have excluded tense, retroflex, and long vowel rhymes for simplicity. Unlike Gong, I recognize a Grade IV distinct from Grade III.)

2ʃɨow without a nasal vowel is close to Chn 守 *ʃɨw 'to guard', but I doubt the former was borrowed from the latter because the vowels don't match. I would expect Chn *ʃɨw to correspond to ʃɨw, a syllable that exists in Tangut.

475: 1306 represented the 龔 Gong of the late Tangutologist 龔煌城 Gong Hwang-cherng in the Forest of Categories.

1306 1kiõ was not a perfect match for Chn 龔/弓/宮/鞏 *kiũ but it was the best available match other than 1kiu. There was no Tangut rhyme -iũ.

Were any Gongs or Jiangs related to Su and/or Qian families?

=+

1306 1kiõ (transcription of Chinese) =

0586 2siu (transcription of Chinese: e.g., the surnames 蘇 *su [without *-i-!] and 宿 *siu, now both Su in modern standard Mandarin)

3277 2tshia (transcription of Chinese: e.g., the surname 錢 *tshiã, now Qian in modern standard Mandarin)

3277 only transcribed Chn 千潛賤淺錢踐 *tshiã with a nasal vowel even though it belongs to the oral vowel rhyme group IV rather than the nasal rhyme group V. Were Tangut period northwestern Chinese vowels losing nasalization?


11.12.28.23:15: THE GOLDEN GUIDE: LINE 94: TANGRAPHS 466-470

94. Four out of these five are transcription characters not associated with any specific morpheme:

Tangraph number 466 467 468 469 470
Tangraph
Li Fanwen number 5916 2152 2635 2138 3617
My reconstructed pronunciation 1xã 1ʃɨi 1xiõ 2bəəu 2xwe
Tangraph gloss (transcription of Chinese) grave (transcription of Chinese)
Word the surname 韓 Han (*xã) the surname 施/史時/石/師 Shi (*ʃɨĩ)? the surname 馮/鳳/豐酆/封 Feng (*fɨũ) or  方/房 Fang (*fɨõ) or Xiang 向 (*xɨõ) the surname 慕 Mu (*mbəu)? the surname 惠 Hui (*xwej)
Translation Han, Shi, Feng/Fang/Xiang, Mu, Hui.

466: 5916 has 5882 as a cryptophonetic (its Chinese translation was 漢 *xã) plus the mysterious right-hand element ヒ (alphacode cin):

=+

5916 1xã (transcription of Chinese 漢/韓/邯 *xã) =

all of 5882 1zaʳ 'Chinese' +

right of 0789 2ɣʊ 'the surname Ghu'

Does 0789 represent a Ghu family related to the Han?

The 馬韓 Mahan confederacy in Korea was called

2bæ 1xã (cf. Tangut period NW Chn *mbæ xã)

so Korea, the 韓國 'Han country', might be known as

1xã 2lhiẹ 'Han country'

in modern Tangut. Only two strokes (cin) would distinguish 'Korea' from 'Chinese'!

<>

1xã 'Korea' <> 1zaʳ 'Chinese'

467: Were the Shi the 'elder Nga'?

=++

2152 1ʃɨi (transcription of Chinese 漢/韓/邯 *xã) =

2888 2mə 'surname' +

1633 2pəụ 'elder' +

2075 2ŋa 'the surname Nga'

468: 2635 looks like a combination of 'earth' (indicating a geographic name? from which tangraph?) plus an element of unknown function (alphacode: dol) found in only eight other tangraphs that don't sound like xiõ.

=+

Although Nishida and Arakawa have reconstructed Tangut f-, I am skeptical because Chinese f-syllables were transcribed with tangraphs like this one listed in chapter VIII (glottal initials) of Homophones. (Velar x- is treated as a glottal initial and may have been glottal [h].)

469: The analysis of 2138 is unknown. It looks like 'earth' (cf. the 土 'earth' in Chn 墓 'grave') plus 'hand' plus an right-hand element of unknown function (alphacode: dal) found in 80 other tangraphs:

=++

2bəəu 'grave' is borrowed from Tangut period northwestern Chinese 墓 *mbəu 'id.' The reason for the Tangut long vowel is unknown. Could it compensate for the loss of a native Tangut suffix?

*CV-X > CVV

470: The analysis of 3617 is unknown:

=++

Its left component is 'person' but I don't know what the other two (alphacodes bal and juu) are doing. The sequence baljuu does not occur anywhere else. There are no other tangraphs pronounced xwe.

12.29.1:25: Could 'person' be from 2888 'surname' as in 2152 above?


11.12.27.23:45: THE ROOTS OF RAWNESS

Having just written about the etymology of Zhuang sawgun 'Chinese character', I should write about the etymology of the second half of sawndip. (The saw is the same.)

Despite the spelling, ndip 'raw' is [ɗip7] without a nasal. d without n is unaspirated [t] in Zhuang spelling. This usage is a carryover from Pinyin* in which d and t respectively represent unaspirated [t] and aspirated [th]. Zhuang has no [th], though it does have a [θ] written as s. The n of nd [ɗ] differentiates it from d [t]. The 1957-1982 spelling of [ɗ] was Ƌ, which might be a mirror image of the 1957-1982 letter Ƃ [ɓ] as well as a derivative of d.

Zhuang even-numbered tones usually developed in syllables with *voiced initials, but syllables with voiced implosive initials developed the odd-numbered tones associated with *voiceless initials:

*Proto-voicing *Proto-initial Tones
voiceless *p-, *t- ... 1, 3, 5, 7
voiced *ɓ-, *ɗ- ...
*b-, *d- ... 2, 4, 6, 8

Tone 1 is not indicated in spelling. Tones 2-5 are indicated by silent letters following a syllable:

Tone 1957 spelling 1982 spelling
2 -z
3 -j
4 -x
5 -q
6 -h

Note how similar the 1957 letters are to the numerals 2-6 and the Cyrillic letters г (italic), з, ч, and ь. (ƽ doesn't look like any Cyrillic letter.)

h can also be an initial letter in Zhuang, but z, j, x, and q are always tonal.

Syllables ending in stops can only have tones 7 and 8 which are indicated by the spelling of the stops:

Tone Spelling
7 -p, -t, -k
8 -b, -d, -g

Tones 7 and 8 are identical to 5 and 6, but this spelling convention avoids final digraphs like -pq for -p with tone 5, etc.

Li Fang-Kuei (1977: 129) reconstructed 'raw' in Proto-Tai as *dl/rip. The reflexes of PT *dl/r- in 'raw' vary from d- to ɗ- to n- to r-. Some of the sawndip spellings of ndip imply earlier phonetic similarity with Middle Chinese *ɳ- (which is r-like) and *l-:

生 'raw' + 尼 *ɳi

立 MC *lip + 生 'raw'

生 'raw' + 立 MC *lip

月 < 肉 'meat'+ 立 MC *lip

立 MC *lip by itself

Other spellings have no (?) phonetic:

米 'rice' + 生 'raw' over 失 'lose'

生 'raw' + 勺 'ladle'

㐅'?' + 力 *lɨk 'strength' (phonetic?; *-k is a grave consonant like -p)

㐅 appears in at least 13 sawndip characters. I don't know what its function is.

Sawndip may be the earliest indigenous Tai writing system. It would be interesting to reexamine existing reconstructions of Proto-Tai with sawndip evidence in mind. Although the spellings of ndip 'raw' imply an earlier liquid or even nasal, Pittayaporn (2009) reconstructed Proto-Tai 'raw' as *C̥.dip without either a liquid or a nasal. *C̥- is a presyllable with a voiceless initial. Pittayaporn compares his PT *C̥.dip with Blust's Proto-Austronesian *quDip. PAN *D is [ɖ]. Could PAN *quɖip have been borrowed into Proto-Kra-Dai as *qudrip**, simplifying to PT *C̥.dip and Norquest's (2008: 277) Proto-Hlai *Curiip and Proto-Be *Curjəp? Or did PKD inherit 'raw' from an ancestor shared with PAN or even from PAN itself?

Benedict's Austro-Tai (Austro-Kra-Dai in modern terminology?)

Proto-Austro-Tai (Proto-Austro-Kra-Dai)
Austronesian Kra-Dai (including Tai)

Sagart: Kra-Dai as (Sino-)Austronesian subgroup

Proto-Sino-Austronesian
Sino-Tibetan Proto-Austronesian
Non-Muic subgroups of Austronesian Muic
Non-Kra-Dai subgroups of Muic Kra-Dai

I used to be highly skeptical of a connection between Kra-Dai and Austronesian. As Pittayaporn wrote,

Benedict’s [Austro-Tai] work has been rightly criticized for its methodology and the quality of its evidence.

However, I now see

Undeniable evidence (Benedict 1942, Sagart 2004, and Ostapirat 2005) for some kind of relationship

But I have no opinion about which kind of relationship exists between Kra-Dai and Austronesian. For now, I can only recommend Pittayaporn (2009) as an overview of compression phenomena which may be relevant to compression in the histories of Chinese and Tangut.

*Not all Zhuang letters are used as in Pinyin. Exceptions:

Zh c is [ɕ] like Pinyin x, not Pinyin c [tsh]. Zhuang has no aspirates.

Zh j, x, q, z are tonal letters (see above), not consonants as in Pinyin.

Zh s is [θ], not [s] as in Pinyin.

Zhuang spelling indicates short vowels with an added -e- in closed syllables, whereas Pinyin has no devices for vowel length:

Short ae [a] oe [o]
Long a [aa] o [oo]

There is no length distinction in open syllables.

**12.28.00:11: A presyllable initial *q- is reconstructible in Proto-Kra-Dai on the basis of Buyang qaɗip 'raw' (Li Jinfang 1999 as cited in Sagart 2004: 50).


Tangut fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2011 Amritavision