13.5.18.23:59: MANCHU VOWEL P-Ū-ZZL-E
I initially wrote in "J_(r)_en",
In Manchu, u ... e (a yin-yin sequence) and ū ... a (a yang-yang sequence) are possible, but not ū ... e (a yang-yin sequence).
However, the yang-yin sequence ū ... e was in Old Manchu Jūsen (sic!). I quickly caught my mistake and added two words (in bold)
In standard written Manchu, u ... e (a yin-yin sequence) and ū ... a (a yang-yang sequence) are possible, but not ū ... e (a yang-yin sequence).
Did vowel harmony rules change between standard written Manchu and Old Manchu? No, but orthographic conventions changed:
| Manchu IPA | Yin/yang | Old Manchu spelling | Standard Manchu spelling | Cf. Khalkha vowels corresponding to Written Mongol |
| [u] | yin after velars; never occurs after uvulars; yin or yang elsewhere | ᡡ <ū> or ᠤ <u> |
ᡠ <u> (new letter) | [u], [o] : ᠦ <ü> ([u], [o] : ᠤ <u> if preceded by a yin vowel) |
| [ʊ] | yang | ᠣ <u> | ᡡ <ū> | [ʊ], [ɔ] : ᠣ <u> |
| [ɔ] | ᠣ <o> | |||
| [ə] | yin | ᠡ <e> initially; ᠠ <a> elsewhere |
ᡝ <e> | [ə], [a]: ᠡ <e> initially; ᠠ <a> elsewhere |
| [a] ([ɑ] after uvulars) | yang | ᠠ <a> | [a]: ᠠ <a> | |
| [i] | neutral | ᡳ <i> | [i]: ᠢ <i> | |
Was the variety of Mongolian known to the Manchu (technically still Jurchen until 1635) at the turn of the 17th century like modern Khalkha Mongol? Old Manchu spelling conventions are more or less what I'd expect from a Khalkha speaker. (There are exceptions: e.g., [suwəni] 'you' is spelled as ᠰᠤᠸᠠᠨᠢ <suwani> as well as ᠰᠦᠸᠡᠨᠢ <sūwani> in Old Manchu.)
The big mystery to me is why the ᡡ grapheme was reassigned to [ʊ] in standard Manchu.
5.19.13:09: I used to think that the change in spelling reflected a shift from a Turkic-style vowel system with palatal harmony to a Khalkha-style vowel system with height harmony:
| Yin/yang | Old Manchu | Standard Manchu | ||
| IPA | spelling | IPA | spelling | |
| yin | [y], [ø] | ᡡ <ū> or ᠤ <u> |
[u] | ᡠ <u> (new letter) |
| yang | [u] | ᠣ <u> | [ʊ] | ᡡ <ū> |
| [o] | [ɔ] | ᠣ <o> | ||
| yin | [e] | ᠡ <e> initially; ᠠ <a> elsewhere |
[ə] | ᡝ <e> |
| yang | [ɑ] | ᠠ <a> | [a] ~ [ɑ] | ᠠ <a> |
| neutral | [i] | ᡳ <i> | [i] | ᡳ <i> |
In this scenario, original [y] and [ø] merged into [ʏ] which then backed to [ʊ] and raised to [u] while original nonlow back vowels lowered: [u] > [ʊ] (only after uvulars?), [o] > [ɔ]. The use of <ü> for [ʊ] would date from stage 3 below:
| Initial | Yin/yang | Stage 1 Pre-Old Manchu |
Stage 2 Early Old Manchu |
Stage 3 Late Old Manchu |
Stage 4 Standard Manchu |
| nonuvular | yin | [kʰy] | [kʰʏ] ᠺᠦ <kū> | [kʰʊ] ᠺᠦ <kū> | [kʰu] ᡴᡠ <ku> |
| [kʰø] | |||||
| uvular | yang | [qʰu] | [qʰʊ] ᠬᠤ <qu> | [qʰʊ] ᡴᡡ <qū> | [qʰʊ] ᡴᡡ <qū> |
| nonuvular, nonvelar | [su] | [su] ᠰᠤ <su> | [su] ᠰᠤ <su> | [su] ᠰᡠ <su> | |
| nonvelar | [qʰo] | [qʰo] ᠬᠣ <qu> | [qʰɔ] ᠬᠣ <qu> | [qʰɔ] ᡴᠣ <qo> |
The trouble is that I don't know of any text with stage 3 characteristics: i.e., <ū> after both nonuvulars and uvulars. The closest thing I can find in the Old Manchu texts I have on hand (those in Roth Li's 2010 textbook) is "Manchu-Chinese cooperative living" (1621-1622) in which <ū> is in yin-voweled words except for ᠦᠯᠭᠢᠶᠠᠨ <ūlgiyan> ~ ᠤᠯᠭᠢᠶᠠᠨ <ulgiyan> which must be a yang-voweled word because of its <a>. I am hesitant to draw any conclusion from such an isolated error.
Worse yet, those early texts sometimes have both <u> and <ū> in yin-voweled words in the same text - a practice that the above table fails to predict: e.g., "Manchu-Chinese cooperative living" has ᠵᠤᠰᠡᠨ <jusen> as well as ᠵᠦᠰᠡᠨ <jūsen> 'Jurchen'.
Let's start over again ... in 1599 when Nurhaci commissioned the adaptation of the Mongolian alphabet to Manchu, Manchu had at least two types of vowels that were written with the Mongolian letters ᠤ <u> and ᠦ <ü> (a combination of <u> and <i>).ᠦ <ü> is ū with a macron instead of a in the Möllendorff romanization of Manchu. Presumably these two vowel types (not necessarily two vowels)
- sounded like the vowels written with those letters in Mongolian
- were the ancestors of the three vowels [u ʊ ɔ] of later standard Manchu
In Mongolian, the letter ᠦ <ü> cannot follow uvulars, and its Old Manchu counterpart ᠦ <ū> also did not follow uvulars in the texts I have seen. However, in standard Manchu, ᠦ <ū> usually follows uvulars (!). How did that happen? Here's one last scenario:
- Old Manchu had four labial vowels [y u ʊ ɔ]
- [y] was losing its palatality and merging with [u]
- [y] was written as ᠦ <ū> (ᠤ <u> after a yin vowel) and the rest were written as ᠤ <u>
- [y] was lost by the time the alphabet was finalized, leaving three labial vowels [u ʊ ɔ]
- [u ɔ] were much more common than [ʊ] which was mostly after uvulars:
Vowel Yin/yang Uvulars Velars Nonuvulars and nonvelars [u] neutral no yes yes [ʊ] yang yes no rarely [ɔ] yes - Therefore [u ɔ] should be written with characters requiring less effort than [ʊ].
- [u] was written as yang <o> with a dot by analogy with noninitial [ə] which was written as yang <a> with a dot:
Dotless yang / lower vowels Dotted higher vowels [o] ᠣ <o> [u] ᡠ <u> [a] ~ [ɑ] ᠠ <a> [ə] ᡝ <e>
- The existing letter ᠦ <ū> - similar to ᠣ <o> but differentiated by more than a dot - was then recycled for the low-frequency vowel [ʊ]. This does not necessarily mean that [ʊ] came from an earlier [y] and/or [u]; graphic continuity (i.e., the reuse of a shape) does not entail phonetic continuity (i.e., a sound change bridging the two sound values).
Similarly, the fact that the Latin letter y once stood for the Azerbaijani vowel [y] but now stands for the Azerbaijani consonant [j] does not mean that Azerbaijani [y] became [j]. (Compare Azerbaijani alphabets here.)
I still don't find that last account entirely convincing. A real answer would require a detailed study of spelling in all early Manchu texts. One might hope that a reconstruction of Old Manchu phonology - if it's really that different from the phonology of standard Manchu only decades (!) later - would map nicely onto what is known about Ming Jurchen phonology, but I fear that might not be the case, since written Manchu is not a direct descendant of the dialects transcribed by the Chinese.
13.5.11.21:59: J_(R)_EN
In "Back on Track" and "The Vowels of the Central Capital", I reconstructed Jurchen
~
as <jū> [tʂʊ] with [ʊ] instead of [u] as in Jin Qizong's reconstruction dʒu.
This character is also in?<jū.še> 'Jurchen' (Yongning Temple Stele, 1413)
which also has the spelling
?<ju.šen> (Sino-Jurchen vocabulary of the Bureau of Translators; cf. Manchu jušen 'serf, Jurchen')
So should I reinterpret the latter as <jū.šen>? I would rather not because the vowel sequence <ū ... e> violates Manchu vowel harmony (which I project back into Jurchen). Manchu vowels are 'yin' (higher), 'yang' (lower), or neutral:
| yin | i (neutral) | e [ə] | (none) | u |
| yang | a | o [ɔ] | ū [ʊ] (> u after nonuvulars) |
The Manchu vowel system is similar to the Middle Korean vowel system:
| yin | i (neutral) | ə | ɯ | u |
| yang | a | ʌ | o |
In standard written Manchu, u ... e (a yin-yin sequence) and ū ... a (a yang-yang sequence) are possible, but not ū ... e (a yang-yin sequence). (However, the yin-yang sequence u ... a is possible only if u is after a nonuvular consonant and is hence from an earlier *ū: e.g., juwa < *jūwa 'ten'. 5.12.3:43: Moreover, the yang-yin sequence ū ... e is in early [i.e., nonstandard] written Manchu Jūsen [sic!]. I will discuss it next time.)
If ū ... e wasn't possible in Jurchen, what was the original Jurchen ethnonym?
In Classical Mongolian, the Jurchen are the Jürčid with the yin vowel ü. Mongolian also has yin/yang vowel harmony (though this is traditionally interpreted in terms of palatality rather than height):
| yin | i (neutral) | e | ö | ü |
| yang | a | o | u |
If Jurchen yin/yang categories were preserved in Mongolian borrowings, then the Jurchen original was *Jurcen with yin vowels. (-d is a Mongolian plural marker that replaces stem-final -n.) Perhaps
was originally <jur.cen> or <jur.šen>. (It is not known whether *rc shifted to *rš before or after the Jurchen script was developed c. 1120. That certainly must have happened after the name was borrowed into Mongolian.)
At this stage,
![]()
<jur> ≠ <jū>
were not homophonous. But by 1413, both -r- and the u/ū distinction after nonuvulars were lost, so the two characters became interchangeable homophones, and the old character for <jū> was used to write a syllable that had once been <jur>:
<jū.še> for Juše [tʂuʂə] (not *Jūše *[tʂʊʂə]!) < Jurše(n) < Jurce(n)
The choice of
for še in 1413 does not necessarily imply that this character was once read rše, rce, or ce; it merely indicates that its initial consonant and vowel were identical to those of
<šen> < <cen>?
by 1413.
Oddly, if the above scenario is correct, the spelling
<jur.cen> or <jur.šen>
is more conservative even though I don't know of any attestations prior to circa 1500 (Pelliot's estimated date for the Sino-Jurchen vocabulary of the Bureau of Translators).
5.12.2:55: The fact that
<jur>
has only been found in the word for 'Jurchen' so far may indicate that it originally represented a syllable less common than <ju> (i.e., <jur>). Perhaps it was initially a logogram <jurcen> or <juršen> for 'Jurchen' and
<šen> < <cen>?
was added later as a phonetic clarifier.
I thought
<ju>
might be the phonogram for ju, but it is attested in isolation (i.e., not as part of a longer word) in the Nüzhen zishu (Jurchen Character Book) which may be a list of logograms. (That's the impression I got from the description of Nüzhen zishu in Kane 1989: 9. I have not yet studied it myself.) I suspect it might have originally been a logogram for juhe 'ice' which was later written with two characters as
<ju.he>
The second character <he> might be a phonetic clarifier.
<jū jur(cen?) ju(he?)>
are the only characters with the reading dʒu (= my ju) in Jin (1984). If the latter two were initially logograms, then perhaps
- the first character <jū> was a phonogram for both ju and jū
- the u/ū-merger after j was already complete by the time the Jurchen script was devised, so that first character was the only phonogram for ju < *ju and *jū
- the <ju>-phonogram could be among the hundred or so Jurchen characters without any known readings; ju/jū merged early, and that character was then replaced by <jū>
13.5.4.23:54: THE VOWELS OF THE CENTRAL CAPITAL
In "Back on Track", I reconstructed Jurchen
~
~
'road' (Jin 1984: 30-31, 95, 167)
as <jū.gū> [tʂʊɢʊ] with [ʊ] instead of [u] as in Jin Qizong's reconstruction dʒu(g)u.
(I have projected Manchu consonants according to Norman [2013] back into Jin Jurchen, but that may be anachronistic.)
(5.5.0:33: Even projecting Manchu-style vowel harmony back into Jurchen may also be anachronistic, but I want to see how far I can go with that hypothesis.)
The first character of 'road' also appears in
<jū.ūng dū> [tʂʊŋ dʊ]+
for Chinese 中都 'Central Capital', now pronounced Zhōngdū in modern standard Mandarin.
This Jurchen spelling may imply that the vowel of the northern Chinese readings of 中 and 都 was [ʊ] rather than [u] during the Jin Dynasty. If that vowel were [u], 中都 might have been transcribed as
~
which may have represented <ju.ung du> [tʂuŋ du] as opposed to <jū.ūng dū> [tʂʊŋ dʊ].
I reconstructed the reading of the final character of <jū.ūng dū> with <ū> because <ū> would be in harmony with the <a> of
<dū.ha> [tʊχɑ] 'intestine' : Manchu duha [tuχɑ]
<wa.dū.ra> [wɑdʊrɑ] 'kill-?-?*' : Manchu wa- [wa] 'kill'
but that character also appears in
?<hen.du.ru> [xənduru] 'say-HORTATIVE**' : Manchu hendu- [xəndu]
whose <e> should harmonize with <u>, not <ū>.
(5.5.0:15: All of the above Ming Jurchen spellings from the Sino-Jurchen vocabulary of the Bureau of Translators may postdate a merger of Jin Jurchen <du> and <dū>, so perhaps one or more of them would be wrong for Jin Jurchen.)
In any case, the vowel of 都 underwent the following shift in Chinese
*a > *ɑ > *ɔ > *o > *ʊ > *u
but it is not clear whether the Jurchen borrowed 都 before or after the final raising to [u].
Also, if I am wrong about assigning [ʊ] and [u] to readings of otherwise seemingly homophonous Jurchen characters, the existence of multiple characters for the same reading - e.g.,
<ju> (as opposed to <jū> and <ju>)
<ung> (as opposed to <ūng> and <ung>)
<du> (as opposed to <dū>, <dū>, <du>, and <du>)
- needs to be explained.
Different shapes for the same syllable need not entail heterophony. Prior to the standardization of kana in 1900, Japanese had always been written with multiple symbols per mora, and this complex allography disguised a simpler phonology.
*5.5.2:15: I don't know what the suffixes after wa- 'kill' are. I considered the possibility that wa-du-ra- was a verb stem consisting of wa- plus the noun-forming suffix -du and the verb-forming suffix -ra-, but what would wa-du- be? Some kind of weapon for killing? Would wa-du-ra- then mean 'use a wa-du-'?
Kiyose (1977: 117) read the final character as <la> and thought it "probably forms perfective participles" whereas Jin (1984)'s reading <ra> is like the Manchu imperfect participle suffix -ra.
**5.5.0:11: Gloss from Kiyose (1977: 63, 117).
13.4.27.23:57: BACK ON TRACK
William Rozycki included Manchu jugūn 'road' in his Mongol Elements in Manchu (1983: 171) - and rejected it as a Mongol element "on semantic grounds" since the corresponding Mongolian word jüg meant 'direction', not 'road'.
I would add two phonological arguments:
First, if Mongolian jüg (or some similar ancestor) were borrowed into Proto-Tungusic (or inherited from Proto-Altaic*), it should have had *i instead of u in languages that shifted *ü to i**: e.g., Evenki *jiɣun instead of juɣu 'direction'. I agree with Rozycki (1983: 171): Tungusic juɣ-words meaning 'direction' "are recent borrowings from Mo[ngolian]". They cannot predate the *ü to i shift and are not cognate to Manchu jugūn 'road'.
Second, the near-high (but not truly high) ū [ʊ] of Manchu jugūn [tʂuʁʊn] and the mid o of its Sibe cognate ǰoxon [tʂoʁon] 'id.' (Kim et al. 2008: 80) indicate that the word 'road' had nonhigh vowels, not high vowels like Mongolian jüg. I would expect Mongolian jüg to correspond to Manchu *jugun [tʂuɣun] and Sibe *ǰuxun [tʂuxun] with high vowels.
Their Jin Jurchen 'ancestor'***
~
~
'road' (Jin 1984: 30-31, 95, 167)
may have been pronounced [tʂʊɢʊ] and could be romanized as jūgū. *ʊ raised to high after Manchu j but not g and lowered to o in Sibe****.
13.4.28.4:46: The Ming Jurchen word ju transcribed as 住 [tʂu] in the Sino-Jurchen vocabulary of the Bureau of Interpreters (Kane 1989: 162, #133) may have been [tʂʊ(ː)] or [tʂu(ː)]. There was no way to distinguish between [ʊ] and [u] or indicate long vowels in Ming Chinese transcription.
It is not clear whether earlier [ɢʊ] fused with the preceding vowel, possibly lengthening it, or was a suffix (as proposed by Jin 1984: 31) that was either lost or always absent in that dialect.
I am skeptical that [ɢʊ] was a suffix since I know of no Manchu suffix -gū -[ʁʊ].
Moreover, the transcription 住兀 [tʂuu] for
<jū.gū> 'road'
in the Bureau of Translator vocabulary (Grube 1896, #57) may point to a long vowel but it is possible that the Bureau of Interpreters dialect lost it. However, 住兀 [tʂuu] may have represented a Ming Jurchen [tʂʊʁʊ] with a [ʁ] without an exact equivalent in Ming Chinese.
*13.4.28.3:59: I do not think a Proto-Altaic language existed.
**13.4.28.4:01: According to a 1996 class handout by Sasha Vovin, Proto-Tungusic *ü merged with *i in some languages like Evenki and merged with *u in others like Manchu.
***13.4.28.4:04: Jin Jurchen may not be directly ancestral to Written Manchu and Sibe, but it is certainly very closely related to their ancestors.
****13.4.28.17:31: At a glance, it seems that Written Manchu ū corresponds to Sibe o which is therefore the regular reflex of pre-Sibe [ʊ] (Kim et al. 2008: 52, 74, 107):
WMa akū [aqʰʊ] : S ako [aqo] 'not exist' (is the lack of aspiration a typo?)
WMa gūsin [qʊsin] : S gošun [koɻun] 'thirty'
WMa hūcin [χʊtɕʰin] : S xocin [χotɕʰin] 'well'
I am not entirely certain, as I have not yet investigated all Sibe cognates of Written Manchu ū [ʊ], and I found one exception (Kim et al. 2008: 79):
WM jakūn [tʂaqʰʊn] : S ǰakun [tʂaqʰun] 'eight' (instead of *jakon [tʂaqʰon])
I initially thought
WM juwen [tʂuwən] 'loan' : S ǰomjə [tʂomjə] 'to borrow' (instead of *ǰumjə [tʂumjə]; Kim et al. 2008: 80)
was an exception, but its o is a fusion of [uwə] also found in Sibe ǰwə [tʂwə] ~ [tʂo] 'two' (Kim et al. 2008: 81) corresponding to Written Manchu juwe [tʂuwə] and Ming Jurchen juwe [tʂuwə] transcribed as 拙 [tʂwə] (Kane 1989: 362, #1110).
13.3.9.23:23: GONE, GENÚN, ΓUNIΓ
I am still stunned by the passing of Toren Smith, the man who opened doors for me in the manga industry - and who made that industry possible in the United States.
The Khitan word
<g.en.ún> 'sad'
from the Eulogy for 宣懿皇后 Empress Xuanyi in the small script came to mind.
I wonder what the large script spelling of genún was. Could it have been something like
<g.en.ün>?
And I wonder how the frontness (?) of genún can be reconciled with the backness of Written Mongolian cognate ɣuniɣ, a cognate suggested in Kane (2009: 114)
genún has
front (i.e., velar) g
front e
ú which might have been front [y]
(though Kane [2009: 30] pointed out that -ún appeared in words of both the [back?] a-group and [front?] e-group; could it have been a neutral central vowel [ʉ]?)
ɣuniɣ has
ɣ (i.e., back g; uvular despite its conventional notation; see Poppe's grammar)
back u
front yet neutral i < central *ɨ < back *ɯ?
Both words might descend from a Pre-Proto-Mongolic* root *g-n; Khitan -ún and Written Mongolian -iɣ cannot be reconciled and must be unrelated suffixes. I have no idea what the vowel of *g-n was. If it were, say, *ö, a vowel that was mid and front like Khitan e but rounded like Mongolian u, I would expect it to remain as ö in Written Mongolian.
*Pre-Proto- rather than simply Proto-Mongolic because Khitan is Para-Mongolic:
| Pre-Proto-Mongolic | |
| Para-Mongolic (extinct) | Proto-Mongolic |
| Khitan (extinct) | Mongolic languages proper |
'Para-Mongolic' means 'sister of Mongolic', so there could have been more than one Para-Mongolic branch, and there might not have been a single 'Proto-Para-Mongolic' language that was ancestral to two or more of those branches.
13.3.2.23:59: MANCHU 'HINGS' AND QUEENS*
Manchu han [χan] 'king' and katun [qatun] 'queen' are presumably borrowed from Mongolian qa(gha)n 'king' and qatun 'queen', yet they have different initial consonants. Were they respectively borrowed from Mongolian after and before Mongolian q lenited to [x]?
Next: Does Jurchen complicate that scenario?
*The English words king and queen share initial /k/, but are unrelated, whereas the Central Asian titles qa(gh)an 'king' and qatun 'queen' are related and are probably from Xiongnu; see Vovin (2007) for details.
13.2.20.23:59:(R)I(U)-DDLE
OF THE SULFURIC SPHINX 2: HOT WATER BUBBLES
In my last post, I proposed that Japanese 硫黃 iō < *iwaũ 'sulfur' reflected several Koreanisms:
r-avoidance even in borrowings from Chinese
medial -h-lossI concluded that iō was a Chinese word that had been filtered through some form of Koreanic with traits (*ri- > i-, -h- > zero) with parallels in later Korean proper**.
However, David Boxenhorn suggested that the word could reflect changes in Japanese dialects. I realized that there must have been at least one native Japanese word for sulfur which is indigenous to Japan.
Then bitxəšï-史 quoted from the 日本国語大辞典 which I haven't been able to consult since 2004 (nine years ago - not everything is online yet!). 日本国語大辞典 derived the word from a native yu-awa 'hot water-bubbles' and reminded me of a variant reading yuō for 硫黃.
Finally, I checked Sakihara's Okinawan-English Wordbook and found yuuwaa 'sulfur'.Here's what I think now:
Early Middle Chinese 硫黃 *lu wɑŋ 'sulfur' was borrowed into Japanese via Paekche as *ruwaũ.
There was a native term *yu nə awa 'bubbles of hot water' (first attested much later in 和名抄 as 由乃阿和) with a shorter variant *yu-awa 'hot water-bubbles' (is this attested anywhere?).
The two terms were confused, and *ruwaũ became *yuwaũ under the influence of the similar-sounding but unrelated native *yu-awa. This *yuwaũ is attested in 和名抄 as 由王 (interpreted by 岩波古語辞典 on pp. 1369 and 1376 as yuwa***).
Okinawan yuuwaa could be a mixture of the expected Sino-Okinawan 硫黃 *ruwoo < *ruwau and the similar-sounding but unrelated Okinawan *yuu-ʔaa < *yuu-ʔawa 'hot water-bubbles'.
The first syllable *yu was changed to *i. This irregular shift also occurred in yuk- 'to go' which became ik- (and both versions of 'to go' survive today alongside yuō ~ iō 'sulfur').
The resulting *iwaũ resembles iwa 'large rock', but I think that's coincidental, as sulfur crystals are not large.
I am still not entirely satisfied with that account, but it does have more explanatory power than my first attempt, and it avoids speculating about irregular nativization in early Koreanic (which was still involved as an intermediary language).
*I called this a "Koreanism" but failed to explain what I meant in my haste. Unlike Middle Chinese, Korean has never had native syllables ending in -w. The prescriptive style of Sino-Korean pronunciation in Tongguk chŏng'un did have readings ending in -ㅱ -w, but that coda was pedantic if not simply artificial; natural Sino-Korean readings lacked -w.
One might think Middle Chinese *-iw was borrowed as -i without -w in natural Sino-Korean, but in fact it was borrowed as -yu (still without -w). So Middle Chinese 硫 *liw 'sulfur' corresponds to prescriptive Sino-Korean ryuw (Tongguk chŏng'un IV: 83) and natural Sino-Korean ryu, not *ri. Nonetheless I thought that perhaps some Koreanic variety that was not a source of Sino-Korean or Sino-Japanese had a different rule: borrow Chinese *-iw syllables as *-i.
**That variety of Koreanic would have been like Khitan which was phonologically more innovative than Classical Mongolian, though the latter was first attested later.
***2.21.0:09: 岩波古語辞典 (p. 1376) regarded the y- of its yuwa (is this ever attested in kana?) as a substitute for *l- which was not permissible as an initial in Old Japanese. Are there any other early Chinese loans in Japanese with y- for *l-?
13.2.19.20:59:(R)I(U)-DDLE OF THE SULFURIC SPHINX
The Battle of Iwo Jima (硫黃島 'Sulfur Island') began 68 years ago today.
Japanese 硫黃 iō < *iwaũ 'sulfur' has an unusual first syllable. In theory the word should be
*ruō < *ruwaũ (if it were a borrowing from Early Middle Chinese *lu wɑŋ*)
or *ryūkō < *riukwaũ (if it were a borrowing from Late Middle Chinese *liw xwɑŋ)
but as far as I know, i is
- the only (?) instance of a Middle Chinese *l-syllable Japanized with a zero initial
- the only (?) instance of a Middle Chinese *-iw syllable borrowed into Japanese without an -u corresponding to -w
Both of these traits might be Koreanisms. Although both Korean and Japanese share the Altaic avoidance of initial r- in native roots, standard South Korean has gone further and avoided it even in Chinese borrowings:
Chinese *l- > Korean *r- > n- (> zero before i, y)
Hence the standard South Korean word for 'sulfur' is 유황 yuhwang without r- before y-. (I specify South Korean because standard North Korean has a restored r-: e.g., 류황 ryuhwang 'sulfur'. This r- is not necessarily pronounced in actual speech. See Young-Key Kim-Renaud's observation in Lee and Ramsey 2000: 342, note 11.)
The 黃 ō (rather than *kō) of Japanese iō indicates an early borrowing. My guess is that some variety of Koreanic - colloquial Paekche, perhaps? - nativized a Chinese *liw as *i, and a nativized *iwaŋ was borrowed into Japanese as *iwaũ. The trouble with this explanation - besides the lack of other examples of similar nativizations - is that *liw is a Late Middle Chinese reading, and the second half of *iwaŋ is based on Early Middle Chinese *wɑŋ. Did such a mixture really exist? Or is the absence of a consonant corresponding to Late Middle Chinese *x- also due to irregular nativization?
*liw xwɑŋ > *ihwaŋ > *iwaŋ
In modern Korean, /h/ is lost between a vowel and /w/ "when the word is spoken at normal conversational speed" (Lee and Ramsey 2000: 74). This phenomenon might also have occurred much earlier in some Koreanic dialect - but after the shift of *lu to *liw of Chinese. Hence the 黃 ō (< *waũ) of Japanese iō is not truly from Early Middle Chinese *wɑŋ like the 黃 ō (< *waũ) of 黃金 ō (< *waũkəm) 'golden', but is actually a Late Middle Chinese *xwɑŋ in disguise.
*Although others would reconstruct *ɦw- or *ɣw- as the initial of Early Middle Chinese 黃 'yellow', I reconstruct *w- for the dialect that is the source of early Sino-Japanese (i.e., 'Go-on') 黃 ō (< waũ) since a voiced back fricative would have been borrowed as /Nk/ which would later become g-: e.g., 號 *ɣɑwh ~ *ɦɑwh as early Sino-Japanese /Nkau/ (now gō 'number').
It is strange that 黃 'yellow' follows rather than precedes 硫 'sulfur' in the disyllabic Chinese word 硫黃 'sulfur'. Could this modified-modifier order reflect substratal (e.g., Hmong-Mien, Austroasiatic, or Kra-Dai) influence in a Middle Chinese dialect? (Cf. remnants of such word order in Cantonese: e.g., 雞公 gai gung 'chicken male' for 'rooster' instead of 公雞 'male chicken' as in Mandarin.)
13.2.16.23:56: EGG-XPLANATION
In "(X)egg", I asked,
Also, why does Spanish huevo have an h- absent from Latin ovum and its other descendants (e.g., Portuguese ovo, French oeuf, and Italian uovo)?
David Boxenhorn sent me a link to the answer:
The letter H is always mute [in Spanish]. It is written [...] orthographically [but not etymologically] in initial hue- (in the traditional graphic there was not distinction between U and V and writing without initial h- could suggest the reading ve-), cf.:
hueso (<= L. ossum) bone, huevo (<= L. ovum) egg;
So h was a consonant letter that indicated the absence of a consonant phoneme (i.e., /v/).
龚勋 Gong Xun independently sent me the same solution and reminded me about how Spanish hu- stands for /w/: e.g., in Nahuatl [ˈnaːwatɬ] (in Nahuatl pronunciation).
All this made me realize why French huit 'eight' has a nonetymological h-. Wiktionary confirmed my guess:
From Old French uit, from Latin octō, the h was added to avoid confusion with vit.
But I still don't understand what's going on with 'egg' in Iranian:
Indo-Iranian: *āwya-(ka-)*
Indic: Sanskrit vi- 'bird'
Iranian (no Proto-Iranian form at Wiktionary)
x- (Persian)
East Iranian:
Ø- (Avestan, Khotanese, Ossetic)
h- (Pashto)
x- (Yaghnobi)
y- (Khwarezmian; y- was added "irregularly" to initial ā- according to Henning 1977: 489)
Did some dialect(s) of Proto-Iranian develop an initial x/h-?
*2.17.4:55: I don't know how *-ka- can be reconstructed at the Proto-Indo-Iranian level since Sanskrit vi- 'bird' doesn't contain the suffix -ka-. Is there an Indic reflex of *āwya-ka- that I don't know? Or is *āwya-(ka-) a Proto-Iranian rather than a Proto-Indo-Iranian reconstruction?
13.2.9.23:54: R-EDUNDANCY R-EDUCTION IN 'THREE' AND 'FOUR'
I have long been puzzled by the feminine stems of 'three' and 'four' in Sanskrit:
| gloss | masculine/neuter | feminine |
| 'three' | tri- (weak), traya- (strong) | tisra- |
| 'four' | catur- (weak), catvār- (strong) | catasr- |
I used to think that the -s- was inserted into the masculine/neuter stems:
tri- > ti-s-r-a- (with metathesis of i?)
catvār- > cata-s-r-
Years later, I read in Beekes (1995: 212) that
The [Proto-Indo-European] element -s(o)r- [in 'three' and four'] also perhaps appears in Hitt[ite] hassu-sara- 'queen' (from hassu- 'king')
but that didn't explain why tri- became ti- or catur- became cata-.
Today I realized something that eluded me for twenty years: the first -r- was dropped to avoid two -r-s in the same word:
Pre-Proto-Indo-European *tri-ser-* > Proto-Indo-European *tiser- > Skt tisras-
Proto-Indo-European *kʷetur-sr- > Skt catasr-
This cannot be the whole story, though. It still doesn't account for the -a- in catasr- instead of the expected *catusr-. The unusual -a- cannot be a Sanskrit innovation because it is also in Avestan cataŋr- 'four' (f.; -ŋ- is from *-s-). Was *ur irregularly reduced to a syllabic *ṛ that became a to avoid two rs in the same cluster?
*Beekes (1995: 212) reconstructed Proto-Indo-European *-ser- in 'three' but *-sor- as the suffix that became Hittite -sara-.
2.10.0:17: Skjærvø (2003: 207) wrote:
The [Avestan] element -šr-/-ŋr- is an ancient suffix found in the fem. forms of the numerals "3" and "4" in several Indo-European languages. It may be related to strī- (< *srī-) "woman" and -ŋhar- in xvaŋhar- "sister" (if originally *xva-har- "one's own woman"?)
Avestan xvaŋhar- goes back to Proto-Indo-European *swesor-, so 'one's own woman' is not an Avestan innovation.
Wikipedia's entry for Proto-Indo-European *swésōr (the nominative singular with a long vowel in the second syllable) has a different etymology (I have rewritten the laryngeals using Beekes' [1995] values):
Possibly a compound of reflexive pronoun *swé (“self”) and *ʔésʕr̥ (“blood”), so literally “woman of one's own kin group” in an exogamous society.
Is there any evidence for laryngeals in 'sister', or were they lost without a trace?
13.2.3.23:45: UN-SHYR-TAIN EVIDENCE FOR RETROFLEXION IN TANGUT PERIOD NORTHWESTERN CHINESE
While checking the entry for
2gii '(to) hope'
in Li Fanwen (2008: 770), I noticed that the tangraph in the adjacent entry
1ʂɨəʳ 'the surname Shyr'
was used as a transcription character for Chinese syllables that are now pronounced shi [ʂr̩] in modern Mandarin: 十什失實室涉. For years I assumed that these syllables would have been pronounced *ʃi in the northwestern Chinese dialect known to the Tangut. I thought that *ʃ did not become retroflex ʂ until the last few centuries due to a chain shift:
Stage 1: *ʃ > ʂ
Stage 2: *s > ɕ (not quite ʃ, but close)
However, if the Tangut transcribed such syllables with a retroflex vowel according to most current reconstructions*, that might be evidence for reconstructing them with a retroflex initial *ʂ and possibly even a retroflex syllabic *r̩. I need to dig deeper before drawing any firmer conclusions.
2.6.2:27: In Gong's 1991 study of transcriptions of Chinese in the Tangut translation of 類林 The Forest of Categories, tangraphs read with retroflex vowels (i.e., with rhymes 77-103) are completely avoided with four exceptions:
1ʂɨəʳ (rhyme 92): 什涉寔 ?*ʂr̩ (see above)
1thwəʳ (rhyme 90): 盾 *thwə̃ < Middle Chinese *don
1mɔʳ (rhyme 96): 邈 *mɔ < Old Chinese *mrakʷ
2lwəʳ (rhyme 90): 論 *lwə̃ < Middle Chinese *lon(h)
There are no Tangut rhymes ending in -ə̃. Was -əʳ an attempt to approximate Chinese *-ə̃?
Is the correspondence between -ɔʳ and Old Chinese medial *-r- coincidental? I doubt that *-r- had left a trace as retroflexion over a millennium later in the Chinese dialect known to the Tangut.
The general avoidance of retroflex vowel transcription tangraphs indicates that Chinese dialect lacked retroflex vowels and that Tangut had not merged retroflex and plain vowels. (If such a merger had occurred, tangraphs with retroflex as well as plain vowels would frequently transcribe Chinese syllables.)
*2.26.2:29: Gong (1997) and Arakawa (1999) also reconstructed a retroflex vowel in 'the surname Shyr'. However, Sofronov's new 2012 reconstruction still completely lacks retroflex vowels:
| Tangraph | Gong | Arakawa | Sofronov | This site |
![]() |
1śjɨr | 1shI:r | 1-ə̂, 1-jə, 1-ɪ? | 1ʂɨəʳ |
I do not know how Sofronov would reconstruct the initial.
Sofronov reconstructed rhyme 92 three different ways. I do not know which reconstruction he would choose for this tangraph.
13.2.2.23:56: A HOPEFUL FAMILY OF FOUR
The tangraph (Tangut character) 2ʂwii 'to need, want, require' consists of two halves whose functions and sources are unknown:
=
+
2ʂwii 'to need, want, require' = ? + ?
It combines with other components to form three other tangraphs:
+
=
'top' or 'female' + 2ʂwii 'to need, want, require' = 1võ 'to wish' (< Chinese 望)
+
=
? + 2ʂwii 'to need, want, require' = 2gii '(to) hope'
+
=
? + 2ʂwii 'to need, want, require' = 2kiʳw 'wide, roomy*, shoe last'
If I did not know any Tangraphic Sea analyses, I might think that the structure of these characters could be explained by assuming that they reflect the structure of 'Tangut B', a hypothetical non-Sino-Tibetan language implied by the script but otherwise leaving no direct traces (!):
| Tangraph | Tangut (A) reading | Tangut B reading (A-E are algebraic symbols) |
![]() |
2ʂwii | AB 'want' |
![]() |
1võ | CAB 'wish' (verb AB 'want' with prefix C-) |
![]() |
2gii | DAB 'hope' (verb AB 'want' with prefix D-) |
![]() |
2kiʳw | EAB (adjective E 'wide' plus suffixes -A, -B or a suffix resembling Tangut A 2ʂwii) |
However, I don't think Tangut B is necessary to explain this set.
The first tangraph vaguely resembles the Chinese characters 欲 'desire' (cursive forms) and 須 'need' (cursive forms). Could it be a distortion of a whole Chinese character resembling two components found in other tangraphs?
The second tangraph is analyzed as a semantic compound in the Tangraphic Sea:
=
+
1võ 'to wish' = top of 1kiụ 'to pray' + all of 2ʂwii 'to need, want, require'
The analyses of the other two tangraphs are unknown, but I suspect they might be
=
+
2gii '(to) hope' + top of 1dzɨu 'love, affection' + all of 2ʂwii 'to need, want, require'
=
+
![]()
Could 'shoe last' be a derived meaning of 2kiʳw 'wide': i.e., 'the thing as wide as a foot'? There is a disyllabic word for 'shoe last' including 1zie 'wide':2kiʳw 'wide, roomy, shoe last' = right of 1zie 'wide' + all of 2ʂwii 'to need, want, require'
1zie 2kiʳw
*These first two glosses are from Kychanov and Arakawa (2006: 677). I don't know of any textual evidence for them.
13.2.1.8:30: (X)EGG
Watkins (2011: 7) regarded Proto-Indo-European *ʕōw-yo- 'egg' (> Latin ovum, English egg; more descendants at Wiktionary) as a possible derivative of PIE *ʕewi- 'bird' (> Latin avis). PIE *ʕ normally becomes zero in Iranian: e.g., in Iron Ossetic 'egg' is айк ayk. So why is 'egg' خایه xāye with x- in Persian?
Also, why does Spanish huevo have an h- absent from Latin ovum and its other descendants (e.g., Portuguese ovo, French oeuf, and Italian uovo)? Hypercorrection: i.e., adding a nonetymological silent h- out of fear of omitting an etymological silent h-?
13.2.1.2:11: DARKENING (ALPHACISM?) IN NORTH FRISIAN AND SOUTHEAST ASIA
The Wikipedia article on North Frisian describes the lowering of ɪ to a as "vowel reduction", but that term makes me think of the reduction of vowels in unstressed syllables, not stressed monosyllabic content words like 'fish': e.g., Mooring North Frisian fasch and Fering-Öömrang North Frisian fask corresponding to English fish, German Fisch etc. I prefer the term darkening, an antonym of brightening which Matisoff used in his 2004 paper on raising and fronting of *a in Tangut: e.g.,
1mi 'not' < *ma (cf. Old Chinese 無 *ma 'not exist')
I also made up the term alphacism by analogy with terms like zetacism.
Whatever it's called also occurred in the ancestor of Cantonese: e.g.,
立 'stand'
Middle Chinese *lip > Mandarin li, Sino-Korean rip
but Cantonese lap
The largest stratum of Sino-Vietnamese was borrowed from a southern Late Middle Chinese dialect that was in transition: e.g., 'stand' is SV lập [ləp].
Similarly, there are transitional North Frisian dialects with partial darkening: e.g., Söl'ring fesk 'fish'.
Conversely, Tangut has partly as well fully brightened forms: e.g., 'not exist' is
1mie < *ma (cf. Old Chinese 無 *ma 'not exist')
with a partly mid diphthong ie instead of the high monophthong i.
13.2.1.1:12: GERMAN -RR- : SATERLAND FRISIAN -DD-
I have been reading about Frisian lately. This photograph in the Wikipedia article on Saterland Frisian caught my eye because the German place name Scharrel corresponds to Saterland Frisian Schäddel.Is this a regular or even an occasional correspondence? If I knew nothing about German, I might guess that Saterland Frisian preserves a [d] that lenited to [r] between vowels in German. However, there is no such lenition in German. Was there a fortition in Saterland Frisian, or is this an irregular correspondence defying explanation? Could the two be separate attempts to represent some (substratal?) third name?
13.1.31.1:21: 'HTER
I initially thought Slovene hči 'daughter' couldn't be a descendant of Proto-Indo-European *dhugʕtḗr 'daughter', but I noticed that its oblique stem hčer- ended in -er, just like the oblique stem mater- of mati 'mother' which is obviously from PIE *méʕtēr. Then I found that Wiktionary derives hči from Proto-Slavic *dŭtʲi which is of course from PIE *dhugʕtḗr. č is obviously from *tʲ, but how did *d become h?
One might expect other cases of Slovene h- corresponding to d(h)- in other Indo-European languages*, but at a glance it seems that the normal Slovene reflex of *d(h)- is d-: e.g.,
| Gloss | Proto-Indo-European | Sanskrit | Russian | Slovene |
| two | *duoʔ | dvā́ ́́ | dva | dva |
| ten | *déḱmt | dáśa | desjat' | deset |
| to give | √*deʕʷ | √dā | dat' | dati |
| to place | √*dheʔ | √dhā | det' | deti |
| smoke | *dhuʕmós | dhūmá- | dym | dim |
An unusual consonant in 'daughter' is not unique to Slovene in South Slavic:
Serbo-Croatian kćiMacedonian ḱerka [cɛrka]
Bulgarian is the 'odd man out' because its dǎšterja ironically looks more 'normal' from a general IE perspective. (Do Bulgarian dialects have Macedonian-like forms?)
One might think that Slovene h-, SC k-, and Macedonian ḱ are traces of PIE *-gʕ-, but that cluster was gone in Proto-Slavic *dŭtʲi. (5:36: Or was it? The Wiktionary page for Proto-Indo-European *dhugʕtḗr 'daughter lists the Proto-Slavic word for 'daughter' as *dŭkti with *-k-, though clicking on the link for that word goes to the entry for *dŭtʲi without *-k-.)
I think the Macedonian form is from *tʲer-ka < *dtʲer-ka: cf. in West Slavic how Polish córka 'daughter' corresponds to Czech dcera and Slovak dcéra.
However, I can't explain Slovene h- or SC k-. I guess Slovene h- might be from *k-, but where did that *k- come from? Is it the result of dissimilation and assimilation?
*dtʲ- > *gtʲ- > *ktʲ-
Assimilation not preceded by dissimilation would have led to a geminate that probably wouldn't have been dissimilated:
*dtʲ- > *ttʲ- (> *ktʲ- doubtful)
Bogadek's (1944) Croatian dictionary lists kćeti as a variant of htjeti 'to want'. In this case k must be from h rather than the other way around because it is from Proto-Slavic *xŭtěti (cf. Russian xotet', etc.).
ADDENDUM: The Interslavic word for 'daughter' is dočera. This word is transparent to Russian speakers who will recognize it because it resembles the oblique stem dočer- of doč', but I wonder how recognizable it would be to West and South Slavic speakers. Would they guess from context that moj syn i moja dočera means 'my son and my daughter'? (Syn is pan-Slavic with minor variations: e.g., Slovene and Serbo-Croatian sin.)
*Such a correspondence is rare but not impossible. Toisanese h- is partly from *dh-: e.g., 唐 hɔŋ 'Tang Dynasty' from Late Middle Chinese *dhaŋ.
13.1.30.23:15: THE AFRICAN ARGUMENT FOR PROTO-INDO-EUROPEAN EJECTIVES
I have long been bothered by the glottatic theory for Proto-Indo-European because I didn't know of a language whose voiceless ejectives had become voiced plain stops. But on Monday I learned of two languages with voiced allophones of ejectives.
Judging from these examples, Blin in Eritrea has voiced allophones of ejectives intervocalically and before voiced consonants. However, /kʷʼ/ deviates from this pattern: it is voiced initially in the same position as /kʼ/ [kʼ] and is debuccalized intervocalically. I presume /kʷʼ/ surfaces as [kʷʼ] in other positions: e.g., perhaps before voiceless consonants.
In Kwa'dza in Tanzania, "/kʼ/ and /kʼʷ/ are voiced [ɡ, ɡʷ] if a preceding consonant is voiced" according to Christopher Ehret (1980). Why aren't other ejectives like /tsʼ/ voiced in that position? Because they're affricates? I am reminded of how the Korean fricative /s/ doesn't voice intervocalically unlike the stops /k t p/ and the affricate /c/.
I also found this passage in the Wikipedia article on ejectives (emphasis mine):
In the languages where they are more obvious, ejectives are often described as sounding like "spat" consonants [which is what they sound like to me]; but ejectives are often quite weak and, in some contexts, and in some languages, are easy to mistake for tenuis or even voiced stops.
So it would not be surprising for Proto-Indo-European ejectives to become plain voiceless stops in Hittite, Tocharian, and Germanic or voiced stops in the other branches. I almost included Armenian, but on Monday I also learned that the Armenian reflexes of Proto-Indo-European stops are a lot more complex than I thought. More on them later.
13.1.27.15:00: PRE-ʔG-LOTTALIZATION, NOT ME-TATHES-I-S
This morning I had a dream about Armenian which made me check Beekes' (1995: 130) table of Proto-Indo-European, Proto-Germanic, and Armenian consonants. Along the way I rediscovered this passage on page 133 (emphasis mine):
... the glottal feature probably preceded the consonant: it was pre-glottalized, 'p, etc. Understood in this way, Lachmann's law for Latin is explained. This law states that a PIE [Proto-Indo-European] voiced (non-aspirated) stop b d g gʷ before a consonant lengthened a preceding vowel: for example: ag-ō, āc-tus, but veh-ō : vĕc-tus (with *ģʰ). The solution is that the glottal stop (ʔ) of the g = kʼ lengthens the preceding vowel: aʔg-tos > āctus. The glottal stop works, then, in the same way as a laryngeal [which leaves a trace as a vowel length]: cf. eh2C > āC. One of the laryngeals was probably a glottal stop.
On page 126, Beekes identified the three laryngeals of PIE (*h1, *h2, *h3) as *ʔ, *ʕ, and *ʕʷ, so PIE *eh2C could be rewritten as *eʕC.
(Does any language have ʕʷ without ʔʷ? UPSID lists only one language with ʔʷ - Kabardian - and no languages with ʕʷ. I think Old Chinese might have had both. Modern Vietnamese has [ʔw], but that's a cluster /ʔ/ plus /w/, not a unit phoneme.)
If PIE 'voiced' stops (in the traditional reconstruction) are reinterpreted as preglottalized stops, then PIE *ʔeǵHom from "Me-tathes-I-s" could be reinterpreted as *ʔeʔǵHom and the second glottal stop was preserved in Kortlandt's (2012: 2) Proto-Balto-Slavic (PBS) *ʔeʔźun 'I'.
The *H after *ʔǵ apparently vanished without a trace in Proto-Balto-Slavic. This *H is needed to account for Sanskrit -h- in aham 'I'. Beekes (1990: 307) mentioned that Rix (1976: 177) reconstructed that uncertain laryngeal as *h2 and guessed that Rix thought only *h2 could condition aspiration in Indo-Iranian. Lunt (2001: 232) also reconstructed *h2, citing Greek ego, Latin ego, and Sanskrit aham, but not explaining his reasoning.
Maddieson (2011) wrote that sounds similar to implosives "are often referred to as “pre-glottalized voiced stops” by linguists working on Asian and Pacific languages." Could traditional PIE *b *d *g *gʷ be reinterprted as implosives ɓ ɗ ɠ ɠʷ? I doubt it, because languages with implosives tend to have ɓ but not ɠ (e.g., Vietnamese), whereas PIE has the opposite pattern: traditional b is rare, but traditional g is common.
I also happened to open Beekes' (1988) Gatha Avestan grammar today and find a section on preglottalization on page 71:
In Indo-Iranian the preglottalization was still present at the time of Lubotsky's Law (see 53.2) and is preserved in modern Sindhi.
Modern Sindhi does have implosives in addition to the usual four types of Indic stops: e.g., ɓ as well as p ph b bh. Khubchabdani (2003: 627) mentioned that Turner (1924) derived Sindhi implosives "from geminated voiced plosives": e.g., əɠʊ 'before' < Sanskrit agra- 'top, front' (with an intermediate stage like Pali agga). However, this does not account for initial implosives: e.g.,
ɗ̣ohʊ 'fault' : Sanskrit doṣa- 'id.'
ɗ̣əhə 'ten' : Sanskrit daśa 'id.'
ɗ̣is- 'see', ɗ̣ekh- 'show' : Sanskrit √dṛś 'see'
ɗ̣y- 'give' < *dH; cf. Sanskrit √dā < *deʕʷ 'id.'
ɗ̣əndʊ 'tooth' : Sanskrit danta- 'id.'
Note how the Sindhi reflexes of Sanskrit d are retroflex (indicated with the non-IPA subscript dot) as well as implosive. Does Beekes view Sindhi implosives as conservative? Ah, I see now. Kortlandt (2012: 1) wrote (emphasis mine):
I have argued that the Sindhi preglottalized voiced stops are an archaism (2010: 121-124). In this language, the unconditioned reflexes of the d and dh series are glottalic and aspirated, respectively, while dissimilation of the dh series before aspirates of recent origin has given rise to a plain voiced series, e.g. ’gāhu ‘bait’ < grāsa versus gāhu ‘fodder’ < ghāsa-. The glottalic articulation cannot be attributed to external influence because the surrounding languages do not present anything comparable.
One could derive ’gāhu from *ggāsa with an initial geminate from *gr, but that cannot account for the implosives corresponding to Sanskrit nongeminates in 'fault', etc. above.
I would add that loans from Sanskrit are also a source of voiced stops without preglottalization: e.g., Sanskrit duḥkha- 'pain' corresponds to native Sindhi ɗ̣ʊkhʊ and was borrowed as dʊkhʊ (forms from Khubchabdani 2003: 637).
I found that article when looking for Kortlandt's 2004 article about preglottalization in English and Scandinavian. Could this phenomenon be a remnant from Proto-Indo-European?
When a phoneme is accompanied (either sequentially or simultaneously) by a [ʔ], then one speaks of pre-glottalization or glottal reinforcement. This is common in most varieties of English, RP included; /t/ and /tʃ/ are the most affected but /p/ /k/ also regularly show pre-glottalization. In the English dialects exhibiting pre-glottalization, the consonants in question are usually glottalized in the coda position. E.g. "what" [ˈwɒʔt], "fiction" [ˈfɪʔkʃən], "milkman" [ˈmɪlʔkmən], "opera" [ˈɒʔpɹə]. To a certain extent, there is free variation in English between glottal replacement and glottal reinforcement.
This makes me wonder if those preglottalized voiceless consonants were ever voiced in pre-Germanic. Did Proto-Germanic preserve Proto-Indo-European *ʔt, etc.?
The preglottalization of English codas reminds me of the use of บ <ʔb> and ด <ʔd> for final [p] and [t] in Thai. Were those codas preglottalized when the Thai script was created? (There is no <ʔg> in Thai, so final [k] is written as ก <k> in native words. Final [k] in loanwords is written etymologically: e.g., Thai [roːk] 'disease' from Sanskrit roga- 'id.' is written as โรค <rōg>.)
13.1.26.17:26: 'FREEDOM' AND 'LIBERTY' IN TJK
The pan-Sinospheric word for 'freedom' and 'liberty' is 自由, which could be interpreted as 'self-reason'. I don't know if Tangut, Jurchen, and Khitan had words for the concept.
Maybe such a word is in Kychanov and Arakawa's 2006 Tangut dictionary, but I can't search for Russian, English, or Chinese glosses. So the best I can do is calque the Sinospheric word in Tangut as
2səu 1ʔiew 'self-reason'
By coincidence, 2səu 1ʔiew is vaguely like *sɨ jɔ, the Old Vietnamese pronunciation of 自由, now tự do 'freedom'.
I don't know of any cognates for 2səu < *Cʌ-suH. There are two 2səu 'self' in Tangut which are derived from each other in the Precious Rhymes of the Tangraphic Sea:
=
+
2səu 'self' (Li Fanwen 2589) =
[left and right of] 2səu 'self' +
[right of] 2səu 'to plot, scheme, conspire'
=
+
2səu 'self' (Li Fanwen 2588) =
left and right of 2səu 'self' +
right of 2səu 'to plot, scheme, conspire'
The second analysis is strained, as the right side of 2səu 'to plot' (alphacode: jen) does not match the center of 2səu 'self' (alphacode: dax) which is apparently unique.
Perhaps 2səu 'to plot' is phonetic in both 'self' tangraphs and its abbreviation jen has been simplified in one of them to dax. But what are the functions of the 'water' element on the left (alphacode: cir) and the right-hand element (alphacode: dil)?
And what was the difference between the two 2səu meaning (?) 'self'?
The Precious Rhymes of the Tangraphic Sea defined LFW 2588 as 'Tangut person; we*' and LFW 2589 as 'acquiring profit' (i.e., 'self-interest'?).
Nishida (1966: 417) had no definition for LFW 2588 and defined LFW 2589 as 'dear person, husband, friend'.
Grinstead (1972: 136) also had no definition for LFW 2588 and defined LFW 2589 as '(lover)'.
Shi et al. (2000: 198) defined LFW 2588 as 'mutual assistance' and LFW 2589 as 'acquiring profit'. (The parts of speech are unclear.)
Kychanov and Arakawa (2006: 322) defined LFW 2588 as 'myself, my' and LFW 2589 as 'influence, authority'.
Li Fanwen (2008: 426) defined LFW 2588 as '(used before disyllabic verbs) self, oneself' (without any examples before a disyllabic verb) and LFW 2589 as 'self, I'.
I think 1ʔiew 'reason' could be a loan from Chinese 由 'id.' The analysis of its tangraph is circular:
=
+
+
1ʔiew 'reason' =
left of 2niee 'heart' +
left of 2ʔiew 'doubt' (phonetic) +
left of 1nɔ̃ɔ̃ 'reason'
=
+
1nɔ̃ɔ̃ 'reason' =
right of 1ʔiew 'reason' +
left and center of 1nɔ̃ɔ̃ 'after, beside, too, and' (phonetic)
Could 2ʔiew 'doubt' be from 1ʔiew 'reason' plus a suffix *-H that conditioned the 'rising' (i.e., 'second') tone? I, um, doubt it, though the English phrase reasonable doubt comes to mind.
anakv.com translated Chinese 自由 'freedom, liberty' as Manchu sulfan, and one could antiquate this as Jurchen *sulpan. Norman translated its root sulfa (< Jurchen *sulpa) as 'at leisure, leisurely, idle, free, at ease, without cares, loose'. I don't know how *sulpan would have been written in Jurchen, as I cannot find any <sul> or <pan> in Jin (1984). Perhaps those syllables were written with two of the hundred or so Jurchen characters with unknown readings.
As for Khitan, one could calque Written Mongolian erke cilüge (modern standard эрх чөлөө), literally 'power space'.
*1.26.17:52: Shi et al. (2000: 198) translated the third and fourth characters of the Precious Rhymes of the Tangraphic Sea definition (LFW 0385 and 2065) as 'know' and 'help' but Li Fanwen (2008: 426) interpreted the difficult-to-read third character as 5091; 5091 and 2065 together mean 'we' (inclusive; Gong 2003: 607).
13.1.25.8:10: (I)OSÍJA
I wasn't sure if the first letter in the name of the subject of this icon was И <I> or Й <J>. David Boxenhorn found a site listing it as Иосия <Iosija>. One mystery solved; others remain:
- The page calls him Пророк Иосия <Prorok Iosija> 'Prophet Josiah' (not Hosea, contra Wikipedia!). But was Josiah a prophet?
- Is the second 'word' on the top left пророкъ <prorokŭ> 'prophet'? The two vertical lines on the left might be п <p>, the next г-like letters might be р <r>, the penultimate letter might be a tiny о <o>, and the last letter might be ъ <ŭ>. But those matches are weak and I don't see matches for the rest.
- Is the first 'word' on the top left (partly) a Cyrillic numeral? I see what might be a titlo atop the second and third characters. But ІИ (or ГИ?) don't make sense as Cyrillic numerals. 18 is ИІ (= 8 + 10), not ІИ (= 10 + 8), and ГИ is 3 next to 8, not 38 or 83. And what is the first character? А? Л? How does it relate to the following characters?
So apart from correcting the name <Iosija>, I'm still more or less where I was yesterday.
13.1.24.7:00: ME-TATHES-I-S?
In "Balto-Slavic personal pronouns and their accentuation" (2012: 2), Frederik Kortlandt reconstructed Proto-Balto-Slavic (PBS) *ʔeʔźun 'I' corresponding to Sanskrit aham 'id.'
I was surprised by PBS medial *-ʔ-. Is it a metathesized reflex of Proto-Indo-European *H (an uncertain laryngeal)?
PIE *ʔeǵHom > PBS *ʔeʔźun
What is the evidence for this metathesis and for the continued presence of a laryngeal in PBS?
Also, did initial *ʔe(ʔ)- (which was accented unlike Sanskrit a-) regularly break to Proto-Slavic *ja-? (I am reminded of how Korean *e broke to yə, though this change was not confined to initial position: e.g., Sino-Korean 西 *se 'west' became 셔 syə [now 서 sŏ].)
Kortlandt's PBS oblique plural stem *noʔs- 'us' is also puzzling since it has a glottal stop corresponding to nothing in PIE *n(o)s-. Its second person counterpart PBS *woʔs- also has an unexpected glottal stop absent from PIE *u(o)s-. *-ʔ- in *woʔs- could be influenced by the glottal stop in *juʔs 'you' (nom. pl.), but there is no glottal stop in *mes 'we' (nom. pl.) to serve as a model for a glottal stop in *noʔs-. Maybe the glottal stops spread from the dual forms which have laryngeals in Kortlandt's PBS and PIE reconstructions, but I would expect duals to be remodeled after more common plurals rather than the reverse.
No, wait, I see that Kortlandt reconstructed intermediate stages *iʔnsme and *uʔsme for the PBS first and second person plural accusatives (later replaced by the genitives which became the forms in the previous paragraph). I would have expected PBS *ʔinsme and *ʔusme from PIE *nsme and *usme. More metathesis?
Was Kortlandt proposing tonogenesis here?
Since the acute of *noʔs and *woʔs [...] originated from the initial zero grade of *nsme and *usme [which became *iʔnsme and *uʔsme with secondary glottal stops; see above] while the acute of *tuʔ, *juʔs, dual *weʔ, *juʔ, *noʔ, *woʔ is of laryngeal origin
He seemed to be deriving acute accent from both primary and secondary glottal stops. The Vietnamese sắc tone (which happens to be written with an acute accent!) arose in syllables with voiceless initials and final glottal stops.
13.1.24.6:30: (J)OSÍJA
Why does this 18th century Russian icon of Hosea have ЙОСИЯ <JОСÍJA> with <J> if his Russian name - at least today - is Осия <Osija> and the original name was הושע <hwš`> without any <y>? Was the acute intended to represent the stress in Russian Оси́я?
I've been puzzled by the text beneath his name since Sunday. I think I've finally figured it out:
Се Богъ 'Behold God' (with the о inside the Б)
нашъ 'our'
и не прі 'and not will-be-ap-
ложітся '-pended'
I have normalized the capitalization. I don't understand the function of the accent marks which I have left out of my transcription. I assume the two dots on Ї are decorative and have nothing to do with the Ukrainian letter ї <ji>.
After Googling, I would expect the illegible text at the bottom to correspond to
инъ къ Нему 'other to Him'
so the whole phrase means something like
'Behold our God, and no other will be appended to Him.'
cf. the King James Bible, Baruch 3:35:
This is our God, and there shall none other be accounted of in comparison of him
but it doesn't look like it. It has two words with three and five letters, not three words with three, two, and four letters. And I still can't make out the text at the top left.
13.1.23.7:49: _IPPORAH AND _ION
I wondered if the voiced -zz- in English Nebuchadnezzar was due to an intervocalic [z] pronunciation of Latin -s- in Latin Nabuchodonosor, but David Boxenhorn mentioned a counterexample: Zipporah from Hebrew צפורה <ṣpwrh>. Initial ṣ- is obviously not intervocalic. The Z- has no precedent in Latin Seffora or Greek Sepphōra. Dutch and German Zippora also have Z- (though I presume the Dutch Z is [z] whereas the German Z is [ts]).
On Monday I found another example of this z : צ <ṣ> correspondence. Years ago I was surprised to learn that 'Zion' was Shion in Japanese. Now I know that the original was ציון <ṣywn> with a voiceless initial consonant. The OED lists S-forms in English up to the 18th century when S- and Z-forms coexist; Zion dominates thereafter. Zion is in the King James Bible of 1611.
According to Wikipedia,
The commonly used form [I presume this refers to Zion] is based on German orthography, where z is always pronounced [t͡s]
Is that true? A footnote leads to Dixon (1853: 132) which does not mention German:
Whether from a wish to be unlike the [Catholic] church, which they had abandoned, even in this slight matter, or from an anxiety to exhibit their acquaintance with the Hebrew text, the first Reformers, in their translations of the bible, rejected the established orthography of the scripture names, substituting for it another, which was modelled upon the Masoretic reading of the Hebrew text. Hence has arisen such a frequent discrepancy between Catholic and Protestant bibles - and of course between Catholic and Protestant writers - in the spelling of these names. The Catholic will say Elias, Eliseus, Sion, whilst the Protestant, following his bible, will say Elijah, Elisha, Zion, and so of a vast number of names of persons and places [...] However, James [...] had still sense enough to perceive that if the principle of the Reformers, in this particular, were fully carried out, it would make their translation ridiculous in the eyes of the people, who, perhaps, would be even provoked to laughter at hearing of the five books of Mosheh, the strength of Shimshon, or the wisdom of Shelomoh. Now the translators, as far as the principle, which guided them in this matter, was concerned, had just the same right, and no more, to change Elias into Elijah, and Josaphat into Jehosaphat, that they had to change Moses into Mosheh, Samson into Shimsohn [sic], or Solomon into Shelomoh : but the reader now understands why James thought fit to limit the operation of their principle.
So it seems that only less common names were changed to be closer to Hebrew. But I still don't know if the choice of z for צ <ṣ> was due to German influence. (How old is the affricate pronunciation of צ <ṣ>?)
Are Dutch and Danish (but not Norwegian or Swedish) Zion due to German influence even though z is not [ts] in those languages? (Danish also has Sion like its sister Scandinavian languages.)
Is 'Zion' really Zion in Haitian Creole even though it is Sion in French?