18.104.22.168:57: AURAL DOUBLES (PART 2)A recap of part 1: Tangut had two syllables with similar fanqie ('to hear' as an initial speller plus a rhyme 20 final speller):
3369 1mia 'transcription character for Skt ma, mā' and sixteen homophones = 5026 1mi 'to hear' + 3853 1tia 'topic marker'
=+5025 2mia 'transcription character for Skt mya' = top and bottom left of 5026 1mi 'to hear' + left of 5314 2ʔia 'transcription character for Sanskrit ya'
If both syllables were mia (disregarding tones), why was 3369 1mia used to transcribe Sanskrit ma and mā without -y-? And why create 5025 2mia as an 'aural double' of 3369 1mia etc. if 1mia was already a good match for Sanskrit mya?
The answer to both questions is the same: 3369 etc. were actually 1ma, not 1mia, so a special character had to be created to transcribe Sanskrit mya.But wait - if rhyme 20 was -a, then I can't reconstruct rhyme 17 as -a anymore. What was rhyme 17? To answer that question and the questions I asked at the end of part 1 -
Why did I reconstruct -i- in rhyme 20? Can this -i- be salvaged?
- I need to write about 'grades'. I've already covered the topic in "G-*r-adation in Chinese" (part 1 / part 2) and "G-*r-adation in Tangut" (part 1 / part 2), but I've changed my mind about a few things over the past day.
In the Yunjing rhyme tables for some unknown variety of Late Middle Chinese, a-type syllables were placed in four tables:
|Grade \ Table||27||28||29||30|
Vietnamese, Korean, and Japanese loans from Late Middle Chinese have /-(w)a/ for all of those rhymes, so their vowels must have been a-like. One could reconstruct a single Yunjing phoneme */a/ and compress the four tables into two:
|Grade \ Table||27+29||28+30|
But why didn't the author of the Yunjing do that? I think it's because */a/ had two allophones, back *[ɑ] and central and/or front *[a]. The *[ɑ] rhymes were placed in tables 27 and 28, while the *[a] rhymes were placed in tables 29 and 30. I reconstruct these allophones on the basis of correspondences with standard Mandarin and Cantonese. (The latter two languages are probably not descendants of the Yunjing language, but their ancestors were probably similar to it.)
|Grade \ Table||27+29||28+30|
|I||[ɤ] after velars, [wɔ] elsewhere|
after *back initials, [a] elsewhere
|Grade \ Table||27+29||28+30|
|II||[aː]|| [waː] after velars, [aː]
The Cantonese pattern is quite clear:
- Grade I: back vowel
- Grade II: central vowel
- Grades III and IV: front vowels
The Mandarin pattern is complicated by these shifts:
*[ɔ] > [ɤ] after velars, [wɔ] elsewhere
*[wɨɑ] > *[wja] > *[ɥa] > *[ɥɛ]
*[ɛ] > [ɤ] after retroflexes
Sino-Vietnamese, Sino-Korean, and Sino-Japanese data for some non-a
rhymes indicate that Grade IV was more palatal than Grade III (which
may have been entirely nonpalatal in the source dialects of SK and the
Go-on layer of SJ): e.g.,
||Sino-Korean (premodern spelling)
||-ŏn after back initials; -yŏn
||-on < *-ən
||-iên with palatalization of labial
initials: *pʲ- > t-, etc.
Similarly, Mandarin Grade IV [jɛ] is more palatal than Grade III [ɤ].All these diverse sources give us some idea of what the four grades in Chinese were like:
- I was backer than the others
- IV was more palatal than III
We do not know for sure that Tangut also had grades. I do not know of any Tangut term for 'grade'. However, patterns of correlation between Tangut rhymes and Chinese grades in transcriptions have been known for over half a century. Moreover, those patterns also correlate with Tangut initials.
Here is a new Tangut-internal definition of 'grades'. One could identify the grade of a Tangut rhyme by looking at which initials may precede it:
|Grade \ Initial
The table above is only a first approximation.
I classify rhymes which can be preceded by any initial as Grade
III/IV. One could also consider such rhymes Grade V, though such a term
would have no parallel in the Chinese tradition.
Compare that distribution of initials with the distribution of
Chinese initials in Yunjing:
|Grade \ Initial
||*w- and labiodentals
The two patterns are not identical, but there are similarities:
- Labiodentals and r- never appeared in Grade IV.
- Dentals and sibilants were in near-complementary distribution with shibilants.
- l- was infrequent in Grade IV.
I think these similarities were due to Chinese influence on Tangut. Of course, Tangut had its own history, which is why the parallels are not absolute: e.g.,
- Tangut had Grade I and II v- unlike Chinese (in which *w- became *ɣw- in Grades I and II - a change absent from Tangut).
- Tangut had Grade I r- unlike Chinese (in which *r- became *l-; Yunjing *r- is from *n-)
The distribution of initials in each grade tells us whether certain
grades were 'friendly' or 'hostile' toward certain initials. Such
'attitudes' give us clues about the phonetic characteristics of both
grades and initials. For example, the fact that shibilants never occur
in Grade IV, the most palatal of the grades, tells us that they were
not palatal in either the Yunjing language or Tangut. That is
why I reconstruct retroflex shibilants. One can also make historical
inferences from (near-)complementary distribution: e.g., Chinese
shibilants derived from dentals and sibilants, and Tangut shibilants
may have partly derived from dentals and/or sibilants.
Having established strong parallels between grades in the two
languages, I used to think that Grade IV a was the same in Yunjing
Chinese and Tangut: i.e.., -ia. But I could not explain why
Tangut rhyme 20 -ia
- transcribed Sanskrit -a and -ā (and nearly all rhyme 20 characters for Sanskrit -ya syllables were fanqie tangraphs combining part of an initial speller with the left side of the character transcribing Snaskrit ya: e.g., 5025)
- was transcribed as -a(H) in Tibetan
Now I think I have a solution:
||Transcribed in Tibetan as
||-a (rare; only after shibilants)
If the Chinese dialect known to the Tangut was similar to the Yunjing language, it had four kinds of a-rhymes which were similar to Tangut rhymes 17-20.
The Grade IV a-type rhyme of late Tang Dynasty northwestern
Chinese was transcribed in Tibetan as -ya, matching the *-ia
I reconstructed for the Yunjing language. Maybe that rhyme was
still *-ia in the eleventh century, and the Tangut thought its
front (?) *a was like the front vowel of their rhyme 20.
The Tangut transcribed Sanskrit central a and ā - vowels absent from their language - with both back ɑ (rhyme 17) and front a (rhymes 18-20).
I suspect that rhyme 20 was once an *-ia that simplified to -a
after all initials except glottal stop. Hence the rhyme 20 tangraphs
1767 1ʔia and 5314 2ʔia)
transcribed Sanskrit ya and yā. There was no rhyme 20 *ʔa. The *i of pre-Tangut *-ia may have been conditioned by a preceding presyllable with a high vowel, as Japhug cognates identified by Guillaume Jacques (2006) lack i:
0335 1pha < *Cɯ-pha : J ɯ-phaʁ 'side'
1530 1ma < *Cɯ-ma : J smar 'river'
2098 2ŋa < *Cɯ-ŋa-H : J aʑo < *ŋa-jaŋ 'I' (also cf. Old Chinese 吾 *ŋa 'I')
4225 1sa < *Cɯ-sa : J kɤ-sat 'to kill' (also cf. Old Chinese 殺 *ksat 'to kill')
4459 2ba < *Nɯ-ba-H 'to cut': J kɤ-mbaʁ 'to be cut'Tangut
3926 and 4601 2na < *Cɯ-naH 'thou' and second person singular verb suffix
correspond to Old Chinese 汝 *Cɯ-naʔ 'thou'.
If Grade IV rhyme 20 lacked -i-, and Tangut Grade IV was
characterized by frontness contrasting with the backness of Grade I, I
can revise my
vowel reconstructions as follows:
|IV: fronter/higher||i||e < *ie
||ə < *iə||a < *ia
||y < *iu
||ø < *io|
That table is not as simple as its predecessor from four months ago, but it fits the Tibetan and Sanskrit transcription evidence better.
22.214.171.124:59: AURAL DOUBLES (PART 1)
I remain troubled by my reconstruction of Tangut rhyme 20 (1.20/2.17) as -ia. Let's look at the transcription evidence for (or should I say against?) the syllable 1mia from my last post:
1. In Pearl in the Palm,
0092 1mia 'mother'
was transcribed in 12th century northwestern Chinese as 麻 *mbɤa. Granted, there was no Chinese *mia, so this does not necessarily mean 1mia is wrong.
2. On the other hand, it is possible to write mya in the Tibetan script, and yet 0092 was transcribed eight times as ma. Moreover, all rhyme 20 syllables were consistently transcribed without -y-. The Tibetan evidence favors reconstructions of rhyme 20 like Arakawa's -a: and Sofronov's -a (see this table).
3. Moreover, rhyme 20 was often used to transcribe Sanskrit -a and -ā. That is another point in favor of Sofronov's -a. Sofronov did not reconstruct a length distinction in Tangut, whereas Arakawa did. I would expect Arakawa's length distinction to correspond to the length distinction of Sanskrit, but it doesn't: e.g., Arakawa's long -a: may corresponds to Sanskrit short -a as well as long -ā, and vice versa. (10.23.1:09: Gong's length distinction that I used to carry over into my reconstruction also did not correspond to Sanskrit length:
|Sanskrit||Tangut rhyme||Sofronov||Arakawa||Gong||This site until recently||This site now|
|23||-âˁ, -jaˁ, -äˁ||-ya'||-iaa||-ææ||-ɤa'|
|a, ā||24||-aɯ, -âɯ||-a:'||-jaa||-iaa||-ia'|
Colors indicate length: pink = short, green = mixed, blue = long.
Rhyme 22 could not have been a simple -a or -aa, as it was never used to write Sanskrit. Rhymes 18 and 23 were also un-Sanskrit.)
If rhyme 20 were -ia, there would be no reason to create a special fanqie character
5025 2mia = top and bottom left of 5026 1mi 'to hear' + left of 5314 2ʔia 'transcription character for Sanskrit ya'
to transcribe Sanskrit mya, since one of the seventeen 1mia characters with the fanqie
5026 1mi 'to hear' + 3853 1tia 'topic marker'
would have been sufficient. However,
actually transcribed Sanskrit ma and mā without -y-!
(10.23.0:33: One might think that 5025 was created for Sanskrit mya because the second tone was favored for Sanskrit words. But tones in Sanskrit transcription seem to be random: e.g.,
- ma was transcribed with both 3369 1mia and 4737 2ma
- mi was transcribed with 5026 1mi, the initial fanqie speller for 5025 and 3369
- Cya syllables were transcribed with both first and second tone tangraphs
I doubt tones in Tangut transcriptions of Sanskrit had anything to do with Vedic pitch accent which was absent from Buddhist Sanskrit.)
In Arakawa's (1997) Nishida-style reconstruction, the reason for 'aural doubles' - tangraphs with slightly different fanqie containing 5026 1mi 'to hear' - is clear: 5025 was 2myaɦ, whereas 3369 and its sixteen homophones were 1maɦ without -y-.
In Arakawa's own reconstruction, 5025 might be 2mya: contrasting with 3369 and sixteen other 1ma:. (Yet there is no 2mya: or 2ma: on pp.128-129 of Arakawa's 1997 syllabary, though there are seventeen 1ma:.)
|Tangraph||Li Fanwen number||Sanskrit transcription value||Nishida-style from Arakawa 1997||Arakawa 1997?||Gong
|5025||mya||2myaɦ||(2mya:?; not in his syllabary)||2mja
|3369||ma, mā||1maɦ||1ma: (with long vowel for Skt ma!)||1mja
||1mia (with -i- for -i-/-y-less Skt ma, mā!)|
(10.23.1:46: I don't know how Sofronov would reconstruct 5025 and 3369 today. In 1968, he reconstructed them as 2ma and 1ma.)
At least everyone agrees that rhyme 20 was a-like, which is why I render it as -a in my lay transcription of Tangut.
Next: Why did I reconstruct -i- in rhyme 20? Can this -i- be salvaged?
126.96.36.199:59: WHY SO MIA-NY?
I have been writing about names of Kumārajīva
lately (part 1 / part 2) such as Tangut
3948 3369 3284 2152 3284 (again!) 1kɨa' 1mia 2lɨa 1ʂɨi 2lɨa
The tangraph transcribing mā was one of the rhyme 1.20 syllables in the Tangraphic Sea that I listed last week. Most were written with one or two tangraphs, but 1mia was written with seventeen! (For comparison I have also included the corresponding rising tone syllable 2mia with rhyme 2.17.)
|Tangraph||Li Fanwen number||Reading||Li Fanwen gloss||Type (* = only in dictionaries)|
|0092||1mia||mother (cf. 3334)||free morpheme 1|
|0409||former times (only in dictionaries?; combines with regular word for 'day')||bound morpheme 1*|
|1178||first half of 1mia 2nie 'end' (only in dictionaries; cf. 3369)||free morpheme 1 in a compound 'end-tail'*|
|1215||first half of 1mia 2mɤe' 'to think of, to long for' (only in dictionaries)||morpheme half 1*|
|1216||ten thousand (loan from Late Old Chinese 萬 *mɨanh 'id.'?)||free morpheme 2|
|1458||second half of 2ni' 1mia 'salamander' (only in dictionaries)||bound morpheme 2* after a Chinese loanword 鯢 'salamander'|
|1530||river||free morpheme 3|
|1721||stirrup||free morpheme 4|
|1803||first half of 1mia 1ɬiu' 'gray', name of an ancestor (only in dictionaries)||morpheme half 2*, free morpheme 2*|
|2270||last syllable of (2mɪ) 2mɪ 1mia 'a kind of bird' (only in dictionaries)||morpheme part 3*|
|2648||first half of 1mia 1khiu 'underground' (1khiu is 'under')||bound morpheme 1|
|3334||female, woman (cf. 0092)||free morpheme 1|
|3369||end, tail, east (only in dictionaries; cf. 1178); first syllable of 1mia 2ɬiụ 'plantain' and 1mia ?xa 'water buffalo'; transcription of Sanskrit ma, mā||free morpheme 1*, morpheme half 1, morpheme half 2, (not in Tangut words)|
|3527||analogy; generally; doubt, fear (i.e., uncertain); and; few; should (i.e., to be time for), time; clothes||free morphemes 5-11|
|3569||fishing hook||free morpheme 12|
|3718||second half of 1ɣa 1mia 'doorframe' (1ɣa is 'door')||bound morpheme 2|
|5118||second half of 1niu 1mia 'earring' (1niu is 'ear')||bound morpheme 3|
|5025||2mia||transcription of Sanskrit mya||(not in Tangut words)|
Why are there so many 1mia - and no native 2mia? The lower frequency of second tone syllables indicates that the source of the second tone must have been something extra which I reconstruct as a final glottal *-H by analogy with Chinese.
I reconstruct *Cɯ-ma(C) as the pre-Tangut source of 1mia. The high presyllabic vowel conditioned the breaking of the main vowel:
*C₁ɯ-ma(C₂) > *C₁ɯ-mɨa > *mɨa > 1mia
I don't know when the final consonant was lost relative to vowel breaking.
The various 1mia may have had different presyllabic and/or final consonants in pre-Tangut: e.g.,
*kɯ-map, *tɯ-mak, *pɯ-ma, etc.
I count 24 types of 1mia:
17 in texts (not just dictionaries; pink):7 only in dictionaries (blue; possible 'ritual language' words and/or words that didn't happen to appear in Buddhist, Confucian, military, etc. texts: e.g., 'salamander'):
12 free morphemes (0092 = 3334, 1216, 1530, 1721, 3527 [seven homophones!?], 3569)
3 bound morphemes (2648, 3718, 5118)
2 parts of polysyllabic morphemes (3369 [two homophones])
2 free morphemes (1178 = 3369, 1803)
2 bound morphemes (0409, 1458)
3 parts of polysyllabic morphemes (1215, 1803, 2270)
Green indicates a tangraph (3369) that represents one morpheme only in dictionaries and parts of words in texts.
Further analysis may be able to reduce the number of types of 1mia: e.g., the 1mia in 1458 2ni' 1mia 'salamander' may be 'river' and the 1mia in 4681 5118 1niu 1mia 'earring' may be 'hook'.
Although one could describe tangraphy as 'logography' (i.e., as a word-per-character writing system), 3527 might have represented up to seven unrelated words! Conversely, the word 1mia 'female' was written with two tangraphs (0092 and 3334) depending on whether it referred to mothers or females in general. And 1mia 'end' was written differently depending on whether it was an independent word (3369) or in the compound 1178 5734 1mia 2nie 'end-tail'.
10.22.1:54: A high degree of homophony is tolerable: e.g., English can can mean
1. to be able2. a container
3. to place in a container
4. prison (if preceded by the?)
5. toilet (if preceded by the?)
6. to be ready for release (in the can)
7. to be released from employment (mostly passive: was/got canned?)
8. Canada (e.g., in Canwest)
and various other meanings I have never encountered. Context is sufficient to disambiguate these many uses.
None of those meanings are opposites. One might look up
1530 1mia and 2648 1mia
in Li Fanwen (2008) and think they are near-opposites ('river' and 'land'), but in fact the latter apparently only occurs in the disyllabic expression
2648 5399 1mia 1khiu 'underground'
and I suppose that is much more common than
1530 5399 1mia 1khiu 'under a river'
so there is little risk of ambiguity. (In Google, under a river has 8.74 million hits, which sounds like a lot, but underground has 335 million hits! And many references to under a river involve underwater construction that would have been unimaginable to the Tangut nearly a thousand years ago.)
188.8.131.52:21: 'ZEN': A REMNANT OF TANGUT EMPIRE CHINESE?
KJ Solonin's article made me think about the Tangut name for Zen
3504 1ʂɨã =
all of 2833 2diẽ 'calm, quiet' (probably 'not' + top and bottom right of 'to move')
left of 5593 1bɤo' 'to look, watch, observe'
as well as the Tangut names of Kumārajīva (part 1 / part 2). 1ʂɨã is a borrowing from Tangut period northwestern Chinese 禪 *ʂɨã which in turn is from Late Old Chinese (LOC) *dʑian, a Sinified form of Pali jhāna- (< Sanskrit dhyāna 'meditation'). (Japanese Zen is from Middle Chinese 禪 *dʑien.) Coblin (1994: 323) reconstructed 禪 as *śan ~ *źan in the 9th and 10th centuries AD on the basis of these Tibetan transcriptions:
大乘中宗見解: shan, zhan
南天竺國菩提達摩禪師觀門: zhan, Hzhan
LOC *dʑ developed differently in premodern northwestern Chinese and in Mandarin in 'level' tone syllables:
|Premodern northwestern Chinese||*ź > *ś > *ʂ|
|Mandarin||ch [tʂʰ]||sh [ʂ]|
I don't understand the phonetic motivation for the split. Why were 'nonlevel' tones incompatible with a voiced affricate? (Voiceless affricates were possible before 'nonlevel' tones.)
Although modern northwestern Chinese generally has Mandarin-style reflexes of *dʑ, 禪 'Zen' still has a fricative initial in some varieties (Coblin 1994: 323):
Early 20th century Xi'an (as recorded by Karlgren): ʂæ̃ (tone unknown)
I thought these fricatives might be substratum retentions. I had either forgotten or overlooked this passage earlier in Coblin (1994: 101):
Occasional exceptions are found [to the Mandarin pattern of reflexes of *dʑ ...], e.g. 禪  (QYS źi̯än) "Zen Buddhism": [mid-Tang Chang'an] *dźan > *źan; CSZ [colloquial Suzhou] *śan (~ *źan?); XN [Xining]: ʂã⁴⁴; DH [Dunhuang]: ʂæ̃²⁴. These exceptional modern reflexes appear to derive directly from forms like those found in CSZ.
I looked for those "occasional exceptions" and found
蟬 LOC *dʑian 'cicada' is ʂæ̃²⁴as well as tʂʰæ̃²⁴(cf. standard Mandarin chan) in Xiaoxuetang's Xi'an data
辰 LOC *dʑin 'fifth Earthly Branch' is ʂɛ̃ (tone unknown) in Karlgren's Xi'an data (Coblin 1994: 361) and ʂẽ²⁴as well as tʂʰẽ²⁴ (cf. standard Mandarin chen) in Xiaoxuetang's Xi'an data
This last graph has two Sino-Korean readings, chin (without aspiration!) and shin. The first reading may be an old borrowing from Early Middle Chinese *dʑin; the second is from Late Middle Chinese *ɕin.
The multiple Sino-Korean readings of 什 in 鳩摩羅什 'Kumārajīva') may also be from different strata of borrowing: 집 chip from Early Middle Chinese *dʑip and 십 ship from Late Middle Chinese *ɕip. (집 chip becomes -jip with secondary voicing after a sonorant. That voicing is due to a Korean phonological rule and does not preserve the voicing of Early Middle Chinese *dʑip.)
A third Sino-Korean reading 습 sŭp is difficult to explain; it may be from a different Late Middle Chinese dialect in which *-ip became *-ɨp rather than vice versa.
The Xining reading of 禪 'Zen' also has an irregular 'yin level' tone (which would normally reflect an earlier *voiceless initial) instead of the expected 'yang level' tone (reflecting an earlier *voiced initial). I don't think the tone of 禪 'Zen' indicates that it had a voiceless initial in pre-Xining. I hypothesize that the original dialect of the region had a 'yang level' tone that sounded like the 'yin level' tone of the Mandarin dialect that displaced it.
If I am correct, then a study of irregular tones in Xining may reveal something about the substratal tone system. Unfortunately, it may not reveal the exact values of the tones at the time of borrowing because all tones - substratal and superstratal may have changed since then. So I don't know if 44 was the 'yang level' tone contour in the substratum dialect.
It would be interesting if other modern northwestern dialects also have a seemingly 'yin level' tone for 禪 'Zen'.
Dunhuang only has one 'level' tone which may be a merger of earlier 'yin level' and 'yang level' tones.
I don't know the modern Xi'an reading of 禪 'Zen', but I do know that both the substratal fricative-initial and superstratal affricate-initial readings of 蟬 'cicada' and 辰 'fifth Earthly Branch' have 'yang level' tones in modern Xi'an. Were the tones of the substratal readings shifted to match the superstratal tones?
One last question: Why would northwestern Chinese retain an old word for 'Zen'? The answer probably has something to do with the religious history of the region.
I am reminded of how Japanese Buddhist terminology consists of Early Middle Chinese-based borrowings (呉音 Go-on) that were not displaced by Late Middle Chinese borrowings (漢音 Kan-on) during the Tang Dynasty: e.g., 禪 Zen was not replaced by a newer borrowing *Sen. (One might think that Zen Buddhism was practiced in Japan before the Tang Dynasty, but in fact it took root in the 12th century when 1ʂɨã 'Zen' was practiced in the Tangut Empire. An old reading Zen was used for a new school because of the strong association between Go-on and Buddhism in Japan.)
On the other hand, Korean Buddhist terminology generally consists of Late Middle Chinese borrowings: e.g., 禪/선 Sŏn 'Zen' probably replaced an earlier borrowing that would have become modern 전 *Chŏn. A rare exception is the 什 -jip in 鳩摩羅什/구마라집 Kumarajip. But that is not the most common reading of 鳩摩羅什. Here are Google frequencies for the three readings of the name:
구마라십 Kumaraship: 215,000
구마라집 Kumarajip: 21,900
구마라습 Kumarasŭp: 19,300
The newer reading 십 ship outnumbers the older reading 집 jip by nearly ten to one.
The older voiced affricate reading of 禪 'Zen' has left no trace in Sino-Vietnamese. The only Sino-Vietnamese reading of 禪 is Thiền from southern Late Middle Chinese *ʑien; there is no *Chiền from southern Early Middle Chinese *dʑien.
184.108.40.206:54: THE TANGUT NAMES OF KUMĀRAJĪVA (PART 2)
The third Tangut name of Kumārajīva shares no characters with the other two:
1429 4575 4710 4867 1kiew 2mo 1lo 1ʂɨəʳ
It is obviously based on Tangut period northwestern Chinese 鳩摩羅什 *kɨwmbɔlɔʂɨi from a 4th century *kumaladʑip.
As I mentioned yesterday, 1429 is also the transcription character for 鳩 in the Tangut translation of the Forest of Categories (Gong 2002: 438).
4575 and 4710 are also transcription characters for Sanskrit mo and lo (Arakawa 1997: 111).
4867 was also used to transcribe other Chinese characters pronounced *ʂɨi (十實失室) and 涉 *ʂɨa (Li 2008: 770). The retroflexion in Tangut may have reflected subphonemic vowel retroflexion in Chinese after retroflex affricates: /ʂi/ = [ʂɨiʳ] and /ʂia/ = [ʂɨaʳ].
In theory the name could have been borrowed in a more Sanskrit-like form as *kʊ ma raʳ dzi va via Tibetan kumaradziba [kumaradziwa] or directly from the variety of Sanskrit known to the Tangut which had [dz] for j. (My Tangut reconstruction has no rhyme -u. Retroflexion was almost always obligatory after r- in Tangut.)
I was curious to see how Kumārajīva was rendered in other languages. Judging from Wikipedia entry titles:
Czech Kumáradžíva preserves the long vowels.
Polish Kumaradżiwa [kumaradʐiva] has retroflex dż for Sanskrit palatal j [dʑ]. I would have expected *Kumaradziwa [kumaradʑiva] with palatal dz (pronounced like dź [dʑ] before i). The combination of retroflex dż and palatal i is unusual in Polish. I wonder if that i is pronounced [ɨ] as in the normal Polish combination ży [ʐɨ].
Ukrainian Кумараджива [kumaradʐɪva] has [ɪ] instead of [i]. I presume the spelling was taken from Russian Кумараджива [kumaradʐɨva].
Korean 쿠마라지바 [kʰumaradʑiba] has an un-Sanskrit (and English-influenced?) initial aspirate. I presume it is a modern term. Older names are 鳩摩羅什 Kumarasŭp/Kumaraship/Kumarajip (the last character is read three different ways) and 羅什 Nasŭp (with initial r- becoming n- before a-).
THE TANGUT NAMES OF KUMĀRAJĪVA (PART 1)
Having just written about Chinese transcriptions of Indic, I thought it was neat that I then stumbled upon KJ Solonin's tentative identification of
2152 3284 1ʂɨi 2lɨa
as a Tangut transcription of the name of Kumārajīva (1998: 411, 414 #80), translator of the Lotus Sutra and other Buddhist texts into Chinese. Kumārajīva's Chinese name was 鳩摩羅什, pronounced *kumaladʑip in the 4th century AD. In the Tangut period northwestern dialect of Chinese, it would have been read as *kɨwmbɔlɔʂɨi. If the two names are connected, the Tangut name might be an accidental inversion of
*3284 2152 2lɨa 1ʂɨi
corresponding to 羅什 *lɔʂɨi, an abbreviation of 鳩摩羅什 *kɨwmbɔlɔʂɨi.
(This abbreviation was obviously created by a Chinese speaker, as a
natural break in the Sanskrit would be between Kumāra 'boy, prince'
and jīva 'life'.)
Unfortunately, the name 2lɨa 1ʂɨi only appears once in the
text that Solonin translated. However, a transcription of the full name
鳩摩羅什 *kɨwmbɔlɔʂɨi does appear in the Hongchuan preface of
the Lotus Sutra (Li 2008: 533; see Nishida 2004
on the Tangut Lotus Sutra):
3948 3369 3284 2152 3284 (again!) 1kɨa' 1mia 2lɨa 1ʂɨi 2lɨa
There are several things that are odd about this spelling.
First, 3948 1kɨa' is a poor match for Chinese 鳩 *kɨw. It is a transcription character for Sanskrit ka and kya (Arakawa 1997: 110, 116; Kychanov and Arakawa 2006: 692). In the Tangut translation of the Forest of Categories, 鳩 *kɨw was transcribed as
which is a much better match (Gong 2002: 438). 1429 is also a
transcription character for Sanskrit (?) kyu (Grinstead 1972:
111) and is the first character in a different transcription I'll
Second, 3369 1mia (rhyme 20) has an -i- that
corresponds to zero in Chinese 摩 *mbɔ and Sanskrit and Tibetan ma
(Arakawa 1997: 110, Kychanov and Arakawa 2006: 234).
Maybe I should follow Sofronov and Arakawa and stop reconstructing -i- in rhyme 20.
Third, 3284 2lɨa (rhyme 19) has an -ɨ- that corresponds to zero in Chinese 羅 *lɔ and Sanskrit la (Arakawa 1997: 110).
I have yet to see a fully satisfactory solution to the problem of
reconstructed Tangut medials seemingly reflecting nothing in
transcriptions of Chinese and Sanskrit.
Fourth, 3284 appears again, corresponding to zero in the
four-syllable Chinese name. The first four syllables of this longer
Tangut name are obviously based on Chinese (hence 2lɨa for 羅 *lɔ
rather than *raʳ for Sanskrit ra). I would have
expected a fifth syllable to be
a transcription of Chinese 婆 *phɔ < *ba for Sanskrit va in longer Chinese names for Kumārajīva:
鳩摩羅什婆 *kɨwmbɔlɔʂɨiphɔ < *kumaladʑipba
鳩摩羅時婆 *kɨwmbɔlɔʂɨiphɔ < *kumaladʑɨba
鳩摩羅耆婆 *kɨwmbɔlɔtʂɨiphɔ < *kumalatɕiba
Having not seen the text where Li found this longer transcription, I don't know if this second 3284 is a typo (I doubt that, as even the Chinese translation has a doubled syllable: 鳩摩羅什羅) or in the orignal. Kychanov and Arakawa (2006: 692) do not list any words beginning with 3948. Maybe this longer name is a confused blend of *1kɨa' 1mia 2lɨa 1ʂɨi and the short inverted name 1ʂɨi 2lɨa.
At least 2152 1ʂɨi is a perfect match for Chinese 什 *ʂɨi, and is attested as a transcription of the last syllable of the name 李七什 *lɨi tshi ʂɨi (Li 2008: 356).
Next: Another Tangut name for Kumārajīva.
220.127.116.11:51: TESTING STAROSTIN'S 'LATE-RAL' SCENARIO
(I rhyme lateral [ˈlætəɹo] and scenario [səˈnæɹio]. 'Late-ral' is [ˈlejtəɹo] with a linking schwa to preserve the resemblance to [ˈlætəɹo].)One of the biggest sound changes in Chinese was the loss of laterals:
Old Chinese *l- in type A syllables > Middle Chinese *d-
Old Chinese *hl- in type A syllables > Middle Chinese *th-
Old Chinese *l- in type B syllables > Middle Chinese *j-
Old Chinese *hl- in type B syllables > Middle Chinese *ɕ-
(The nature of the Old Chinese type A/B distinction is disputed, but the Middle Chinese initials are uncontroversial.)
In my last entry, I mentioned two conflicting chronologies for the lateral shift in Chinese. Schuessler (2009) reconstructed Middle Chinese-like initials (*j-, *ɕ-, *d-, *th-) in his Later Han Chinese (i.e., Eastern Han / Late Old Chinese), whereas Starostin mostly reconstructed transitional fricatives or laterals for that period:
|Old Chinese syllable type||Early Old Chinese||Late Old Chinese||Middle Chinese|
|A||*l- (Starostin: *l- and dɮ-)||*l-||*d-|
|*hl- (Starostin: *tɬ-)||*hl-||*th-|
|A and B||*r-||*l-|
|*hl- (Starostin: *tɬ-)||*ɕ-|
(I use the same notation regardless of scholar for ease of comparison. I list Starostin's reflexes of his Early Old Chinese *tɬ- and *dɮ- because they correspond to *hl- and *l- in others' reconstructions. Starostin's EOC *hl- behaved differently from others' *hl-; it became Late Old Chinese and Middle Chinese *h- [= others' *x-]. For arguments against Starostin's lateral affricates, see Sagart 1999. I have included EOC *r- for comparison.)
To test Starostin and Schuessler's reconstructions of Late Old Chinese (LOC), let's look at Eastern Han transcriptions of Indic from Coblin (1983).If Starostin is right:
- LOC *l- should transcribe Indic l
If Schuessler is right:
- LOC *hl- shouldn't be used in transcription because there was no Indic voiceless hl
- LOC *r- should transcribe Indic r
- LOC *d- from EOC *l- could transcribe Indic d
- LOC *th- from EOC *hl- could transcribe Indic th
- LOC *l- from EOC *r- should transcribe both Indic *l and *r (since LOC no longer had *r-)
Both would agree that LOC *ɕ- should transcribe Sanskrit ś [ɕ].
As I already noted last time, the correspondence of Starostin's *ʑ- / Schuessler's *j- to Indic y- [j] is ambiguous since Starostin would have said that *ʑ- was the closest available initial due to the absence of *j- in his LOC. Correspondences between this LOC initial and Sanskrit c-, j- [ɟ], ś- [ɕ], and s- suggest that it was "a fricative or affricate of some sort" (Coblin 1983: 63): e.g., Starostin's *ʑ-.
In the transcriptions of 安世高 An Shigao (mid-2nd c. AD) we find that:
- Indic d and even intervocalic -t- were transcribed with Starostin's LOC *l- / Schuessler's *d- (18, 19; the numbers are from Coblin 1983)
- Indic l was transcribed with Starostin's LOC *r- / Schuessler's *l- (13, 15, 28)
These pattern are not quirks of An Shigao; they can also be found in the transcriptions of 支婁迦淺 Zhi Loujiachen/Lokakṣema (mid-2nd c. AD; his name has 婁 Starostin's LOC *r- / Schuessler's *l- for Sanskrit l-) and 康孟詳 Kang Mengxiang (late 2nd-early 3rd c. AD). All three men were non-Chinese who settled in Luoyang, so their transcriptions probably represent the same dialect.
The only Indic th in An Shigao's transcriptions was transcribed with 替 whose EOC initial is ambiguous. It definitely had *th- in Middle Chinese and must have had *th- here. Starostin might have taken that as evidence for reconstructing 替 with *th- in EOC.
th is a low-frequency consonant, so it's not surprising that there are no instances of it transcribed with original or secondary *th-. (Oddly Lokakṣema transcribed it as the coda-onset sequence -t s- in 55.)
I conclude that the following chain shift had occurred in the Luoyang dialect of LOC by the mid-2nd century AD:
*r- > *l- (type A) > *d-
This is contrary to Starostin's 'late-ral' scenario in which the laterals hardened later.I also reconstruct a parallel change
*hl- (type A) > *th-
on the grounds that it would be odd if *hl- lagged behind its voiced counterpart *l-. Unfortunately there is no Indic transcription evidence for that.
Phonetic glosses such as
indicate that *hl- did not harden in other LOC dialects during the early centuries of the first millennium AD. The glosses would not make sense if *hl- had already become *th-.
'聖 *hlieŋh (type B; > MC *ɕieŋʰ) is read like 通 *hloŋ (type A; > MC *thoŋ)' (Xu Shen 1063, b. in 召陵 Zhaoling 200 km SE of Luoyang, fl. c. 100 AD)
'天 *hlein (type A; > MC *then) read as 身 *hlin (type B; > MC *ɕin)' (Gao You 243, b. in 涿 Zhuo, fl. c. 200 AD)
10.17.23:17: Some LOC glosses that seem bizarre might make more sense if we don't try to shove the words into the standard paradigm defined by the Chinese lexicographical tradition. For instance, perhaps Xu Shen pronounced 通 as something like *hliøŋ with a front diphthong similar to 聖 *hlieŋh. The expected Old Chinese reconstruction 通 *hloŋ is mechanically derived from Middle Chinese *thoŋ, whereas my hypothetical *hliøŋ would have vowel warping conditioned by a presyllable in an Old Chinese variant *Cɯ-hloŋ or *Hɯ-loŋ. Perhaps *Hɯ-loŋ was the earliest form which developed along two paths:
Early fusion: i.e., before *ɯ conditioned vowel warping
*Hɯ-loŋ > *hloŋ > *thoŋ (Middle Chinese prestige form recorded in dictionaries)
Late fusion: i.e., after *ɯ conditioned vowel warping
*Hɯ-loŋ > *Hɯ-luoŋ > *hluoŋ > *hlioŋ > *hliøŋ (> Middle Chinese *ɕyøŋ?; nonprestige and extinct?)
For more examples of variation between fused and unfused presyllables, see the discussion of Phan Rang Cham (Austronesian) and Ruc and Nha Heun (Austroasiatic) in Sagart (1999: 15-17).
18.104.22.168:12: TILTED TONGUE
On Monday, I wrote,
Last night I mentioned
I wrote the pre-Tangut source of ld- as *L-. External evidence may help us identify what *L- was.
3190 1ldwia 'tongue' = (4226 1ldwị + 0537 1pia) + 1223 2phɤo' (Mixed Categories of the Tangraphic Sea 11.122)
as one of the syllables with a fanqie including the mysterious additional character 1223.
1ldwia is probably related to the many l-words for 'tongue' in Sino-Tibetan: e.g.,
Old Chinese 舌 *mɯ-lat or *m(ɯ)-ljat (Baxter and Sagart 2014: *mə.lat)
also cf. 舐/舓/咶 *mɯ-leʔ or *m(ɯ)-ljeʔ (B&S 2014: *Cə.leʔ) 'to lick'
and perhaps 舔 *hlˁimʔ < *qlimʔ or *Hʌ-limʔ (? - I can't find any attestations before the 13th century AD; nonetheless it resembles lem-words for 'tongue' elsewhere in Sino-Tibetan and may be very old) 'to lick'
It is not possible to determine whether Middle Chinese *ʑ- in 'tongue' and 'lick' is from *mɯ-l- or *m(ɯ)-lj-. Coblin (1986) reconstructed medial *-i- for 'tongue' at the Proto-Sino-Tibetan level.
If the third word is related, and if the root was *√lj, then I can reconstruct*m(ɯ)-lj-a-t (a-grade)
It is tempting to reconstruct *m(ɯ)-lj-a-j-ʔ (a-grade), but the phonetics 氏 and 易 point to *e.
*qli-m-ʔ or *Hʌ-li-m-ʔ (zero grade; the *j of the root became *i if no vowel followed)
Classical Tibetan ljags /ldʑags/ < *n-ljaks (Jacques, "The laterals in Tibetan")
CT j is an affricate /dʑ/, whereas pre-Tibetan *j is a glide.
Although it would be nice if Tibetan had *m- like Chinese, *m-lj- would have developed into mj- /mdʑ/, not lj- /ldʑ/ (Jacques, "The laterals in Tibetan").
Written Burmese hlyā
I cannot explain the variation in final consonants (Old Chinese *-t and *-[m?]ʔ, pre-Tibetan *-ks, Written Burmese zero). I presume they are all suffixes.
The pre-Tangut source of 1ldwia must be a combination of the following elements:
ld- may be from a consonant prefix plus root *l-
-w- is from a labial prefix *P- (and that prefix might have combined with *l- to form ld-)
-i- is from *-j- and/or a presyllabic *-ɯ-
a final stop could have been lost without a trace
the tone indicates there was no final *-H
If the root was *√lj, that narrows down the possibilities.
The simplest reconstruction would be *m-lja whose *m- would combine with *l- to form ld- and condition the medial glide -w-.
A more complex reconstruction *P-N-lja would have separate sources of -d- and -w-.
Forms for 'tongue' in Horpa varieties seem to be from *P-lj-: fʑa, vɮɛ, etc. See STEDT and the rGyalrongic Languages Database (item #36).
According to Guillaume Jacques ("The laterals in Tibetan"), Li Fang-Kuei, Coblin, and Gong all reconstructed *n-l- as the source of Written Tibetan ld- (whereas Jacques reconstructed *d-l- since his *n-l- became WT Hd- /nd/.) Perhaps *N-l- similarly became ld- in Tangut. *N- may have been an *n- as in pre-Tibetan *n-ljaks 'tongue' or an *m- as in Chinese *m(ɯ)-ljat.
The only other word out of the eight I discussed yesterday that might have a cognate - with emphasis on might - is
0841 1ɬwiẹ 'oblique, slanting, inclined' = (2814 2ɬị + 3439 1piẹ) + 1223 2phɤo' (Mixed Categories of the Tangraphic Sea 12.122)
Before I go on to a possible cognate, I realize what 1223 is doing here and in various other cases. I think 1223 in such contexts means 'combine the initial of one syllable with a labial-initial syllable to form a syllable with medial -w-': e.g.,
1ɬiẹ + 1piẹ = 1ɬpiẹ > 1ɬwiẹ ̣
Could this suggest that -w- was [v] or [β] and that Tangut labials lenited in coherent speech (as opposed to words pronounced in isolation): i.e., 1ɬiẹ 1piẹ was pronounced [ɬiẹ viẹ] or [ɬiẹ βiẹ]?
Another possibility is that labials were followed by a subphonemic glide [w]: e.g., 1piẹ /piẹ/ was [pwiẹ] and
1ɬiẹ + [1pwiẹ] = 1ɬwiẹ
There was no contrast between /P/ and /Pw/ in Tangut.
That does not explain the highly anomalous fanqie for 2417 (which does not have a labial-initial final speller; moreover, its final speller has a different rhyme with the wrong tone!):
2417 1ʂwɨọ 'to need, want' = (0245 2ʂwɨi + 1449 2tʂhwɨoʳ̣̣) + 1223 2phɤo' (Tangraphic Sea 55.222)
Moreover, 1223 is redundant in cases like the one above and
5679 1khwɤa 'remnants' = (2554 1khwɤe + 4314 1bɤa) + 1223 2phɤo' (Tangraphic Sea 26.211)
in which the initial speller has -w-. Perhaps this use of 1223 originated in fanqie for words like 0841 and was overextended.
Back to cognates: 0841 1ɬwiẹ could go back to *S-P-KE-la:
*S- conditioned the tense vowel
*P- conditioned -w-*K- fused with *l- to form ɬ-
*-E- conditioned the raising and breaking of *a to ie
The root *la would be shared with Old Chinese 邪 'awry' *sla (spelled 斜 from the 2nd century BC onwards for 'slanted'). But it is not clear if 邪 had an *l-root.
First, other *l-less reconstructions of 邪 are possible: e.g.,
*sja (Schuessler 2009 and this site)
*sə.ɢA (B&S 2014, which reconstructs the left side 牙 of 邪 as *m-ɢˤ<r>a; Schuessler 2009 reconstructs *ŋrâ and I reconstruct *ŋra)
Second, the lateral phonetic 余 *la of the later spelling 斜 is not strong evidence for an *l-root if
- Baxter and Sagart's *sə.ɢa is correct
- *l- had shifted to *ʑ- by the 2nd century BC
- *sə.l-, *s-l-, *s-ɢ-, and *sə.ɢ- had merged into something like *sj- or *zj- (i.e., a *ʑ-like cluster) by the 2nd century BC
However, Starostin reconstructed a different chronology in which laterals remained lateral as late as the 2nd century BC (i.e., during the Western Han):
邪 *lhia > 邪/斜 Western Han *lhia > Eastern Han *zhia
余 *dɮa > Western Han *la > Eastern Han *ʑa
Eastern Han transcriptions of Sanskrit y- are ambiguous. Starostin might have said that Chinese *ʑ- was used for Sanskrit y- because there was no *j-. On the other hand, Schuessler would say that Chinese *j- was used for Sanskrit y-.
22.214.171.124:20: BIRD WORDSAt the end of my last entry, I asked what 1223 was doing in this Tangut fanqie:
1363 1swia 'time' = (5323 1swi + 0537 1pia) + 1223 2phɤo' (Tangraphic Sea 29.132)
The analysis of 1223 2phɤo' 'gentle, harmonious, together, pair' is unknown, but it looks like 'bird' + 'word':
It is in eight fanqie in the first and third surviving volumes of the Tangraphic Sea. It might have been in the lost second volume as well.
|Volume/Page/position||Tangraph||Li Fanwen number||Initial class||Rhyme||Reading (Nishida-style, Arakawa 1997)||Reading (this site)||Fanqie||Gloss|
|1.26.211||5679||V||1.18||1khamba||1khwɤa||2554 1khwɤe||4314 1bɤa||remnants (only in dictionaries?)|
|1.29.132||1363||VI||1.20||1špwaɦ||1swia||5323 1swi||0537 1pia||time, transcription character for Chinese 宣 *swiã, 修 *siu|
|1.55.222||2417||VII||1.48||1štšhor||1ʂwɨo||0245 2ʂwɨi||1449 2tʂhwɨoʳ||to need, want|
|1.84.253||1029||V||1.80||1kwɑr||1kwaʳ||2503 1kʊ̣||5528 1baʳ||to cry, weep, sob|
|3.11.111||0732||IX||1.64||1hlwạ||1ɬwiạ||1770 1ɬwi||5370 1piạ||ash, dust|
|3.11.122||3190||1.20||1ɬwaɦ||1ldwia||4226 1ldwị||0537 1pia||tongue|
|3.12.111||2238||1.67||1hlwị||1ɬwị||0239 1ɬiə||5212 1pị||the surname Lhwi|
|3.12.122||0841||1.61||1lwɛ̣||1ɬwiẹ||2814 2ɬị||3439 1piẹ||oblique, slanting, inclined|
What is the function of 1233? It can be translated into Chinese as 合 'together', the word used in Middle Chinese transcriptions of Sanskrit to indicate that two syllables were to be read as one: e.g.,
One might expect 1233 to appear in fanqie for Sanskrit transcription characters, but it doesn't; in fact, one of the fanqie is for the basic word 3190 1ldwia 'tongue'. Why wasn't its fanqie simply
娑婆二合 *sa ba TWO TOGETHER for Sanskrit sva
4226 1ldwị + 0537 1pia
without 1233? Fanqie are by definition combinations of initials and finals; wouldn't 1233 be redundant?
In any case, 1233 is not a carryover from the Chinese lexicographical tradition, since 合 does not appear in Chinese fanqie.1233 is interpreted in at least three ways in Arakawa's Nishida-style reconstruction:
1. Read as a sequence of two syllables:
(1kĭɛ2 + 1mba) TOGETHER = 1khamba
This is the only disyllabic reading in Arakawa's Nishida-style reconstruction.Why isn't the combination 1kĭɛ2mba or 1kamba (if the second rhyme is copied in the first syllable)?
2. Read as a combination of the initials of the two syllables and the rhyme of the second syllable:
(1swiɦ + 1paɦ) TOGETHER = 1špwaɦ
(2ši + 2tšhɔr) TOGETHER = 1štšhor (not 2štšhɔr!)
3. Redundant in the other five instances which might as well be normal fanqie
The first two interpretations are highly unlikely. I don't know of any transcriptions of 5679. And I doubt Chinese 宣 *swiã and 修 *siu would have been transcribed with a very un-Chinese cluster špw-.
So that leaves the third interpretation which is also unsatisfying. What, if anything, does 1223 indicate that differs these eight syllables from all others in the Tangraphic Sea? I can't help but fear that the instances of 1233 in the lost second volume might not shed light on this mystery.
A PHONETIC KEY TO TANGRAPHIC SEA RHYME 1.20
Nearly fifty years have passed since the Russian translation of the Tangraphic Sea, and the Chinese translation of that dictionary turned thirty last year. An English translation would be nice but perhaps also redundant since Tangutologists should be able to read Russian and/or Chinese. Of course, English would be nice for many non-Tangutologists. What I would like to see (and make) is a Tangraphic Sea with reconstructed character readings. Since I have been writing abou rhyme 20 syllables lately, here are the readings for the
rhyme 20 1sia 'to do (only in dictionaries?); transcription character for Chinese *sa, *sã and Sanskrit sa, sā'
entries in the first (level) tone* volume of the Tangraphic Sea. You can see the characters in Andrew West's online Tangraphic Sea. I have added the initial classes from Homophones. Groups are divided by circles in the original text.
|Page/position||Initial class||Group||Reading||Fanqie||Number of tangraphs|
|29.132||VI||1swia||(5323 sw- + 0537) 1223||1|
The absence of classes II (v-) and VII (retroflex shibilants) is a trait of Grade IV rhymes.Class IV (ɲ-?) is rare.
Some groups divided by circles correlate with homophone groups (e.g., 1-4), but others don't: e.g., the fifth group is a mixture of class III and V syllables.
Fanqie initial speller 3031 is ambiguous (see "When Rhyme 21 Is Really Rhyme 20" and "When 1825 Is Really 1829"). I would not expect 3031 to represent dz- here, since dz-tangraphs were placed in the Mixed Categories volume of the Tangraphic Sea.
I see now that I mixed up the fanqie of 1829 and 1825 (as well as those characters themselves) last week. Great. For the record, the correct fanqie are
1829 'to heat up, burn' 1tshia = 3278 1tshi + 1693 1sia (Tangraphic Sea 28.271)
1825 is from 1829 with a prefix *P- in addition to the *Kɯ- that conditioned aspiration and vowel breaking:
1825 1tshwia 'to roast, warm up' and 5041 1tshwia 'stove, furnace' =
0311 1tshwiə + 1289 1lwia (Tangraphic Sea 29.141-29.142)
*Kɯ-tsa > 1829 tshia
*P-Kɯ-tsa > 1825 tshwia
(The bare root is in Tibetan tsha 'hot' whose initial aspiration is secondary. More cognates here.)
5041 is presumably an extended use of 1825 (i.e., 'where food is warmed up', 'device for heating').
In theory one might expect only one fanqie final speller for all rhyme 1.20 syllables or two (one for -ia and another for -wia), but in fact there are ten! That does not mean there were ten subtypes of rhyme 1.20 syllables. Nearly all of those ten can be linked in a complex fanqie tree:
Members of that tree are in pink in the first table. (I have colored 0537 somewhat differently since it is followed by 1223. I will write about 1223 in my next entry.)
I placed 1693 at the root since its fanqie final speller is ... itself! 1693 is the final speller of 3179 and 0618, 3179 is the final speller of 4620 which is the final speller of 3583 and 2019, etc.
The final spellers 1289 and 1825 for -wia form a closed circle. 1289 is the final speller of its final speller 1825 (see above for the fanqie of 1825).
1289 1lwia 'lower limbs, legs' = 2302 1lɨə + 1693 1tshwia (Tangraphic Sea 29.143)
I don't know why 1363 swia wasn't spelled with either 1289 or 1829:
1363 1swia 'time' = 5323 1swi + 0537 1pia + 1223 2phɤo' (Tangraphic Sea 29.132)Next: What is 1223 doing in that fanqie?
10.14.21:21: The numbers at the ends of
1ldia1 'to come' and 1ldia2 'to return, transport'
indicate that they were treated as nonhomophonous (heterophonous - why isn't that word used more in linguistics?) in the Tangraphic Sea (and in Homophones!) even though their fanqie seem to indicate they are homophones. Their final spellers belong to the same tree (see above), and the initial speller of 1ldia2 is derived from the initial speller of 1ldia1. See "Come Again?" for details.
"1.20" in the title of this post refers to tone one, rhyme 20.
The Tangraphic Sea volume for the second [rising] tone has been lost. Rhyme 2.17 is the rising tone counterpart of rhyme 1.20. The rhyme numbers do not match since not all level tone rhymes have rising tone counterparts and vice versa: e.g., 1.6, 1.13, and 1.16 lacked rising tone versions. Arakawa (1997) lists rhyme 1.20 and 2.17 tangraphs side by side.
126.96.36.199:43: THE COMING CLANYesterday I reconstructed a Tangut word for 'come' with ld-. Other words for 'come' have the same fanqie initial speller (0475), so they can also be reconstructed with ld-:
3456 1ldia < *Cɯ-La 'to come'
*C- might be the *S- conditioning vowel tension (indicated with a subscript dot) in the words below. *Sɯ- could have been lost after the vowel conditioned breaking (see below) but before *S- could condition tension.
Normally *ɯ conditions the breaking of *a to ɨa after *l-. Did *a break to ia after *L-?
4106 1ldɨə̣ < *S-Lə 'to come'
2373 1/2ldɨẹ < *Sɯ-La/ə-j(-H) 'to come'
The root vowel is ambiguous.
The Precious Rhymes of the Tangraphic Sea has two entries for this character, one in the level tone volume and the other in the rising tone volume. Although there are other characters with two readings, I don't know of any other case in which the two readings only differ in tone.
5727 1ldɨə̣ < *S-Lə 'to transport, come' (homophone of 4106; cf. how 3456 is nearly homophonous with 3502 'to transport', written as a mirror image of 5727 and derived from it:
I wrote the pre-Tangut source of ld- as *L-.
External evidence may help us identify what *L- was. There are many
Sino-Tibetan words for 'come' with l-; at least one
(Mandarin 來 lai < *mʌ-rək) is not related to the
others. Do the Tangut words belong to this clan of l-words? If
- do the other languages preserve a root-initial l- that gained a prefix in Tangut?
cf. how *d-l- became ld- in Tibetan (Jacques, "The Laterals in Tibetan")
- or does Tangut preserve a cluster reduced to l- in other languages?
- or are both Tangut ld- and non-Tangut l- from a third source in Proto-Sino-Tibetan?
188.8.131.52:15: COME AGAIN?
(23:09: The title refers to this idiom and to the fact that 3456 'come' is followed by 3502, another Tangut character containing it in Homophones.)
After two steps
backward ... one step forward ... I hope.
In my last post, I mentioned
which has no homophones: it is in the isolated liquid-initial section of Homophones (A edition, 55A54).
3456 1lia (Grade IV) 'to come' = 0475 1liu (Grade IV) + 3583 1tia (Grade IV)
Right below it in Homophones (A
edition, 55A55) is
3502 1lia (Grade IV) 'to return, transport' = 4464 1lɨə̣ (Grade III) + 2019 1thia (Grade IV)
which looks like 3456 'come' plus 'hand' and is derived from all of 'come' and the left side of 5727 1lɨə̣ 'transport, come' (also containing 'hand' and 'come' in reverse order) in Tangraphic Sea:
I have followed Gong who reconstructed 3456 and 3502 as homophones in spite of the fact that they are isolates. It would also be hard to distinguish them in context since both are motion verbs. But if they weren't homophones, what was the difference between them?
Could they have had different initials? Their initial spellers are of different grades (III and IV). So perhaps 3456 had Grade IV [l] whereas 4464 had Grade III velarized [ɫ]. If they had identical finals, I would have to posit a phonemic distinction between /l/ and velarized /ɫ/. Sofronov (1968 II: 308) reconstructed 3456 as 1la and 4464 as 1lda. But how could there be such a distinction if the two initial spellers were part of the same fanqie chain?
4464 1lɨə̣ (Grade III) = 0475 1liu (Grade IV) + 1493 siə̣ (Grade IV)
(There was no /ɨə̣/ : /iə̣/ distinction; the quality of the first vowel was dependent on the initial.)
Tai (2008: 201) reconstructed the initial of that chain as ld-
since it was transcribed in Tibetan as ld- (11 times) and
zl- (3 times), but never as a simple l- (Tai 2008: 198).
That initial was transcribed in late 12th century northwestern Chinese
as *l- which is not necessarily evidence for reconstructing
Tangut l-. Chinese *l- would have been the best
available substitute for an un-Chinese *ld-. (There was no *d-
in that Chinese dialect.) Hence there seem to have been two kinds of 1ldia.
I cannot reconstruct either 3456 or 3502 with -w- since the fanqie
do not contain such a medial. The final spellers were transcribed in
Tibetan without -w- (Tai 2008: 210):
3853: ta (37 times)
2019: tha (9 times)
3853 was also used to transcribe Sanskrit ṭa, ta, and tā
without -v- (Sanskrit had no -w-).
The Chinese transcriptions 怛 *ta and 達 *tha for 3853
and 2019 lack *-w-.
None of the transcription evidence supports the -i- required by my Grade IV hypothesis or Gong's -j-. Sofronov's (2012) -a is much more likely for rhyme 20 which he regarded as Grade I, not IV. The l- from earlier in this post would be unusual before a Grade IV rhyme but normal before a Grade I rhyme. Sofronov (2012) sometimes reconstructed more than one value for a single Tangut rhyme, but rhyme 20 was not one of them. At this point I can only combine Tai's ld- with Sofronov's 1-a and be agnostic about the difference between the two 1lda-like syllables (3456 and 3502).
184.108.40.206:21: WHEN 1825 IS REALLY 1829
What's worse than having to publicly correct a mistake on a blog? Having to publicly correct that correction!
Andrew West pointed out that the correct fanqie for Tangut character 3371 (and its homophones 0596 and 1283) is
3371, 0596, 1283 1dzia = 3031 + 1829 (not 1825!)
I got the idée fixe that
was the final speller and didn't notice that 1829 with the same left-hand radical 'fire' and a similar right-hand radical in the fanqie of the handwritten copy of the Tangraphic Sea in Wenhai yanjiu (1983) or Arakawa's Seikago tsūin jiten (Tangut rhyme dictionary, 1997).
Notice that I have not supplied readings for 3031 or 1829.
I have already explained why 3031 is ambiguous, and I will add one more complication here:
- 3031 is the initial speller for 3371, 0596, and 1283 which are in the MIxed Categories of the Tangraphic Sea. For some reason, all dz-, dʐ-, and ɬ-syllables were placed in Mixed Categories along with a seemingly random smattering of other syllables. That suggests 3371, 0596, 1283 had dz-.
- On the other hand, 3031 is in the 'rising' tone volume of Precious Rhymes of the Tangraphic Sea instead of the Mixed Categories volume. That implies 3031 did not have dz-.
The fanqie for 1829 indicates -w- ... or does it? There is no transcription evidence for the -w- of 1829, its final speller 1289 1lwia or 0259 1lwia, the only homophone of 1289. -w- is an attempt to account for why 1289 1lwia is not in the same homophone group as
3456 1lia 'to come'
whose Chinese transcription 辢 *la has no *-w-. Then again, that transcription is not ironclad proof 3456 didn't have -w-, because the Chinese known to the Tangut had no syllable *lwa. Nonetheless a Tangut lwia could have been transcribed in Chinese as 辢合 *laCLOSED with a small 合 'closed (mouth)' diacritic to indicate -w-.) 1289 and 3456 had the same initial (l-) and rhyme (1-ia), so they presumably had different medials (-w- and zero).
If 1289 was 1lwia, then 1829 was 1tshwia, and 3371, 0596, and 1283 were 1dzwia ... which conflicts with the use of 0596 as a transcription character for Sanskrit ja without -v- (there is no -w- in Sanskrit).
Let's suppose that 3371, 0596, and 1283 were 1dzia without -w- and that their fanqie final speller 1829 was 1tshia without -w-. 1829 and 1825 were in different homophone groups even though they had the same initial (tsh-) and rhyme (1-ia), so they presumably had different medials (zero and -w-). But if 1825 was 1tshwia, why was it transcribed in Tibetan as tsha instead of tshwa? Was the subscript -wa character accidentally omitted?
This is so frustrating. I want to end on a more positive note. Andrew West recently created an online Homophones lookup tool. You can input the Li Fanwen 2008 numbers I use for tangraphs to see that
- 3371, 0596, 1283 1dz?(w)ia are in the same homophone group (31A46-48; all Homophones numbers here are from the A edition; different editions have different numbers)
- 1829 1tsh(w)ia (the final speller for those three syllables) and 1825 1tsh(w)ia (which I confused with 1829) are in different homophone groups (31B36 [which has no homophones] and 33A13-14 [a set of two homophones: 5041 and 1825])
- 1289 1lwia (the final speller for 1829) and 3456 1lia are in different homophone groups (53B78-54A11 [a set of two homophones: 1289 and 0259] and 55A54 [which has no homophones])
Alas, Homophones does not give any concrete information about the homophone groups beyond their initial classes: e.g., 3371, 0596, 1283, 1829, and 1825 belong to the sixth class (alveolars) and 1289 and 3456 belong to the ninth class (liquids). The Tangraphic Sea lists homophone groups organized by rhyme with fanqie, but fanqie for most 'rising' tone syllables are lost, and readings for fanqie spellers are dependent on a mixture of transcription evidence and educated guesswork (e.g., the reasoning for reconstructing -w- above).
WHEN RHYME 21 IS REALLY RHYME 20
(10.11.18:25: Formerly titled "Tangut Grade III -a('): Rhymes 19 and 21 (Part 2)", but I changed the title since this entry has nothing to do with either rhyme apart from my confusion of rhymes 20 and 21.)
If you don't want to constantly make a fool of yourself in public,
don't blog about Tangut.
For the past couple of days, I've been reconstructing 3371 as 1dzɨa' with Grade III rhyme 21 which would be unusual after dz-, but its fanqie in the Mixed Categories of the Tangraphic Sea clearly indicates that it has Grade IV rhyme 20 which is normal after dz-:
3371 1dzia = 3031 2dzi + 1825 1tshwia (sic; should be 1829!)
Even this corrected (?) reading remains problematic for several reasons.
First, the initial might be ts-. The evidence is ambiguous:
1. There is no fanqie for 3031, the initial speller of 3371.
2. 3031 was used to transcribe
Chinese characters with *ts-readings
Sanskrit ci (pronounced [tsi] in the variety of Sanskrit known to the Tangut, probably via Tibetan which had [ts] for Sanskrit c).
3. 3031 was transcribed in Tibetan as both Hdza and Htsa. The phonetic value of H- is uncertain: it could have represented prenasalization or a voiced back fricative.
4. Another character
1290 2?-ew 'ordinal suffix, class, limitation'
with 3031 as a fanqie initial speller was transcribed in Tibetan as tsa, tsi(H), gtsiH, and gdzi(H).
5. 3371 was homophonous with
0596 'to grow'
a transcription character for Sanskrit ja (pronounced [dza] in the variety of Sanskrit known to the Tangut, probably via Tibetan which had [dz] for Sanskrit j).
Second, it would be odd for a -wia graph (1825; sic - should be 1829!) to be a fanqie final speller for -ia without -w-. But it would also be odd for Sanskrit ja [dza] to be transcribed as dzwia instead of dzia.
The Tibetan transcription of 1825 is tsha, not tshwa. So maybe 1825 lacked -w- after all. And maybe it lacked -i- as well. A Sofronov-style reconstruction of 1825 as 1tsha may be best. But then how can one explain the different fanqie for the other 1tsha (or 1tshia) in the Tangraphic Sea?
1829 'to heat up, burn' 1tshia = 0311 1tshwiə + 1289 1lwia
(10.14.20:00: This is actually the fanqie for 1825!)
Maybe 1829 had -w- and 1825 and its homophone
5041 'stove, furnace'
did not. Their fanqie has no -w- in either speller:
1825 and 5041 1tshia = 3278 1tshi + 1693 1sia (used to transcribe Sanskrit sa)
(10.14.20:00: This is actually the fanqie for 1829!)
I will revise my reconstructions accordingly:
|Tangraph||Sofronov 1968||Li Fanwen 1986||Gong||Nishida-style reconstruction in Arakawa 1997||This site|
|1829||1tsha||1tsha||1tshja||1tshaɦ||1tshwia (formerly 1tshia)|
|1825||1tshwa||1tshɛ||1tshjwa||1tshaɦ²||1tshia (formerly 1tshwia)|
(10.14.20:04: No, judging from the corrected fanqie,
Sofronov and Gong were right to reconstruct -w-
in 1825 and 5041! Which means that the equation below is still 'broken'
or 'unbalanced', depending on your preference in metaphors.)
Plugging that revised reconstruction of 1825 back into the fanqie at the beginning of this post results in a balanced equation:
3371 1dzia = 3031 2dzi + 1825 1tshia
The two homophones of 1825 listed in Mixed Categories of the Tangraphic Sea share that fanqie and should also be read as 1dzia:
0596 'to grow' and 1283 'stomach' (attested only in dictionaries)
This entry demonstrates how errors and their corrections can cause chain reactions in Tangut reconstructions.
I have eliminated one type of apparent anomaly in rhyme 21: the combination of an alveolar initial dz- with the Grade III medial -ɨ-. But other anomalies remain, and I will examine them in future entries.
220.127.116.11:55: TANGUT GRADE III -A('): RHYMES 19 AND 21 (PART 1)
Last night I mentioned the words (phrases?)
3371 0378 1dzɨa' 2ʔʊ 'curled hair' and 3371 1144 1dzɨa' 2dị 'bun (of hair)'
and noted that their first syllables had an anomalous initial-rhyme combination. (No, actually they don't!)
3371 has the Grade III rhyme 21 (= 1.21/level tone rhyme 21 and 2.18/rising tone rhyme 18). (10.10.20:01: The true rhyme of 3371 is 20.) Here are the latest reconstructions of that rhyme and its immediate neighbors in the first rhyme cycle:
|Rhyme||Tibetan transcription||Gong 1997||Arakawa 1999||Sofronov 2012||This site|
In Gong's reconstruction, there is no Grade III/IV distinction, and many rhymes are redundant: e.g., rhymes 21 and 24. Hence Gong regarded
3371 1dzjaa (rhyme 21; = my 1dzɨa') 'hair worn in a bun; peak' and 4075 1dzjaa (rhyme 24; my 1dzia') 'thrifty'
(10.10.20:02: 3371 should be 1dzia with rhyme 20.)
as homophones in spite of their placement into different rhymes and homophone groups in the Tangraphic Sea. They are not homophonous in the other three reconstructions.
In Arakawa's reconstruction, rhyme 21 is the only Grade IV rhyme, and it has a combination of the -y- of his Grade II and the vowel length of his Grade III.
Sofronov's reconstruction is very different from all others: e.g., it has Grade II and Grade IV variants of rhyme 21. Sofronov reconstructs five subtypes of a-rhymes corresponding to three subtypes in the other reconstructions.
In my reconstruction, Grade III rhymes are characterized by medial -ɨ- and are distinct from Grade IV rhymes with -i-. Grade III and IV rhymes typically have different initials:
III: v- (= w- in most other reconstructions), shibilants (tʂ-, tʂh-, dʐ-, ʂ-, ʐ-), l- (cf. Grade II which occurs with shibilants but not sibilants or r-)
All of these initials are associated with Grade III in the Late Middle Chinese (LMC) of the rhyme table tradition. (So are many other LMC initials other than sibilants and *ɣ-.) In LMC, Grade III was nonpalatal and Grade IV was palatal. Assuming that the Tangut carried over that distinction into their analysis of their own language, Tangut Grade III initials must have been nonpalatal. Tangut l may have been velarized [ɫ].
IV: all other initials (cf. Grade I which occurs with all non-shibilants)
However, this correlation between grade and initial is not absolute: e.g., 1dzɨa' has a dz- that normally should precede a Grade IV rhyme. Hence the distinction between medial /ɨ/ and /i/ is phonemic as well as phonetic, and the Tangut created separate rhyme categories whenever the medial could not be predicted on the basis of the initial. Minimal pairs like 3371 and 4075 above necessitated the separation of rhymes 21 and 24. (10.10.20:05: 3371 1dzia [not 1dzɨa'!] and 4075 1dzia' actually differ in terms of the presence or absence of the mysterious feature that I write as -', not in terms of medials.)
On the other hand, I presume all medials in rhyme 27 were nondistinctive (and predictable?*) as suggested by the mixture of Grade III and IV in this rhyme 27 fanqie:
Hence there was no need to create separate rhyme categories for -ɨã and -iã syllables.
1ʂɨã (Grade III) = 2ʂɨu (Grade III) + 1kiã (Grade IV!)
I'll start looking at the unpredictable medials of rhymes 21 and its -'-less counterpart 19 this weekend.
*It is possible that -ɨ- and -i- were completely interchangeable in rhymes like 27: e.g.,
1ʂɨã ~ 1ʂiã (cf. Grade III rhyme 36 1ʂɨe; there is no Grade IV rhyme 37 *1ʂie)
1kɨã ~ 1kiã (cf. Grade IV rhyme 37 1kie; there is no Grade III rhyme 36 *1kɨe)
It is also possible that rhyme 27 had only one medial (-ɨ- or -i-) after all initials, so all rhyme 27 syllables were Grade III or IV.
It is not possible to choose between these alternatives at this point. It might be more accurate to write the medial of rhyme 27 with an algebraic symbol like -I-. However, I have already used that symbol to represent a lost unstressed presyllabic vowel conditioning the raising and fronting of pre-Tangut *a to i. I assign medials to rhyme 27 syllables following the general pattern: -ɨ- after shibilants (there are no v- or l-rhyme 27 syllables) and -i- after other initials.
18.104.22.168:32: WHIP = TSU + SHARP + ?
If 0219 2tseʳw 'whip' has three sources, the first two might be one of three tangraphs with a TSU-type reading and 3767 1reʳw 'sharp, pointed end':
What might be the third? There are nine tangraphs sharing a right side with 0219 that I didn't cover last Saturday:
|0054||1tswa||hair worn in a bun or coil||HAIR|
|0375||1ka||second syllable of 2phʊ 1ka 'boots worn in rain or snow'||HAIR (fur boots?)|
|0378||2ʔʊ||second syllable of 1dzɨa' 2ʔʊ 'curled hair'||HAIR|
|1144||2dị||second syllable of 1dzɨa' 2dị 'bun (of hair)'||HAIR|
|2279||1swa||second syllable of 2siọ 1swa 'a kind of grass'||SWA|
|4021||1swa||second syllable of 1niu 1swa 'ear ornament'||SWA|
|4371||1dạ||second syllable of 2me 1dạ 'hair'||HAIR|
|5133||2rieʳ||wool, feather, fine hair||HAIR|
All of the above characters either represent (parts of) words for hair or syllables homophonous with 1swa 'hair'. So 2tseʳw 'whip' is either 'TSU + hair' or 'TSU + sharp + hair'.Two of the above characters (0378, 1144) are only attested after
3371 1dzɨa' 'hair worn in a bun or coil; peak (< like a bun of hair on the top of the head?)' = 2750 1ɣɤu 'head' + 1lwʊ̣ 'to mix, blend'
They may be adjectives modifying 1dzɨa'.
Both the structure and pronunciation of 3371 are odd to me (10.10.20:15: because I reconstructed 3371 incorrectly! It should be 1dzia with a Grade IV rhyme, not 1dzɨa' with a Grade III rhyme.) I wouldn't describe a bun or coil as mixed and blended hair. And Grade III rhymes with -ɨ- normally don't follow alveolars. I will take a closer took at -ɨa' tomorrow.
22.214.171.124:57: WERE TANGUT WHIPS SHARP?On Sunday I concluded that the left side of 0219 2tseʳw 'whip' might be an abbreviation of some tangraph with a TSU-type reading, though I admit the phonetic match is poor:
2tseʳw 'whip' < left of 1tshwiu, bottom left of 2dziu', or right of 2dʐwɨiw?
I also identified the rest of 0219 as being from
2061 2pɤẹ̃ 'hair'
as a whole. And on Saturday I used Google to demonstrate that whips are associated with hair in English, though of course there is no guarantee the Tangut also had such an association.
2061 of course consists of two components. Maybe each of those components in 0219 2tseʳw 'whip' is from a different source. Let's look at eleven possible sources of
the center of 0219:
|2434||1bie||to mend, patch||BE, TATTER (i.e., to fix tatters)|
|3088||1bie||second syllable of 2bə 1bie 'dung beetle'||BE|
|3090||2ɬọ||first syllable of 2ɬọ 2ɬwi 'ugly and old'; can it stand alone?||UGLY|
|3558||2pɤẹ̃||first syllable of 2pɤẹ̃ 2ba 'flattery'||BE|
|3767||1reʳw||sharp, pointed end||SMOOTH (left and center from 1963 'smooth')|
|4330||1ʔị||ladle, scoop||I (bottom center and right from 3101 2ʔị 'to repeat')|
|4817||?ɬə||plane for carpentry||LHY|
I have excluded five tangraphs containing 2061.
The classes can be grouped into three families:
TATTER < BE > HAIR
SMOOTH > LHY > (UGLY if 2ɬọ had a ?ɬə tangraph as phonetic)
The last is an unusual case, as the shape of the bottom center component of 4330 1ʔị 'ladle' does not match its source 3101 2ʔị 'to repeat' in its Tangraphic Sea analysis:
The source of the top and bottom left of 4330 1ʔị 'ladle' is 4368 2dwʊ 'chopsticks'.
Among these characters, the best candidate for a source of 0219 'whip' is 3767 1reʳw 'sharp, pointed end'. I wish I knew more about Tangut material culture. Did Tangut whips have sharp ends?
126.96.36.199:06: THE APPEARANCE OF ANGERTwo of the Tangut words in yesterday's table
0924 2niạ 'anger, rage' and 0996 2mə 'appearance, spirit'
were borrowings from Chinese 惱 'angry' and 模 'pattern' according to Li Fanwen (2008: 156, 167).
The first etymology would work only if there was a pre-Tangut prefix *Sɯ- of unknown function (!) added to *nawʔ from Middle Chinese *nawˀ. The *S- of the prefix conditioned vowel tension (indicated by a subscript dot) and the high vowel *ɯ of the prefix conditioned the -i- in the main syllable:
*Sɯ-nawʔ > *Sɯ-nɨawʔ > *Sɯ-nɨaɯʔ > *S-nɨaɯʔ > *nnɨaɯʔ > *ṇɨaɯʔ > *ṇɨ̣ạɯ̣ʔ > 2niạ
The relative chronology of changes is not entirely clear, though *a-breaking must have preceded *ɯ-loss and *S-tension.
I once thought Tangut rhymes ending in the algebraic symbol -' (corresponding to what I used to reconstruct as long vowels) once had final consonants:
-V' (= -VV) < *-VC
If that were the case - and I don't think it was* - then the absence of -' in 0924 2niạ would not rule an earlier final consonant (i.e., *-w) since -' could not occur with tense vowels. This complimentary distribution is a clue to the identity of -' which had to have some phonetic characteristic that was incompatible with tense vowels.
The second etymology is highly improbable because Middle Chinese 模 *mo 'pattern' should correspond to Tangut *2mʊ, not 2mə. (See Gong 2002: 413 for examples of MC *-uo : Tangut -u which is equivalent to MC *-o : Tangut -ʊ in my reconstruction. I regret not include the raising of *-o to *-ʊ in pre-Tangut.)
There are isolated instances of the correspondences
Tangut -ə : Japhug rGyalrong -u < *-o, -ɯ < *-u
in Jacques (2006), but the general pattern is clear:
Tangut -ʊ (= Jacques' -u) : Japhug rGyalrong -u < *-o, -ɯ < *-u
2mə 'spirit' may be an unrelated homophone of 2mə 'pattern' that was written with the same character.
The Precious Rhymes of the Tangraphic Sea analyzed the graph 0996 for 2mə as being from
the top of 1365 and the bottom of 4744 2ʔiõ 'appearance' (a loan from Middle Chinese 樣 *jɨaŋʰ or Tangut period northwestern Chinese *jõ).
Li may have been tempted to have derived 2mə from Middle Chinese 模 *mo 'pattern' since the word appears with the clarifying character 4744 in Homophones:
He translated that collocation as 模樣 'pattern' which would have been read as *mo jɨaŋʰ in Middle Chinese - a near-mirror image of 2ʔiõ 2mə! I think this resemblance is coincidental. In Tangut period northwestern Chinese, 模樣 was something like *mbʊ jõ which would have been borrowed into Tangut as *bʊ 2ʔiõ. (Tangut tones for Chinese loans are unpredictable, so I have not indicated the hypothetical tone of the first syllable.)
The analysis of 0924 2niạ 'anger, rage' is unknown. Perhaps it was from the top and bottom left of 0948 1na 'to steal' (phonetic) plus 'demon' (semantic) extracted from one of forty-nine different possible characters:
('Demon' has left-hand and right-hand forms which are interchangeable in tangraphic analyses.)
None of the other 'demon' characters mean 'anger', so none stand out as more likely sources than others.
*My old -V' < *-VC hypothesis would not predict Tangut-Japhug rGyalrong comparisons such as these from Jacques (2006):
'nose': 5700 2ni' (not *2ni) : J sna
Correlations between Tangut -' and Japhug final consonants in sets such as
'needle': 4935 1ɣa (not *1ɣa') : J ta-qaβ
'fruit': 2436 1mia' : J sɯ-mat
may be coincidental.
188.8.131.52:59: WHAT PLUS 'HAIR' EQUALS 'WHIP'?
If the center and right components of
0219 2tseʳw 'whip'
2061 2pɤẹ̃ 'hair',
what is the source of the left-hand component
None of the 69 other characters with that component are a plausible semantic match for 2tseʳw 'whip' which may belong to the TSU phonetic class:
|LFW2008||Tangraph||Reading||LFW2008 gloss||Class(es)||Class codes|
|0009||1ʂwɨo||to appear; to raise (< 'cause to appear'?)||APPEAR||S1|
|0020||1tʂɨa||road, way (literal and metaphorical: 'manner'); to lay bricks||CHA, ROAD||P1, S2|
|0486||2paʳ||horse with white trotters||PAR||P4|
|0503||1tʂɨa||the surname Cha||CHA||P1|
|0745||2vɨe||the surname Ve||VE||P5|
|0752||1tʂɨa||ceremony, courtesy||CEREMONY, CHA||S3, P1|
|0760||2dʐɨe||to judge, decide||JE||P6|
|0948||1na||to steal, rob||NA||P7|
|1003||1lew||full, filled, satisfied||not HOLLOW?, LU? (but analysis has 1630 2dziẽ 'carve'!)||S4|
|1026||1tʂwɨa||the name Chwa; luck||CHA||P1|
|1071||2dziu'||first half of 1071 1226 'to hide, conceal'||HIDE, TSU?||S5, P8?|
|1082||2riʳ||second syllable of surnames ending in Rir||RIR||P2|
|1094||2ʐɨə||to go without a burden||GO||S6|
|1226||?T-||second half of 1071 1226 'to hide, conceal'||HIDE, TU?||S5, P3?|
|1360||1va||to hide, conceal'||HIDE||S5|
|1364||1ŋa||hollow, void||HOLLOW, NGA||S4, P9|
|1588||1tʂɨa||sheep guardian god||CHA, SHEEP||P1, S8|
|1630||2dziẽ||to carve||CARVE, JE||S9, P6|
|1641||2dʐɨa||lamb||CHA?, SHEEP? (but analysis has 1043 1lew 'full')||P1?, S8|
|1651||1tshwiu||to salute||CEREMONY, TSU?||S3, P8?|
|2663||1kwiə̣||to kowtow, worship on bended knees||CEREMONY||S3|
|2755||2lwəʳ||the surname Lwyr||LWYR||P10|
|2972||1ŋa||to spread; Grinstead: 'empty'||HOLLOW?, NGA||S4?, P9|
|3049||1xwaʳ||to melt, thaw; to confess (< 'melt down' and release information?)||XA, SPEAK||P11, S10|
|3575||2ni||to listen, hear||EAR||S7|
|3579||2kie||impressive and dignified, eminent||APPEAR (i.e., prominent?), CEREMONY?||S1, S3?|
|3813||2vɨẹ||to see someone off||VE||P5|
|3821||2lʊ||to enjoin; to tell; to give a present||CEREMONY?, LU, SPEAK||S3?, P12, S10|
|3828||1tʂɨə||to give a present; to enjoin; to tell; to know||CEREMONY?, CHA, SPEAK? (but no CEREMONY, CHA, or SPEAK graph in analysis which has 3813 2vɨẹ 'to see someone off')||S3?, P1?, S10?|
|3874||1ʔiə||hunger||HOLLOW (lacking food)||S4|
|3920||1kiụ||to bow, salute||CEREMONY||S3|
|4153||2lɨiw||to gather, assemble; transcription character||LU?||P12?|
|4201||?kha||casket, small box||XA?||P11?|
|4469||2ʂɨi||to go toward, depart||GO||S6|
|4475||?xa||to puff, blow; transcription character||XA||P11|
|4534||2dʐwɨiw||hungry||HOLLOW (lacking food; but analysis has 130 'source')?, TSU?||S4?, P8?|
|4681||1niu||ear||EAR, NU||S7, P16|
|4682||2khiə'||chimney, window, hole, space||HOLLOW, KHY||S4, P13|
|4696||1bạ||cymbals||BA, CYMBAL||P15, S11|
|4744||2ʔiõ||appearance, shape; transcription character||APPEAR||S1|
|4761||1ʂwɨa||to speak, say||SHA, SPEAK||P14|
|4762||1tʂhɨe||to go, walk||GO||S6|
|4766||2bə||a kind of vegetable||BA||P15|
|4812||2rioʳ||to brush, wipe, whisk||RIR?||P2|
|4822||2dzwiə||to go, walk||GO||S6|
|4849||1niu||the surname Nu||NU||P16|
|4894||1mio||to listen, hear||EAR||S7|
|5126||1lɨu'||to carve, engrave||CARVE, LU||S9, P12|
|5412||2lwəʳ||ceremony, rite; to get a haircut; transcription character||CEREMONY, LWYR||S3, P10|
|5693||1vɪʳ||to listen, hear||EAR, VE||S7, P5|
|6010||1kiụ||to bow, salute (= 3920)||CEREMONY||S3|
I have numbered phonetic (P) classes by order of first occurrence in the table above. Class names are in my lay romanization for Tangut which ignores the four grades, vowel tension, and the unknown distinction indicated by -'. Y represents central nonlow vowels.
Phonetic classes organized by Homophones chapter
|Chapter||Initial type||Phonetic class|
|I||Labials||P4. PAR, P15. BA|
|III||Dentals||P3. TU, P7. NA, P16. NU|
|V||Velars||P9. NGA, P13. KHY|
|VI||Alveolars||(no pure VI classes)||P6. JE,
|VII||Alveopalatals (actually retroflex shibilants?)||P1. CHA, P14. SHA|
|IX||Liquids||P2. RIR, P10. LWYR, P12. LU|
Some of the phonetic classes could be combined (P4. PAR + P15. BA, P1. CHA + P14. SHA, P10. LWYR, P12. LU).
P6 and P8 might be split, as I am not certain that mixing class VI and VII initials was permissible in Tangut phonetic series.I have also numbered semantic (S) classes by order of occurrence:
10.6.0:59: Some of those 27 classes could be combined into even bigger classes using ambiguous graphs as pivots: e.g., 0020 can either be CHA or ROAD, so CHA and ROAD graphs could be grouped together. Here is one particularly large group containing 18 classes:
That diagram is meant to be read from left to right: e.g.,
CEREMONY > LU > CARVE > JE
Two smaller groups are
CYMBAL > BA > PAR
EAR > NU, VE
Three classes cannot be combined with others: GO, NA, RIR.
Thus one could say there are six kinds of :
But I doubt literate Tangut actually looked, at, say, 0760 2dʐɨe 'to judge, and thought, 'its left side indicates that it has a JE-like reading like 1630 2dziẽ 'carve', derived from the right side of 5126 1lɨu' 'to carve', in turn derived from the bottom left of 3821 2lʊ 'to give a present', in turn derived from the center of 5412 2lwəʳ 'ceremony':
How did the Tangut learn and perceive their own script?
184.108.40.206:57: ARE WHIPS LIKE HAIR?
It was fun to use tentative Unicode code points for Tangut characters and components in my last post, but now I'm going to use Li Fanwen 2008 numbers again.
I've been trying to figure out the graphic etymology of
0219 2tseʳw 'whip'
The left side is shared with 69 other characters which don't seem to have any phonetic or semantic similarity to 2tseʳw 'whip'. I'll look at them again and post a list tomorrow.
The center and right components appear in five other characters. I already mentioned the first yesterday:
|LFW2008||Tangraph||Reading||LFW 2008 gloss||Character structure|
|2pɤẹ̃||hair||left of 'hair' + left of another graph for 'hair'|
|2mioʳ||second syllable of 2177 0227 1pə 2mioʳ 'rude, coarse, careless'||'language' + 'hair': i.e., coarse words are rude|
|2phʊ||boots worn in rain or snow||'boots' next to 'hair': i.e., furry boots|
|2giu||silk, silkworm||'bug' atop 'hair' (i.e., silk thread)|
|2ɬɤi||smooth, glossy||'not' next to 'hair'|
If the right two-thirds of 0219 were taken as a unit, then 'hair' is the most likely source. Although a whip is not much like a hair, it is even less like 'rude', 'rain and snow boots', 'silk(worm)', or 'smooth'. Moreover, none of the five sound like 2tseʳw.
I'll break up that two-thirds and see if I can find more plausible graphic sources.
10.5.0:30: Are whips associated with hair on Google?
"whip like a hair": 0 results
"whips like hair": 2 results
"whips made of hair": 7 results
"hairs like whips": 229 results
"hairs whip": 374 results
"hair whips" 32,100 results
"hair like whips": 39,400 results
"whip hair": 62,200 results
"hair whip": 93,500 results
"hair like a whip": 273,000 results
Of course modern English usage is not the key to the ancient Tangut mind. Nonetheless, the whip-hair connection is stronger in the 21st century than I had thought.
220.127.116.11:51: UNICODE TANGUT COMING IN JUNE 2016
This has been an exciting week. First, Baxter and Sagart's new Old Chinese reconstruction, then the catalog of Khitan large script characters, and in less than two years, 6,126 Tangut characters plus the Tangut iteration mark and 753 Tangut radicals. Andrew West has documented the long road his team has taken. Bravo!
Finding Tangut characters is easy in Unicode. For example, if I want the first character I mentioned on Wednesday, I can just search for its Li Fanwen 2008 number (0219) on this code chart, and voila!
U+17366 2tseʳw 'whip'
And I can find the second character I mentioned on Wednesday (Li Fanwen 2008 number 1877) by looking through the range of characters sharing its left-hand radical U+1896E (= Nishida 219, gloss unknown). Oddly the source graph for its left side according to the Combined Homophones and Tangraphic Sea has a different radical (U+18954 = Nishida 218 'dog/fox'):
U+1785F 2ʔiəʳ 'whip' =
left (!) of U+175EF 2khɤi 'yak'
all of U+18571 2phʊ 'tree'
Why does 'yak' plus 'tree' equal 'whip'?
The analysis of U+17366 2tseʳw 'whip' is unknown. There are 69 other characters containing the component
U+1892C (= Nishida 103, gloss unknown),
16 other chararacters with the middle component
U+18942 (= Nishida 275, gloss unknown),
14 other characters with the right-hand component
U+18975 (= Nishida 134, gloss unknown),
and five other characters containing the middle and right hand components: e.g.,
U+173F3 2pɤẹ̃ 'hair'.
Is a whip like a giant hair? Maybe. Or maybe there's a more likely source of the right two-thirds of U+1785F 2ʔiəʳ 'whip'. I'll look at the possibilities tomorrow.
18.104.22.168:59: THE KHITAN LARGE SCRIPT IN SRI LANKAI never expected Khitan to be discussed in
Sri Lanka <ś.ri l.ang.k.a>
at WG2 meeting 63. To be more precise, it was the Khitan large script that came up, not the Khitan small script above. I'm much less confident about this attempt to write the name in the large script:
<ś(i) ri la ang ka>
Even if one or more of those characters turns out to be inappropriate for transcribing Sri Lanka, I'm certain that a large script spelling would take up more space than its small script equivalent since the former is not clustered into word blocks like the latter.
The first of the large script characters is identical to the Chinese character 已 pronounced i in Liao Chinese, the northeastern dialect known to the Khitan a millennium ago. Should Khitan large script characters be unified with Chinese characters in Unicode?
The unification was proposed to minimize the security issues caused by co‐existence of similar shaped characters in the CJK Unified Ideograph [i.e., Chinese character] block and Khitan Large Script block.
Not knowing what the security issues are, I oppose unification. Unifying Chinese characters and the Khitan large script would be like unifying Latin A, Greek Α, and Cyrillic А. Would Greek and Cyrillic lookalike letters (e.g., Γ and Г) be assigned to one or the other alphabet while letters unique to Greek or Cyrillic (e.g., Δ and Д) were assigned to separate alphabets? My mind reels.
I also don't think unifying Jurchen (large) script characters resembling Khitan large script characters is a good idea. To me, Chinese characters, Khitan large script characters, and Jurchen (large) script characters are like the Latin, Greek, and Cyrillic alphabets: related scripts that should be kept apart in spite of partial visual overlap.
Encoding issues aside, I've been excited to browse the longest list of Khitan large script characters I have ever seen:
Proposal on Encoding Khitan Large Script in UCS
Part 1: Characters 0001-0472
Part 2: Characters 0473-0963
Part 3: Characters 0964-1455
Part 4: Characters 1456-1930
Part 5: Characters 1931-2218
(10.3.1.1:30: This last file does not include 已 <ś(i)> attested in the epitaph for 多羅里本 Duoluoliben [a.k.a. 突呂不 Tulübu, 1081], though it does include 己 [#1938] and 巳 [#1941] which also look like Chinese characters.)
I especially appreciate the inclusion of images of original characters. (10.3.0:06: But I wish I understood the codes for their sources.) I wanted to continue my series on Baxter and Sagart's new Old Chinese reconstruction, but I had to interrupt it to mention this breakthrough in Khitanology.
Alas, that list does not include any characters that Viacheslav Zaytsev may have discovered in Nova N 176, the longest known Khitan text in either script. As much as I'd love to be able to type the Khitan large script in Unicode as soon as possible, I wonder if it might be a good idea to wait until the characters in that book have been catalogued. It might be odd to have a first Khitan large script encoding covering all texts but Nova N 176. Typing words from what may be the most important Khitan text in the far future might involve going back and forth between a primary Khitan large script block and a Khitan Extended-A block. Awkward.
10.3.1:18: ADDENDUM: The Khitan large script proposal lists several inscriptions that I have never heard of before:
1. 耶律大王墓誌 Epitaph for Prince Yelü (personal name not given; 1051)
2. 耶律準墓誌銘 Epitaph for Yelü Zhun (1068)
3. 耶律李家奴墓誌銘 Epitaph for Yelü Li Jianu (1081)
4. 留隱太師墓誌銘 Epitaph for Master Liuyin (1109)
I wish I could see them.
22.214.171.124:59: GSR 0000 IN BAXTER AND SAGART (2014): PART 1
I didn't know Baxter and Sagart's new book Old Chinese: A New Reconstruction came out until it was released in the US yesterday, almost two weeks after it was released in the UK on 18 September. I'm not surprised it's sold out in the UK. I've waited years for it. I'll have to wait even longer because I can't afford it. But for now at least I can look at the reconstructions which the authors have kindly shared with the public (alternate URL). All reconstructions in this post are Baxter and Sagart's unless I state otherwise.
Will these reconstructions ultimately displace those of Karlgren's Grammata serica recensa (GSR, 1957)? We shall see.
For years I would recommend Schuessler's Minimal Old Chinese (2009) reconstructions to nonspecialists, as they incorporate many post-GSR elements widely accepted among scholars today (e.g., a six-vowel system) while excluding more controversial proposals. (I also recommend his Later Han Chinese in the same book. By definition it's too early to be Middle Chinese, but it's close, and I prefer it to nearly all Middle Chinese reconstructions I've seen.)
I dream of publishing my own reconstruction, but I really should finish my Golden Guide translation first, among many other things. I'd also like to publish a complete list of my readings of Tangut characters and the pre-Tangut sources of those readings. Both my Chinese and Tangut reconstructions have only been available in scattered form on this site and a couple of publications.
Enough about me. Let's start looking at Baxter and Sagart's Old Chinese reconstructions organized by GSR numbers. (Alas, the characters in the PDF are not directly searchable, though one can indirectly find them by searching for their Unicode code points.) At the top of the list are characters without GSR codes. Baxter and Sagart assigned them the number 0000.
The first character is 𠓥 *pe[n] 'whip', an alternate spelling of 鞭.
鞭 is a semantic-phonetic compound ('leather' + *be[n]) whereas 𠓥 is a compound of 攴 'strike' (itself a semantic-phonetic compound of 卜 *pˁok atop 又, a drawing of a hand) beneath something looking like 入 'enter' with a short horizontal line inside it. Those top components don't look like a pictograph of a whip to me, but I presume they're semantic. Another variant 𠓠 simply has 入 'enter' on top. See more variants here.
The brackets around the coda indicate that Baxter and Sagart "are uncertain about its identity". In this case, the coda might have been *-r. We know for sure that 'whip' ended in *-n in Middle Chinese, but Middle Chinese *-n could be from Old Chinese *-r as well as *-n*.
*pe[n] turns out to be an uncontroversial reconstruction. Pan, Zhangzheng, and Schuessler all reconstruct it as *pen. I am the odd man out, as my system requires a high vowel presyllable to account for the vowel breaking (partial vowel height matching) in Middle Chinese (MC):
*Cɯ-pen > *Cɯ-pien > *pien (= pjien in Baxter's MC transcription "not intended as a reconstruction")
My Old Chinese *pen without a high vowel presyllable (e.g., 邊 'side') remained *pen in Middle Chinese. Baxter and Sagart reconstruct 'side' as *pˁe[n] with a pharygealized *pˁ- that in my view blocked breaking. Such pharygealized initials distinguish their reconstruction from most others. I only reconstruct pharyngealization in Middle Old Chinese; it developed in (pre)initial** consonants preceding 'lower' vowels (*ʌ *e *a *o) and spread through the syllable:
*pen > *pˁen > *pˁeˁnˁ
On the other hand, my Old Chinese *Cɯ-pen was not subject to pharyngealization because its preinitial preceded the 'higher' vowel *ɯ. Pharyngealization and its absence conditioned vowel allophones that became phonemic after the loss of pharyngealization in Late Old Chinese (OC):
|Graph||Baxter and Sagart's OC||This site||Baxter's MC|
|Early OC||Middle OC||Late OC, MC|
|𠓥/鞭||*pe[n]||*Cɯ-pen||*Cɯ-pien > *pien||*pien||pjien|
|邊||*pˁe[n]||*pen||*pˁen > *pˁeˁnˁ||*pen||pen|
*10.2.0:51: I reconstruct *-n unless (1) a phonetic series or word family also contains Middle Chinese *-j readings pointing to *-r and/or (2) external evidence points to *-r. Baxter and Sagart's policy of reconstructing is safer since there is no guarantee that all Old Chinese *-r belonged to such phonetic series or word families and/or can be reconstructed on the basis of external evidence.
I have not found any true cognates of the Chinese word for 'whip'; all lookalikes in the region are borrowings.
The Tangut words for 'whip' are completely different:
0219 2tseʳw < *T-tse(k/w)H (common) and 1877 2ʔiəʳ < *T-ʔəH or *ʔərH (only in dictionaries?)
**10.2.1:02: Preinitials are onsets of unstressed presyllables whereas initials are onsets of stressed syllables. Hence the preinitial of *Cɯ-pen was *C- (an unknown consonant) and its initial was *p-. The height of the vowel after the first consonant in a (sesqui)syllable conditioned the presence or absence of pharyngealization in Middle Old Chinese.
I suspect that uvular initials always conditioned pharyngealization regardless of the following vowel unless preceded by a high vowel presyllable, but I have not yet investigated that hypothesis:
*qi > *qˁi (but *ki > *ki)
*Cʌ-qi > *Cˁʌˁ-qˁeˁiˁ > *kei (same outcome as *Cʌ-ki)
*Cɯ-qi > *Cɯ-ki > *ki (same outcome as *Cɯ-ki)
126.96.36.199:40: STILI IN OFFORD AND GOGLITSYNA (2005)
Offord & Gogolitsyna (2005; hereafter OG) is the first book of Russian for foreign learners that I have ever seen with extensive coverage of variation in Russian. Although Japanese is well known for its complex speech levels, I was surprised tonight to find that McClure's (2000) book on Japanese in the same series covers the same topic in a 33-page chapter that is less than half as long as OG's two chapters combined (72 pages). I think it would be possible to write a full-length book on variation for learners of Japanese. OG identify three registers of Russian. I have grouped their lowest varieties into a fourth register ranked below their first register (R1):
Demotic, youth slang, prison slang, thieves' cant, vulgar language
Everyday spoken conversation. I would have extreme difficulty making out compressed forms such as monosyllabic [grʲu] for trisyllabic говорю 'I say' (p. 10). I have wondered how learners cope with compression in English.
"This is the norm of the educated speaker, the standard form of the language that is used for polite but not especially formal communication [...] It is the register that the foreign student as a rule first learns and which is most suitable for his or her first official or social contacts with native speakers. [...] This register is perhaps best defined in negative terms, as lacking the distinctive colloquial features of R1 and the bookish features of R3" (p. 14)
Apart from textbook Russian, this is the style I am most acquainted with. OG noted the feature that stands out in my mind:
Various means are used to express a copula for which English would use some form of the verb to be, e.g. состои́т из [consists of], зaключáeтся в [concludes in], прeдстaвля́eт собо́й [presents itself as], all meaning is (4.2). (p. 15)
All three expressions for 'is' can be found in Zaytsev (2011)'s paper on Khitan. (Can be found is itself a bookish synonym for is in English.)
c. Journalism/political debate
Literary and online language can mix elements from across the above spectrum.
OG also discuss regional variation.
All that makes me ponder how little is known about Tangut, Jurchen, and Khitan. I suspect that Tangut words only appearing in odes and dictionaries are from a traditional, colloquial register whereas the bulk of surviving Tangut texts are in an elevated, Chinese-influenced register. Even less is known about Jurchen and Khitan. Surviving texts in those languages are largely in inscriptions representing a 'monumental' register. One huge possible exception is the Khitan book that Zaytsev (2011) identifed; it may have been written in a different style.
188.8.131.52:59: NOT BEING THERE ANYMORE: RUSSIAN GERUND VARIANTS
Russian has several types of gerund suffixes. Six books for English-speaking learners include notes on when to use them:
|Aspect||Suffix||Example||Reiff (1883: 181)||Forbes (1916: 171)||Arant (1981: 119)||Pul'kina & Zakhava-Nekrasova (1992: 371)||Offord & Gogolitsyna (2005: 328)||Wade (2011: 386, 389)|
|Imperfective||(-shibilant) -я / (+shibilant) -а||встречая 'meeting'||"written tongue"|
|(consonant +) -учи / (vowel +) -ючи||встречаючи 'meeting'||"familiar language"||"peasants", "popular poetry"||not mentioned; even будучи 'being' is absent||"popular parlance"; "generally avoided in the
modern literary language" with the sole exception of
|only будучи 'being'||only будучи 'being', three others*|
|Reflexive imperfective||(-shibilant) -ясь / (+shibilant) -ась||встречаясь 'meeting'|
|Perfective||(-shibilant) -я / (+shibilant) -а||войдя 'having entered'||"common with reflexive verbs"|
|(vowel +) -в||встретив 'having met'||"written tongue"||interchangeable||"preferred in written styles" to -я/-а|
|(vowel + ) -вши||встретивши 'having met'||"familiar language"||"peasants", "popular poetry"||"less frequently" used than -в||"archaic flavour"; "may also occur" in "the colloquial register"or "demotic"||not mentioned|
|(consonant +) -ши||вошедши 'having entered'||"rarely used"|
|Reflexive perfective||(-shibilant) -ясь / (+shibilant) -ась||разбредясь 'having wandered in different directions'|
|(vowel + ) -вшись||встретившись 'having met'|
|(consonant +) -шись||ведшись 'having been in progress'|
1. Is -учи /-ючи still in "popular parlance" today?
Google Ngrams has no data for встречаючи, so here are two more pairs of gerunds:
читая vs. читаючи 'reading' (the former is always more common)
делая vs. делаючи 'doing' (ditto)
The title refers to the most common surviving -учи /-ючи gerund in Будучи там, the Russian title of Being There (1979). Будучи 'being' has 4.99 million Google results. See below for the Google statistics of other -учи /-ючи gerunds mentioned in Wade (2011).
2. I am surprised that the perfective gerund suffix -а/я is still around. It could be confused with the homophonous imperfective gerund suffix (though the latter attaches to many more stems). And yet войдя 'having entered' is much more common than its synonym вошедши in Google Ngram Viewer. (There is no risk of confusing inflected imperfective and perfective gerunds [i.e., stem-suffix sequences] as opposed to suffixes in isolation as long as each aspect has a different stem: e.g., the imperfect gerund corresponding to войдя/вошедши 'having entered' is входя 'entering' with a different stem вход-.)
3. I am also surprised that -вши is in decline (Wade does not even mention it!) though its reflexive counterpart -вшись is common.
встретивши was once more common than встретив, but their fortunes reversed shortly before the Revolution.In short, I would expect the imperfective and perfective gerund suffixes to be maximally differentiated over time and internally consistent:
-я(сь)/-а(сь) vs. -(в)ши(сь)
But that's not the case!
9.30.1:36: Added a column for Offord & Gogolitsyna (2005: 328) and Google Ngrams links.
*The three are
едучи 'traveling' "is sometimes found in poetic or folk speech" (p. 386; 97,300 Google results)
жить припеваючи 'to live in clover' (p. 386; 124,000 Google results)
крадучись 'stealthily' (p. 394; 391,000 Google results)
184.108.40.206:14: TRANSCARPATHIAN RUSYN MASCULINE 'JA-NIMATES'
The Transcarpathian Rusyn (TR) and Prešov Rusyn (PR) masculine animate declension in Magocsi (1979: 83) and Magocsi (1979: 83) is straightforward in the singular: all endings are added to an invariable stem brat-:
|dative||*bratru||bratu, bratovy||bratovi||bratu, bratovi||bratu||bratovi||bratru, bratrovi|
|locative||*bratrě||bratu||bratovi||brati, bratovi||bracie||brate||bratu||bracie||bratovi||bratru, bratrovi|
The TR dative and locative plurals resemble the Russian plurals, but that must be a coincidence, as TR is not contiguous with Russian; it is spoken in the Transcarpathian Oblast' "which borders upon four countries: Poland, Slovakia, Hungary, and Romania." I wonder if those TR ja-plurals were influenced by Polish whose ci is from *tj. The TR nominative plural is unlike those of Polish or Slovak.
TR bratüm < *bratomŭ may be an older TR dative plural or a very old borrowing from Slovak predating *o-fronting and *-ŭ-loss.
Moreover, the Russian plural forms are based on an old feminine collective which must have replaced an earlier regular masculine plural *braty still preserved in the other East Slavic languages. On the other hand, all non-j TR forms are from brat- rather than the feminine collective *bratĭja.
The Serbo-Croatian 'plural' braća 'brothers' is still a feminine collective singular unlike Russian brat'ja which takes plural endings except in the old nominative singular (now reinterpreted as a plural). Hence none of its endings are cognate to those of the original masculine plurals.
Polish has a mixture of old singular and plural forms of that collective. I assume the old feminine accusative singular *bracię has been replaced by the old feminine genitive singular braci to conform to the genitive-as-accusative pattern of masculine animates. (23:30: The old feminine vocative singular would have been *bracio; it has been replaced by the old nominative singular since masculine plurals have identical vocatives and nominatives.)
Slovak combines that collective (reinterpreted as a masculine plural) in the nominative with forms of brat- in all other cases.
Notes on other forms
Stem: Only Czech preserves the second *-r-.
Nominative/accusative singular: Originally identical but differentiated later when the genitive was used as the singular. See Schenker (1993: 108).
Dative/locative singular: Apparently partly merged in TR and Ukrainian. Fully merged in PR, Slovak, and Czech. Dative for locative reminds me of the dative after German prepositions.
What is the origin of -ovy/-ovi?
PR y normally does not correspond to Ukrainian i. Why does PR have -y instead of -i?
Instrumental singular: Did Polish and Czech generalize -em from other paradigms? Czech -em in this paradigm must postdate *r shifting to ř before *e (a change visible in the vocative).
Belarusian unstressed *o became a.
Nominative plural: In spite of my transliteration, TR/PR bratȳ [bratɨ] is homophonous with Belarusian braty [bratɨ] but not Ukrainian braty [bratɪ].
Genitive plural: Originally homophonous with nominative and accusative singular. How did *-ovŭ (the source of most forms above) and *oː (the source of Czech ů [uː]) develop?
The *o before *ŭ fronted to ü in TR and lost its rounding in PR and Ukrainian.
*-v became Belarusian -ŭ.
Russian -ev is an allomorph of -ov after -j-.
Dative plural: Is -a- instead of -o- in most of East Slavic other than TR and PR by analogy with the instrumental -ami?
Is PR bratom due to Slovak influence postdating the *o > i shift before *ŭ?
Is TR bratüm due to Slovak influence predating *o-fronting?
Accusative plural: Czech preserves the original homophony of accusative and instrumental plural. All other modern languages have accusative plurals from genitive plurals.
Instrumental plural: Schenker (1993: 89) could not explain the original ending *-y. It was replaced by -mi endings by analogy with other declensions.
-a- in East Slavic could be from the -ami of the -a-declension.
Locative plural: Is Czech the only language in the table with a reflex of *ě? Most of East Slavic seems to have generalized -a- from the instrumental and/or dative plural. Polish braciami has the -ami of an a-declension instrumental plural. Slovak may have generalized -o- from the genitive/accusative and/or dative plural. PR o must be from the dative plural since *o borrowed from the old genitive/accusative plural *-ovŭ would have fronted to *i.
9.28.23:57: I forgot to ask if the -j- in TR bratjam and bratjach is in all masculine consonant-final dative and locative plural forms or is only in a subset of those forms. I could answer my own question by looking for all masculine consonant-final dative and locative plural forms in Magocsi (1979), but my copy is not machine-searchable, and that would be time-consuming. My guess is that (1) TR brat belongs to a small class of masculine animate nouns which once had alternate plurals based on feminine singular collectives and (2) all other TR masculine animate nouns share the endings -am and -ach with masculine inanimates and neuters.
220.127.116.11:03: НЕСПРАВНІ СЛОВА
Magocsi (1979: 82) listed fifteen English loanwords in American Rusyn that he regarded as "incorrect" (несправні <nespravni>). They contain a number of surprises from an English speaker's perspective:
1. 'Displaced' stress
Verbs are borrowed with the stressed suffix -ва́ти <váty>. The English roots are unstressed: e.g.,
bother > бадерова́ти <baderováty> (not *báderovaty)
Is the stress in this word by analogy with other -ня <-nja> words?
grocer > ґросе́рня <grosérnja> (not *grósernja)
The stress in 'watch out!' is by analogy with its native equivalent:
watch > вачу́йте <vačújte> (not *váčujte) : мирку́йте <myrkújte>
Also see 'cookies' and 'cousin' below.
2. Assignment of monosyllabic consonant-final nouns to the feminine -a declension
yard > я́рда <járda> (not *jard)
car > ка́ра <kára> (not *kar)
mine > ма́йна <májna> (not *majn)
store > штор <štor> (not *štóra; the initial consonant is irregular)
Polysyllabic consonant-final nouns were assigned to the masculine consonant-final declension:
carpet > ка́рпет <kárpet>
closet > кла́зет <klázet>
3. Double plural
cookies > куке́сы <kukésȳ>
I suppose the Rusyn plural ending -ȳ was added to kukés- because *kuki would end in an un-Rusyn -i- and could not be declined.
Is there a singular kukés?
I'm surprised the stem isn't *kúkiz-.
4. Spelling-based borrowings?
Rusyn y is [ɪ].
cousin > кузи́н <kuzýn> (not *kázyn)
picture > пі́кчер <píkčer> (not *pýkčer)
run (?) > рунова́ти <runováty> 'to drive' (not *ranováty)
The -e- in kukésy 'cookies' may also be influenced by spelling.
5. Vowel not matching spelling or pronunciation
drive > дрейвова́ти <drejvováty> (not *drájvovaty)
Oddities like this make me wonder about the dialect(s) and nonnative, non-Rusyn English that Rusyn speakers heard.
If someone asked me how to distinguish between modern written Russian, Belarusian, and Ukrainian without actually knowing the languages, I'd tell them to look for letters specific to each orthography:
ъ <''> is only in Russian
є <je> and ї <ji> are only in Ukrainian
ў <ŭ> is only in Belarusian
The problem with that approach is the low frequency of those letters:
ъ <''> is the rarest letter in Russian
є <je> and ї <ji> are eight-point letters in Ukrainian Scrabble
ў <ŭ> is the 12th least frequent letter* in the Narkamaŭka Belarusian orthography and the 11th least frequent letter in the Taraškievica orthography
Here is a different approach using higher-frequency letters:
- if a text contains і, it is either Ukrainian or Belarusian
- if a text contains і and и, it is Ukrainian
- if a text contains і and ы, it is Belarusian
- if a text contains и and ы, it is Russian
This table shows the distribution of the three letters:
Note that и has different phonemic values in Russian and Ukrainian.
і is the third most frequent letter in Belarusian and a one-point letter in Ukrainian Scrabble.
ы is the 5th most frequent letter in the Narkamaŭka Belarusian orthography and the 4th most frequent letter in the Taraškievica orthography, but is the 19th most frequent letter in Russian.
The Russian, Belarusian, and Ukrainian words for 'identification' exemplify the different distributions of those letters:
R идентификация <identifikacija>
B ідэнтыфікацыя <identyfikacyja>
U ідентифікація <identyfikacija>
The Russian word would be an even better example if it contained ы as well as и.
Belarusian has one difference absent from the table above: э where the others have е.
So far, so good. But then I finally got around to looking at the Rusyn alphabet this week. I've known about Rusyn for years without knowing that its alphabet was like a combination of the Russian and Ukrainian alphabets. It has
- ё, ы, ъ like Russian
- є, і, ї like Ukrainian
I don't know anything about Rusyn, much less its historical phonology. My guess is that Rusyn did not merge *y and *i unlike Ukrainian:
Did Pannonian Rusyn merge all three vowels into и? If so, then it is like Ikavian Serbo-Croatian in that respect.
On Tuesday I discovered a Transcarpathian variant of the Rusyn alphabet with two more letters in Magocsi (1979): ӱ <ü> and ю̈ <jü>.
ӱ <ü> is from *o before a short high vowel:
*nočĭ 'night' >
Russian ночь <noč'>
Belarusian ноч <noč>
Transcarpathian Rusyn нӱч <nüč> (fronting) (p. 14)
Ukrainian ніч <nič> (fronting and loss of rounding)
I can't explain this correspondence:
*děvica 'girl' >
Russian девочка <dеvočkа>
Belarusian дзяўчына <dzjaŭčyna>
Transcarpathian Rusyn дӱвочку <düvočku> 'girl' (acc. sg., p. 23) (I would expect *divočku)
(9.27.0:05: I'm pretty sure the nom. sg. is düvočka. is Did *ě round before *o?)
Ukrainian дівчина <divčyna>
ю̈ <jü> is much rarer than ӱ <ü>. Here are two examples from *e before a short high vowel:
*medŭ 'honey' >
Russian and Belarusian мёд <mjod> [mʲot]
Transcarpathian Rusyn мню̈ д <mnjüd> [mɲyd] (p. 37)Ukrainian мед <med> [mɛd]
(9.27.0:30: Transcarpathian Rusyn [mɲ] is reminiscent of Czech [mɲ] from *mj-, though the two languages are not contiguous. Transcarpathian Rusyn's neighbor Slovak has [m] corresponding to Czech [mɲ].)
*anŭgelŭ 'angel' >
Russian ангел <angel>
Belarusian анёл <anjol>
(9.27.0:32: Coincidentally reminiscent of Slovak anjel. Did Belarusian simplify *ng to n?)
Transcarpathian Rusyn агню̈ ль <ahnjül'>, ангел <anhel> (p. 52)
(The former has an irregular palatalized -l' and the latter looks like a later loan.)
Ukrainian ангел <anhel>
Another example is from *ju before a short high vowel:
*ključĭ 'key' >
Russian, Belarusian, and Ukrainian ключ <ključ>
Transcarpathian Rusyn клю̈ ч <kljüč>
*The Belarusian frequency lists include Russian letters absent from Belarusian at the bottom: и, ъ, щ. I presume those letters appeared in Russian names and words in Belarusian texts. I have excluded those letters from my ranking.
18.104.22.168:51: MIENSK I MINSK
(The title is from Менск і Мінск 'Miensk and Minsk', the first song I ever heard in Belarusian.)
I was puzzled by this section of the English Wikipedia entry on Minsk:
The Old East Slavic name of the town was Мѣньскъ (i.e. Měnsk < Early Proto-Slavic or Late Indo-European Mēnĭskŭ), derived from a river name Měn (< Mēnŭ). The direct continuation of this name in Belarusian is Miensk (pronounced [mʲɛnsk]). The resulting form of the name, Minsk (spelled either Минскъ or Мѣнскъ), was taken over both in Russian (modern spelling: Минск) and Polish (Mińsk), and under the influence especially of Russian it also became official in Belarusian. However, some Belarusian-speakers continue to use Miensk (spelled Менск) as their preferred name for the city.
It does not explain where Minsk came from. The standard Belarusian reflex of Proto-Slavic *ě ('jat') is e (with palatalization of the preceding consonant indicated by -i- in Łacinka). Russian has the same reflex of jat. Among the East Slavic standard languages, only Ukrainian has i from jat. The Slavic root for 'white' in Беларусь Belarus' 'White Rus' ' has jat:
Belarusian бел- bieł- [bʲɛl]
Russian бел- bel- [bʲɛl] (Łacinka disguises the fact that the Belarusian and Russian roots are homophonous)
Ukrainian біл- bil-
Polish biał- [bʲaw]
(More descendants here.)
One might think that Minsk is a borrowing from Ukrainian (in which the word is Мінськ Mins'k with the shift ĭs > s'), and in fact Vasmer credits Ukrainian influence rather than outright borrowing. The Belarusian Wikipedia in the current official orthography states that according to Aničenka (1987), the spelling Minsk adopted in 1939 incorporates the Ukrainian reflex of jat.
The Taraškievica Belarusian and Russian Wikipedias mention another explanation by Abremska-Jabłońska in Kramko and Štychaŭ 2001: the influence of the Polish name Mińsk (Mazowiecki) '(Masovian) Minsk'.
The Russian Wikipedia says the i-spelling in Latin dates from 1502 when Minsk was under Lithuanian rule. The Polish-Lithuanian Commonwealth was still 67 years in the future.
At first I thought it was likely that the Poles renamed Minsk after their own Mińsk, but why would non-Poles* alter the name to match a name in a foreign country? And centuries later, why would the BSSR adopt a Ukrainianized name for Minsk?
Here is an uninformed guess: Did the originators of the spelling Minsk perceive the local Belarusian reflex of jat to be i-like: i.e., an [e] or [ɪ] higher than Belarusian e [ɛ]? Such a high reflex could have later lowered and merged with [ɛ]. Or this hypothetical high-jat dialect could have been replaced by an [ɛ]-jat dialect.
*9.26.0:52: I don't know who wrote the Latin documents containing Minsk. They could have been Lithuanians or Belarusians. In any case, they did not have the option of writing a higher e with the dotted letter ė which was absent from the earliest Lithuanian alphabet of 1547. (In modern Lithuanian orthography, plain e is [ɛ] and dotted ė is [eː]. The Lithuanian Wikipedia article on the Lithuanian alphabet gives me the impression that dotted ė is only a little over a century old.)
22.214.171.124:54: BROTHER-IN-LAWS IGOR AND OLEG
I am barely a dilettante at Slavic, so I constantly fear that I am raising Comparaitve Slavic 101-level questions whenever I bring up the subject. Yesterday I asked why *e in *děverĭ 'brother-in-law' didn't raise in Ukranian. Today I learned that the late George Shevelov himself (1979: 309) wasn't sure:
The reason for the appearance of e in [standard Ukrainian] díver 'husband's brother' is unclear. Could it be an influence of NU [northern Ukrainian] dialects where e is restored in unstressed syllables?
So maybe that wasn't such a bad queston after all. I don't know about these next questions, though.
Another word from my last post, Russian Igor' / Ukrainian Ihor / Belarusian Ihar, is from Old East Slavic In(ŭ)gvarŭ which in turn is a loan from Old Norse Ingvarr. Let's go through this word from left to right:According to Shevelov, nasal + consonant sequences did not exist at the time. Hence there were four options to deal with Old Norse Ing-:
1. Borrow as is in spite of native phonotactics: Ingvarŭ
2. Insert ŭ to break up the ng-cluster: Inŭgvarŭ
3. Drop the n to avoid the ng-cluster: the ancestor of Igor'
4. Replace In- with native nasalized Ę- to break up the ng-cluster.
All but the last options were exercised. A nasal vowel would have become *Ja- in modern forms like Russian *Jagor', etc.
G weakened to h in Ukrainian and Belarusian.
I have not seen the change *va > o anywhere in Slavic. Are there other examples? Was Old Norse va something like [wɒ] or [wɔ] which would have been close to Old East Slavic *o? Belarusian a in Ihar is from o and is not a direct retention of the Old Norse vowel.
Why does Russian have -r' < *-rĭ if the Old East Slavic word ended in *-ŭ?
Ukrainian final -r in theory could be from either *-rŭ or *-rĭ, but the -r of Ihor must be from *-rĭ since palatalized r appears before endings: Ihorja instead of *Ihora, etc.
Belarusian r is always unpalatalized, so the endings of Ihar do not reveal whether its -r was from *-rŭ or *-rĭ: e.g, Ihora, etc.
Another Norse name in East Slavic is Russian Oleg / Ukrainian Oleh / Belarusian Aleh from Old Norse Helgi via Old East Slavic Olĭgŭ.
Old East Slavic had no H-. (As already stated, the later h of Ukrainian and Belarusian is from g.) Old Norse -e- was borrowed as Old East Slavic Je- with a prothetic J-. This Je- then became Jo- and ultimately O-; cf. Proto-Slavic *ezero > Russian/Ukrainian ozero / Belarusian vozera 'lake'. Belarusian lowered unstressed O- to A-.
'Strong' ĭ before a 'weak' ŭ lowered to e in East Slavic. (See Wikipedia on the 'strong'/'weak' distinction.)
Why does the -i of Old Norse Helgi correspond to Old East Slavic -ŭ instead of -ĭ? I am reminded of how Russian third person verb endings end in -t from -tŭ instead of the expected -t' from -tĭ corresponding to Ukrainian -t', Belarusian -c', and - far outside Slavic - Sanskrit -ti.