You are on page 1of 13

Some notes on the original Rigvedic language

in context of the OIT (or, Out-of-South-Asia) theory


of the Indo-European Urheimat
[Rigvedic as an originally kentum Harappan language: a hypothesis]

PART 1

by Igor A. Tonoyan-Belyeyev
July 2018

§1. Usually, it is taken for granted that Rigvedic dialect is not much differing from other
Vedic dialects (thus representing just the same chronological stage as they do), and
moreover, that it has much in common with both documented Iranian and Middle Indo-
Aryan dialects (thus presumably allowing to date it with respect to absolute chronology as
much younger than it could be).
But, in addition, there are some misconceptions with regard to the Old Indo-Aryan
itself, in general.

§2. The origin of retroflexes (cerebrals, cacuminals, etc.). There are two major
misconceptions concerning the origin of Indo-Aryan retroflexes, which are widespread and
misguidingly taken for proven.
(1) "Indo-Aryan retroflexes appeared and developed under the influence of the
Dravidian languages". This is absolutely unproven thesis, as we have at least two counter-
arguments to that. One of them is that the absolute majority of words containing
retroflexes in the oldest Indo-Aryan texts are of proven Indo-European origin (but we
should notice that Indo-Aryan does not have alveolars, which are also present in
Dravidian; moreover, Old Indo-Aryan even does not have original short simple e and o
vowels, which are present in both Dravidian and most non-Aryan Indo-European
languages). The other is that we have at least one example of the "internal" Indo-
European evolution of retroflexes from combinations of dentals with a liquid far outside the
area of the Dravidian languages, that is in Scandinavian languages (where rt, rd, rn, rs
yield the corresponding retroflexes in Norwegian and Swedish); this does not demonstrate
our thesis, but shows that Indo-European languages are able to develop retroflexes
without any Dravidian influence. It should be added that in some Uralic languages there
are some traces of retroflexes, too (but this problem is out of scope of the present paper).
(2) "Indo-Aryan retroflexes in words of Indo-European origin developed from r +
dental combinations". This is a very special type of misbelief, but its influence is too
destructive to ignore it. Actually, it is Rigvedic which allows to demonstrate the validity of a
different approach to this problem. First of all, it is in Rigvedic that we have a special
intervocal sandhi of "etymological" ḍ and ḍh phonemes, which between two vowels
become -ḷ- and -ḷh-, respectively (the same process is known much later from the Middle
Indo-Aryan Pāḷi language, as well; but in New Indo-Aryan we have intervocal -ṛ- and -ṛh-
instead). Second, in all Old Indo-Aryan dialects it is lt, ld, ldh, ln, ls combinations which
are completely forbidden, but not rt, rd, rdh, *rn (rṇ), *rs (rṣ). Third, as opposed to Iranian
(at least, Avestan and Old Persian) both Rigvedic and all other Old Indo-Aryan dialects
actually preserve many cases of "original Indo-European L" (but not in the above-
described position), which makes them significantly different from Iranian dialects. Fourth,
in some Old Indo-Aryan dialects (other than Rigvedic) we have simple "L" in place of
Rigvedic ḍ/ḷ (cf. iḷā ~ ilā, etc.). "Semi-Prakritized" cases like kṣudra ~ kṣulla should not be
ignored as well.
Keeping in view all the above, we should state that:
1) Originally, Old Indo-Aryan retroflexes could have begun their development without any
need of deep contact with Dravidians (or, influence from Dravidians), that is, outside of any
zone of intensive contact with them (which should mean that there were likely no
Dravidians present in Punjab in any epoch, including Early and Mature Harappan);
2) Originally, retroflexes could be completely absent from the original Rigveda which
was then gradually phonetically transformed ("adjusted to current pronunciation")
according to the development of later Vedic dialects of those speakers who transmitted in
orally from generation to generation (and who composed late Vedic texts where
retroflexes are metrically proven). This argument needs a throughout checking of the text
of the Rigveda with respect to its metrical consistency in all cases we replace retroflexes
with corresponding "L+dental" combinations (except in cases of ṣ and ṇ which are of
various origins, not only biphonemic). Thus, Rigvedic could originally (in the original text
when it was just composed) have had no retroflexes at all, but, instead, only lt, ld, ldh, ln,
ls clusters, and simple n and s in those places (after "cerebralizing" r and so on), where
late Vedic and then Sanskrit developed ṇ and ṣ, respectively. Please, note that from early
Prakrits (including Pāḷi) on, "original" Old Indo-Aryan ṣ "again" transforms into simple s.
This feature might be called "Lambdaic origin of Old Indo-Aryan retroflex stops".

§3. The original absence of "jh" phoneme (voiced aspirate palatal stop). Some Indo-
European reconstructors take for proven that already "Common Indo-Iranian" possessed a
kind of jh-like phoneme (in two forms – for both "first palatal" and "second palatal" series).
But the fact is that Rigveda archaically does not have any jh phoneme, having for both
cases h instead (a phoneme not having a plosive element though phonologically
functioning as a stop as it cannot be doubled), while only late Old Indo-Aryan and Middle
Indo-Aryan have a real jh phoneme. Some scholars argue that, as Prakrits have idha
instead of Vedic iha ("dh" losing its plosive element), just in the same way Rigvedic dialect
could have lost original jh. But the problem is much deeper. Old Indo-Aryan is the only
dialect group within Indo-European (in contrast to both Iranian and all newer stages of
Indo-Aryan itself) which possesses this voiced velar fricative (in Middle Indo-Aryan it
becomes unvoiced, while in Iranian there are only g,j,z,ž corresponding to it, but the same
is the case with Balto-Slavic and Armenian as well). As the Pratyāhāra-sūtra-s to the
grammar of Pāṇini show, there existed at least two h's in Old Indo-Aryan, the one
functioning as a stop while the other representing a kind of sonorant. In this case, this
latter h originally developed directly from Proto-Indo-European laryngeal (with which the
two voiced and aspirated stops of velar series coincided). And, for this case, Sanskrit
words like hala "plough" (cf. its development in Sindhi) really correspond to Latin ara,
arare, etc. This again severely separates Indo-Aryan from Iranian, where the two velar
series are distinguished throughout from the very beginning (with g/j for the "second
palatal" *jh and z/ž for the "first palatal" *jh); but the same is valid for other satem dialects
(thus grouping Iranian with Balto-Slavic and Armenian instead of "intuitive more-closeness"
with Indo-Aryan), with the exception of only Indo-Aryan and, probably, Nuristani.
But as that jh finally developed in later Vedic dialects and then in Middle Indo-
Aryan, its original absence should have been a distinctive feature of only the oldest
form of the Indus-Sarasvati valley Aryan dialect, that is, Rigvedic (as Rigvedic
geography is documented within the Rigveda itself). Here we have two features: Rigvedic
is opposed to even other Old Indo-Aryan and Middle Indo-Aryan dialects as not having jh;
Old Indo-Aryan (including Rigvedic) is opposed to Iranian and other satem dialects as
having plosive j and voiced fricative h for both voiced "palatal" series (thus
undistinguished), distinguished in Iranian, Baltic, Slavic, Armenian, and so on (but those
non-Indo-Aryan satem dialects, instead, do not distinguish between aspirates and non-
aspirates, mostly except for Armenian).
All this makes us think that, originally, in Rigvedic there existed only one voiced
aspirate phoneme (h0,1,2) for the three original Indo-European phonemes, that is, a)
some cases of preserved laryngeals, b) voiced velar/palatal aspirate, and c) voiced
"labiovelar" aspirate.
We should add, that Rigvedic originally, is likely to have had only one voiced non-
aspirate stop for both velar series as well, that is, j1,2 (though with different alteration
patterns), in contrast to most other satem dialects (including Iranian).

§4. Highly probable original absence of ṛ, ṝ, and ḷ vowels in the oldest Indo-Aryan. We
know that at least Classical Sanskrit (and Late Vedic) and some Slavic dialects have or
had syllabic r and l sounds, that is vowel counterparts to these phonemes. It is generally
accepted by Indo-European reconstructors that such phonemes (together with syllabic m
and n, as well, as per de Saussure and further on) actually existed in Proto-Indo-
European. As for Indo-Aryan, it is stated that their presence is one of its "archaic" features.
But, looking at Iranian and Greek, we see that the corresponding elements are always
represented by clusters containing real vowels (a,i,u,e,o) plus r, l, n, m. As for Slavic
languages, they are not that unanimous with respect to this problem: most of them have
vowel+sonorant or sonorant+vowel combinations instead (or even vowel+sonorant+vowel,
as Eastern Slavic does). All this makes us think that this feature was not common Indo-
European and developed separately (and scarcely!) in some dialects and at a relatively
late stage. Thus, Rigvedic language is again "adjusted" to the later Vedic stage, when
such groups were pseudomonophonematized (for a short period, as in Middle Indo-Aryan
they are again absent). Now, the first misguiding idea is that if a sound is present in the
"phonemic alphabet" it really existed. Let us consider long Indo-Aryan ṝ vowel: in
vocabularies, it exists within some verbal roots like kṝ- (to scatter), pṝ- (to fill), etc. But
when we look at real words produced from those abstract roots, we have only kar/kir/kīr
and par/pur/pūr groups instead of those containing ṝ. The only position where this
pseudophoneme was "written" is the group of ṛ-final stems where Accusative masculine
plural had -ṝn, Accusative feminine plural had -ṝs, and both had Genitive plural as either
-ṛṇām or -ṝṇām. Thus, the original nonexistence of long ṝ is preliminarily demonstrated.
As for the ḷ vowel, it is documented only in one Old Indo-Aryan root, kḷp-, so it is anyway a
marginal phoneme (like ph).
To prove the original nonexistence of short ṛ is just a bit more difficult (and it can be
postulated only for the earliest stage of Vedic). First of all, masculine and feminine stems
ending in -ṛ have abnormal -us/-ur in Genitive singular instead of either *-ras (as in earliest
Vedic vasvas in contrast later vasos) or *-as/ar/ās/ar (or less possible -ṛs). Second, weak
grades of both ra/la and ar/al sequences have the same result in Indic, which makes us
suspect it to be secondary. Third, at least in some cases there exist no weak grade for
ra/la/ar/al sequences (no alteration). Fourth, in Iranian we normally have ar instead (Old
Persian arta etc.). Fifth, in Old Indo-Aryan itself we already have doublets like kṛmi~krimi
("worm"). There are still some other considerations, but here we are not going to dive too
deep. We just state that it is highly probable that in original Rigvedic no ṛ / ṝ / ḷ vowels
existed, only ar/ir/ur/ār/īr/ūr, al/il/ul/āl/īl/ūl, ra/ri/ru/rā/rī/rū and la/li/lu/lā/lī/lū having
existed (this hypothesis is in no contradiction with the "spirit of Pāṇini", by the way).

§4. The Phonology of the Harappan Script. As archaeological evidences in favour of the
Out-of-India theory are getting accrued, we are going to prepare here some basis for
correct work on deciphering the Harappan script (ca. 2600-1900 BC). Many scholars and
amateurs tried to solve this problem. Most of them worked without any special method. All
those attempts were generalized and surveyed by notable Iravatham Mahadevan and
other scholars. The prevailing hypothesis has been that it is a Dravidian language that was
spoken (and written) in Mature Harappa. But even those who thought Harappan language
was Aryan or Indo-Aryan did not try to use the results of systematic Proto-Indo-Aryan
phonological reconstruction (both internal, basing on Prātiśākhyas, Pāṇini, etc., and
comparative, in Indo-European context). Our idea is that the language of the Mature
Harappan period was lexically and grammatically close to or identical with Rigvedic
and, to a lesser degree, Samavedic and Atharvavedic, but as for its phonology, it
was significantly different, as the language of those oldest samhitas was "juniorized" by
the people among whom these texts circulated until the basics of phonology and methods
of artificially supported memorizing, pāṭha-s were elaborated (which was done only at the
beginning of the so-called Sūtra [post-Vedic] period, but not earlier). That's why Vedic
texts look glottochronologically much more homogeneous than they are. While
Brāhmaṇas, Āraṇyakas, and some of the earliest Sūtras (like those of Baudhāyana
school), are likely to be composed after 1900 BC (the former two mainly during the so-
called Late Harappa, while the Sūtras and the latest versions of the former in other areas,
including that of PGW and the like). The earliest editions of the Yajurveda can be dated
ca.1900 BC while the youngest ones (of the so-called "White Yajurveda") could be
probably dated ca.1300 BC or even a bit later. Still, some mantras and other text periods in
any samhita or Brāhmaṇa could be much older than the corresponding text as a whole.
Anyway, any period after the Late Harappan (ca.1900-1300 BC) should be called post-
Vedic, as no Vedic text proper was likely to be composed or even rearranged thereafter.
While Early Harappa (3300-2600 BC), Mature Harappa (2600-1900 BC), and Late
Harappa (1900-1300 BC) should be called, respectively, Early Vedic (or, Early Proto-
Vedic), Middle Vedic (or, "Main Vedic"), and Late Vedic (or, Transitional Vedic). Generally
speaking, family mandalas of the Rigveda, some oldest mantras within Samaveda,
Atharvaveda and, probably, in Black Yajurveda (Maitrayani) were originally composed in
the Early Vedic language (and it is this language that was the basis for the "Standard
written language" of the next period, that is, Mature Harappan "hieroglyphic" language).
Mandalas 1st and 10th, as well as some later hymns within other parts (including
Valakhilya and khila-s), together with the most part of Atharvaveda and Samaveda, as well
as, probably, a bit larger part of the earliest Black Yajurveda mantras, were originally
composed in the Middle Vedic language. While the rest of Vedic texts proper was
composed or rearranged during the Late Vedic period, only at the end of which the first
signs of the beginning of the so-called "Prakritization" emerged. Anyway, it is the Rigvedic
language which is likely to be the closest to the written language of the Indus-Saraswati
civilization (in terms of its lexicon and main part of grammar). Thus, our task in to
reconstruct here its original phonology, which was so far the greatest obstacle in attempts
to decipher the Harappan script. Rigveda has preserved some features of even Mature
Harappan phonology (which we partly discuss in this article), but it has obtained many
features of Late Harappan (that is, Late Vedic) dialects incorporated through the
pronunciation of the text without linguistic reflections until Prātiśākhya methodology and
pāṭha devices developed.
Our purpose is here to plan two kinds of phonological reconstructions – the
oldest one and the moderate one. In the oldest reconstruction, the Proto-Indo-Aryan
language is reconstructed, which was the language of the oldest mantras of the Rigveda,
while in the moderate reconstruction, we are trying to show how the Mature Harappan
could have sounded by the end of the period, that is by ca. 1900 BC. The Early Vedic
language was presumably still a centum dialect of Indo-European, while the Middle Vedic
became "semi-satemized" already (which is discussed below, in brief).

§5. The original biphonemic character of " ch". S-mobile. Scarcity of unvoiced aspirates
except "th" and "ch". Indo-Aryan "kṣ ~ ch" oscillation. Special sound for "Indo-European
unvoiced palatal" present only in Indic, Old Persian, and Nuristani (with some traces in
Baltic as well), but not in Avestan, Armenian, or Slavic.
As we know from Indo-European reconstructions, Old Indo-Aryan ch developed
from Indo-European *sk' cluster. According to Sanskrit grammars, prosodically, this ch was
still biconsonantal (that is, cch) in most positions, and always so between vowels.
As for other unvoiced aspirates, which are usually thought to be a special feature of
"Indo-Iranian languages", actually, only ch and th (which becomes ṭh after ṣ and in some
other cases) are enough frequent in Vedic. Labial ph is extremely rare, while in
reconstruction it corresponds to original sp (as in phena) and sbh (as in sphūrj-), the latter
case being preserved only in combination with s-mobile.
But th is also highly defective, as it cannot stand in the beginning of any Vedic root
or word. It may stand either after s-mobile or as a medial. As there exist no initial simple th
(unlike later stages of Indo-Aryan), it can be supposed that all original initial st-groups
either preserved or only lost their s-mobile. But, within words, th and sth (together with ṭh
and ṣṭh) would have had different origin, the former being the development of original * st,
while the latter being the product of *sdh evolution; moreover, we have some rare cases
where th alterates with dh, as in Dhātupāṭha: nāth ~ nādh, and the like.
Anyway, in the Earliest Indo-Aryan language there were no unvoiced aspirates
as separate phonemes. There existed only either "s+unvoiced stop" or "s+voiced aspirate
stop" combinations (and, if we consider the possible written form, they should have been
represented either by ligatures or by special sequences, but not simple signs). Rigvedic
metrics is able to help us and hence is very useful in reconstruction: where *st group can
be metrically expected (after a short vowel where the syllable should be heavy), this *st
was the prototype of later th; after metrically light syllables, th was an allophone of dh (a
relatively rare case); while in cases of sth the prototype is always *sdh (please note that
*sdh normally is polyvalent in all positions except root-initial with respect to its further
development, as in mēdhas<*masdhas, mīḍha<*misdha, etc.). Some cases are difficult,
as the superlative suffix -iṣṭha- (Greek speaks in favour of *-ista- while Indo-Aryan itself
makes us expect *-isdha- instead). But these separate cases should not become obstacles
for the general approach as they can be solved later (with more facts further accumulated
and generalized).
Now, only kh and kṣ phonemes and phonemic groups are yet to be explained. In
many cases, kh can be explained as having developed from former *sk. But this phoneme
is rare enough in the Rigveda. As for kṣ, we should keep in view the two facts. We have
already shown that originally there were no retroflexes in the oldest Indo-Aryan. But in
some other satem languages (including Iranian and Balto-Slavic), the special RUKI rule
was already valid without any traces of retroflexes. This means that a kind of unvoiced
palatal sibilant might have already existed even before the first retroflexes appeared in
Old Indo-Aryan. In some rare cases this *ks group might be even simplified to ṣ (as in
ṣaṣ). But still, in the oldest stage there was no RUKI rule valid, and so only one sibilant
might have existed, that is plain s. While the unvoiced stop of the "first palatal series"
should have been still a stop (as in Nuristani).
In Old Indo-Aryan, there exist a special sandhi of ś~ch (śatam~tac_chatam,
pṛcchati~praśna, etc.) which is absent from any other Indo-European branch including
Iranian. As we have already mentioned, ch developed from Indo-European *sk', while here
we should add that ś developed from Indo-European *k'. The problem is to explain how
and why they began to be sandhially interchangeable. We would explain it as follows.
Indo-European unvoiced palatal *k' in Indo-Aryan was, via some kind of generalized
contamination, reinterpreted as always containing s-mobile. Whenever before a stop
(including nasal stops), it was realized as *sk'>ś; whenever after a stop (optionally so after
nasals), it suffered metathesis yielding *k's>ch. Now, in cases like Greek skhizo ~ Sanskrit
chid- we should reconstruct not *sk'id- but *sg'hid- instead. So, ch in fact is the product of
originally s+h (IE *sg'h-) and not *sk'. Now, we see that Indo-Aryan does not distinguish
between original voiced stops of the "first" and "second" palatal series, while it
distinguishes between the products of unvoiced stops (k ~ c ~ ś 1), the products of s-mobile
+ unvoiced stops (sk/kh1 ~ śc ~ ś2), and the products of s-mobile + voiced aspirates
(kh2/kṣ2 ~ kṣ1 ~ ch). Moreover, this pattern may additionally explain why the RUKI rule
began to act after k. Originally, here it acted only after a labiovelar *ku/gu/ghu, that is, after
u, without any "K" needed in this rule. But later it was generalized for any instance of k.
Pāṇini describes the "ch" phoneme after short vowels and in some other cases as
having a "t " increment going before it. While from Late Vedic on, ch is a frequent variant
for some instances of original kṣ and śy (kakṣa[pa] / kaśya[pa] ~ kaccha[pa]). While, a
couple of Old Vedic words, namely, kṣam ("earth") and ṛkṣa ("bear") had, according to
Anatolian and Tocharian data, *tk'am~tg'ham and *ṛtk'a~ṛg'ha prototypes, actually with that
"incremental t" of Pāṇini's description of a related phoneme. But, as we have already
noted that the reflexes of Indo-European *k' were generalized as *sk' (with s-mobile), these
two prototypes should have sounded as *tsk'am~tsg'ham and *ṛtsk'a~ṛtsg'ha, respectively,
which actually explains wherefrom that *s of those kṣ appeared (please, note that in Indo-
Aryan the common non-Indic Indo-European rule of tt/dt>st was NEVER valid, so that s
could not have developed from *t [after all attempts to artificially explain Indic tt in place of
both tt and dt as secondarily developed from non-Indic *st]); actually, t was dropped, while
s was metathesized with k, thus yielding the documented Indo-Aryan state.
Now, among "Indo-Iranian languages", for that unvoiced "first palatal" stop we have
a separate sound only in Old Indo-Aryan (ś), Old Persian (θ), and Nuristani (ċ and the
like), Avestan having simple s. Slavic and Armenian also have simple s for that case, while
Baltic preserved mostly traces of former palatality. And it is only Nuristani which preserved
a plosive here, while speaking on the whole "first palatal series", only Old Indo-Aryan and
Old Persian (and not Avestan!) preserved stops here among voiced sounds (d in Persian
and j in Indo-Aryan, others having plain sibilants).
Now, the reason why later Old Indo-Aryan has a special sandhi rule of ṣṭ / [ḍ]ḍh for
"first palatals + t" combinations, is that before the development of retroflexes it was ś (that
is *sk' containing s-like sound) used there: sk't>śt>ṣṭ and g'ht>sg'ht>sk'dh>śdh>ṣḍh
(further, ḍh with lengthening or modification of the previous vowel).
Let us now consider the basic system of early Old Indo-Aryan plosives and their
combinations with s-mobile.

unvoiced simple s-mobile + voiced simple voiced aspirate s-mobile +


(unvoiced) simple voiced aspirate

Plain velar / k~c sk ~ śc g~j gh ~ h skh ~ kṣ


labiovelar
[+ kh] [+ kh]
First palatal / ś ś j h ch
velopalatal
[+ ch] [+ ch] [+ kṣ]
Dental t st d dh sth
[+ th] [+ th] [+ th]
Labial p sp b~v bh sph
[+ ph, sv*] [+ ph]

§6. Original Rigvedic vowels and the system of elementary Harappan syllabograms. As we
have already discussed above, it is highly probable that originally, in the oldest Indo-Aryan,
only three vowels existed, namely, a i u, together with their long counterparts, ā ī ū.
Diphthongs (e, ai, o, au of Classical Sanskrit) should have developed much later from a+i,
a+(a+i), a+u, and a+(a+u) combinations, while the vowels ṛ, ṝ, and ḷ should have originally
had forms of something like *iri/uru/iru/uri, *īri/ūri/īru/ūru, and *ili/ulu/ilu/uli ("u" being
possible only in contact with a labial or a "labiovelar"), respectively.
As we have already considered the system of stops and their combinations with s-
mobile in the previous paragraph, now let us consider the rest of consonants.
As for pure, original sibilants, there existed only one, i.e. s.
As for nasals, there are only two etymological nasals corroborated by other IE
languages, that is, n and m.
As for liquids, there are two as well, r and l.
It is doubtful whether a glottal stop ʔ still existed or already disappeared from Proto-
Indo-Aryan. It should have existed, as otherwise diphthongs would have already
developed (but a very late Old Persian shows that they might have not developed until a
relatively late time, so this sound might have been unwritten but present).
As we have already noted, only three simple vowels were present, that is a, i, u.
Finally, there should have existed non-syllabic variants of i and u, that is y and v/w.
As for combinations of the rest of consonants with s-mobile, sn, sm, sv, sy, sr
actually exist in Indo-Aryan, while *sl is absent from there.
Finally, as Prātiśākhya-s tell us that in final position voiced and unvoiced simple
stops are interchangeable, let us hypothesize that a closed syllable should have ended in
the following sounds (please, keep in view that no visarga or anusvāra could have existed
in the Early Old Indo-Aryan): 1) -k/g, 2) -ś/-j, 3) -t/-d, 4) -p/-b, 5) -n, 6) -m, 7) -r, 8) -s,
assuming that it could have finished only in one consonant and not in a sequence (with
special rare exceptions as ūrk in Sanskrit). More precisely, voiced and aspirated
consonants might have existed only before vowels and sonorants (y r l v n m), otherwise
they coincided with the corresponding unvoiced stops.

§7. Enumerating elements of the hypothetical Harappan Aryan written language. Let us
first enumerate simple (elementary or "non-ligature") syllables which are likely to be
written with simple signs of the Harappan script.
H1) a.
H2) i.
H3) u.
H4) va.
H5) vi.
H6) ya.
H7) yu.
H8) ka.
H9) ca.
H10) śa.
H11) ki.
H12) ci.
H13) śi.
H14) ku.
H15) cu.
H16) śu.
H17) ga.
H18) ja.
H19) gi.
H20) ji.
H21) gu.
H22) ju.
H23) gha.
H24) ha.
H25) ghi.
H26) hi.
H27) ghu.
H28) hu.
H29) ta.
H30) ti.
H31) tu.
H32) da.
H33) di.
H34) du.
H35) dha.
H36) dhi.
H37) dhu.
H38) na.
H39) ni.
H40) nu.
H41) pa.
H42) pi.
H43) pu.
H44) ba.
H45) bi.
H46) bu.
H47) bha.
H48) bhi.
H49) bhu.
H50) ma.
H51) ra.
H52) ri.
H53) ru.
H54) la.
H55) li.
H56) lu.
H57) sa.
H58) si.
H59) su.
Some of these syllables are extremely rare, so they may not be attestable in scarce
Harappan documents (some 5 thousand inscriptions by 2018). Such are ki, ghi, ghu, bi, la,
li, lu. Anyway, approximately 50 elementary syllables should have corresponding
elementary signs. Please, note that in contrast to Brāhmī or Nāgarī-like scripts, here the
syllables (Harappan signs) with common consonant but different vowels will
generally not resemble each other and will not have common graphic element, as a
rule (though rarest exceptions are possible).
This list may be not exhaustive: some of syllable-finals may be represented by
separate signs (as there is no special virāma here, being too abstract for this stage of
script), this being specially the case with -n, -m, -r, -s. Another possibility is that syllables
like that would be written with ligatures, for example -mas (verbal ending, 1pl.pres.act.) as
ma+si/sa, tanūs ("body") as ta-nu-u+sa/su, etc. The syllables with long vowels and future
diphthongs may be also (at least, sometimes) written in ligatures: ka+a or ka-a, tu+u or tu-
u, ka+i or ka-i, ta+u or ta-u (hyphen here representing separate writing while plus sign
representing "condensed" or ligature writing). Actually, long homogeneous vowels might
have been written only in case where a doubt could arise (otherwise, as in prose it is not
that important, they could not always be distinguished, just as in Sumero-Akkadian
cuneiform script). "Second grade diphthongs", that is ai and au, might be written in
different ways, as [C]a+i / [C]a-i / [C]a+u / [C]a-u (here not differing from future e and o), as
[C]a+a+i / [C]a+a+u, [C]a+a-i / [C]a+a-u, or, rather, as [C]a-a+i / [C]a-a+u, and so on. The
frequent Harappan sign " might have served to denote the prolongation of the previous
vowel (but this is a relatively abstract idea for the Bronze Age).
What is more or less clear is that initial syllables with consonantal clusters (i.e.
initial in real words) will be mostly written with ligatures. Among them, those with initial s
are the most important. Some other types are important, too. Let us try to preliminarily
enumerate them.
First of all, "future" common Indo-Aryan ch (for *s+h) and kṣ (for *s+gh) might be
either ligatures or special signs. Please, note also that any syllable beginning with a
consonantal cluster which stands as initial at least in one word denoting a
pictorially representable thing, maybe be represented not through a ligature but
with a simple, not analyzable sign. Thus:
H60) cha.
H61) chi.
H62) chu.
H63) kṣa.
H64) kṣi.
H65) kṣu.
Other important "initial" ligatures (binary ones) are:
1) ska
2) ski
3) sku
4) śca ~ sca
5) ści ~ sci
6) ścu ~ scu
7) sta
8) sti
9) stu
10) sdha ~ stha
11) sdhi ~ sthi
12) sdhu ~ sthu
13) spa
14) spi
15) spu
16) sbha ~ spha
17) sbhi ~ sphi
18) sbhu ~ sphu
19) sna
20) sni
21) snu
22) sma
23) smi
24) smu
25) sya
26) syu
27) sva
28) svi
29) sra
30) sri
31) sru
Those with second "r":
32) kra (NB!, *cr does not exist in OIA, but *cra/cri/cru ligature still might have existed to
denote a cṛ syllable which actually exists)
33) kri
34) kru
35) śra
36) śri
37) śru
38) gra
39) gri
40) gru
41) ghra
42) ghri
43) ghru
44) hra
45) hri
46) hru
47) tra
48) tri
49) tru
50) dra
51) dri
52) dru
53) dhra
54) dhri
55) dhru
56) pra
57) pri
58) pru
59) bra ~ vra
60) bri ~ vri
61) bru ~ vru
62) bhra
63) bhri
64) bhru
65) mra
66) mri
67) mru
68) sra
69) sri
70) sru
Those with second "y" (the syllable "yi" could not have had a simple designation, if it
ever existed):
71) cya
72) cyu
73) śya
74) jya
75) hya
76) tya
77) dya
78) dyu
79) dhya
80) pya
81) bhya
82) mya
Those with second "v":
83) kva
84) kvi
85) śva
86) śvi
87) jva
88) hva
89) tva
90) tvi
91) dva
92) dvi
93) dhva
Those with second "n":
94) kna
95) kni
96) knu
97) cna (NB!, no ñ sound there!)
98) cni
99) cnu
100) śna
101) śni
102) śnu
103) gna
104) gni
105) gnu
106) jna
107) jni
108) jnu
109) ghna
110) ghni
111) ghnu
112) hna
113) hni
114) hnu
115) tna
116) tni
117) tnu
118) dhna
119) dhni
120) dhni
121) pna
122) pni
123) pni
124) pnu
125) bhna
126) bhni
127) bhnu
128) mna
129) mni
130) mnu
Those with second "m":
131) kma
132) kmi
133) cma
134) cmi
135) śma
136) śmi
137) gma
138) gmi
139) jma
140) jmi
141) hma
142) hmi
143) tma
144) tmi
145) dma
146) dmi
147) dhma
148) dhmi
Some of the above are only within words and thus are very rare. This is just a
preliminary list to show that the number of Harappan signs fits very well into the Indo-
Aryan phonology and not that of Dravidian or other language which are in terms of their
syllabic structure much more simple.
Ternary combinations "s + stop + sonorant" are not that frequent, but still there are
some cases of frequent syllables of this structure.
Any "stop + (non-nasal) stop" combination should normally not form a ligature, the
former belonging to the preceding syllable while the latter being the initial of the next
syllable, like nakta – na+k(a/u)-ta, etc.
For some reasons, syllables with r + consonant might have been also represented
with ligatures, but for now it is unclear. The same might be valid in case of l + consonant
for at least lt, ld, ldh, ls combinations which later developed into retroflexes.
What is incorrect is to try to find some parallels between the Brāhmī script
and the Harappan script (as Subhash Kak tried to do): even if they are genetically
related, this would give nothing, as Brāhmī itself could not be read knowing only Nāgarī
until James Princep, without special efforts, notwithstanding the fact that both scripts had
actually the same structure. The Harappan script is completely different from Brāhmī
in terms of its structure and basic principles (and hence, syllabostatistics): it is not an
abugida, but a purely syllabic script, like cuneiform, with a small number of possible
logograms.

§8. Acrosyllabic principle. Considering the question how we can identify the syllabic
meaning of at least simple signs we should remember that is was acrosyllabic principle
that was valid at that time (cf. Mycenaean sign NI representing a fig-tree along with actual
Greek nikylea for "fig-tree"). This principle actually preceded Semitic "acrophonic principle"
which was just a kind of reduction and simplification of the former. But this method is only
heuristic, as we cannot be sure about the Mature Harappan lexicon (Rigvedic language
was sacral, so Rigvedic lexicon might have only partly overlapped with Harappan). So we
need some independent way of verification – and such means actually exists!

§9. Statistical definition. The most important of defining the more or less precise meaning
for the group of the most frequent Harappan signs is the so-called syllabostatistics which
should be separately counted for some Vedic texts in a series of forms – from "as is" to
reconstructed ones. In this way, at least some phonetic (syllabic) values might be obtained
in a manner either independent from that of the former paragraph (acrosyllabic, based on
pictorial considerations) or partly corroborating it.

§10. The way deciphering may be carried out even without a "Rosetta stone". The
decipherment will begin to be proved (first partly and then in full) when at least thirty most
frequent Harappan would yield such "assigned" values which would combine in different
inscriptions into well-known Rigvedic words and phrases, which have at least in most
cases pragmatic and logical sense. Thus, it should not sound like some abstract ritual
pseudoformulas or unclear names. These inscriptions should appear to be short phrases
with grammatically bound and coherent elements. This can be called cascade multiple
coherent (consistent) reading. This is the only criterion in the absence of a "Rosetta
stone" which is likely to be found in future in Mesopotamia, or in Iraq, or in Rakhigrahi
(after its full excavation), in other sites especially in Gujarat, Sind, and Makran, or, finally,
in some collections of already excavated but not explored artifacts (especially those from
Mesopotamia and Syria).

§11. Conclusion. To begin more or less methodical work, we should keep in mind that we
should solve some problems:
1) to discern between variant writings of Harappan signs and really different signs
just lookings alike.
2) to classify Harappan signs into graphically elementary and compound (the
criterion would be that at least one of the parts of a presumably compound sign functions
as a separate sign as well, but much better when both parts do so).
3) to make a list of most probable elementary Harappan graphemes.
4) to make a list of simple syllables of Old Indo-Aryan (Rigvedic), in two groups:
those functioning as word or root initials and those found only inside word forms.
5) basing on the Rigveda and other Vedic texts, to prepare a list of words (nouns)
which can be represented pictorially, and to begin to find possible correspondences
(Classical Sanskrit is of no use here, as many words changed their meanings in it, and as
it has too long synonymic series which makes it impossible to choose the correct variant
among that multitude).
6) to prepare absolute frequency statistical charts for both Harappan signs and
Vedic syllables reanalyzed in terms of reconstruction.
7) to prepare positional and combinatory statistical charts for both sets.
8) to try to find correspondences between acrosyllabic pictorial candidates and
statistical candidates.
9) to define the first series of not less than twenty syllables, otherwise they cannot
form any finished inscriptions.

You might also like