You are on page 1of 12

Extrait de la Revue Informatique et Statistique dans les Sciences humaines

XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.


Revue, Informatique et Statistique dans les Sciences humaines
XXIII, 1-4, 1987, pp. 99-110
Scansion and Analysis
of prakrit verses
by text-processing programs
C.M. MAYRIIOFER
The use of computers in the study of metre has sevcral attractions, aboye
aU the prospect of saving time and labour. There exist professional metricians,
but for most students of language and literaturc, metre in Hs technical aspects
is a rebarbatiye subject, aU the more so far its dense and contradietory
documentation. However, before one can use such deyices, one must be able to
state the rules of the metre in question. Now, even those who study literature
for pleasure may be interested by the formalization of metrical rules, in so far as
the rules may be thought of as a model of the knowledge, the knowhow, which
in certain traditions of poetic composition is handed down from one generation
to the next.
European literary scholars who have studied metrics as a pradical science
are likely to have absorbed t.he rules for scanning classical Greek and Latin
verse. In several projeets t.hese rules have been adapted to text-processing
programs and t.he results have appeared in a number of publications 1.
Unfortunately machines can make only an approximation of t.he scansion
of a giyen line of verse, even if the approximation is right in as much as 95%2
of cases. To improve this score their progl'ammers would haye to foUow the
path of reseal'ch into automatic translation, because the metrical propel'ties
of a text in classical Greek or Latin can he securely ascertained ouly artel' its
meaning has been determined. In other words, the Greek and Latin alphabets
1 W. OTT, Melrische Analyse'l zur Ars Poetica des Horuz, Goppingen, 1970; S. HOCKEY,
I. l\-IARRIOTT, Oxford Concordance Program, version 1.0 : user's manual, Oxford, 1980,
rcpr. 1984, pp. 334-337; discussion and bibliography in C. I3UTLER, Computers in Linguistic,
Oxford, 1985, p. 20.
2 OTT, p. 7
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
100 C.M. MAYRHOFER
do not sufficiently specify the phonetic properties of their respective languages
ta permit accurate metrical analysis on the basis of the text alonc, whether by
humans or computers.
The case of another classical language \Vith a long tradition of metrical
analysis, Sanskrit, as \Vell as of its relations the Prakrits, is very different.
The devanagari alphabet, in which these languages are usually writteu, has
about twice as many characters as the Roman. This multiplicity of charaders
hears witness ta the precision of phonetic aoalysis in the Indian grammatical
tradition. It seems likell' that a preoccupation \Vith the accuracl' of orally
performed ritual texts inspired the development ofthis highll' anall'tic alphabet,
which has, thereforc, in principle, none of the ambiguities of the Grcck and
Roman alphabets, and the fuies for scanning Sanskrit and Prakrit verses can
be translated into a text-processing program which will, again in principle,
produce metrical analyses with 100 percent accuracy. This has been done, at
least for Prakrit
3
.
The project to which l now turn concerns a particular Prakrit language.
The Prakrits are regarded in the Indian tradition as the heirs of Sanskrit,
the classical language par excellence. One of them, however, is qualified as
apabhramsa, "decadence
ll
, being the furthest rernoved from the parent language.
It is true that in sorne respects it is the nearest to the spoken languages of the
family. Nevertheless in the evolution of literary styles in India it too becarne a
classicallanguage, and a substantial number of text.s in Apabhramsa, sorne of
them very long, have been discovered and published, almost all of them during
the course of this cent ury 4. Still, it remains the runt of the family so far as
scholarship is concerned. The tradition of linguistic and metrical commentary
is less secure than in the case of Sanskrit and of the other Prakrits, and
modern scholarship has not l'et settled the rules of the language. A general
problern is that the editing of the texts is not sufficiently advanced to permit a
reliable sUl'vey of its features, and conversely, the absence of an authoritative
phonology and grammar means that the editing of texts is hazardous. In order
to rnake progress in either direction, one must try to rnake progress in both
simultaneously. The special interest of metre in Apabhramsa studies is that
aIl known texts in the language are metrical. There is no Apabhramsa prose.
Consequentll' one has the opportunitl' to benefit from the predictable features
of verse to control the accuracy of an exemplar of a text, and by the saIlle token
the obligation to develop an adequate description of these features.
3 Dr. H. Nakatani of Tokyo University informed me in January 1985 thal he had the use
of such a program.
4 References in S. LIENHARD, A llistory of Classical Poetry : Sans}';ri!PaliPrahit,
Wiesbaden, 1984, index p. 290, S.v.
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
SCANSION AND ANALYSIS OF PRAKRIT VERSES 101
The first step is the conversion of the texts into machine-reaclable form,
which process takes the form of typing them in transliteration uncler a program
which provides line-references in a standard format. For the transliteration
whieh involves rendering the 38 or so characters used in Apabhramsa by the
26 of the Roman alphabet, certain deeisions had to be taken and adhered to.
1 deeided to use the established system, in partieular the representation of
aspirate consonants by digraphs (e.g. "Kh"), except that instead of a variety of
diacriticai marks (points above and below the letters, microns and macrons) 1
use majuscules for most marked letters and random characters for one or two. It
would have greatly simplified the programs described below if 1 had avoided the
digraphs, firstly because they constitute a peculiar set of characters, secondly
because "h" in the usuaI system, and in mine, has an independent existence
as a let ter j but to keep to convention seemed preferable mnemonically. The
best system for this purpose is the one which causes the least hesitation in the
operator who is converting and typing simultaneousIy.
Although such work ideally demands no refiection on the operator's
part, it represents a significant stage in metrical analysis because, despite
its multiplieity of letters, the devanagari alphabet is not perfectly suited to
Apabhramsa. Many of the letters are never used, except when an older Prakrit
or Sanskrit itself is quoted j sorne letters are used interchangeably; sorne souncls
are not represented by any standard letter. Two clet ails will suffice to describe
the significance of this for metrical analysis, but first sorne terms will have to
be defined. The metrical basis of Apabhramsa verse is quantitative, that is,
the recurring elements of the system are classified by duration, not stress or
auy other linguistic feature. The fundamental element is the syllable, which 1
shaH not attempt to define here except to note that its essential constituent is
a vowel. For metrical purposes what matters is that a syllable may be either
"long" or llshort". Vowels too are classifled as Ulong" or "short". Any syllable
which contains a long vowel is long j a syllable containing a short vowel may be
short or, if a group of two consonants follows the vowel, long. Now, the vowels
in Sanskrit transcrihed as Ile" and ClO" are regarded as diphthongs and hence
hy clefinition long. In the Prakrit languages, however, there exist both long and
short "e" and "0" vowels. The majority of examples occur hefore a group of
two consonants, indeed they are conditioned by that envronment 5. Hence the
existence of these short vowels is of no metrical significance. But they can occur
in other contexts as weil, they can he metrically short, and the alphabet has
oS The so-called law of two morae : W. GEIGER Pali Literatur und Sprache, Strasbourg,
1961, p. 42. The rule that a short vowel becomes long if followed by a double consonant
is enunciated by the Indian grammarian Hemacandra (1088-1172 AD)j see R. PISCHEL,
Grammatik der Pral.:ritSprachen, Strasbourg, 1900, repr. Hildesheim, 1973, pp. 72-3.
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
102 C.M. MAYRHOFER
no way of distinguishing them from the long vowels "e" and "0". Secondly, the
Prakrits have two kinds of nasalization of vowels which have different metrical
valucs
6
In written and printed texts these sounds are represented sometimes by
two different signs, sometimes by one and the same. Clearly, if the texts are to
he transliterated, as they must he for computer processing, the transliteration
should he explicit and consistent with regard to these matters. And when one
makes a choice of this kind one is adding information to the text, in effed
performing the first step in scansion.
The scanning program itself is a homespun affair with no pretentions to
elegance or security. lt would no doubt have both these quaHties if 1had used the
services of a professional programmer, but unless one is lucky enough to have at
one's disposaI a programmer interested in literary problems, it is probably easier
to learn the rudiments of a suit able language (in this case SIMULA) and do it
oneself. Briefiy, the program takes as input a file in a standard format in which
each line represents a line of verse j by a series of conditions it translates each
line into characters of three types standing respeetively for short syllable, long
syllable and word-division; this string is read into an array. The array is passed
to a procedure which converts each grade into one of three symbols, which have
a rough resemblance to the symbols of micron (HU"), macron ('1_") and caesura
( l'') as used in European metrical scholarship. The array is then output as a
lne. The most complicated part is the translation from alphabet into metrical
types, which needs to he sensitive to a context that in different cases can include
the character before, the character after1 and up to two characters following a
word division after, the character under consideration. The output Hne consists
of a reference to the li ne of text, an integer representing the metrical length
of the line in terms of short-syllable equivalents, and the line as scanned. The
integer is used for sorting a corpus of scanned Hnes into blocks of related lines
J
the metres being traditionally classified by the length of the Hne.
The symbols used could easily he replaced by others, for example ":" and
"Sil in imitation of the symbols sometimes used in works by Indian scholars.
More crucial than their shape is the nUInber of columns that they occupy j the
long-syllable symbol occupies four, the short-syllable symbol two, with in either
case the rightmost column being a blank space for the word-division symbol,
which when it occurs does not occupy a column of its owu but fits iuto the
other symbols. In this way, as the Hne is read from left to right, the number of
columns traversed bears an exact relationship to the metricallength of the Hile
at that point.
6 PISCIIEL, pp. 131-2. '
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
SCANSION AND ANALYSIS OF PRAKRIT VERSES 103
The purpose of this is to make it possible to examine a listing visually and
quickly pick up any irregularities. Though not l'very fault in the text will l'l'suit
in an irregular scansion, it is almost certain that any irregularity in scansion
will be caused by a fault in the text, introduced at sorne stage in the copying
process from writer to printer to computer file. Flirther, since as we have seen
the transliterated text carries more information than the original, sorne errors
will be located in this information. Finally it is possible that the program may
not he capable of correctly handling a particular cornbination of charaeters,
though by now aIter rnany adjustments to eliminate banal inadequacies any
new failures will add significantly to what is known about the language. Hence,
once a text has been typed in transliteration, it is very useful to print out the
scansion of the file and have it by one's si de while collating the transliteration.
In Apabhramsa texts a parti culaI' metre may be used throughout a black
of verses, after which comes another block of a different metre, or a recurring
strophe-like arrangement composed of verses of different length may be used, or
the verses may be mixed. Espccially in the last two cases it is useful to sort the
file of scanned verses by the integer representing the metrical length of each
line. This then produces blocks of related verses, which again are useful for
visual checks. The correctly-scanning Hnes will be grouped together by metre,
and any verses which contain a metrical fault that influences their length will
fall outside the black. Often there will occor between one black of regular lines
and the next a well-defined group of stragglers that need editorial attention.
In such a listing the eye will pick UP, not only irregularities, but also
regularities. A particular metre is defined, not only by its length in short-
syllable equivalents, but also in terms of perrnitted sequences of long and short
syllables. In a black of related lines the compulsory, preferred and forbidden
sequences form visible patterns, and by means of such an examination it is
possible to check the traditional rules and to suspect further rules not specified
by the tradition. There is no doubt a circularity in using the rules of scansion to
correct a text and at the same time using the text. to deduce rules of scansion,
but such a praetice is inherent in the process of editing a text.
When finally the whole file scans regularly, one can proceed ta use the
scanned file, the output of the first stage, as input for a further stage, that of
submitting the traditional rules and one's own intuitions concerning the nature
of a given metre to a rigorous analysis. There are two fundamental approaches
to metrical analysis at this level, sometimes qualified as "outer" and "inner"
metrics : bath study permitted sequences of long and short syllables, but the
former concentrates on the verse and its articulations, the latter concentrates
on the word in relation to the verse and its articulations. Thus for example in
the Greek and Latin hexameter, the verse must begin with a long syllable (-);
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
104 C.M. MAYRHOFER
that advances the count of short-syllable equivalents ta 2; the count may then
jump ta 4 with another long syllable (- -), or it may increase fust ta 3 (_ V)
then ta 4 (- VV) with short syllables, but it must then advance from 4 ta 6 :
- - - or - y v _ the count can never he 5. Then follows one long syllable or two
short and sa on, ta the line limit of 24. The regtar increment in the Hoe of 4
short-syllable equivalents, consisting of one long syllable followed by a long or
two shorts, constitutes a foot. Indian paeties also uses the ward equivalent ta
(({oot
H
) but in a different sense. The ward gana is used with reference ta Prakrit
verse for a concept somewhat like that of the foot, except that not every gana
in a verse has the same valuc, 80 that the formula of a verse might bel nstead
of for example 6 * 4 as in the Greek and Latin hexameter, something likc
6+4 +3 +6 +4 +1 , and the specifications for a particular gana can range from
"only this sequence of long and short syllables is allowed
ll
ta l'any combination
of long and short syllables is allowed provided that the total of the short-syllable
equivalents is n" 7.
The program which deals with the "outer metric
1J
of Apabhramsa verse
begins by collecting certain data interactively from the terminal : the name
of the metre (for the heading), the number of gana. in a verse, the number of
short-syllable equivalents in each gana. The SUffi of these is used to select verses
in the file for closer attention j the program then asks the user to prescribe two
features of the verse in the form of two columns of the scanned file (e.g. in
column 21 and column 25 the charader Il U" must appear) so that different
metres of the same length can be distinguished. After the data are elicited, the
program reads the scansion file Hne by Hne, checks each Hne for its metrical
length, and if the length is that of the verses under investigation, applies the
column-matching test. If the liue passes the test , the program splits it into ganas
as prescribed. Each gana is matched against an array in which are accumulated
the gana-types found during the current run, and a two-dimensioned integer
array keeps count of the occurrences of each type in each place of the verse. The
results are tabulated with the actually-occuring gana-types, sorted by length,
along one axis and a heading identifying the gana by its place in the line along
the other j the intersections give the number of times that the gana takes a
particular form at a particular place. See table 1 for an example of the output
of this program j the verses analysed are from the 8th chapter of Siricanda
Kahako3a.
7 For a view of the basis and development of gana metres, see E.I<. WARDER, Pail
Metre : a contribution to the hi3tory of n d i r ~ Literature, London, 1967, chapter vi.
There are important collections of data in C. CAPPfil,l,ER, Die GanachandM, Leipzig, 1872,
repr. in Kleine Schriften, cd. S. LIENHARD, Wiesbaden, 1977, pp. 35156j L. ALSDORF,
Apabhram3a.dudien, Leipzig, 1937, repr. Nendeln, 1966; H. BHAYANI, The Samdesa rasaka
of Abdul Rahman... , Bombay, 1945.
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
SCANSION AND ANALYSIS OF PRAKRIT VERSES 105
TABLE 1: GANA TYPES
Number of doha read in scande: 40
GANA
ul
0 0 0 0 0 40
u u Ul 0 0 25 0 0 0
ulu Ul 0 0 13 0 0 0
U U U 0 0 2 0 0 0
u Ul __
0 Il 0 0 6 0
ulu _ 0 2 0 0 15 0
__ U U
0 1 0 0 0 0
----
0 7 0 0 7 0
__lU u
0 3 0 0 0 0
U u __
0 4 0 0 12 0
u UIU Ul 0 1 0 0 0 0
U U U U 0 3 0 0 0 0
-1-
0 3 0 0 0 0
U U u ul 0 1 0 0 0 0
u UIU u 0 3 0 0 0 0
U __ U
0 1 0 0 0 0
u ulu u __ 1 0 0 1 0 0
UUluUluu 0 0 0 1 0 0
u Ul __ u Ul 1 0 0 0 0 0
u Ul ____
2 0 0 1 0 0
U u __lu Ul 1 0 0 0 0 0
uuuluuUI 0 0 0 1 0 0
u UI __I_
1 0 0 1 0 0
__ UIU __
2 0 0 1 0 0
__ uu __
2 0 0 0 0 0
U U ____
0 0 0 1 0 0
------
2 0 0 0 0 0
__ UIUlu u 1 0 0 0 0 0
__ u Ul_
4 0 0 6 0 0
UUluuuUI 0 0 0 1 0 0
UUluuUU 0 0 0 1 0 0
____ u u
1 0 0 3 0 0
__ u U-I 1 0 0 0 0 0
__ u __ Ul
0 0 0 1 0 0
--I-_U Ul 3 0 0 2 0 0
--lU U U U
0 0 0 2 0 0
uuuuluul 1 0 0 1 0 0
u ulu ul __ 1 0 0 3 0 0
__ uuluul 0 0 0 1 0 0
u u ____1
0 0 0 1 0 0
__ ulul_
1 0 0 1 0 0
____ u ul
1 0 0 1 0 0
uuuuluu 1 0 0 0 0 0
UUUIU __ 2 0 0 1 0 0
u u u __ u
1 0 0 0 0 0
__ U __ u
0 0 0 1 0 0
(...)
"Innee' metrc can not he entirely separated from <Iouter". It s clear that
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
106 C.hL MAYRHOFER
the permitted sequence of long and short syllables in a verse determines the
shapes of words that are admissible in that verse at particular places in the
verse. But Hinnee' metric is alsa concerned with, ta put it one way, the metrical
length of words occurring in a verse, to put it another \Vay, \Vith the places in
a verse where one word may end and another begin. Concerning the former,
there are regularities ta he discovered in the favoured word-shapcs of a given
length, bath absolutely and in relation \Vith their place in the verse; traditional
metrics has nothing ta say about these. Concerning the latter, traditional
metrics prescribes places in the verse where there is always a word-division,
but is reticent about the places where there is never a word-division. This is
not the place to give details of the phenomena to bc investigated; the method
of the program which tabulates them is, briefly : for a li ne of given metrical
length, which passes the colunm-content test as abovc, read the metrical shapes
between word-divisions and store them in a text anay of which each grade
contains one type, with a count being kept in a two-dimensioned integer array
of the occurrences of each type in a particular place. The results are then
tabulated with the actually-occurring ward-types, sOl'ted by length, along one
axis, and along the other axis a heading which divides the Hne into columus
numbered to represent the place in the verse, 50 as to locate the places where
the words begin j the intersections give the number of times that a word of a
particular shape begins at a particular place in the line. Ta facilitate the stndy
of the general metrical nature of the text, the SUlU of aU the occurrences of a
word of a particulaI' shape is given at the end of a row. One cau then see at a
glance if, for example, anapaestic words are favoured over dactylic, or whatever.
Ta facilitate the study of the obligatory and fOl'bidden ward-divisions, the last
row gives the sum of eaeh column. Table 2 gives an example of the output of
this program, using the same material as in table 1.
The aim of these programs is essential1y to reconstruct the criteria of
correctness in the various metres of Apabhramsa verse, or to verify the criteria
in cases where they are provided by traditional eommentaries or have already
been estabHshed by modern scholarship. It seems entirely appropriate to use a
computer ta perform quickly and accurately the inherently disagreeable tasks
of gathering and sorting the data, as a means to the end of deducing the rules
by which the poets worked, sa ta speak. 1 pass over the problem of the statns
of these rules. There is probably no way of deciding whether the regularities
which emerge from this presentation of the data arise from the system of the
language itself or from the craft of the poet working within a tradition; most
cases would no doubt involve a mixture of the two in varying proportions. l
have resisted the temptation to explore the regularities for themselves beyond
the point where the)' may intuitively be associated with or considered as models
for the practice of the poets. But that there are perspectives in the formai basis
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
SCANSION AND ANALYSIS OF PRAKRIT VERSES
TABLE 2 : WORD TYPES
Number of 24-morae lines read in scande: 40
LiST OF WORD TYPES
107
1 :
2:
3 :
4:
5:
6:
7:
8:
9:
10:
11:
12 :
(... )
37 :
38 :
39 :
40 :
41 :
42 :
43 :
44 :
u
u u
u u u
__ U
U __ U
__ UU
U U __
U U U U
__ uuu
____ U
______ u u u
UUU __ UUUU
U UUU
U U U U __ U
__ U U __ U U __ U
____ U U __ U U U
UUUU __ UUUUUUU __ UU
U U U __ U U U U U U U U __ U __ U U U __ U
of versification beyond this point is constantly impressed on one. To take one
example : in a rhyming couplet in the paddhadika metre there are 32 short
syllable equivalents, and an obligatory word-break for the rhyme at the mid-
point. Hence the longest possible word contains 16 short-syllable equivalents,
and words of this length lli'e not uncommon. In this poetic language there is no
simple inverse relationship hetween word-length and frequency. The number of
possible shapes of words hetween 1 and 16 short-syBable equivalents is about
4000. Sorne of these are excluded by the metre, but probably, the metre being
very accomodating, less than half : say then that the number of possible shapes
is 2000. In praetice about 100 seems ta he the maximum, and it is a practical
matter for the programmer because it influences the dimensioning of the arrays.
To explain why relatively few of t,he possible word-shapes are used one would
probably have to invoke the formai properties of the language itself.
A further use for a file of scanned verses is the study of rhyme. As it hap-
pens, not ouly are aIl known Apabhramsa texts in verse but also nearly aIl
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
108 C.l....I. MAYRHOFER
the verses are rhymed. Traditional pocties recognizes rhyme or rather rhyme-
like phenomena as a repetition of sounds at the end of \Vords or homonymous
\Vords at the verse end. This leaves l'oom for further investigation in terms of
the lexical choice, morphology and metrical shape of rhyming clements. To be-
gin with, it is necessary ta define rhyme, as understood in European pocties,
in terms of metre. Words with similar cudings are ouly rhymes if they Dccur at
a determined place in a verse: typically, the end of the verse, but that faises
the question of the relationship between the verse and the lines as presented
on the page. For example the verse referred to above can he printed as
(1) (16)
(17) (32)
or as
(1) (32)
In the former case, each Jiue will end with a rhyming word; in the latter,
there will be a compulsory word-break after the 16th place, and the word wrueh
ends there will rhyme with the word at the end of the line. A more eomplicated
example is shown in the figure below, in which the numbers l'epresent the place
in the verse, the letters represent the words which rhyme \Vith one another in
the same or in successive lines :
(1) (lO)a (18)a (31)b
(1) (lO)e (18)e (31)b
It would undoubtedly have simplified the programming if the verses were
typed so that rhymes occurred only at the end of lines, but for the sake of ease
in collation and cross-reference the lines represent the verses as they are usually
printed. So the program must n.nd the places in the verse where a rhyming word
i8 to be expeeted, and extraet that word. This is done by reading the text file
and the scanned file concurrently (the latter, being derived from the former,
corresponds with it line by Hne) and counting in any particular Hue, starting
at the end of the Hne, the number of markers that occur in the
scaIUled text before the prescribed metrieal place is reached (this is of course
1 in the most frequent and trivial case, end-rhymes); then extracting the ward
from the text which occurs after the same number of word-divisions, again
counting from the right. The same process is repeated as necessary in arder ta
extraet the other member of the rhyming pair, and once that has been done,
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
SCANSION AND ANALYSIS OF PRAKRIT VERSES 109
the elements that are common to the ends of both words are extraeted. This is
simple enough when both occur on the same line, a little more delicate when
they OCCUl' on different lines, and qui te awkward when there is a mixture of
the two schemes as in the example above. In praetice 1 have a procedure for
the case of rhymes in the same line and another for the case of rhymes in a
different Hne and 1 simply rewrite a section of the prograrn to cope with each
particular job. Tt would be more elegant, but very time- consuming, to write
a prograrn that could accommodate every pattern. When the rhyming words
have been extracted, the program sends to a file a record consisting of the
elements cornmon to the endings of the two words, - in other \Vords the rhyme,
- reversed for sorting purposes, and the two words involved, together with the
Hne reference of one of them.
Of these higher-Ievel programs, the least useful, in the sense of giving
new insights, is the most obvious, that which classifies gana. Scholars have
collected this kind of material before, and it is at least reassuring to find
that the results produced by my program correspond with those produced
manualiy by others. If such statistics are needed again, they may be entrusted
to a machine. The rhyme program was written and applied in the hope of
confirrning a hypothesis that, not evcry allowable and actually-occurring word-
ending is used in rhyme, but that a relatively smali group of rhyming elements
is used relatively frequently. It did not fulfil trus hope. There are matters to
be discovered about the choice of rhyrning words, but the phenornenon is not
simply lexical or morphologieal as was first thought. On the other hand the
sorted output of rhyming words revealed at the head of the list an unexpected
group of blanks, where the words in rhyming position did not end with common
clements. More preeisely, the common elements were of a kind that the program
was not written to deteet. Such rhymes are of considerable interest for the study
not only of the poetics but also of the phonemics of the language: it appears
that the poetic rules deem certain non-identicalletters to correspond with one
another and certain non-nun letters to he nul!. These rules are of course not
direct evidence for the phonemics of the language but they clearly need to
be taken into account under that heading
8
The rhymes can perfeetly weIl he
collected manually. However when one is not sure at the outset what one is
looking for, it is aIl advantage to be able to repeat the process of collection
with different criteria in negligible time, and this can be done by means of
minor modifications to the program.
8 For discussions of "impure" rhymes see H. JACOBI, Bhavatta Kaha von Dhanavala :
eine Jaina Legende in Apabhromsa, Munich, 1918, pp. 52-3; G. BAUMANN, Drei Jainagedichte
in AltGujarati, Wiesbaden) 1975, pp. 1921.
Extrait de la Revue Informatique et Statistique dans les Sciences humaines
XXIII, 1 4, 1987. C.I.P.L. - Universit de Lige - Tous droits rservs.
110 C.M. MAYRHOFER
The most useful program is without doubt the one which classifies word-
types. Not ouly does it yield interesting information on the structure of the
verses which are submiUed ta it, but it also provides a table of the frequency
of occurrence of caeh metrical shape of the words in a text. Inasmuch as me-
trieal shape is considered to be a factor in the historieal morphology of the
ludo-Aryan languages
9
, this material has an importance beyond the study of
poetic practices, and in a subsequent project this funetion of the program will
be applied ta the automatic construction of word-lists arranged by metrical
shape from the texts uncler consideration.
TABLE OF DISTRIBUTION OF WORDTYPES
BY STARTING POINT
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41... SUM
1 o 0 o 3 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 5
2 8 0 2 0 3 0 2 0 1 0 0 13 o 11 0 5 0 2 0 3 0 50
3 6 0 3 0 1 0 4 0 0 0 0 0 0 5 0 2 0 0 0 0 0 21
4 3 0 o 0 0 0 0 0 3 0 2 0 0 2 0 0 2 1 0 0 0 13
5 4 0 1 0 0 0 0 0 4 0 0 0 0 3 0 0 0 9 0 0 0 27
6 o 0 o 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 o 15 18
7 5 0 4 0 4 0 0 0 0 0 0 0 0 6 0 2 0 0 0 0 0 21
8 1 0 o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
9 1 0 o 0 1 0 1 0 0 0 0 0 0 2 0 1 0 0 0 0 0 6
10 o 0 o 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
11 o 0 o 0 0 0 0 o 10 0 0 0 0 0 0 0 0 0 0 0 0 10
12 o 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 5 0 8
( ...)
37 o 0 o 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
38 o 0 o 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
39 o 0 o 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
40 o 0 o 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1
41 1 0 o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
42 o 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
43 1 0 o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
44 1 0 o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
SUM 40 0 14 7 15 1 13 2 21 0 2 13 o 38 o 16 5 17 0 12 15
9 O. VON HINBER, Das iiltere Mittelindisch im berblick, Vienna, 1986, p. 90.

You might also like