Yaron Matras Ed. Romani in Contact The History, Structure and Sociology of A Language

ROMANI IN CONTACT
AMSTERDAM STUDIES IN THE THEORY AND

HISTORY OF LINGUISTIC SCIENCE
General Editor
E.F. KONRAD KOERNER
(University of Ottawa)
Series IV - CURRENT ISSUES IN LINGUISTIC THEORY
Advisory Editorial Board
Henning Andersen (Los Angeles); Raimo Anttila (Los Angeles)

Thomas V. Gamkrelidze (Tbilisi); John E. Joseph (Hong Kong)
Hans-Heinrich Lieb (Berlin); Ernst Pulgram (Ann Arbor, Mich.)
E. Wyn Roberts (Vancouver, B.C.); Danny Steinberg (Tokyo)
Volume 126
Yaron Matras (ed.)
Romani in Contact
ROMANI
IN CONTACT
THE HISTORY, STRUCTURE
AND SOCIOLOGY OF A LANGUAGE
Edited by
YARON MATRAS
University of Hamburg
JOHN BENJAMINS PUBLISHING COMPANY

AMSTERDAM/PHILADELPHIA
The paper used in this publication meets the minimum requirements of
American National Standard for Information Sciences Permanence of
Paper for Printed Library Materials, ANSI Z39.48-1984.
Library of Congress Cataloging-in-Publication Data

Romani in contact : the history, structure, and sociology of a language / edited by Yaron
Matras.
p. cm. (Amsterdam studies in the theory and history of linguistic science. Series
IV, Current issues in linguistic theory, ISSN 0304-0763 ; v. 126)
Papers presented at a workshop held May 1993. Hamburg, Ger. and later named the 1st
International Conference on Romani Linguistics.
Includes bibliographical references and index.
Contents: On typological changes and structural borrowing in the history of European
Romani / Vt Bubenik - On the migration and affiliation of the Domba : Iranian words in
Rom, Lorn, and Dom Gypsy / Ian Hancock Plagiarism and lexical orphans in the European
Romani lexicon / Anthony Grant - Interdialectal interference in Romani / Norbert Boretzky
- Verb evidentials and their discourse function in Vlach Romani narratives / Yaron Matras -
- Notes on the genesis of Cal and other Iberian Para-Romani varieties / Peter Bakker
Romani lexical items in colloquial Romanian / Corinna Leschber -- Romani standardization
and status in the Republic of Macedonia / Victor A. Friedman - Trial and error in written
Romani on the pages of Romani periodicals / Milena Hbschmannov.
1. Romany language-Congresses. 2. Languages in contact-Congresses. I. Matras, Yaron,
1963- . II. International Conference on Romani Linguistics (1st : 1993 : Hamburg, Germany)
III. Series.
PK2896.R66 1995
491'.499-dc20 95-15437
ISBN 90 272 3629 1 (Eur.) / 1-55619-580-X (US) (alk. paper) CIP
Copyright 1995 - John Benjamins B.V.
No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any
other means, without written permission from the publisher.
John Benjamins Publishing Co. P.O.Box 75577 1070 AN Amsterdam The Netherlands
John Benjamins North America P.O.Box 27519 Philadelphia PA 19118-0519 USA
ACKNOWLEDGEMENTS
In May 1993, researchers in Romani linguistics from seven countries

gathered in Hamburg for a workshop entitled "Romani in Contact with
Other Languages". Following a two-day discussion on papers and general
perspectives in Romani linguistics, they decided to continue meeting on a
regular basis, and to re-name the workshop: it became the First International
Conference on Romani Linguistics. As these lines are being written, the
final touch is being given to the organizational work in preparation for the
Second International Conference on Romani Linguistics, to be held in
Amsterdam.
The papers collected in this volume are revised versions of some of the
contributions presented at the Hamburg conference. I wish to thank the
participants, who agreed to share their ideas and insights with one another
and with an interested public in this form. I also thank Jochen Rehbein and
Ian Hancock, who helped initiate the Hamburg workshop, and the Program
on Multilingualism and Language Contact at the University of Hamburg,
which, supported by the German Research Society, agreed to share part of
the financial burden involved in organizing the conference and producing
the volume.
I am very grateful to Victor Friedman, Ian Hancock, Sarah Thomason,
and Jochen Rehbein, who served as an advisory board, for their valuable
comments and suggestions and for their contribution to the editorial work.
Finally, I owe very special thanks to Ulrike Erichsen, who invested
countless days and weeks tackling numerous editorial and technical
problems in order to enable a prompt completion of this project.
Yaron Matras
Hamburg, November 1994
CONTENTS
Introduction IX
Yaron Matras
On typological changes and structural borrowing in the history of

European Romani 1
Vit Bubenk
On the migration and affiliation of the Dmba: Iranian words

in Rom, Lorn and Dom Gypsy 25
Ian Hancock
Plagiarism and lexical orphans in the European Romani lexicon 53

Anthony Grant
Interdialectal interference in Romani 69

Norbert Boretzky
Verbevidentialsand their discourse function

in Vlach Romani narratives 95
Yaron Matras
Notes on the genesis of Cal and other Iberian

Para-Romani varieties 125
Peter Bakker
Romani lexical items in colloquial Romanian 151

Corinna Leschber
Romani standardization and status in the Republic of Macedonia 177

Victor A. Friedman
Trial and error in written Romani on the pages of

Romani periodicals 189
Milena Hbschmannov
INTRODUCTION
Language contact has been a central issue in Romani linguistics ever

since its very early beginnings. The fact that a language existed with no
coherent geographical area, and that bits and pieces of that language most
obviously paralleled some of Europe's best known idioms, while others
remained obscure in their structure and origin, had led scholars and other
spectators to many speculations regarding the classification and definition
of the Gypsy varieties.
It was Johann Rdiger who, drawing on clues provided by some of his
colleagues, was the first to provide a solid explanation for the origin of the
Romani language, tracing it back to the Indo-Aryan varieties of India
(Rdiger 1782). Rdiger was an empiricist, who collected Romani data via
fieldwork and compared them to descriptions of Hindostani (Hindi)
available in the literature. He concluded that Romani was in essence Indic,
and that it differed from its modern genetic relations in a similar fashion as
do related languages in Europe, such as German, English, and Danish, or
Italian and French, from one another (Rdiger 1782: 70). Furthermore, he
pointed to the fact that Romani "copied" the structures of the surrounding
languages, reproducing them by using elements of the inherited Indic stock,
as exemplified by the emergence of definite and indefinite articles, or by a
general syntactic re-arrangement (p. 71, 77). A critic of his times and a
hardly acknowledged explorer of new methodological paths, Rdiger not
only provided the proof for the real genetic affiliation of Romani, but he
also taught us that appreciating the impact of language contact is crucial for
understanding the typology of the language.
Nearly a century after the appearance of Rdiger's article, Franz
Miklosich published his first contribution in a series surveying the Romani
dialects of Europe (Miklosich 1872-1880; 1874-1878). Miklosich relied on
the borrowed lexical component of the various dialects in order to trace the
migration routes of the Roma through Europe, and established their
European origin in a Greek-speaking community, basing his arguments on
the Greek element in the lexicon and morphology which all varieties of the
language share (Miklosich 1872-1880, III).
Discussions of the Indic component of Romani and its position in
relation to the genetically affiliated languages of India flourished in the late
1800's and in the early decades of this century. Here too, scholars pointed to
X INTRODUCTION
the hybrid character of Romani, some arguing that contact was responsible
for the fact that Romani shared distinct features with different varieties of
subcontinental Indo-Aryan. While Turner (1926) advocated a historical
layer-solution, arguing for the emergence of Romani in Central India and a
subsequent migration to the Northwest, where it remained unaffected by
later developments within the Central languages, Sampson (1926: 29)
suggested a merger of various dialects as a possible solution.
Since Gilliat-Smith (1915-1916), dialect classification in Romani has
relied heavily on the distinction between Vlach and non-Vlach varieties, the
Romanian influence upon the former being one of the significant criteria.
Besides reviving this latter distinction (Kochanowski 1963), modern
Romani linguistics has directed much of its efforts toward investigating the
structural and sociological background for the emergence of those Gypsy
varieties which are based on the surrounding so-called host-languages, but
retain, at least in part, a Romani vocabulary (see Bakker & Van der Voort
1991 for an overview). Growing interest in these varieties during recent
years (cf. Bakker & Cortiade 1991) has partly to do with the fact that their
structures are more easily accessible to non-Romanologues than are those of
'inflected Romani' (so the term for Romani proper), but it is also connected
to the theoretical questions which they pose. Much discussion has been
dedicated to seeming similarities between such idioms and Creoles (e.g.
Hancock 1970; Boretzky 1985; Acton 1989), before the term Para-Romani
(Cortiade 1991) was suggested. Romani 'mixed' varieties have been
presented as further evidence for the existence of a 'mixed language' type
which owes its genesis to a process of 'broken transmission' across
generations - evidence put forth in order to challenge the conventional
notion of gradual genetic development as the primary course of linguistic
evolution (Thomason & Kaufman 1988).
But inflected Romani, a genuine genetic relation of Modern Indo-
Aryan, also possesses structural features which confront descriptivists with
a challenge. It has been shown that apart from borrowing lexical items,
Romani also treats borrowed morphological material systematically,
assigning much of it selectively to the borrowed lexicon, a phenomenon
referred to as thematic vs. athematic grammar (Hancock 1993). While this
is generally characteristic of the language as a whole (cf. Kostov 1973;
Boretzky 1989), specific dialects also show their own typical patterns of
adoption (Kostov 1973; Igla 1989; Boretzky & Igla 1991). Besides selective
integration of borrowed morphology, Romani dialects have incorporated
INTRODUCTION XI
phonemes and, in some cases, phonological distinctions based on those of

the surrounding languages (Boretzky 1991; Boretzky & Igla 1993).
Although areal typology in the context of Balkan studies has largely ignored
the language, investigations into single grammatical constructions have
pointed out Balkan features in Romani (Kostov 1962; Friedman 1985, 1991;
Boretzky 1986, 1993), and recently an attempt has been made to classify the
core syntactic structure of Romani as a case of typological shift stimulated
heavily, perhaps even triggered by the dynamics of language convergence
in the Balkans, rendering a balkanized Indic language (Matras 1994).
Much of the attention which Romani has increasingly received in recent
years is thus connected to the fact that contact has played such a central role
in shaping its grammatical structure, there being virtually not a single
unilingual Romani community. Such interest in turn reflects growing
awareness that language contact has its impact upon numerous domains of
everyday communication in modern society, with linguistic manifestations
ranging from individual code switching and institutionalized bilingualism to
grammatical interference and structural change. Apart from the interest in
structural aspects of language contact, however, general appreciation of
Romani is on the rise as, following the political transition in eastern Europe
and an ongoing cultural and political re-unification process among the
Romani communities in Europe, the presence of a Romani-speaking
population of a significant size is acknowledged, and new ways of
incorporating its cultural needs, interests, and demands into existing
institutional and societal frameworks are being explored. Much activity has
therefore centered lately around efforts at conventionalizing Romani for use
as a vehicle of literary communication. But here too, contact proves to be a
primary consideration since every written variety of Romani coexists
alongside a national literary language, with which it shares a variety of
functions.
Approaching the language via this unique aspect - its ever-surfacing
character as a language in contact - this volume seeks to give expression to
part of the wide range of research represented in today's field of Romani
linguistics. It reflects both traditional and most recent domains of interest,
dealing with issues of Romani origin, dialect diversity, mixed varieties, and
Romani loans, as well as grammatical categories, discourse-pragmatics,
standardization and literacy, and a critical assessment of the discipline itself.
It is rather striking that interest in Romani among Indologists seems to
have decreased since the first part of this century, when scholars of older
XII INTRODUCTION
and modern Indo-Aryan made significant contributions to the study of

Romani structure and origin. This old tradition in Romani studies is revived
by Vit Bubenik (St John's), who looks at a number of areas in Romani
morphosyntax from the perspective of Middle and Modern Indo-Aryan
languages and dialects. Bubenik shows that, while preserving some basic
Indo-Aryan formation patterns at the morphosyntactic level, Romani also
adheres to the 'average European' SVO, prepositional, analytic type. This
results in a typological competition of structures within the language,
which, as Bubenik points out, renders a genuine case of overall systemic
complication. This observation based on empirical evidence from Romani
dialects merits special attention, as contact is often believed to lead
necessarily to structural simplification.
As mentioned above, conjectures regarding the origin of the Romani
people and their language, and subsequently concerning the time, the
nature, and the circumstances of their exodus from India are at least as old
as Romani studies itself. Ian Hancock (Austin) attempts to illuminate some
aspects of a puzzle which is yet to be solved by examining lexical elements
of Iranian origin in three Gypsy languages: European Romani, Near-Eastern
Domari with its varieties, and Armenian Lomavren. Hancock compiles the
relevant items from scattered sources, and concludes on the basis of a
comparative analysis that the loan vocabulary of Iranian origin shared by
any of the languages is so scarce (that shared by all three, in fact, non-
existent), that it is highly unlikely that the three groups crossed Iranian-
speaking territory together.
Lexical issues are also the focus of Anthony Grant's (Bradford)
contribution. Grant discusses the problem of 'lexical orphans', that is to say,
words which are attested only in one dialect of the language, and their
importance for reconstructing the common lexical core of early Romani, a
task which, beyond its descriptive achievement, could bear significance
with respect to standardization efforts. But he also approaches Romani
studies itself in a critical manner, tracing plagiarism and dissemination of
spurious data across the pages of some of the discipline's most prominent
early representatives. In doing so, the author calls our attention to a delicate
but crucial matter which touches on the credibility of work considered basic
in Romani philology, and entails a special challenge to those involved in
contemporary Romani studies.
Romani dialectology, much like the investigation of dialects in other
languages, has traditionally been rather strict in drawing isoglosses to mark
INTRODUCTION XIII
interdialectal boundaries. Norbert Boretzky (Bochum) demonstrates that

this is, in reality, a rather difficult endeavor. Although there are strict social
boundaries between Romani communities, mixing takes place at the
phonological, morphological and probably also lexical levels. Concentrating
on Romani dialects of former Yugoslavia, Boretzky examines oral data and
data gathered from informants, but also written Romani prose. This
demonstrates how literary Romani is gaining territory, and how (beyond its
primary tasks) it also furnishes, like any other literary language does,
material for descriptive analyses.
Nevertheless, our main source for Romani narration remains the oral
tradition of the language, whether in the institutionalized form of story-
telling, or simply in the way of reconstructing one's own experiences. Yaron
Matras (Hamburg) introduces discourse-pragmatics into Romani studies in
an attempt to shed new light on the function of a particular grammatical
category in the language, namely the split in past tense formation. Here,
Romani occupies a unique position among those subcontinental Indic
languages believed to be more closely akin to it due to its lack of ergativity.
On the other hand, the function of the structure in question is interpreted by
Matras as a coding of evidentiality, a typical feature of the Balkan
languages. We thus have a further case illustrating the hybrid character of
the typological formation of the language.
Hypotheses concerning the genesis of Para-Romani languages vary,
some scholars arguing for the conscious creation of a secret language
common to Gypsies and other marginalized groups, others assuming a
gradual integration of material from the 'host-language' into Romani, and
yet another view explaining their emergence as a case of language shift.
Peter Bakker (Amsterdam) discusses sources on Cal, a Para-Romani
variety of the Iberian peninsula. On the basis of the apparent functions of
some retained morphological elements of Romani origin, and especially the
nature of the Spanish components, Bakker remarks that an early genesis of
Cal, that is, its emergence shortly after the arrival of the Roma in the
peninsula is most probable.
From the point of view of language-contact studies, Romani is
generally treated as the recipient language for borrowed grammar and
lexical elements. Especially Romanian influence on Romani has been given
much attention, having played an important role in shaping what we now
designate as the Vlach Romani dialects. Corinna Leschber (Berlin)
confronts us with the opposite perspective, surveying evidence for the
XIV INTRODUCTION
impact of Romani on colloquial Romanian. Checking published sources

against empirical fieldwork with various control groups, she compares the
status Romani words have acquired in the speech of ethnic Romanians with
that of similar items among Romanian-speakers of Romani origin. While
semantic shift, explained by the need to fill slots in taboo-domains, is
characteristic of the first, semantic consistency, that is, strong affinity with
the Romani source is typical of the second. We thus gain a valuable insight
into the mechanisms of language loss and the emergence, instead, of a
community form of speech, bearing perhaps some significance for a general
understanding of the genesis of Para-Romani varieties as well.
The process of assisting an oral language to become a vehicle of written
communication is often referred to as 'standardization', but in fact it is much
more complex than selecting or defining a standard norm. The last two
papers of this volume present contrasting case studies on the emergence of
an orthographic norm and, more generally, of a literary variety of Romani.
Victor A. Friedman (Chicago) discusses officially-backed efforts to design a
standard Romani alphabet in the Republic of Macedonia. Friedman
emphasizes the fact that normativization can also be seen as a contact
process, just like lexical or structural borrowing, since the development of a
Romani alphabet is taking place in contact with the elaboration processes of
other standard languages. He illustrates how decisions take into account
these contact factors when selecting and codifying a norm. Milena
Hiibschmannov (Prague), on the other hand, shows how in the Czech
Republic written, normative Romani is developing gradually and almost
spontaneously through trial and error. She surveys some of the features of
these literary varieties, tracing many of them back to the merger with
structures of the contact language Czech, and makes special reference to the
comparatively large inventory of Romani publications in this country. Each
of the two papers by Friedman and Hiibschmannov provides an analysis of
a specific context of standardization, but a comparative examination of
some of the data on orthographic problems reveals that while contact with
the respective national languages is indeed crucial, an orientation toward an
international Romani variety can also be detected, both in the way
phonemes are recognized and in the manner in which their graphemic
representation is established.
All the authors in this volume assume a primarily descriptive point of
view. However, those actively engaged in promoting the use, study,
recognition and standardization of the Romani language in order to fill
INTRODUCTION XV
whatever social, cultural and political needs are felt necessary to be covered
by it, might feel encouraged by the fact that this collection also aims at
stimulating further discussion and involvement in Romani studies and its
applied domains. By emphasizing the importance of curiosity and
discovery, as opposed to prejudice and ignorance, this book can constitute
but a very modest contribution to normalizing the difficult position still
assigned to the Roma by the majority in our society.
YARON MATRAS
REFERENCES
Acton, Thomas (1989) The value of 'creolized' dialects of Romani. In:

ipka, Milan et al. (eds.) Jezik i kultura Roma. Sarajevo: Institut za
prouavanje nacionalnih odnosa. 169-180.
Bakker, Peter & Marcel Cortiade (eds.) (1991) In the Margin of Romani.
Gypsy Languages in Contact. (Studies in Language Contact 1). Amster
dam: Institute for General Linguistics.
Bakker, Peter & Hein van der Voort (1991) Para Romani languages: An
overview and some speculations on their genesis. In: Bakker, Peter &
Marcel Cortiade (eds.) In the Margin of Romani. Gypsy Languages in
Contact. (Studies in Language Contact 1). Amsterdam: Institute for
General Linguistics. 16-44.
Boretzky, Norbert (1985) Sind Zigeunersprachen Kreols? In: Stolz,
Thomas, Norbert Boretzky & Werner Enninger (eds.) Akten des I.
Essener Kolloquiums ber 'Kreolsprachen und Sprachkontakte'.
Bochum: Brockmeyer. 43-70.
Boretzky, Norbert (1986) Zur Sprache der Gurbet von Pristina
(Jugoslawien). Giessener Hefte fr Tsiganologie 3:1-4, 195-216.
Boretzky, Norbert (1989) Zum Interferenzverhalten des Romani.
(Verbreitete und ungewhnliche Phnomene). Zeitschrift fr
Phonologie, Sprachwissenschaft und Kommunikationsforschung 42:3,
357-374.
XVI INTRODUCTION
Boretzky, Norbert (1991) Contact-induced sound change. Diachronica 38:1,

1-15.
Boretzky, Norbert (1993) Conditional clauses in Romani. Sprachtypologie
und Universalienforschung 43:2, 83-99.
Boretzky, Norbert & Birgit Igla (1991) Morphologische Entlehnung in den
Romani-Dialekten. (Arbeitspapiere des Projekts Prinzipien des
Sprachwandels Nr. 4). Essen: Universitt Essen, Fachbereich Sprach-
und Literaturwissenschaften.
Boretzky, Norbert & Birgit Igla (1993) Lautwandel und Natiirlichkeit.
Kontaktbedingter und endogener Wandel im Romani. (Arbeitspapiere
des Projektes Prinzipien des Lautwandels 15). Essen: Universitt Essen,
Fachbereich Sprach- und Literaturwissenschaften.
Cortiade, Marcel (1991) Romani versus Para-Romani. In: Bakker, Peter &
Marcel Cortiade (eds.) In the Margin of Romani. Gypsy Languages in
Contact. (Studies in Language Contact 1). Amsterdam: Institute for
General Linguistics. 1-15.
Friedman, Victor A. (1985) Balkan Romani modality and other Balkan
languages. Folia Slavica 1/3, 381-389.
Friedman, Victor A. (1991) Case in Romani: Old grammar in new affixes?
Journal of the Gypsy Lore Society 5/1-2, 85-102.
Gilliat-Smith, Bernard (1915-1916) Report on the Gypsy tribes of north-east
Bulgaria. Journal of the Gypsy Lore Society (New Series), 9/1-4, 65-
109.
Hancock, Ian (1970) Is Anglo-Romanes a creole? Journal of the Gypsy Lore
Society 3, 49/1-2, 41-44.
Hancock, Ian (1993) A Grammar of Vlax Romani. Austin: Romanestan.
Igla, Birgit (1989) Kontakt-induzierte Sprachwandelphnomene im Romani
von Ajia Varvara (Athen). In: Boretzky, Norbert, Werner Enninger und
Thomas Stolz (Hgg.) Vielfalt der Kontakte. (Beitrge zum 5. Essener
Kolloquium liber Grammatikalisierung, Natiirlichkeit und
Systemkonomie. Bd. 1). Bochum: Universittsverlag N. Brockmeyer.
67-80.
Kochanowski, Jan (Vania de Gila) (1963) Gypsy Studies. New Delhi:
International Academy of Indian Culture.
Kostov, Kiril (1962) Aus der Syntax der Zigeunersprache Bulgariens.
Linguistique Balkanique IV, 131-146.
INTRODUCTION XVII
Kostov, Kiril (1973) Zur Bedeutung des Zigeunerischen fr die Erforschung

grammatischer Interferenzerscheinungen. Linguistique Balkanique
XVI-2, 99-113.
Matras, Yaron (1994) Untersuchungen zu Grammatik und Diskurs des
Romanes {Dialekt der Kelderaa/Lovara). Wiesbaden/Berlin:
Harrassowitz.
Miklosich, Franz (1872-1880) ber die Mundarten und Wanderungen der
Zigeuner Europas. I-XII. Wien: Karl Gerold's Sohn.
Miklosich, Franz (1874-1878) Beitrge zur Kenntnis der
Zigeunermundarten. I-II. Wien: Karl Gerold's Sohn.
Rdiger, Johann Ch. Ch. (1782 [1990]) Von der Sprache und Herkunft der
Zigeuner aus Indien. Reprint from: Neuester Zuwachs der teutschen,
fremden und allgemeinen Sprachkunde in eigenen Aufstzen, Part 1.
Leipzig 1782, 37-84. Hamburg: Buske.
Sampson, John (1926 [1968]) The Dialect of the Gypsies of Wales. Oxford:
Clarendon Press.
Thomason, Sarah G. and Terrence Kaufman (1988) Language Contact,
Creolization, and Genetic Linguistics. Berkeley: University of
California Press.
Turner, Ralph L. (1926) The Position of Romani in Indo-Aryan. Journal of
the Gypsy Lore Society III-5/4, 145-194.
ON TYPOLOGICAL CHANGES AND STRUCTURAL
BORROWING IN THE HISTORY OF EUROPEAN ROMANI
VIT BUBENIK
Memorial University of Newfoundland
0. Introduction
Recently there has been an upsurge of interest among historical linguists,
typologists and sociolinguists in the interference phenomena of European Romani.
olization, and Genetic Linguistics (1988) brought them to the attention of
North-American audiences. One of their achievements is their explicitly stated
borrowing scale (pp.74-6) assigning numerical values from 1 to 5 on the ba
sis of lexical and structural borrowing: category (1), lexical borrowing only,
(2) and (3) slight structural borrowing, and (5) heavy structural borrowing.
Various Romani dialects appear at the heavy end of their borrowing scale
(category 5); they underwent changes through borrowing that range from
moderate to heavy. As is well known, these include lexical items of all types
(to enlarge a very restricted vocabulary of ca. 1 000 inherited words of Indic
origin); borrowed phonemes (including entire contrastive sets such as pala
talized consonants in Russian Romani); development of a typical Balkan peri
phrastic future; Aktionsart prefixes borrowed from Slavic languages; a plural
suffix borrowed from Romanian in Vlach dialects; special paradigms -
modeled on Greek - for inflecting borrowed nouns, adjectives, and verbs,
etc.. This interference has been and is still going on, especially, in the sphere
of grammatical categories and syntactic rules.
Any work on interference phenomena from Medieval and contemporary

European languages - starting in the 13th century - has to refer to the pre-
European linguistic type of Romani. This type has been labeled Thematic
grammar by Hancock (1992: 11). Its grammatical rules apply to the original
Indic words, and to words acquired from the other languages the 'Dmba'
(cf. Hancock, this volume) met during their migration through Asia before
reaching Europe (most notably Middle Persian, Kurdish, Armenian and
2 VIT BUBENIK
Byzantine Greek). Athematic grammatical rules apply to words acquired from

the Medieval and contemporary European languages: Medieval Greek, South
Slavic,Romanian,East and West Slavic, Hungarian, German, etc..
Two issues will be examined in this paper:
(i) What was Romani morphosyntax (=Thematic grammar) like during its
formative pre-European stage?
(ii) What is the typological distance between the state of affairs during this
ancestral Middle Indo-Aryan (MIA) stage, on the one hand, and its con
temporary European descendant, on the other hand (or, in Thomason-
Kaufman's terminology, the degree of structural borrowing by European
Romani (ER) as a result of strong cultural pressure from European
languages).
A satisfactory answer to (i) boils down to our ability to place the an
cestral Romani on the Stammbaum of MIA languages and dialects, and to be
explicit about its contacts with the linguistic varieties met during the out-
migration from India. This has been done relatively successfuly for pho
nology and lexicon; the results for morphology are - as far as I surveyed the
field - less conspicous. To mention only three eminent Romanologues: Tur
ner (1926) in his influential article on the position of Romani in Indo-Aryan
devoted less than two pages to morphology (but 38 to the matters of
phonology). Sampson's presentation (1926) is more balanced (Phonology 39
pages, word formation 57 pages, inflection and syntax 106 pages), but many
of his intuitive statements about the matters of historical morphosyntax have
to be carefully re-examined; and, finally, Bloch (1933) in his classic Indo-
Aryan from Vedas to Modern Times used Romani data rather very sparingly
in his portrayal of the MIA and New Indo-Aryan (NIA) stages. Another
serious shortcoming was the manner in which the Romani data were fitted
into the overall mosaic of MIA languages and dialects: not enough was
known about the crucial late MIA and early NIA stages because most
Apabhramsa and certain early NLA texts were published only quite recently in
the 70s and 80s. I have been working during the last decade in the area of
MIA linguistics and would like to present some of my ideas about the
morphosyntactic characteristics of the pre-European Romani (or, perhaps,
Proto-European Romani).
I propose to revisit the following areas ofRomanimorphosyntax:
1. the formation of the future tense
2. the imperfect and conditional
TYPOLOGICAL CHANGES 3
3. the preterite/perfect, and

4. the analytic and synthetic passive.
1. Formation of the future tense

Synchronically speaking, in the Carpathian dialects the formant -a is
added to the present tense forms in all the persons; in the h-dialects the -s in
the 2nd Sg and the 1stP1is replaced by -h (but it remains in the s-dialects):
(1) Present Future: h-dialects s-dialects

Sg 1 -av -av -av
2 -es -eh -es
3 -el -el -a -el -a
P1 1 -as -ah -as
2/3 -en -en -en
As is well-known the 2nd and 3rd Pers P1 display an identical suffix,

-en, which originally was the suffix of the 3rd PL The reason for its ex
tension to the 2nd Pers P1 is seen in the fact that the 2nd Pers became
homophonous with the 3rd Pers Sg. The 3rd Sg *-et > -ed > -el and the 2nd
P1* -eth > -ed > -el would have merged into -el as a result of the lateralization
of t (via d) of the original suffixes *-et and *-eth.
I could not find any satisfactory answer to what this -a is historically and
the following section is meant as a contribution towards the elucidation of its
diachronic status. Typologically speaking, the strategy of forming the future
tense by cliticizing one of the "particles" to the finite forms of the present
tense is well-established in the Indo-Iranian languages. There are at least three
different particles used in Indo-Iranian languages to accomplish this task. The
best known is Hindi ga: which can be traced back to the Middle Indo-Aryan
form gaya:, i.e. the past participle of the verb "go" (<OIA gata). Hindi ga:
shows different marking for gender (gi: Fem) and number (ge: P1
Masc/Fem).
In Modern Rajasthani dialects the ga:-particle may or may not show
marking for gender and number. Thus in the Western (Marwari) dialects go:
shows marking for gender and number in the same fashion as Hindi does.
The masculine suffixes are different: Hindi ga: (Sg), ge (P1) vs. Marwari go:
(Sg), ga: (P1). In the South-Eastern dialects spoken in Malwa (around the city
of Ujjain) ga: is indeclinable. The other future tense particle used in
Rajasthani dialects is la: (used also in Marathi, Bhili and the Himalayan
4 VIT BUBENIK
dialect). In Marwari the particle la: is indeclinable (as ga: in Malvi), while
Jaipuri declines lo: for gender and number in the same fashion as Marwari
does it in the case of the particle go:.
Given the parallelism with the ga:-future the source of the la:-future
might be the past participle of the OIA (Sanskrit) verb la:- "grasp, take" (this
verb was replaced in MIA by le-, presumably, by non-proportional analogy
with de- "give").
All these forms are surveyed in (2):
(2) Formation of the future tense in Rajasthani dialects
Malvi Marwari Jaipuri

M F M F
Sg 1 cal: cal: =go: cal: =lo
2/3 cale ai ai
=ga: =la: =gi: =li:
Pl1 cal: : :
2 calo 0 =ga: 0 =la:
3 cale ai ai
Structure: "go"+Pres/Pers/No = PRT (Malvi)

= ga: (< "go"+PP)
"
go"+Pres/Pers/No = PRT (Marwari)
= la: (< "take"+PP)
"go"+Pres/Pers/No = PRT+Gender/No (Jaipuri)
M F
= lo (< "take"+PP)
H:
la:
But now the crucial question: how old are these constructions in our late
MIA or early NIA texts?
To the best of my knowledge they do not appear in the Apabhramsa
texts. As far as the early NIA texts are concerned, it should be noted that the
ga:-future is not found in the medieval poets from the central area; on the
other hand, it did not replace completely the old synthetic sigmatic future in
the regional dialects (e.g. Bundeli calha" I will go" continues the OIA
sigmatic future carisya:mi). As far as the la:-future is concerned, it is

documented in the southern area in Marathi since the 12th century and in Old
Western Rajasthani (ancestral to Old Marwari) since the 16th century (data in
Tessitori 1915).l
As in the central area, in Rajasthan the la:-future never ousted completely
the sigmatic future (e.g. in Jaipuri these two coexist: mar:=lo ~ ma:rasy:,
the latter from OIA marisya:mi).
To round our typological survey, I would like to add that in Rajasthani
(and Gujarati) and certain dialects of Western Hindi (Bangaru spoken in the
Eastern Panjab) also the progressive aspect/immediate future may be formed
by cliticizing the copula to the finite forms of the present tense (unlike Hindi
which may cliticize the copula only to the non-finite participial forms to form
its general present, marta: h: "I strike" or the progressive aspect/continuous
present mar raha: h: "I am striking"). Similarly, on the Iranian side, Pashto
forms its imperfective future by cliticizing the particle ba to the present tense
forms (cf. Bashir 1991: 62). These data are presented in (3):
(3) Formation of the progressive aspect/immediate future in Rajasthani and

Bangaru, and imperfective future in Pashto
Marwari Bangaru Pashto

(Bagri) (=Jatu)
Sg 1 ma:r:=h: ma:r:=s: lwgem

2 e=hai ai=sai e =ba
3 e=hai ai=sai i
"I am/will be striking" "I am/will be striking" "I will be falling"
The earliest atestations of the above constructions are from the 16th
century in Old Western Rajasthani (cf. Tessitori 1915).2
Having said all this, I am not inimical to the idea of considering the
suffix -a of the future tense in ER as a relic of the particle such as known
from Modern IA languages in (2) and (3); whether it was originally a clitic
form of the copula (as in Bangaru or Rajasthani) or the verb "to go" (as in
Hindi and Rajasthani) or the verb "to take" (as in Rajasthani) we will not be
able to decide with any degree of certainty.3 A propos the former two, the
copula and "to go", it should be in kept mind that their forms merged
6 VIT BUBENIK
phonologically in ER. A propos the verb to "take" Bloch (1933: 288) made
an intriguing comment that 'the convergence with the use of the Russian
Romani la- "to take", should be noted'. A few observations of mine follow.
The North Russian Romani (Wentzel 1980: 95) forms its imperfective future
by combining the verb lava "to take" in the present with the subjunctive form
of the main verb. Boretzky (1989: 369) proposed that the foreign model
could have been Ukranian which - unlike Russian - forms its future by
cliticizing the present forms of the verbjati"to take" to the infinitive.
Alternatively, given our evidence for the "take" -future in medieval Ra-
jasthani (16th c.) and Old Marathi (12th c.) we might propose a genetic
solution in the sense of a common historical source. The problems, however,
might be insurmountable. How old is the "take"-future of Russian Romani?
How far back can we project the "take" -future of early NIA languages? Both
proposals are schematized in (4):
(4) The "take" -future in North Russian Romani
Ukranian Russian Romani (structural borrowing)

Sg 1 citat=imu lva te g'inav "I will read"
2 =imes lsa es
3 =ime lla el
(genetic solution) *kar-am la: (< "take")
Old Rajasthani North Russian Romani

(no morpho-syntac- (change in morphosyntactic typology):
tic change) SOV > SVO
(MV Aux) MV Aux > Aux MV
kar:=la: lava te kerav
2 . Formation of the imperfect and conditional

The imperfect and present conditional are formed by attaching the
formant -as to the present tense forms; the past conditional by attaching the
same formant to the forms of the perfect; their singular paradigms are given in
(5):
(5) Present Imperfect and Present Conditional

Sg 1 -av -av
2 -es -eh -as
3 -el -el
Perfect Past Conditional

Sg 1 -d-j-om -d-j-om
2 -al -al -as
3 -a(s) -ah
It would seem that - synchronically speaking - we are dealing with the

same strategy as in the future-tense formation, i.e. that of attaching the par
ticle to finite forms (either the present or the perfect) resulting in typologically
rare introflexion. It must be of some significance that neither Domari nor
Lomavren (Sampson 1926:192) possess this type of formation of the im
perfect. Diachronically, as we saw above, we are in a position to provide
some more or less convincing Indo-Iranian parallels for the future-tense for
mation by attaching the finite form of the copula or a particle to the finite
forms of the present; in the case of the imperfect and the conditional it is more
difficult.
The formant -as has been traced back to the OIA copula as- "be", either
in the perfect form a:sa or the imperfect form a:st ~ a:si:t (cf. Sampson 1926:
192). The latter form ended up as a:si during the late MIA times
(Apabhramsa); and it could be used in all the persons and numbers in
participial tenses or the passive construction. For the latter see the example in
(16). Sampson (1926: 192) stated that:
Miklosich (xi 40) on the strength of Hungarian Gypsy forms in ahi
(=*asi) concludes that this -as or rather -s... is the Gypsy 'si (=isi) "est";
though as he acknowledges, it is difficult to understand how the addition
of the Present of the verb "to be" could have imparted to the Indicative
Present an Imperfect signification.
If one identified this *asi with Apabhramsa a:si in the past tense
meaning, this semantic obstacle would presumably be removed but the con
structional dilemma would remain. Sampson (1926: 192) compared the
formation of the Welsh Romani (WR) imperfect and pluperfect with that of
Panjabi. His examples are shown in (6):
8 VIT BUBENIK
(6) Panjabi WR (Sampson 1926: 192)

Imperfect ja:nda: sa: Jav-as
"I (he, etc.) was going"
Pluperfect gia: sa: gi:m-as (< gilo: "gone")
"I (he, etc.) had gone"
But this comparison is based only on a similarity of the WR formant -as

and the Panjabi form of the copula in the past. It would seem that Sampson
was not aware of the constructional dilemma involved in his comparison
when he said:
... (in) the Indian group of languages...the Imperfect...is compounded

of the Present Participle and the Past tense of the verb substantive, (e.g.
Hindi ja: ta: tha:, Panjb ja:nda: sa:),' and then Panjb, it may be noted,
has also a Pluperfect with sa: formed from the Past Participle', (e.g. gia:
sa: "I(he, etc.) had gone". (Sampson 1926: 192)
Panjabi actually conjugates its auxiliary (san, sy), si, sa, sow, sn, cf.
Shackle 1972: 35) when it forms its imperfect and pluperfect; and in this
respect Grierson (1903-1922: 629) mentions that:
the common form of the past tense of the verb substantive is usually si:
(=3/Sg "was") for both masculine and feminine singular, and for the
masculine plural. This is generally explained as the feminine of sa:, but
much more probably it is a corruption of some old form akin to Prakrit
a: si:.
Sampson appears to have taken Panjabi sa: as a counterpart to Hindi tha:

whose non-finite source is fairly well known; tha: goes back to Apabhramsa
thia and ultimately to OIA sthita "stood", i.e. the PP form of the root stha:
"stand". Panjabi si: traceable back to Apabhramsa a:si - as maintained by
Grierson - could have been reinterpreted as the feminine form of the PP
"been" corresponding to Hindi thi:. Unfortunately, Sampson did not
substantiate his intuitive claims and the whole matter has to be reinvestigated.
As a minimal groundwork in this respect one needs a reliable survey of
contemporary Panjabi and Western Hindi dialects and a good idea of what
Old Panjabi was like. In addition, in certain Western Hindi dialects (Bangaru,
called also Jatu or Hariani) the imperfect is formed by attaching the PP form
tha: to the present participle (as in Hindi), or to the verbal noun in the oblique
case; in Rohtak tha: is attached to the finite forms of the present tense; i.e. the
same strategy as that involved in the formation of the progressive aspect of

immediate future, shown in (3), is used. The data in (7) are taken from
Grierson (1916/1968:255):
(7) Formation of the imperfect in Bangaru (=Jatu)

Bangaru Rohtak
Pres Pan=tha: Verbal Noun+OBL=tha: Pres Tense=tha:
Sg 1 ma:r:
2 ma:rada:=tha: ma:re=tha: ai =tha:
3 ai
"I was striking" (lit.) I was at/on striking (lit.)
I strike = was
you strike = was
he strikes = was
In typological perspective it is easier to rationalize the forms of the

conditional than those of the imperfect. I want to mention two synchronic
parallels from modern IA dialects. In Bangaru the conditional may be formed
as in Hindi (using the bare imperfective participle) or by attaching the
3/Sg/Pres form of the copula to the present forms of the main verb (cf.
Grierson IX.1.255). In Lahnda (called misleadingly Western Panjabi) the
conditional is formed in the same fashion (cf. Grierson 1919/1968:267);
here, however, the indeclinable form of the copula, ha:, is probably a reduced
form of the 3rd Sg Past (i.e. not the present): a:ha: "was". Some examples
are given in (8):
(8) Je ma ny: kar:=hai, to: ma mar:(=hai) (Bangaru)

If I so do+l/Sg/Pres=is then I die+1/Sg/Pres (=is)
"If I had done so, I should have died"
ma:ra:=ha: (Lahnda)
strike+ 1/Sg/Pres=was
"I would have struck/if I had struck"
ma:ren=ha: (Lahnda)
strike+3/Pl/Pres=was
"they would have struck/if they had struck"
10 VIT BUBENIK
If the ER imperfect marker -as goes back to OIA a:sa "has been"
(Perfect) or a:si:t "was" (Imperfect), then Lahnda would furnish the closest
synchronic parallel. The singular paradigms of these three languages are
surveyed in (9):
(9) Formation of the conditional in Bangaru (Western Hindi), Lahnda

(North-Western IA) and European Romani: "I would have struck"
Bangaru Lahnda European Romani

Sg 1 ma:r ma:r: marav
2 ai =hai e =ha: es +as
3 ai ("is") e (<"was") el (<"was"?)
3. Perfect
Upon the examination of the suffixes of the perfect, we may ascertain
that:
(i) they are different from the suffixes of the present
(ii) in the 1st and 2nd Pers they are identical with the suffixes of the copula
in the present tense (the 3rd Pers is of a nominal origin; witness the plural
suffix -e shared by the 3rdP1perfect and the plural form of the PP; the 3rd
Sg Perf is kerd'a vs. the singular form of the PP kerdo), and
(iii) they are preceded by the yod (=palatal glide) in Baltic, Balkan and Vlach
dialects (in the latter group -y- is used in Kalderas but not in Lovari, cf.
Hancock 1993: 48). In Slovak Romani, the yod palatalizes the preceding
apical consonant t', d',l' (cf. Hbschmannov et al. 1991) The forms of
Vlach and SlovakRomaniare shown in (10):
(10) Vlach Romani Slovak Romani

Perfect Copula Perfect Copula
ker-d-em sim ker-d'-om som

-(j)an san d'-- al sal
-(j)a(s) si d'~ a(s) hin-o
-(j)am sam d'-am sam

-(j)an san d'~ an san
-(in)e si d~e hin-e
A propos their diachrony we have a statement by Wentzel (1980: 95):

"historisch gesehen ist das Perfekt eine Verbindung des Partizips mit dem
verb som "sein" im Prsens":
kerd(o) + (s)om "I have made".
Put differently, Wentzel understands the perfect "I have made" as coming
from the nominal construction (lit.) "I am (the) having-made (one)". This
statement leaves a lot to our imagination and it may be understood in two
ways: (i) the speakers of Romani used the suffixes of the copula (i.e. they
separated them from the radical s) to finitize the PP which was enlarged by
the yod: kerd-j-om; or, (ii) the speakers ofRomanicliticized the copula to the
PP to finitize it. The problem with (ii) is obvious: it does not tell us how the
s (or h) was lost: kerd-som (or horn) > kerd-j-om. In addition both (i) and
(ii) are silent on the origin of the stem-forming yod. While solution (i) - the
segmentation of the copula - is not impossible (but it has to be argued for!),
solution (ii) has the merit of being the historical scenario in Indo-Iranian lan
guages. As is well known, after the loss of synthetic past tenses (Imperfect,
Aorist and Perfect) Prakrits started forming their Preterite/Perfect analytically
by cliticizing the copula to the PP (the form in -da < OIA -ta).
Some examples from the Prakrit Niya documents (of the 3rd c. A.D.) are
given in (11):
(11) kadamhi (<kada=mhi)

make+PP=I+am
"I have made"
pesidamhi (<pesida=mhi)
"I have sent"
Notice their spelling as a single grammatical word which indicates that

the clitic forms of the copula reached the status of verbal suffixes. Similar
examples may be provided from any historical period and any geographical
area in India. (Actually we are dealing with a common process found in many
other language families, e.g. Slavic (Czech) sla=jsi "you (F) went" becomes
> la+s in colloquial). A couple more examples from Medieval and contem
porary dialectal Avadhi data from Beames (1872-79: 148) are given in (12):
(12) ma:ra:=hau > ma:ra + (Avadhi)

strike+PP/M=I+am
"I(M) have struck"
12 VIT BUBENIK
ma:ri:=ha > ma:ri + (Avadhi)

strike+PP/F=I+am
"1(F) have struck"
vs. Standard Hindi: cala: ha-
Given all the above we are in position to claim that the Proto-Romani
Perfect could be reconstructed as the PP with the copula cliticized. Using the
h- forms of the copula they could appear as given in (13):
(13) Proto-Romani Perfect

kerdo=(h)am > kerd+om (1/Sg)
kerda=(h)am > kerd+am (1/P1)
The h of the root could be lost through contraction during the cliticiza-
tion process. For s-dialects the segmentation of the copula would have to be
proposed.
However, the major problem still has not been solved. What is the
source of the yod intervening between the PP stem and the suffixes of the
copula? Diachronically speaking, I would prefer to keep this yod separate
from the yod which appears in the stem of the passives and inchoatives (cf.
Section 4). Assuming that the ancestral speakers of Romani left India without
it (i.e. kerdom would be earlier than kerdjom) we would have to provide a
potential source for it in the yod-initial copula of one of those many languages
they encountered en route (as it happens the East Iranian, Greek and Slavic
languages possess a yod-initial copula).4
But whatever its source, the addition of the yod in the perfect enhanced
the contrast between it and the PP. Without it certain forms would come
"dangerously" close to each other; e.g. in Slovak Romani kerdo "made" (PP)
vs. kerd'a "he/she made" (it should be observed that the latter form is often
enlarged by the imperfect marker -(a)s : kerd'as); or, mard'o- in mard'om "I
have struck" differs only by its palatality from the PP mardo "struck". On the
other hand, the ambiguity remains in the plural where kerde "made" (P1 form
of the PP) is homophonous with the copula-less PP kerde "they made" (but
notice Vlach kerd(in)e).
In terms of morphosyntax, ER treats both the agent and the subject of the
perfective event in identical manner; put differently, ER is unique in the
context of Central and North-Western IA languages in not possessing the
ergative construction. To use a simple example, in Hindi perfective events
involving transitive verbs have to be grammaticalized by the ergative con

struction (i.e., the agent has to be marked by the ergative postposition and the
PP agrees with the indefinite objects), e.g. ma-ne kutta: ma:ra: hai "I have
hit a dog". In perfective events involving intransitive verbs, e.g. ma gaya:
h: "I have gone", the PP agrees with the subject in the absolutive form
without any postposition. ER treats both the agent and the subject of the past
perfective events in identical manner: me mard'om dzukles "I have hit a dog"
and me gejl'om "I have gone". As shown in (13) the perfect forms display
the PP finitized by the copula, continuing thus the pre-ergative Prakrit type
ma:rida=mhi (< ma: ritas asmi) lit. "I am the one who has struck", and
gada=mhi (< gatas asmi) lit. "I am the one who has gone". These examples
are surveyed in (14):
(14) ma=ne kutta: ma:ra: hai (Hindi) Ergative-absolutive type

I=ERG dog hit+PP/M is
"I have hit a dog"
ma gaya: h:
I go+PP/M be+1/Sg
"I have gone"
me mard'om dzukles (Slovak Romani) Nominative-accusative

I hit+PP+1/Sg dog+ACC type
"I have hit a dog"
me gejl'om
I go+PP+1/Sg
"I went"
kukkur ma:rida=mhi (pre-ergative Prakrit) Nominative-accusative

dog+ACC strike+PP/M=be+l/Sg type
"I have struck a dog"
gada=mhi
go+PP/M=be+l/Sg
"I went"
14 VIT BUBENIK
4. Formation of the passive

There are two types of the passive in ER: analytic and synthetic. The
analytic passive is formed in the usual fashion by combining the PP form
with the copula in the present, past or future tense; examples in (15) are from
Slovak Romani:
(15) o skamind hin/sas/ela thodo [Hbschmannov 1991 et al.: 631]

"The table is/was/will be laid"
It is perhaps no coincidence that all the examples of the analytic passive I

came across in my limited perusal of Slovak Romani texts (Hbschmannov
1990; Gia 1991, Olh and Demeter 1992) involved the 3rd Pers. If formed
in other persons the constructions with the PP agreeing in referential gender
and number obtain:
(16) som mardo/i I am beaten + M/F

sal mardo/i you are beaten + M/F
sam marde we are beaten5
Typologically speaking, the construction displaying the auxiliary before

the main verb is in harmony with the overall SVO typology of European
Romani. Boretzky (1989: 370) considers this construction to be calqued on
the Romanian or Slavic (Serbo-Croatian) analytic passive. In my opinion,
there is no need to assume a lexical borrowing of the auxiliary "be" in this
construction since the Middle Indo-Aryan dialects contemporary with Proto-
Romani possessed it. But the structural interference from European super-
and adstratal languages would be seen in the direction of clisis. In prakritic
Proto-Romani - as in SOV languages generally - the auxiliary was enclitic to
the main verb and the change in the overall typology of ER from SOV to SVO
could, of course, come about as a result of its contact with the European
languages of an SVO typology.
Some MIA examples (from Mrcchakatik) of "be"-passives are provided
in (17):
(17) Middle Indo-Aryan "be"-passive

sandesena pesidamhi ( < pesida:=mhi) [Mrcchakatik]
news+INSTR sent=I+am
"I(F) was sent with news"
gahidosi ( < gahido=si)

"you(M) are taken"
pesiya ve vi a:si desantarau [Svayambhdeva,

send+PP two PRT were country another Paumacariyu, 7-10th c]
"We two were sent to another country"
Considering the examples for the perfect in (11) kadamhi "I have made"
and pesidamhi "I have sent" we reach a dysfunctional state of affairs pre
sented in (18); that is the same construction of the PP plus the copula may
grammaticalize the active "I have sent" (i.e. I am the one who has sent) and
the passive "I am/ have been sent".
(18) Perfect and passive in Middle Indo-Aryan: (11) and (17)
By dysfunctional I do not mean a complete breakdown in communica

tion since the MIA dialects possessed synthetic passive forms in (-ijj < OIA
-ya). But as shown in (18) the alignment between morphology and semantics
was thus that the single form ma:ridamhi was ambiguous between the active
and the passive, and the resultative perfect "I am/have been struck" could be
grammaticalized by two different forms ma:ridamhi and ma:rijji (only the
latter one presenting an unambiguous passive morphology albeit ambiguous
between perfective and imperfective interpretation).
During their history the IA languages solved this dilemma of an in
sufficient contrast between the active and the passive perfect by enhancing it
in essentially three ways which are summarized in (20):
16 VIT BUBENIK
(i) Late Classical Sanskrit created an unambiguous active perfect by

attaching the derivational suffix -vant to the PP (thus, ma:ritava:n asmi "I
have hit" vs. ma:ritas asmi "I have been hit").
(ii) On the other hand, both Saurasen and Vrcada Apabhramsa (to judge by
the situation in their medieval descendants) created an unambiguous passive
by using the auxiliary bhu: "become" (thus, ma:rida=mhi "I have hit" vs.
ma:rida bhavam "I am (being) hit", lit. "I become hit").
Some examples of MnIA analytic "become"-passives are given in (19).
Sindhi forms its imperfective passives by combining the imperfective passive
participle ma: ribo (from OIA gerundive <ma:ritavya) with either the copula
(in the present) or the verb "to become" (in the past). Nepali may form its
passives by combining the forms of "become" with adjectives. In Hindi and
Rajasthani only the nominalized form (as the adjectival passive participle)
survived: here the PP of "become" passivizes the PP of the main verb which
lost its original passive meaning, e.g. ma:ra: hua: "(being) struck", lit.
strike+PP become+PP.
(19) ma:ribo :hiy: (Sindhi)

being-struck be+l/Sg
"I am being struck"
ma:ribo hose
being-struck became+1/Sg
"I was being struck"
pasal band bhayo (Nepali)

shop close+PP become+PP
"The shop has (become) closed"
ma:ra: hua: (Hindi)

strike+PP become+PP
"(being) struck"
ma:riyo huwo (Marwari)

strike+PP become+PP
"(being) struck"
However, in a long run, even (ii) proved to be an unsatisfactory solution

because the two auxiliaries "be" and "become" merged ultimately into one set
of forms as a result of phonological attrition. As shown in (20. iii), it became
necessary to adopt another more distinct passive auxiliary, namely, the verb
to "go" (documented in this function since the 10th/11th century) which
yielded the typical "go" (ja:na)-passive of Modern Hindi. At the other end,
the ergative postposition -ne guaranteed the unambiguous active inter
pretation of ma-ne ma:ra: (hai) "I (have) hit" (vs. the passive ma ma:ra:
gaya: h: "I have been hit"). These three solutions are surveyed in (20):
(20) Three "solutions" to the homophony of the perfect and passive in

MIA (cf. 17)
(i) Activization of the PP in Late Classical Sanskrit
Active ma:rita+va:n asmi "I have hit"

(activized PP)
Passive ma:ritas asmi "I am/have been hit"
Iam
(ii) "become"-passive in Middle Indo-Aryan > Sindhi, Nepali
Active ma:rida=ham "I have hit"

I am
Passive ma:rida= (b)ha(v)am "I am hit"
I become
(iii) "go"-passive in early New Indo-Aryan > Hindi
Ergative mai=ne ma:ra: (hai) "I have hit"

Passive mai ma:ra: gaya: (h:) "I have been hit"
gone (I am)
We do not have to worry about the third solution since there is no trace
of the "go"-passive in ER. In what follows I will attempt to show that the
synthetic passive of ER goes back to the MIA "become"-passive (solution ii).
The synthetic passive in ER is formed by the suffix -jov which is
attached to the stem of the preterite plus non-final affix, the so-called NFA
(Hancock 1992: 41) The same suffix -jov is used to derive inchoative verbal
forms from adjectives. In Slovak Romani, according to Hbschmannov et
18 VIT BUBENIK
al. (1991: 647), the derivational suffix is -uv and the final consonant of the
verbal stem is palatalized (d,g > d', k,t > t', l > l', n > , st > t'). The
source of palatality is obviously the yod of the suffix as preserved in Vlach
Romani. Some examples are presented in (21):
(21) Passive Inchoative verb

mard-o "beaten" phur-o "old"
mard-j-ov- phur-j-ov-
Vlach R mardjovav phurjovav
Slovak R mard'uvav "I am beaten" phuruvav "I get old"
A propos their diachrony the discussion in Sampson (1926: 111) is

confusing. Sampson disagreed with Miklosich who connected this suffix
with the Sanskrit passive suffix -ya in view of the difficulty of tracing the
Romani -ovel back to Sanskrit -yati, since OIA -ya should end up as -ijja.
Sampson favored the earlier view of Pott and Paspati, who regard the suffix
-iov, -ov as identical with the verb uv-, ov- "to become"; however Pott and
Paspati do not tell us where the yod in -iov comes from.
It would seem to me that both Miklosich, and Pott and Paspati were
right. The suffix -ya appears as -ya in older Prakrits (e.g. kari:yati, di:yati >
later karijjai, dijjai) or it may palatalize the preceding n (e.g. haai <
hanyate). It could conceivably survive as yod - i.e. without undergoing af-
frication (as in ya:- >ja:- "go") - because of its morphological function (put
differently, the rule of affrication would be suspended within the derivational
paradigms of passives and inchoatives). All this would have to be taken into
account by proponents of an earlier outmigration from India. However, even
the proponents of a later outmigration could pinpoint that the passive marker
-ijja could revert to -i:ya (in Old Western Rajasthani: i:ya > -ijja > i:ya > ia ,
cf. Tessitori 1914-16).
In my opinion the remaining portion of the suffix, -ov or -uv, is
considered most sensibly as descended from the verb "to become", (i.e. MIA
bhav) by Pott and Paspati, but it should be added that even their proposal is
not in keeping with phonological rules (one would expect bh to be reflected
as ph or b!). Given all the above I am inclined to consider the yod in the
suffix of Romani passives and inchoatives as not of the same origin with that
used in the formation of the perfect (in section 3. I proposed to connect it
with the yod of the copula). The yod in the inchoatives and passives can be
traced back to the extremely productive IA suffix -ya. In OIA this suffix
derived a great variety of verbal and nominal categories - in addition to the

inchoatives and passives also the adjectives and gerundives.6 As an example,
the IA ancestral forms of the Romani inchoatives phurjovav "I get old" and
loljovav "I turn red" are presented in (22):
(22) IA ancestral forms of the ER inchoatives
OIA vrddh- r/lohit- *vrddha-ya- lohita:-ya-
MIA phur(d)- lod- phur-(d)-y- lod-y
ER phur- lol- phur-y- lol-y-

"old" "red" "get old" "turn red"
Our investigation would be incomplete without referring to the formation

of the causative. As mentioned above, the IA suffix -ya formed not only
passives and inchoatives but also causatives. ER forms its causatives by the
suffix -ar from nouns, adjectives, and rarely from verbs and passive par
ticiples. The latter two categories are shown in (23):
(23) Causativization in Slovak Romani (Hbschmannov et al. 1991: 647-9)
rat "night" rat'-ar-el "lets X stay overnight"

sov-el "sleeps" sovl'-ar-el "puts X to sleep"
rov-el "weeps" rovl'-ar-el "makes X weep"
pherd-o "filled, full" pherd'-ar-el "fills"
The causativizing suffix -ar is most likely descended from MIA -a:d
which survived as -ad in Gujarati (cf. Turner 1926: 279). It should be added
that late MIA possessed also the suffix -a:r (e.g. baisa:riyau "he seated him"
in Svayambhdeva's Ritthanemicariu [1.9.3], ascribed to 7-10th century). As
above for the synthetic passive, I want to claim that the yod (seen in Vlach
Romani, e.g. ratjarel) goes back to the OIA derivational suffix -ya. The / seen
in sovl'arel and rovl'arel is found also in Hindi (sula:na:,rula:na:). As a part
of my argument I suggest considering this / as an intrusive consonant
between p (ancestral to v) and the yod', the p in sov ( < svap) belonged to the
root wheras that in rov ( < roda-p(a)ya-) belonged to the causative suffix
20 VIT BUBENIK
-paya. The sequence of historical events leading to Hindi and Romani

descendants from MIA svap-aya and *rod-apaya is presented in (24):
(24) Historical development of Hindi andRomanicausatives
"put X to sleep" "make X weep"

MIA svap-aya roda-apaya
syncopation svapya ropya
intrusive l suplya soplya roplya
lenition suvl sovly rovly
causative sul-a: sovly-a:d/r rul-a: rovly-a:d/r
(Hindi) ' (Hindi)
sovly-ar rovly-ar
sovl'-ar rovl'-ar
(Romani) (Romani)
Thus, paradoxically, the yod, the original causativity marker has been
reduced to the feature of palatality in Slovak Romani, while the intrusive /
ended up as a full segment which in Hindi co-indexes the causativity (contrast
sona: with sula:na:).
Finally, a few comments regarding the source of -il- in the past passive,
e.g. ker-d-il-em "I was born" (in Vlach Romani, cf. Hancock 1992: 41). The
same -il- is used to express the past of the inchoative verbs, e.g. phur-il-em
"I got old", lol-il-e "they turned red". The finitizing suffixes are those of the
copula in the present, as shown in (10). As mentioned by Turner (1926: 279)
the most obvious candidate is the Prakrit suffix -ilia- (cf. Sanskrit -ila-,
equivalent of -vant). In Prakrits it was used to enlarge the participles (e.g.
a:ga-elliya: "having come" corresponding to Sanskrit a:gata-va:n). In Modern
IA languages it is found in Marathi {dekhila: "seen", vs. Hindi dekha:),
Gujarati and the Eastern languages (cf. Bloch 1933: 267).
5. Conclusion
One of the striking results of our enquiry into the contact phenomena of
European Romani was an overall picture of systemic complication rather than
that of simplification, however the latter is defined. Obviously, the length and
depth of contact play an important role in determining exactly what happens.
Let us remind ourselves of another well documented example, namely, that of
Asia Minor Greek discussed extensively by Thomason and Kaufman (1988:
65) under the motto "Turkish soul in the Greek body" (from Dawkins 1916:
198). The strong cultural pressure of Turkish on Asia Minor Greek, lasting
almost one millenium, resulted in heavy structural borrowing (the highest
point on their scale) affecting the innermost core of language, such as the
introduction of Turkish vowel harmony and agglutinative morphology into
several of the most affected Greek dialects. Data from EuropeanRomanimay
be used to exemplify such a strong degree of structural borrowing that we
may actually talk about massive replacement of large portions of inherited
grammar. On the other hand, we saw some remarkable retention of MIA
grammatical categories under the thick overlay of structural borrowings from
European languages. Their co-existence in ER strikes me as a genuine case of
overall systemic complication: the co-existence of the system of the definite
article with the system of seven cases; the co-existence of inherited post
positions with borrowed prepositions, and the appearance of typologically
rare circumpositions (documented in Pashto, Amharic and Asia Minor Greek,
cf. Campbell et al. 1988); the co-existence of the inherited deflected adjectival
agreement with the borrowed full adjectival agreement. In verbal categories,
ER in addition to the inherited present, perfect and the passive, displays an
innovative reflexive voice; in addition to the inherited indicative and
imperative an innovative subjunctive. In phonology Russian and Czech
Romani are more complex than either the MIA ancestor or the Slavic donors
of the feature of palatalization. The consonantal system of Russian Romani
includes voiceless aspirates descended mostly from the MIA murmured stops
but also a full set of palatalized consonants.
One of the most remarkable archaisms of ER, namely, its nominative-
accusative typology, has not been - to the best of my knowledge - discussed
in pertinent literature. There are two conceivable answers to this problem: (i)
Romani possessed the ergative construction before its ancestral speakers left
India, but lost it later on as a result of its long contacts with European
languages which do not have it; or, (ii) Romani did not have chance to
develop it because the Dmba left India before this construction crystallized
during the later period of MIA. The first solution could be entertained, if we
could prove that theDmbaleft India relatively late during the centuries of the
Muslim expansion, llth-12th century, (cf. Hancock 1993: 1-2 and this
volume) when the ergative construction was more or less established.
Alternatively, one may side with those Romanologues who favor an early
outmigration from India (Kenrick 1976, Kaufman 1984). Either way one has
to conclude that ER in its nominative-accusative typology preserved a
22 VIT BUBENIK
remarkable archaism of late MIA (cf. Matras, this volume). Thus, the
evidence of ER is of cardinal importance in solving one of the fundamental
problems of IA linguistics: that of the emergence and development of the
ergative construction (cf. most recently Bubenik 1994).
NOTES
1
Tessitori (1915:81):
na bolai=li: [Packhyna, 310; 1500-50 A.D.]
"[If] you will not speak"
amhe pachai kar:=la: [Upadesamlblvabodha 288; 1500-50 A.D.]
2
Old Western Rajasthani (Tessitori 1915:78):
ja: cha "I am going" (16th c.)
kahai chai "You are saying" (16th c.)
bhamai chai "He is wandering about" (16th c.)
Old Gujarati (Dave 1935:49):
sium kahau chau "what are you going to say?" (16th c.)
3 According to Jesina (1882:27,31) in Czech Romani the future is formed by attaching
the copula/verb "to go" in the future tense to the root of the main verb:
Sg 1 cor-ava < cor + (v)aba ~ avava
2 -eha + (v)eha aveha
3 -ela + (v)ela avela
"steal" + "I will be "I will go"
4 Paspati (1870: 95) mentions the presence of yod in the perfect as one of the
distinguishing characters of the dialect of the sedentary Gypsies vs. its absence in that of
the nomadic Gypsies: (in his orthography) kerghim vs. kerdm (nomadic).
5 Hbschmannov (p.c.) informs me that mardo in the analytic passive constructions is
used in its secondary meaning "punished", e.g.joj mard i le Devlestar "she is punished by
the God/fate". To translate "I was beaten by the father" one would prefer the synthetic
passive tnard il om le dadestar to the analytic *sotnas mardo. The analytic passive occurs in
the proverb somas buter mardo sar calo "I was more beaten than full".
6 OIA derivational suffix -ya:
kri-y-te "is made" finite passives (deverbative)
tavis-y_- "is mighty" inchoatives (denominative)
ka:r-ya "[which is] to be done" gerundives (deverbative)
mukh-ya "principal" adjectives (denominative)
REFERENCES
Bashir, Elena L. (1991) A Contrastive Analysis of Pashto and Urdu.

Washington: Academy for Educational Development.
Beames, John (1872-79) A Comparative Grammar of the Modern Aryan
Languages of India. London: Trbner.
Bloch, Jules (1933) L'indo-aryen du vda aux temps modernes. Paris:
Maisonneuve. (English translation by A. Masters (1965) Indo-Aryan
from the Veda to Modern Times. Paris: Maisonneuve.)
Boretzky, Norbert (1989) Zum Interferenzverhalten des Romani. ZPSK 42,
357-74.
Bubenik, Vit (1994) The Structure and Development of Middle Indo-Aryan
Languages. Delhi: Motilal Banarsidass.
Campbell, Lyle, Vit Bubenik and Leslie Saxon (1988) Word order
universals: refinements and clarifications. Canadian Journal of Lin
guistics 33 (3), 209-230.
Dave, Trimbaklal N. (1935) A Study of the Gujarati Language. London: The
Royal Asiatic Society.
Dawkins, R.M. (1916) Modern Greek in Asia Minor. Cambridge: Cambridge
University Press.
Gia, Andrej (1991) Bijav. Romane priphende. Praha: Apeiron.
Grierson, George A. (1903-22) Linguistic survey of India. Delhi: Motilal
Banarsidass.
Hancock, Ian (1992) Notes on Romani Grammar. London and Austin:
Romanestan Publications.
Hancock, Ian (1993) A Grammar of Vlax Romani. London and Austin:
Romanestan Publications.
Hbschmannov, Milena (1990) Kale rui. Hradec Krlov: Krajsk kulturn
stredisko.
Hbschmannov, Milena, Hana Sebkov, Anna Zigov (1991) Romsko-
cesky a cesko-romsky kapesn slovnk. Praha: Sttn pedagogick
nakladatelstv.
Jesina, Josef (1882) Romi cib cili jazyk ciknsky. Praha: Urbnek.
Kaufman, Terrence (1984) Exploration in Proto-Gypsy phonology and
classification. Paper read at the Sixth South Asian Language Analysis
Roundtable. Austin, Texas, May 25-26th.
Kenrick, Donald (1976) A contribution to the early history of the Romani
people. Occasional Paper of the Romani Institute, No.3 London.
Miklosich, Franz (1872-81) Ueber die Mundarten und Wanderungen der
Zigeuner Europa's 1-12. Wien: Karl Gerold's Sohn.
Olh, Vlado and Gejza Demeter (1992) O Del vakerel ke peskere chave.
Praha: Cesk biblick spolecnost.
Paspati, Alexandre G. (1870) tudes sur les Tchighians ou Bohmiens de
l'Empire Ottoman. Constantinople: Koromla.
24 VIT BUBENIK
Pott, A.F. (1844) Die Zigeuner in Europa und Asien. Halle: Fricke.
Puchmeyer, Anton (1821) Romni cib: Das ist Grammatik und Wrterbuch
der Zigeuner Sprache. Prague: Wildenbrunn.
Sampson, John (1926) The Dialect of the Gypsies of Wales. Oxford:
University Press.
Shackle, C. (1972) Punjabi. London: Hodder & Stoughton.
Tessitori, L.P. (1914-1916) "Notes on the Grammar of the Old Western
Rajasthani", The Indian Antiquary.
Thomason, Sarah G. and Terrence Kaufman (1988) Language Contact,
Creolization, and Genetic Linguistics. Berkeley: University of California
Press.
Turner, Sir Ralph (1926) The position of Romani in Indo-Aryan. Journal of
the Gypsy Lore Society, Third Series 5 (4), 145-89.
Wentzel, Tatjana V. (1980) Die Zigeunersprache. Leipzig: VEB Verlag
Enzyklopdie Leipzig.
ON THE MIGRATION AND AFFILITION OF THE DMBA:
IRANIAN WORDS IN ROM, LOM AND DOM GYPSY*
IAN HANCOCK
University of Texas
0. Introduction
Charles Godfrey Leland (1875:637, see also Andreas 1926:129), re
ported the observation of an Iranian gentleman who, on the subject of Roman
Persian words, such as one hears from peasant grandmothers." About one
hundred are discussed below, only some of which may be accepted incon-
testibly as being of Iranian origin; but there are literally dozens of items in
theRomanidialects which remain etymologically unaccounted for, and it is
likely that a good number of them may eventually also be identified as
Iranian.
Paspati (1870:1), in his oft-repeated quote, said that the key to the his
tory of the Roma is to be sought in the study of theRomanilanguage. Since
the revelation in 1760 to Valyi Istvn that theRomanispoken in the heart of
Europe was actually of Indian origin (subsequently published in 1776),
scholars have to this day been attempting to determine when and why the
ancestors of the Roma left their homeland, whom they are most closely re
lated to there, and under what circumstances, and by what route, did they
enter Europe. A great number of hypotheses have been proposed (sum
marized in Hancock 1988), the most persuasive of which have certainly
relied upon linguistic, rather than purely historical, and sometimes
anecdotal, evidence.
Using linguistic material, the migratory route out of Central India can
be reconstructed, particularly through an examination of the sources of the
lexical items adopted along the way; thus beyond the now well-established
Central and Northwestern core, the presence of Dardic (Miklosich 1872,
Pischel 1883, Grierson 1908) and Burushaski (Berger 1959, Kenrick 1976)
elements suggests - assuming no great changes in linguistic territory over
26 IAN HANCOCK
the past millennium - that their westwardly route passed through the Hindu
Kush, south of the Turkic languages and north of Pashto (neither of which
is well represented in Gypsy), along the southern and southwestern
shoreline of the Caspian Sea through Iranian linguistic territory (Persian
and Kurdish) the southern Caucasus (Ossetic, Georgian and Armenian), on
through the Byzantine Empire (Medieval Greek) and thence into Europe.
The job of the historian is to reconstruct the social events involving
those early migrants which took place within the geographical territory sug
gested by this lexical material. Were the ancestors of the present-day Roma
Rajputs, as one current hypothesis maintains (Hancock 1991) and, if they
were, was it from this area that they moved out to encounter the Afghani
and Turkic Ghaznavids or was it further south - information necessary to
support or disprove that theory. And if, according to the same hypothesis,
the Pratihra did indeed constitute an element in the social (and subse
quently genetic) make-up of those who left India, were they of Scythian
origin, and were those Scythians the ancestors of the Iranian-speaking
Ossetes, as Abaev indicates (1964:xi)? If so, can Iranian linguistic influence
be dated back to the pre-exodus period? It is in these areas of Gypsy lin
guistic and social history that the Romanologist must seek his answers.
1. Traditional classifications of the Gypsy languages

During the 19th century, starting with Pott (1844-5, and especially
1846), the debate centred upon these and other questions, which dealt en
tirely with theRomanidialects of Europe, broadened to include the speech
of the Syrian Gypsies or Doms; a quarter of a century later, it had also come
to include that of the Armenian Gypsies, or Loms, first mentioned in print
by Paspati (1870:17). The linguistic data from all three were intriguing, be
cause the languages of the Doms and the Loms (Domari and Lomavren,
respectively) were, in part, Indian. It was assumed, therefore, that they con
stituted branches of the original Gypsy migration which had separated from
the main body before it finally entered Europe. This linguistic stimulus co
incided with what seemed to be historical support, with Brockhaus' sug
gestion that the word Rom had its origin in the Indian word dorn1. In a
review of De Goeje, Colocci (1907:279) urged caution in drawing too
sweeping a conclusion from the available data:
To imagine that just because the Gypsies of Europe and their brothers
in Asia share a common linguistic core, one should therefore conclude
that there was a single exodus of these people [out of India], and
IRANIAN WORDS 27
furthermore that the unity of their language only slightly weakened by

the still nebulous state of the documentation.
Unity of language might well prove unity of origin; but there could
still have been different migrations, chronologically and geographical
ly, without that fact being too apparent from the lexical adoptions
acquired by the mother tongue in the countries through which they
passed; all the moreso since those migrations were very rapid.
To conclude, therefore, that the unity of their exodus rests upon the
recognition of the unity of the substrate of their language, strikes me as
a proposition which shouldn't be universally accepted without [first in
corporating] the benefit of a [lexical] inventory.
Since Colocci's time, linguistic material dealing with a number of re

lated languages and dialects has been recorded and analysed, but the debate
continues despite having reached what seemed to have been a resolution in
1927. In that year Turner (1927:176), who also successfully argued for an
ultimately Central rather than North-Western origin for Romani, concluded
that
. . . the morphological differences between European and Syrian
Romani [i.e. Domari] are very considerable, and many of the
resemblances can be referred back to a common Indian origin rather
than necessarily to a post-Indian period of community.
He was saying, in other words, that while Romani and Domari are both
Indic, it did not mean that the ancestors of both necessarily formed one pop
ulation while they were still in India. Nevertheless it is still widely accepted
that the Dom, Lom and Rom branches of Gypsy are related in terms of their
original speakers' having left India as one population and separating only
once having passed through Persia. One recent statement in support of this
is found in Ventzel (1991:102):
The route of the ancestors of the Roma passed through Iranian lin
guistic territory. One group established itself in the north and the west
of Persia, from whence it subsequently moved into Syria and Palestine,
and [thence] into Armenia. The majority, however, reached the Byzan
tine Empire. Those whom we call Tsiganes are the descendants of those
nomadic tribes who passed on through to the Balkans.
More recently still, Harcourt Films released a film entitled The Romany
Trail in which this historical account was presented with animated maps
and interviews with Gypsies in Egypt. Their exodus from India was
28 IAN HANCOCK
explained by referring to Firdausi's Shah Nameh, an epic account of a mass

migration of Indians into Persia in the 5th Century (Marre 1992).
Turner was arguing against Sampson, who stated (1923:161) that
"Gypsies, on first entering Persian territory, were a single race speaking a
single language." In the same paper, Sampson (p. 169) provided a gene
alogical sketch of the relationships of these languages:
Bhen or Bheni Gypsies

The original settlers in Persia
Ben Gypsies Phen Gypsies
Nawar Kurbat Karai Helebi Bosa or Posa Byzantine Gypsies

of of N. of Asia of of Armenia &
Palestine Syria & Minor, Egypt S. Caucasus
Persia Trans
caucasia
and Persia European Gypsies
Turner's argument, that there were different migrations out of India at

different historical periods, seems not to have received much acknowledge
ment, although its sentiment is echoed in a recent book on Gypsies (Fraser
1991:39), who cautions that
. . . despite Sampson's insistence that both sprang from a single source,

some of Domari's dissimilarities from European Romani create doubts
about how far we can assume that the parent community was uniform.
Attempts to describe Proto-Gypsy (for which the name Dombari has

been proposed, and its speakers Domba2), such as that by Kaufman (1984),
have rested upon reconstructions which utilize forms found in Romani, Do-
mari and Lomavren. Such reconstructed forms have, therefore, incorporated
phonological changes found in all three, as well as acknowledging lexical
material which supposedly belonged to the proto-language. Thus Kaufman
(1984) was able to claim that "the fact that p[roto]-Gypsy can be recon-
IRANIAN WORDS 29
structed means that such a language could have existed" (p. 38), and on that
basis go on to reconstruct such non-lndic Proto-Gypsy forms as
'lord" and "pear" (from Kurdish xod and Persian amrd respec
tively) though at the same time maintaining that in India, several hundred
miles to the east of those languages, Proto-Gypsy "was an actual language
... in the Central Zone or on the border between the C[entral] and W[estern]
zones" (1984:39-40).
Some years before Turner voiced his suspicions that Dom Gypsy had a
different linguistic history from Rom Gypsy, Finck (1907:49-50) had also
made the same claim for Lomavren, which he believed was probably of
sauraseni descent, unlike Romani, which he saw as a Dardic language.
However, since Lomavren survives only as a lexicon (like Angloromani or
Cal), a fact first mentioned by Papasian (1901:126), conclusions regarding
its specific Indic affinity cannot be addressed on the basis of its structure. It
is significant, however, that not one word in the overwhelming Armenian
component of Lomavren appears among the ca. 45 Armenian-derived items
in European Romani (Redzosko 1984).
There are reasons why a separate origin, at least for Domari, seems
likely: First, if the ancestors of its speakers left India during the fifth
century AD (the hypothesis which rests upon Firdausi) and entered Europe
in the early Medieval period, where were they located and under what cir
cumstances were they existing during those intervening six centuries spent
in the Middle East before leaving the area of Arabic linguistic influence and
entering the Byzantine Empire? And why, during such a long period of
time, was there such scant lexical impact from Arabic upon European
Romani3 when just two centuries of contact with Medieval Greek - a mere
third of that time - has resulted in over two hundred items from that lan
guage being adopted - on the other hand, Domari is more than half Arabic
in its contemporary lexicon. Secondly, although Sampson disputes this
(1926:125, para. 272), according to Macalister (1914:9), Domari retains
vestiges of a third, neuter grammatical gender, which reflects the time of its
exodus from India; the neuter gender became lost in the neo-lndic lan
guages before the ancestors of the Rom (and presumably the Lorn) left, but
after those of the Dom left. The present paper discusses a third argument:
the impact of the Iranian lexical content, that is words adopted from
Persian, Kurdish, and Ossetic. If the Dmba had moved into, and resided in,
Iranian-speaking territory as one population before dividing into the Syrian,
Armenian and European groups, we would expect the lexical material
30 IAN HANCOCK
accreted there to exhibit a high degree of overlap in each group. The

findings, however, argue against this.
There are of course limitations to this analysis. The non-European
Gypsy languages have not been adequately examined or classified, and the
lexical material available to us was collected from dialects in decline. Thus
the massive Iranian (Kurdish) influence upon Domari, for instance, con
ceals the originally-acquired vocabulary, if indeed the two can be treated
separately. Furthermore the Iranian and the Indic languages are related and
share very many cognates; likewise there has been intensive lexical balkan
ization throughout the area, with similar forms turning up in otherwise
unrelated languages. The widespread occurrence of an item such as e.g.
"conversation", "monkey", "enemy" or "melon", however, while arguing for
its retention in Romani and the other languages, makes the determination of
the language of direct transmission difficult. Sometimes phonology pro
vides a clue; thus Balkan Romani vaxt or vkti "time," while also found in
Persian as vqt is omitted here since its form suggests Turkish (yakti) as the
source of immediate adoption for both languages. In listing the Persian
items in Albanian Romani, for example, Mann (1933:14) discusses only
those which have not also been adopted by Turkish, although I have not
followed this constraint here (e.g. soba). Very many of the items in the
following lists are not exclusive to the source-language listed, nor may they
necessarily be indigenous to that source, but are probably adoptions
themselves. Some have been omitted for other reasons; ce "what," for
example, is both Romani and Persian, but in Romani it is found only in the
Vlach dialects 4 , and is probably an adoption from Romanian (in which
language // as a development from Latin /k/ is predictable). Persian ce,
furthermore is literary, not colloquial, where the pronunciation is [ci].
Again Persian has mgar "unless," which is similar to Romani mkar
"although, at least," but it may easily be demonstrated that the two words
are not related (cf. similar forms in Albanian, Serbian, Romanian, etc.).
2. The languages
All sources for the following items are listed in the acknowledgements
and the bibliography; only in particular cases are they cited for individual
items. Given the evident migratory route of the early Romani population, a
closer examination of other languages besides those dealt with here (e.g.
Pamir, Pashto, Zebaki, etc.) is clearly in order, and will be part of a larger
study.
IRANIAN WORDS 31
The only attempt to date to list and identify the various languages and
ethnonyms which occur in the literature is Kenrick (1976). Those dealt with
here are, besidesRomani,Lomavren, Domari, Karaci and Mitnp:
Lomavren is the language of the Lorn, or Bosa as they are called in Tur
kish. Because it is now only a variety of Armenian, it has probably existed
in intimate contact with that language for a very long time. According to
Narodi Kavkaza, Vol. 2, page 40 (cited in Kenrick 1976:32), there are few
Lom now remaining in Armenia, but they constitute a viable community in
neighboring Azerbaijan. Benninghaus (1991) lists a number of localities in
eastern Turkey which are also inhabited by Lorn.
Domari is the language of the Dom or Nawar, who inhabit Syria,
Lebanon, Egypt, Israel and neighboring countries.
Karai is a variety of Domari, as the higher incidence of shared forms
between the two (below) demonstrates. The population refers to itself as
Dom. The name Karaci may derive from the town of Karai near Isfhan;
the proposed origin in Turkish kara "black" + -ci"agentive nominalizer" is
not grammatical in that language (although kara-ci "land-dweller," is; there
is also a Turkish form karaca, [karad3a] which means "somewhat
swarthy"). It is applied to groups inhabiting western Turkey, Syria, Iran and
Iraq also. Kenrick (loc. cit.) lists a North and a South dialect which differ
appreciably, and there are probably others. Jochelson (1928:172) estimated
a population of half a million Gypsies in Asiatic Russia 65 years ago.
Andrews (1989:139) list a figure of "10,633 registered peripatetics," in the
Turkish census called gebe ve gezginci ingeneler ("nomadic and trav
elling Gypsies").
Mitrip is also a variety of Domari, and may be a name of Arabic origin
meaning "musician," cf. Ar. Motribiyya, "Gypsy." The population's self-
designation is Dom. The speakers of this dialect live along the Turkish
borders with Iran and Iraq in Batman, Elmayaka, Van and other towns. All
Mitnp data here are from Benninghaus.
3. Iranian items in the languages5

While lists ofRomaniitems of Iranian origin have appeared in a num
ber of publications (most recently Boretzky & Igla 1994), no concerted
attempt has been made to search for all of them in that language, nor to
identify them in all of the Gypsy languages under discussion here or to pro
pose conclusions based upon that regarding the affiliation of those lan
guages. For Romani, I have included my own etymological investigation
32 IAN HANCOCK
and that of Boretzky and Leschber in particular (in personal com

munication), as well as the sources listed in the bibliography. For the other
lists, I have had to rely solely upon the published sources also listed in the
bibliography.
3.1. Those in Romani

In addition to the items listed below, Iranian has made at least two
possible grammatical contributions to Romani, though neither is found in
Domari or (apparently) Lomavren. The first is the suffix {-as}, which as an
enclitic is added to the present root plus person/number endings to form the
imperfective, and to the preterite stem plus person/number endings to form
the pluperfective (dikhv "I see", dikhvas "I was seeing", dikhlm "I saw",
dikhlmas "I had seen"). This has been interpreted as an innovation paral
lelled in a number of other neo-Indic languages, thus Hi.jt th, Pan. jnd
s "was going" (cf. discussion in Bubenik, this volume). In each case, the
particle following has been analysed as having derived from some form of
the BE verb in Sanskrit. Sanskrit, however, does not form its imperfective
in this way, and both Miklosich (1872, vol. xi, p. 40) and Sampson
(1926:192) admit that "it is difficult to understand how the addition of the
present of the verb 'to be' could have imparted to the indicative present an
imperfect signification." Sampson suggests that the morpheme originates in
an imperfective form of Skt as-, through Pkt s In an article dealing with
this, Bloch (1932:59-60) claims that Sampson's argument is presented "de
faon inacceptable," and also states that the structure is "sans analogue dans
les langues indiennes, et hors de probabilit ailleurs." Boretzky (in personal
communication, 1993) suggests that
Dikhava-s(i) and dikhlem-as(i) are good internal grammaticalizations of
pre-European times, cf. HungarianRomanidialects having -ahi and the
Prizren dialect in Kosovo with simai < simahi < sim-asi "I was".
Bloch's statement is wrong, at least with regard to Romani itself,
because there is good parallel in the Arli dialect: dikhava sine and
dikhijum(a) sine, functionally a new imperfect and pluperfect re
spectively, although their functions are not all that clear, [and] sine is
often reduced to ne, the normal way grammaticalization takes.
In Persian, the perfective is constructed by placing the short forms of

the BE verb after the past participle. InIndiclanguages, the function of this
particle is generalized for all persons and numbers, thus in Panjabi "gi s 'I
(he, etc.) had gone'" (Sampson 1926). In Persian6, the past participle is in-
IRANIAN WORDS 33
variable, but the perfective marker (i.e. the short forms of the BE verb)
changes for person and number: raft am "I have gone", raft i "you have
gone", rafl ast "he, she has gone" etc..Romanidiffers from each of these,
in that the person/number marker is retained, and the third person singular
imperfective/pluperfective morpheme has been generalized for all persons/
numbers: dikhv-as, dikhs-as, dikhl-as, etc.. The loss of final /t/ from the
hypothesized underlying Persian ast is a normal phonological feature of the
language (cf. [vas] for <vast>, "hand", [bus] for <bust> "skewer", etc.)7.
The second grammatical acqusition from Iranian is the comparative
suffix {-der}, enclitic to the oblique adjective: baro "big", bareder
"bigger". While Sanskrit had the related form -tara, Sampson has noted
(1926:150) that the structure "is lacking in A[siatic] Gyp[sy] (i.e. Domari)
as well as in the Indian vernaculars", though he provides parallels with Skt.
at para. 125.3 on p. 53. Hindi uses the oblique adjective followed by
postpositional se: bora "big", bore se "bigger", and Banjara does the same:
moto "big", mote si "bigger". Gujarati adds -kartm or -th to the nominal
preceeding the adjective: mot "big", str mo t-kartm hokr he "a
woman is bigger than a girl", while Sindhi "uses the Iranian comparative in
-taf' (Campbell 1991: 1235). In the Iranian languages, however, the
Romani construction is found. In both Pehlevi (Middle Persian) and
Modern Persian, the comparative suffix is -tar: bozorg "big," bozorgtar
"bigger." Kurdish has -ter: gewre "big" gewreter "bigger," while the Ossetic
form -der comes closest of all toRomani:saw "black" sawder "blacker."
The possibility of an Ossetic origin is strengthened by the fact that the
Common Romani comparative feder (or fededer) "better" may also be
Ossetic; see discussion for this entry below. Pashto has the cognate
comparative morpheme tor, but this functions prepositionally only.
Kostov (1963:99) has suggested that the model for forming the nume
rals between eleven and nineteen (with des "ten" + -u-: desu-jekh "11," des-
u-duj "12," etc.) is either Persian or Armenian; the R. model is neither Skt.
or New Indian; cf. Skt. ekdasa,dvdasa, Hi. gyarah, brah, "11, 12," but
cf. Arm. tasn-u-mek, tasn-u-erku "11, 12," P, bist-o-yek bist-o-do "21, 22"
(Romani has conjunctive ta for compound numerals above 20). In some
Vlach dialects, conjunctive -u- is not used in compounds with the non-
Indian numerals efta "7," oxto "8" and enja "9." Further discussion of
numerals is given below.
34 IAN HANCOCK
The Romani forms listed below are from an examination of all

available wordlists and do not represent any single dialect. Unless otherwise
specified, the first entry in each case is Persian:
R I aj certainly! K ar "certainly!, yes!" In R compounds ajso, ajsar.
Cf. D ais "well now!".
R 2 jna mirror yneh "mirror" (Albanian R.). Possibly < Tk.
R 3 akana, akna now aknun "now," but cf. Hi akhan "now."
Prob. convergence.
R 4 alav- set on fire alv "flame", cf. Skt dhayati, Skt anala
"fire," Sindhi anal(a) < ult. Drav. anal
R 5 amal friend haml "friend," + Hi Old P. mal, Oss. mbal,
Skt amtya, Pkt *amatta "companion"
R 6 ambrol, brol pear amrud "pear," K emrd + Tk armud
R 7 angut finger angut "finger," but cf. Skt angusta-, and next
R 8 anguStri ring anguta "ring," a loan from Hi. angutar. K
engtere
R 9 arii tin arij "tin," Oss arj, + Arm.
R10 asav, asjav (wind)mill syb "mill," also K asyaw "water mill"
R11 avcin steel Oss afseing, Kurdish spna
R12avd2in, honey angubn "honey," Khingv
avgin
R13 azb- touch azmudn "try out, test;" doublet with Rl 19 zumav-
R14 baxt luck baxt "luck," K, Arm, Hi
R15 berk breast bark, birkat "breast;" ultimate Ar source cited
by Miklosich (above) appears to be a ghost word.
R16 bezex sin bazah "sin," K beze Tk beze
R17 bi- without bi "without," + K + Skt. See Bloch (1926:139).
R18 bibi aunt bb "aunt," but prob. < I
R19 bori daughter in law, K bur "daughter in law," but cf: the
bride more prob. Banjara bori.
R20 burnek handful burunk "gain, acquisition"
R21 burr straw K borr "pile of straw." Sampson (1926:48-9) says
"origin somewhat obscure," but compares Hi b
"bush, shrub."
R22 bust spit, skewer sbux "prick, pierce"
R23 bustan garden K bistan, "garden." (< Ar.?)
R24 buzex spur sbux "prick, pierce"
R25 buzno goat buz "goat," Afg vuz
IRANIAN WORDS 35
R26 erxan, sky arx "sky." Miklosich's Ar. source seems to be a ghost
erxaj, akano word.
R27 i, ii, i anything, ci, ih "what." Bloch (1926:139) dismisses Skt.
nothing, not indefinite enclitic -cid because it cannot exist unbound
(Sampson, 1926:60).
R28 inri plane (tree sp.) enar "plane-tree" (Albanian R.)
R29 ukni, upni whip buk "active, quick, nimble"
R30 desto handle; stick dastah "handle, haft"
R31 divno conversation dvan, K diwan "council", also Tk, Balkan lgs.
Non-final stress suggests later adoption in Europe.
R32 diz town, fortress diz, diz "fort," K diz "fort"
R33 dorjv, sea, lake, river dary "sea"
dorjvo
R34 do-, du to milk K du-, du- "to milk"
R35 duman, enemy dolman "enemy," also Rum, Tk, Ar, Psh, Ur.
dumano
R36 damutro son in law dmd "son-in-law," cf. D potra "son", but
almost certainly I, with convergence, cf. Peh damat,
Skt mtr.
R37 di, (pi. da), ozi stomach, heart, soul K jn"life," dzagr "stomach."
Convergence of unrelated Ir forms in R?
R38 dukel dog aghl "jackal." This is usu. listed as < Skt jakuta
"dog," but Sampson says this has no descendants in
neo-Indic. Soravia ( 1988:5) however lists Mynwle
jukel, Sansi chhkkal, Kanjari jhu nkil (these lan
guages are discussed in Grierson, 1922).
R39 feder, fededer better Oss fid "strong." Sampson tentatively suggests
Skt bhadratara "better," but development of R HI
would be hard to explain, {-der} may itself be < Oss
{-der}. Feder is the only comp, form surviving in the
Vlach dialects, in which it means (as adverbial maj
feder) "rather, sooner."
R40 gertjno, throat, gullet K gerdin "throat," cf. Rum dial, girtan, poss. Slav.
girtno See also R. 50)
R41 harbuz, melon xrbuz "kaveh melon, donkey-mouth melon,"
arbuz Gk and Tk have karpuz(i); also Rum.
36 IAN HANCOCK
R42 kam- to desire Peh. kmitan "wish," cf. Skt kmyate. Turner says this
is not IR; cf. Hi kam "desire, wish," although we would
have expected the R form *kav- if it were < I.
R43 kanzavri hedgehog xrndaz "porcupine" (?) (Balkan R. only, and almost
certainly < Gk skanzxoiros
R44 katun sheet, cloth qutn "cloth" < Ar. Also Gk & Slavic, see
Sampson, (1926:136)
R45 kermso, rat karmu "rat"
kermsa, kermso
R46 khangeri church kangura "battlement, turret". Discussion at D 6 below;
see Sampson (1926:133)
R47 kirmo worm kirm "worm," + Hi, cf. Skt kpnih, Pkt kimi. (Arabic
adopted the Skt form as qirmiz, from which the P form
is derived. Cf. E kermes "cochineal insect," also
carmine, crimson). Turner believes this to be directly
from Skt.
R48 ki , ke , ke silk kaz "raw silk"
R49 kisi pouch ks, ksa "pouch," < Ar.
R50 korr neck, gullet K karrhk, P grdn "neck"
R51 korro, koro blind kr "blind," + Arm kuir, K kor, Hi kor
R52 kui cup kze "pot, vessel," K kz
R53 kun, kunsus corner kunj "corner"
R54 kta, kti wrestling match koti "wrestling," K koe "effort." Albanian R.
R55 kutik belt kuti "girdle worn by Parsis"
R56 lalo, mute, dumb lal "mute," + Hi, cf. D lala (cf. the Lallere Sinti).
lavoro, lavodo
R57 1av word lafz "word," K lebz (<Ar). Sampson (1926:191) ..
rejects Skt lpa/Hi sallp. Tk has laf, a common
form in many Balkan R dialects
R58 lis, li terror larzdan "tremble," cf. D rzri "tremble."
R59 lono, loeno joyful ron "bright, splendid" K rro in "delighted, happy,
lit up." Sampson (1926:200) says not < Skt las, but cf.
Skt rocana "bright, pleasing, lovely." See also L22,
L29
R60 majmno monkey meymun "monkey," K, Tk maymn, Rum momuie,
and other Balkan lgs.
R61 mjna plain, savanna K mejdan (< Ar)
R62 mamux, sloe mahmiz "sloe fruit," but cf. Tk mahmuz (< Ir?)
IRANIAN WORDS 37
mamuxo
R63 mjve, fruit mlveh "fruit," K mewe, but cf. Tk meyva.
mvo, mibao
R64 mero person K merdum "people," Old P merah "husband"
Greek & Spanish Romani
R65 mom, mum wax mom "wax," + K + Hi
R66 momeli, candle The P word for this is am, in K it is mom as
memel i, mum(b)ali above. The R form, according to Sampson (1926:82,
233), is formed from mom + the adjectival suffix
{-al-), the word therefore meaning "(the) waxy (one)."
The form might however be Arm, thus momeln "of
wax."
R67 muso, musos, mouse mus "mouse," cf. Hi ms, msak
muakos, mia
R68 na, anav name nam "name," + Hi
R69 nian sign, signal nean "sign," K nian "sign, mark, spot, meeting place"
R70 orde, here Oss orth "here"
vorde, arde
R71 pata, footwrap ptva, pataba "leggings," cf. Hi; E "puttee"
patay, patavo
R72 paver-, rear, foster parvardan "nourish", + Hi. Cf. Skt. prati "protect,
parvar- nourish"
R73 pendex, nut banduk "filbert", cf. K bendak,
penax, penaxa Gk fountouiki "hazelnut, filbert."
R74 perde curtain prdeh "curtain." Cf. Hi,Urpurda "veil," hence E
"purdah." But R form prob, immediately < Tk perde
R75 phurt, bridge purt, purd "bridge," but cf. Skt *prtu- "bring across"
phurd, phurt
R76 por feather par "feather" (prob, not cognate with R patrin "leaf."
See Sampson, 1926:282 and Bloch, 1926:140), though
cf. Pkt pama-
R77 poom, wool pam "wool," "down"
poum
R78 poti hide, skin post "skin," postn "fur," + Hi
R79 poxtan, cloth paxta "cotton"
potan
R80 rad- set out on a K rra-y "set out on journey" (R form probably
journey compounded with d- "give;" Hi rahn unlikely).
38 IAN HANCOCK
R81 res-, ares-, arrive rasidan "reach"

reg-
R82 rez vineyard K rrez "vine, vineyard"
R83 rizer-, shake, tremble larzdan "shake," see Sampson (1926:316)
lisdr-, izdr-
R84 ruv wolf rbh "fox", cf. Oss robas, ditto
R85 rril fart ridn "defecate." Sampson (1926:314) doubts
origin in I dhirh
R86 sini, sinia table, tray sn salver," + K, but also Tk and widespread
Balkanism
R87 sir, sirr garlic sir"garlic," + Hi + K
R88 soba stove K soba "stove." This word has the same form and
meaning in Romanian; in South Slavic and in Romani
dialects spoken in those areas, it means "room," i.e. the
area warmed by the stove. Probably K and R < Tk
R89 gaj able saj "be able," say-ad "it can," K san "be able, capable,
ready" (Wahby & Edmonds 1966)
R90 ol, gel whistle sor, sut "a whistle," + Ur, Hi
R91 tablo warm tab-an "heat, radiation", tab-es "radiance", K taf
"warmth." Bloch (1926:140) dismisses Sampson's Skt
tapia, but cf. Hi tap(a)t "hot, warm."
R92 tang, tango narrow tang "narrow," + Hi, Ur. D tanga
R93 taxtaj tumbler, glass tast "vessel, bowl," K test "metal bowl," also Tk
R94 trijax, boots Oss tsyryx'x"'boot." Gk has tsarouxi.
dirax, erda, khera
R95 tover, tovel axe tapar "axe," cf. K tapar, Arm tavar
R96 vazd- raise, lift K vazdn'a "I lift myself." See Miklosich 1872, vol. 10,
p. 93, also Sampson 1926:20.
R97 veg, vo woods besa "forest" (Manzandaran dial. vis)
R98 vurdon, waggon wardyn. Also Oss wordne "waggon", P bordn
verdo, vardo "wheel" K gerdne"carriage" (Wahby & Edmonds
1966)
R99 xa- eat xidan "eat" (it has been suggested that this is not
< Skt khdati because no Domari forms have a /d/).
R100 xamov- yawn, gape xamiyza "yawn," + Ar
R101 xanav-, dig OP xan- "dig." Cf. Ol khan- "dig," and next
hunav-, honav-
R102 xanduk deep xandaq "grave, pit," K xendeq "trench"
IRANIAN WORDS 39
R103 xandz an itch xrdan "itch," xrndan "scratch," xrxr "an itch"
R104 xanrrnd- scratch as above. Cf. also K xuran "itch," Arm xantel "itch"
See Sampson (1926:177).
R105 xar valley, gully xr, xre "slope, valley," but cf. Skt khadda "hole"
R106 xeljlax, xeljal herbs kirkir "kind of bean or pea", cf. Gk kekris
R107 xer donkey xar "donkey," but cf. Pkt khara-
R108 xevis, xejic, cornmeal mush K xav "cold, uncooked meal"
xivici, xovici
R109 xirpa term of abuse; K xirpo "term of abuse"
asshole
R110 xolov, sock, stocking Oss xalaf. Wolfs comparison with Bulgarian
xoliv, xoluv holev (1960:134) reveals no such word found in that
language. Tk has orab, do.
R111 xulano, chief, leader, K xola "landlord"
xulaj host
R112 xumer dough, dumpling xamr "dough", Also Ar, Arm
Rl 13 xurdo, small, petty, xurd "small." Ashiti (an archaic Iranic dialect
xuredo insignificant between Kurmanci and Persian) has xori "it's a
waste of time." H khurd less likely.
R114 zen saddle zin "saddle," K zi n
R115 zeja back Plural of zin (< zinja > zija > zeja, with
characteristic Vlach loss of intervocalic nasal) but
not with this meaning in P or K.
Rl16 zet, diet, oil zeit "oil"
zetino
Jin "a spirit, genii"
R118 zor, ruz-, power, strength zor "power," + Arm + K
o
Rl 19 zumav- try, test, prove azmudn "try out, test" + K. A doublet with R13 azb-:
Gjerdman & Ljungberg (1963:203) attempt no
etymology for azb-, while Sampson (1926:409)
suggests a Romanian source for zumav- but does not
provide one.
3.2 Those in Lomavren

No native or Iranian grammatical adoptions are evident in Lomavren,
but Persian-derived word-formation in that language is discussed in Finck
(1907:53ff.). The items below are cited as Persian by the same author.
40 IAN HANCOCK
L 1 andz inclination, desire andoz

L2 ansev apple siv Kar seb
L3 ansnani strange, unknown K nas "know" + Arm neg.
L4 antazi basket ndazeh "a measure"
L5 sb horse sb
L6 vd flour vd
L 7 axolar, xola lord xula, K xola
L 8 axvar contemptuous, evil axvr "hateful"
L 9 bb gate bb + Ar
L 10 barbar equal brabr
L11 bezeh, pa Sax damage, injury bazah
L12 drm I have darm
L13 daste handle dastah
L14 dmrav violin damrv "mandolin"
L15 dombare tambourine Old P dombareh
L16 ghen today ghen "instant, occasion"
L17 gar soup garm = "warm," as metaphor
L18a ham, him again him "back again"
LI8b himni bring back him + Arm
L19 isi this in isi ghen "time, occasion"
L20 jari whore yr "lover, friend" adopted into Sh and Ka
L21 jiroSi wealth, goods yiro "livelihood"
L22 jiro fire, light ron "clear, bright"
L23 kargah workplace krkh "type of frame tent"
L24 kurax foal, filly koreh
L25 nakeajtejan suddenly naghan
L26 parpar against bar
L27 piyazav onion piaz
L28 por, pur fill por
L29 rowan light, not dark ron ."to light." Cf. L22, R59
L30 alvar trousers lvar
L31 sib apple K seb
L32 suzan needle suzn, cf. Hi s
L33 var stone, tooth bar, D wat, but rejected by Sampson
(1926:24)
L34 vist twenty bist
L35 vusta master bezeh, K beze
IRANIAN WORDS 41
L36 xar eat xordan, R xa-

L37 xod god xoda "god"
3.3. Those in Syrian Domari

I have not attempted to list all of the Iranian-derived items in Domari
since, as Kaufman says (1984:4) it "shows massive (but not quite over
whelming) influence from Iranian...most of the Iranian vocabulary in Dom-
Gypsy is of Kurdish origin;" Macalister (1914:14) for instance discusses the
Persian model for the Domari genetive construction. The language is also
very heavily influenced lexically and structurally by Arabic, dealt with ex
haustively by Littmann (1920), who deals only in passing with its Persian-
derived element ( 1920:32).
It is significant that in all three of the Gypsy branches discussed here,
the words for "seven," "eight" and "nine" are not native items, suggesting,
as Pischel (1883:360) has already noted, that they may not have been
brought out of India. InRomanithey have been adopted from Greek, while
in Lomavren and Domari they are Persian. The same is true for the words
for "thirty," "forty" and "fifty;" Romani has Greek words for these, and
Lomavren and Domari have Persian.
Listed here are those items with parallels in either or bothRomaniand
Lomavren.
D 1 bi without bi, K be Also Ind.

D 2 if saliva tuf
D 3 daurik-pani sea dary
D 4 d village dih
D 5 jau barley dzau
D 6 kangri waggon kangura "turret." This word is not listed in
Macalister. Pott indicates that the Middle
Eastern waggons gave the appearance of
having turrets, and compares Spanish Romani
cangallo "waggon" and EurR kangeri
"church".
D 7 kr blind kr, adopted by Hi as kor. Also K kor, Arm
kuir
D 8 ku beard ksa
D 9 lla dumb, mute lal
D 10 nla dark blue nl (?< Ar)
42 IAN HANCOCK
D 11 pai behind pas

D 12 ptna wool pam
D 13 pau foot p, pe
D 14 qar ass xar
D 15 ras- arrive rasdan
D 16 raw- depart, go, move ra-y. See R80
D 17 razari tremble larzdan
D 18 su needle suzan cf. Hi. sul
D 19 tnga narrow tang cf. R tang
D20 wrsar rain barsanda, K bar
D21 xal maternal uncle xal (?< Ar)
D22 xja god,sky, heaven xud
D 23 ya or ya
D 24 zari mouth Old P zafar
D 25 zrda yellow zard
3.3.1. Domari equivalents of nonshared Iranian items in Romani

The fact that most Romani items which are of Iranic origin are diffe
rent from their equivalents in Domari also argues against their common
acquisition. These include:
Romani Domari
amal friend beli, sahib
angutri ring ngli, wirg
avin steel stm
azb- touch mnr
baxt luck miritk
berk breast i, xri
bust/buzex spit; spur sx
buzno goat kli
erxan sky, star lhii, xy
do- milk xlaur
katun tent xmi, kuri
khangeri church ktirnkki
kisi pouch, pocket b
kugtik belt kli
lig terror bi, xf
momeli candle mixri
pendex nut bgnn
IRANIAN WORDS 43
poti skin, hide kl, qal

poxtan cloth pti, tli
res- arrive had-her
ruv wolf snts-xlaiki
ol whistle ibab
taxtaj drinking vessel ktik.
tover axe ptt
vazd- pick up limm-kerr
xeljax herbs gas
xulaj lord kurik-saui
zor power qauwa
zumav- try jar
3.4 Those in karaci
K 1 alav flame lw
K 2 angudari ring angutar
K 3 bafir snow barf
K 4 banir cheese panr
K 5 bugandus he dug kandan
K 6 ia hole, well ah
K 7 dava camel dav- "run"
K 8 deh village dih
K 9 dost friend dst
K10 dara candle iry
K11daver barley Jaw
K12 gand sugar qand, kand "candy"
K13 gyrmyzi beautiful; red girmiz "crimson," cf. kirmo
K14 hafta week hafta (also "seven")
K15 hanaq trick, joke hanaq-myqa "do not trifle"
Kl6 hargu everywhere haru
K17 erimda graze ardan
K18 urabaura diverse Jr-ba-Jr
K19 ku beard kusa
K20 kuli cap kulah
K21 luleh leg lla "tube"
K22 mahni they stopped mandan
K23 mahudi cloth mht
44 IAN HANCOCK
K24 meimun monkey maimn

K25 mega awood mesa
K26 mia he dies mir
K27 miftav moon mahtb
K28 muzi shoes mza
K29 myg, mygu mouse m
K30 naziq thin nazk
K31 nila dark blue nil
K32 rdaq duck urdak
K33 paa after, since pas
K34 paf foot p, pe
K35 pigiq she-cat pilk, puak
K36 plav stewed rice pilv
K37 qar ass xr
K38 qadum I desire qasdam "my desire"
K39 qi in order that ki
K40 quq wolf gurg
K41 rang colour rang
K42 seb apple sib
K43 sirqa vinegar sirka
K44 sovdasyz trade (n) sawd
K45 gax horns x
K46 ir lion er
K47 or salted r
K48 tarki dark trk
K49 taza fresh taza
KS0 tu spittle tuf
K51 tuz smooth tz
K52 varsinda rain barsanda
K53 xa'a egg xya
K54 xalum uncle xl
K55 xania spring, well xani
K56 xuja god xud
K57 xyrsi hear xirs
K58 ya or ya
KS9 zardavi yellow zard
K60 zever mouth zafar (Old P.)
IRANIAN WORD 45
3.4.Those in Mitrip (B Batman dialect, E Elmayaka dialect, V Van dialect)
M 1 bavere! E come! K were!

M 2 bever B father K bav
M 3 ire V grass, grassland K re "meadow
M 4 davir, dawir B mother K da, de
M 5 geji B woman K ke,
M 6 hesp V horse K hesp
M 7 kutaviva B where (are you K kuda "where"
going)?
M 8 lafte B child K law
M 9 merif V man K merif
M 10 rawele E stand up! K rabe!
M 11 zman B language K zman
4. Analysis
The following breakdown of shared forms includes items which are
probably of Indic rather than Iranian origin ("stone", "eat", "needle",
"rain"), or which might have been adopted from Arabic ("monkey," "dark
blue," in Karaci). Not included are cognates which, because of semantic
shift, were probably inherited independently and cannot support common
acquisition (for example R khangeri "church" and D kangri "waggon," R
lono "happy" and L rowan "bright," R kirmo "worm" and K gyrmyzi
"red," R efta "seven" and K hafta "week").
Romani Lomavren Doman Karaci

1 angutri "ring" - - angudari
2 barr "stone" ? var wt -
3 bezex "sin" bezeh - -
4 bi "without" ? bi bi -
5 briind "rain" ? - warsar varsinda
6 desto "handle" daste - -
7 dorjvo "sea" - daurik -
8 izdr- "shake" - razari -
9 korro "blind" - kr kor
10 lavodo "mute" - ll lal
11 ma(j)muno "monkey" ? - - meimun
12 mia "mouse" - - my
46 IAN HANCOCK
13 poom "wool" - pm -
14 rad- "depart" - raw- -
15 res- "reach" - ras- -
16 suv "needle" ? suzan su -
17 tang "narrow" - tng -
18 xa- "eat" ? xar - -
19 xer "donkey" - qar qar
20- "apple" sib, ansev - sib
21- "god" xod xuja xuja
22- "after" - pai pa
23- "ass" - qar qar
24-- "barley" - dzau dzaver
25- "beard" - ku ku
26- "dark blue" - nila nila
27- "foot" - pau paf
28- "mouth" - zari zever
29- "or" - ya ya
30- "spittle"? - f tu
31- "uncle" - xal xalum
32- "village" - de deh
33- "yellow" - zerda zardari
Of the above items, Romani and Lomavren share eight, Romani and
Domari/Karaci share sixteen, Lomavren and Domari/Karaci share three, and
Domari and Karaci share sixteen, to be expected since they are both Dom
languages. None of the eleven Persian-derived items in Mitnp occur in any
of the other languages, and there are no Iranian items at all shared by all
three (Rom, Lorn and Dom).
Among the total of 119 items of Iranian or possible Iranian origin in
Romani, the seven shared with Lomavren constitute about one fifteenth or
roughly 7 %; the sixteen shared with Domari/Karaci just over one sixth, or
about 15.5 %. Of the total of 36 items of Iranian or possible Iranian origin
in Lomavren, those it shares with Domari/Karaci constitute one eighth, or
about 12 %. Of the sixty items in Karaci, sixteen are shared with Domari,
which is about 26 %. These ratios would be even lower if doubtful items,
such as those followed by a question mark, were not counted. The corre
spondences would then be two items shared by Romani and Lomavren or
2.3%, eleven items shared by Romani and Domari-Karaci, or 11.5%, and
IRANIAN WORD 47
two items shared by Lomavren and Domari, or 2.3%. The proximity

suggested by the greater percentage of shared Rom-Dom items (than Rom-
Lom items) contrasts oddly with ampson's scheme in which the first
hypothesized split was between the Ben and the Phen groups.
5. Conclusion
It may be assumed that the low incidence of shared Iranian-derived vo
cabulary between Romani and Lomavren or Romani and Domari/Karaci,
and the absence of any items shared by all three, argues strongly against
their separation after having coexisted in Iranian-speaking territory. Even
Sampson (1923:164) admits that "lacking in Nuri [ie. Domari] are several
important loanwords [from Persian occurring inRomani],which may per
haps be regarded as evidence that the two bands had separated before these
later Persian borrowings were absorbed into the speech of the western Gyp
sies." Whether these figures prove separation within Indian territory, or
outside of India before reaching Persian territory, however, remains to be
demonstrated using other criteria.
NOTES
I should like to thank Ali Jazyery and Mohammed Ghanoonparvar for their help with
Persian, Ms. Corinna Leschber for her help with Kurdish, Ms. Naciye Kunt for her help
48 IAN HANCOCK
with Turkish, and Angus Fraser, Anthony Grant, Norbert Boretzky, Victor Friedman and
Donald Kenrick for their valuable comments on earlier drafts of this paper.
1 Although Leland (1873) gave himself the credit for making this connection, as he did
for the "discovery" of Shelta sixty-five years after McElligott first wrote about it (Han
cock, 1984:385), it was in fact Hermann Brockhaus who first suggested it, in a letter to Pott
dated July 16th, 1841. The relevant section of that document (found in Pott, 1844, Vol. I,
on page 42) is important, and bears reproduction here:
In the collection of fairy tales of the Somadeva edited by myself, we find Tar. 13 l.
96 (page 169), and in Khlana's History of Kashmir, e.g. V. 353, the word Dmba
(with retroflex d), and Wilson makes at this point the comment that this name in
dicates a kind of pariah. Since this word is missing from the Sanskrit dictionaries, and
thus is not considered to be classical by Indian grammarians, it must therefore belong
to the words borrowed from the colloquial vernaculars. In Hindi, we actually find the
word d'ma, feminine d'mn, with which a person of the lowest class is labelled.
Might not this word dom be the same as the Gypsy Rom? Doesn't this perhaps refer to
a tribe originally living in north-western India which, being subjugated, were
degraded to the status of pariahs? The fact that a people don't call themselves by a
name indicating something dishonorable is obvious; only through subjugation can the
name of a people become a name of opprobrium among the victors.
2
By Kaufman, 1984.
3 The position (first mentioned by Groome, 1963:xxxiii, and discussed in Hancock,
1988:206) which assumes that the Dom and the Rom left India together and passed through
Persia before the 7th century rests upon the supposed lack of Arabic influence in the latter,
the argument being that after that date, the spreading influence of Arabic upon Middle
Eastern languages would have been reflected in Romani. However, while Arabic began to
affect the liturgical vocabulary of Persian at this early date, it did not have any considerable
impact upon the colloquial speech until the 11th or 12th centuries. In addition, the route of
the Dmba through Persia seems to have been along the shoreline of the Caspian Sea,
geographically remote from Arabia and, presumably, the original linguistic influence of
Arabic. Nevertheless, Arabic has made some impact upon Romani. De Goeje (1903:55)
claimed that the Arabic items he lists "all occur in European Gypsy dialects...and they
sufficiently establish the theory that all the Gypsies of Europe have lived for a long time
among Arabic-speaking people." Although Miklosich (1877 ff., vol. 6, pp. 63-64) has
made a convincing case against most of them, and some are found neither in Romani nor
Arabic, they are repeated here because of the inaccessibility of the original list, and for
future reexamination: agor "end," Ar xir, alikati "time," Ar al-ikt, axal- "understand,"
Ar 'aqal, baxt "luck," Ar baxt, berk "breast," Ar berka, aro "dish," Ar ahn, eni
"earring," Ar odhni ("ear"), oro "deep," Ar ghr, handako "ditch,"Ar xandak, jar "heat,"
Ar harr, kghed "paper," Ar kghed, kha "house," Ar kha, kx, katuna "tent," Ar
qaitun, ke, kez "silk," Ar ke, kazz,, kisi "purse," Ar ksi, ko tor "piece," Ar kot'a, kurko
"Sunday," Ar kurki, ma(j)muno "monkey," Ar maimun ("happy"), mom "wax," Ar mom,
moxto "box," Ar motn, pendex "nut," Ar bondoq, tremo "vestibule," Ar trima, xasar-
"lose," Ar xasar, xev "hole," Ar kav, xud-, ud- "seize," Ar axadi, xumer "dough," Ar
kamr, zeiti "oil," Ar zeiti.
IRANIAN WORDS 49
4
While Vlach e "what" only coincidentally resembles the Persian, CommonRomaniCi
and its variants meaning "nothing, anything" may be of Iranic origin. This item is discussed
at R 25.
5
Abbreviations in the lists are as follows: Afg Afghani (Pashto), Ar Arabic, Arm
Armenian, D Domari, Dar Dardic, E English, Gk Greek, Hi Hindi, I Indian, Indic, Ir
Iranian, K Kurdish, Ka Kashmiri, Kar Karai, L Lomavren, OP Old Persian, OI Old
Indian, Oss Ossetic, P Persian, Pan Panjabi, Peh Pehlevi, Pkt Prakrit, Psh Pashto, R
Romani, Rum Rumanian, Sh Shina, Skt Sanskrit, Tk Turkish, Ur Urdu.
6 This form is only found in literary Persian today, but may have been a part of the
colloquial language a millennium ago.
7 For the treatment of prothetic lv-1 in vast and other thematic items in Romani, see
Turner (1975). For an alternative analysis of {-as}, see Bubenik, this volume.
REFERENCES
Abaev, V.I. (1964) A Grammatical Sketch of Ossetic. The Hague: Mouton

&Co.
Andreas [Robert Scott Macfie] (1926) Review of Sampson (1926) Journal
of the Gypsy Lore Society 3 (5), 126-134.
Andrews, Peter (ed.) (1989) Ethnic Groups in the Republic of Turkey. Wies
baden: Ludwig Reichert Verlag.
Bakker, Peter, & Marcel Cortiade, eds. (1991) In the Margin of Romani:
Gypsy Languages in Contact. Studies in Language Contact, No. 1. Am
sterdam: Instituut voor Algemene Taalwetenschap.
Benninghaus, Rdiger (1991) Les Tsiganes de la Turquie orientale. Etudes
Tsiganes 3, 47-60.
Berger, Hermann (1969) Die Burusaski Lehnwrter in der Zigeunersprache.
Indo-Iranian Journal 3 (1), 17-43.
Bloch, Jules (1926) Review of Sampson (1926). Journal of the Gypsy
LoreSociety 3(5), 134.
Bloch, Jules (1932) Survivance de Skt. st en indien moderne. Bulletin de
la Societe de Linguistique de Paris 33, 55-65.
Boretzky, Norbert, & Birgit Igla (1994) Wrterbuch Romani-Deutsch-
Englisch fr den Sdosteuropischen Raum. Wiesbaden: Harrassowitz.
Campbell, George L. (1991) Compendium of the World's Languages.
London: Routledge.
Colocci, Adrian ( 1907) Review of De Goeje ( 1903). Journal of the Gypsy
Lore Society, New Series, 1, 278-280.
De Goeje, M.J. (1903) Memoire sur les Migrations des Tsiganes a travers
l'Asie. Memoires d'Histoire et de Geographie Orientales, No. 3. Leiden:
Brill.
Finck, Franz N. (1907) Die Sprache der armenischen Zigeuner. St. Peters
burg: Imperial Science Academy.
50 IAN HANCOCK
Finck, Franz N. (1907a) Die Grundzge des armenisch-zigeunerischen

Sprachbaus. Journal of the Gypsy Lore Society, New Series 1(1), 34-
60.
Fraser, Angus (1991) Looking into the seeds of time. (Romani glotto-chro-
nology). Tsiganologische Studien, No. 1.
Gjerdman, Olof & Erik Ljungberg (1963) The Language of the Swedish
Coppersmith Gipsy Johan Dimitri Taikon. Uppsala: Lundeqvist.
Grierson, G.A. (1908) India and the Gypsies. Journal of the Gypsy Lore
Society, New Series, 1(1), 400.
Grierson, G.A. (1922) Gypsy Languages. Vol.XI of the Linguistic Survey of
India. Delhi: Motilal Banarsidass.
Groome, Francis H. (1963) "Introduction", Gypsy Folk Tales. London:
Jenkins.
Hancock, Ian (1984) Shelta and Polari. In: Trudgill, (ed.) (1984), 384-403.
Hancock, Ian (1987) II contributo armeno alla lingua romani. Lacio Drom,
23(1), 4-10.
Hancock, Ian (1988) The development of Romani linguistics. In: Jazyery &
Winter (1988), 183-223.
Hancock, Ian (1991) Romani foodways: The Indian roots of Gypsy culinary
culture. Roma 35:5-19.
Jazyery, Ali, & Werner Winter (eds.) (1988) Languages and cultures:
Studies in honor of Edgar C. Polom. Berlin & New York: Mouton de
Gruyter.
Jochelson, Waldemar (1928) Peoples of Asiatic Russia. Washington: The
American Museum of Natural History. Gypsies mentioned at 114-115
and 170-172.
Kaufman, Terrence (1984) Explorations in proto-Gypsy phonology and
classification, Paper presented at the session on Romani linguistics at
the Sixth South Asian Languages Analysis Conference Roundtable,
Austin, Texas. May. Unpubl. ms.
Kenrick, Donald (1976) Romanies in the Middle East, Roma l(4):5-8, 2(1),
30-36; 2(2), 3-39.
Kostov, Kiril ( 1963) Grammatik der Zigeunersprache Bulgariens: Phonetik
und Morphologie. Unpublished doctoral dissertation, Humboldt Uni
versity, Berlin.
Leland, Charles Godfrey ( 1873) The English Gypsies and Their Language.
London: Trbner.
Leland, Charles Godfrey (1875) A visit to the Gypsies. The Academy, June
19th, 637.
Littmann, Enno (1920) Zigeuner-Arabisch: Wortschatz und Grammatik der
arabischen Bestandteile in den morganlndischen Zigeunersprachen.
Bonn: Kurt Schroeder Verlag.
Macalister, R.A. (1914) The Language of the Nawar. London: Bernard
Quaritch (Gypsy Lore Society Monograph No. 3).
IRANIAN WORDS 51
Mann, Stuart (1933) Albanian Romani. Journal of the Gypsy Lore Society,
Third Series, 12(1), 1 -32.
Marre, Jeremy, producer/director (1992) The Romany Trail Pt. 1: Into
Africa. Beats of the Heart Series, Shanachie Records, SH 1210.
Harcourt Films, Inc.
Miklosich, Franz (1872-1881) Ueber die Mundarten und die Wanderungen
der Zigeuner Europa's. Vienna: Karl Gerold.
Papasian, Vrtanes M. ( 1901) Les Boschas (Tsiganes) armniens. Erivan:
State Printer.
Paspati, Alexandre G. ( 1870) Etude sur les Tchinghians ou Bohemiens de
VEmpire Ottoman. Constantinople: Karomla.
Pischel, W. (1883) Die Heimath der Zigeuner. Deutsche Rundschau, 36,
353-375.
Pott, Augustus F. (1844-5) Die Zigeuner in Europa und Asien. Two
volumes. Halle: Heynemann Verlag.
Pott, Augustus F. (1846) Ueber die Sprache der Zigeuner in Syrien. Zeit
schrift fr die Wissenschaft der Sprache 1, 175-186.
Redzosko, Y. le (1984) Armenian contributions to the Gypsy language.
Ararat 25 (4), 2-6.
Sampson, John (1923) On the origin and early migration of the Gypsies.
Journal of the Gypsy Lore Society, Third Series, 2 (4), 156-169.
Sampson, John (1926) The Dialect of the Gypsies of Wales. Oxford: The
Clarendon Press.
Soravia, Giulio (1988) Di alcune etimologie zingariche, Archivio Glotto-
logico Italiano 73, 3-11.
Trudgill, Peter, ed. (1984) Language in the British Isles. Cambridge:
University Press.
Turner, Ralph (1927) The position of Romani in Indo-Aryan. Journal of the
Gypsy Lore Society, Third Series, 5 (4), 145-183.
Turner, Ralph (1975) So-called prothetic v- and y- in European Romani,
ln:Collected Papers, I9I2-I973. London: Oxford University Press.
331-335.
Wolf, Siegmund (1960) Wrterbuch der Zigeunersprache. Mannheim:
Bibliographisches Institut.
Ventzel, Tatiana V. (1991) Le 'Bosa,' parler 'insulaire' des Roms d'Armenie.
In: Bakker & Cortiade (1991), 102-105.
Wahby, Taufiq & C.J. Edmonds (1966) A Kurdish-English dictionary.
Oxford: The Clarendon Press.
PLAGIARISM AND LEXICAL ORPHANS
IN THE EUROPEAN ROMANI LEXICON*
ANTHONY P. GRANT
University of Bradford
0. Introduction
A considerable amount of attention has been paid to the extraction from
the European Romani lexicon of elements from the genetic Indo-Aryan
component, and to profiling the broad etymological structure of the voca
bulary. This paper discusses two issues in the documentation of the European
Romani lexicon which have hitherto received scant attention. The first is that
of the authenticity of many of the data themselves, their status as
representative samples of the dialect under discussion and as bona fides Ro-
mani. Much "Romani" data is actually fabricated from collections made from
several dialects, or contains a number of metanalysed (and often invented)
words, and several instances of this are discussed. The second issue is that of
lexical orphans, genuine words of whatever origin, but especially firmly-
established and integrated borrowings, in the way in which Poplack and
Sankoff (1984) used the term, which are attested only in one dialect in one or
more reliable sources, and which do not derive from the lexicon of the host
language. Such words - and not least early loans from the Greek and South
Slavic strata common to all dialects of European Romani- are important for
tracing the migrations of the speakers of a particular dialect and in
documenting the earlier history of the dialect.
1. General principles of plagiarism and the dissemination of

spurious data
The biggest problem which one encounters when examining earlier
printed sources in search of old rare words which are no longer to be found
in a dialect is that of plagiarism, and thence of tracking down the ultimate
source of a given vocabulary. By plagiarism I mean the practice of taking
over linguistic data from other sources and passing it off as the fruit of one's
54 ANTHONY P. GRANT
own work. I would also include the unquestioning incorporation of other

people's material in one's own work without identifying the elements which
have been plagiarised, even if this incorporation has been acknowledged. In
short, I am using "plagiarism" as an umbrella term to refer to those practices
which result in material which does not comprise part of the lexicon of a
Romani dialect being passed of as such.
One might divide plagiarism into two categories. The first would be
copying of data without acknowledging the source of words copied from
other material, which is plagiarism as understood in a legal sense.The other
category might be more accurately called "data fabrication": creating
"Romani" data from the records of other dialects, from material belonging to
other languages (usually Hindi or Sanskrit), or making words up out of thin
air, often by creating ghost-words because of one's misconstruction of a
certain item. Inventing large amounts of Romani lexicon is rare, and data
fabrication has usually been the result of plagiarisation of material itself
intrinsically dubious (since most Roma were illiterate until this century, we
can usually rule out borrowing by Roma of allodialectal or spurious forms
into their dialects, and can attribute questionable elements to the dishonesty
of the person presenting the collection, "improving" materials already
collected). Bogus indicisms, instances of metanalysis and desk-loans from
other dialects are often to be found in the same work. I will therefore treat the
two phenomena jointly.
The propagation of inauthentic data can be found in materials on most
dialects. The documentation of EuropeanRomaniabounds with books which
include, or even entirely comprise, material stolen from other sources, often
at second or third hand, material that is misunderstood, inaccurately analysed,
swiped from a Hindi lexicon, or simply invented ex nihilo.
The motivations for plagiarism are numerous. Some authors who filled
their dictionaries with words from several different dialects were obviously
keen to ensure that as much as possible of theRomanilexicon was presented
between their covers (this may have been the case with Hrkal 1940), some
did not appreciate thatRomanidialects were discrete and diverse phenomena,
while others must have looked with dissatisfaction at the mass of assimilated
German or Slavic (or whatever) words in their data, or at the small amount
of data which they had been able to gather altogether, and would have felt that
the responses should be improved by adding Indic terms. When the words
for "lake" and "side" in Sinto were the Slavic-derived zero and the opaque
rig (whose Indic etymology was perhaps not immediately clear to those re-
PLAGIARISM AND LEXICAL ORPHANS 55
searchers), the temptation to adopt the relevant Hindustani words tallo and
kunara into a recognised Indic language must have been strong indeed
(though the latter word is a Farsi loan in Hindi!). Other common book-loans,
deriving from Hindustani via Grellmann like those two, such as banduk
"rifle, gun", a word of Turkish origin, are equally spurious as tokens of
Romani.
The first known instance of plagiarism ofRomanidata is that committed
by Samuel Bjrckman, a Swedish pastor and scholar of the early eighteenth
century, who wrote a Latin work, Dissertatio Academica de Cingaris (1730),
in which he passed off a number of items lifted bodily from Johannes
Scaliger's or Bonaventura Vulcanius' late sixteenth-century vocabulary as his
own, despite claims that he had checked it over with aRomani-speakerwho
was then in prison. The fact that Bjrckman's glossary is copied from
Vulcanius-Scaliger can be shown by the fact that both share the same
orthography, often spelling the same words identically, thus both use <ch>
for several sounds including /x/, // and //, both list the same words, such as
<buchos> "book" and both share mistakes, such as final <-t> for /-1/ in
<tzuket> "dog" (Romani dzuke) or <for> for por "feather". As can be seen,
the similarities in transcription and the shared mistakes give him away.
The publication of the economist Grellmann's work, itself largely derived
from secondary sources (both published and unpublished) of varying accu
racy and dialect affiliation, rather than a great deal of independent fieldwork
(Wolf 1960: 39), and coloured by what he had heard and read about Sanskrit
and Hindustani, was a significant event which had often rather dire con
sequences for Romani linguistic scholarship, as he introduced a number of
ghost-words into the published material onRomanithrough the perpetuation
of printing errors, especially into the records on Central European and Sinto
dialects, for example such nonexistent words as feizrile "tomorrow" (for
*teizrile, compare Sinto tisrla), or telel "animal" (where German Tier
"animal" has been misread for tief "deep"; tlal means "underneath" in
Romani; "deep" is the Armenian loan xor). Grellmann's book was plagiarised
and excerpted more than any other single work, and its mistakes and falsities
were copied along with the more genuine data.
2. Artificial indianising in Romani lexica

Grellmann also set in motion the "Indianising" syndrome (also dubbed
offensively as "Einschwrzung": Wolf 1960: 233), whereby words from
Indo-Aryan languages (often loans from Persian or Arabic, mediated through
56 ANTHONY P. GRANT
Hindustani), which were never part of the inheritedIndiclexicon in Romani,

are taken over by an author into a body ofRomanidata in order to reinforce
the "Indic" characteristics of the language.1 Rdiger had just shown the
essentially Indic affinities of Romani (Rdiger 1782), and Grellmann and
others were not slow to give the facts ofRomani'sIndo-Aryan affinities a
push by improving Romani data in the direction of Sanskrit or by
improvising "Romani" data from Indic words completely unattested in
Romani. This Indianising trend was to happen occasionally for a long time
before the practice of borrowing straight from Hindi and Urdu (including
vocabulary derived from IndoAryan, Arabic, Persian and even English) into
Balkan Romani was legitimised as "language planning" in such works as
Kepeski and Jusuf (1980), attention to whose avid adoption of Hindi-Urdu
material into an emergent literary variety of Romani based on Arlija
(incorporating words containing voiced aspirates, or unrecognised loans
from English) has been drawn by Igla (1991: 84).
One author in the 1960s, himself a Rom, tried to palm off such Indic
"loans" as genuine Romani words in a Romani glossary (Kochanowski
1965). In the second volume of this chaotic and flawed work we find a
glossary of "Eastern BalticRomani",a dialect of which he was apparently
one of the last two speakers (pages 384-419), this list being drawn from
Kochanowski's own diary. The language is basically North Russian Romani,
with (rather disturbingly, unintegrated and suffixless) loans from Russian,
Polish, Latvian, French, and a few unusual terms, for instance: man "to
think", jas "to hope", and tarang " a wave". That such stems have been taken
from a Sanskrit dictionary rather than really belonging to theIndiccomponent
of Romani can be seen by the fact that some of them would have different
shapes if they were genuine Romani ; thus *man would be men because of
the vowel change of /a/ < Id in closed syllables. He has not apparently
borrowed any material from Grellmann.
3. Sample cases of plagiarism in European Romani dialects

There are egregious examples of plagiarism and the use of inauthentic
data in the documentation of American Angloromani (Prince 1907;
smoulderingly reviewed by Sampson 1908-1909), in Central European
dialects (Hrkal 1940, vide infra), and in the published work of other
scholars, such as Decourdemanche (1908), and Wlislocki (1884). The
former, among other things inventor of a spurious Gypsy alphabet, is unread
today, but the baleful influence of the latter, who invented a "Transsylvanian
dialect" out of Romungro, Lovari and invented data, was still felt in the
1960s, and one of his works was used and noted in Wolf (1960), who
evidently thought the work genuine, while words from the other works of
Wlislocki were included in a commentary on Wolf (1960) by Johann
Knobloch (Knobloch 1964), together with a number of words, mostly
derived from Hungarian, from texts which Knobloch had collected at Nazi
concentration camps in Austria as material for his doctoral dissertation
(Knobloch 1953), from speakers of various forms of Romungro and Lovari,
shortly before they were taken away to the gas chambers. Wlislocki seems to
have preferred to invent his own lexemes (for instance glete "tongue",
authentic Romani chib), rather than to draw from Grellmann's polluted well
like his predecessors.
Sinto dialects have often been documented in works largely derived from
other works whose lexical content is of questionable authenticity. The first
extensive source for Sinto lexicon is the dictionary compiled by the
Thuringian judge Wilhelm Ferdinand Bischoff (Bischoff 1827), who drew
on printed sources, including Grellmann's work (which contains material
from Romungro and Lovari as well as Sinto), and on his own fieldwork,
some of which was evidently carried out with Vlach-speaking Roma. This
mlange is perpetuated through the nineteenth century by authors who stole
data from Bischoff s book: a word recorded by Bischoff or maybe taken by
him from Grellmann's book would be taken from his dictionary and included
by the magistrate Richard Liebich (Liebich 1863), and others working from
Liebich's book would add it to their list.
Eventually Rudolf von Sowa sorted out the problems in this field,
producing a sensible dictionary of Sinto, based on his own fieldwork in
Westphalia and East Prussia and on printed sources, and keeping separate the
Baltic Romani dialect of East Prussia, which others had lumped with Sinto
simply because East Prussia was German territory. In his dictionary of 1898
von Sowa printed the more reliable words in italics and the others, including
words recorded in no other dialect and other more dubious ones, a good half
of the ones collected, in plain type. This list was incorporated bodily into
Finck's grammar, complete with errors, where Finck included dubious
words. Some words listed in Finck (1903), taken from earlier sources,
whose authenticity I doubt are: <karaw> "I pull out" (first listed by Bischoff
as "Kahraf, ich rcke aus", and attested for no other dialect, but included in
Rishi 1974); <k'eledo> "penance", that is, Bue, for Rue, also meaning
"soldier, beau, lover"; see a form such as Northern PolishRomanixalado
58 ANTHONY P. GRANT
"soldier", Kraus and Zippel <chellado> "lover" in von Sowa (1898) and in
Wolf (1960: 133); <kora> "hour" (which some authors trace back to
Hungarian koran "late" (!) although German Sinto has no Magyar loans, and
any such listed for German Sinto are a sign that the source under consultation
is false; it probably derives from metanalysis of jekh ora "one hour" as *je
kora); and probably also: <dsajel> /dajl/ "to freeze", which is reputedly of
Indic origin (allegedly from Sanskrit jaayati), but unknown outside Sinto and
first noted by Bischoff; elsewhere "freeze" is expressed by forms of mraz-
from South Slavic, for instance Welsh Romani mzin "it's freezing", or on a
verb derived from pho "ice", from Greek, e.g. Lovari pahosarl).
Quite why Finck perpetrated such a fraud is a mystery, but deceive people
he did, and he attempted to cover his trail, as one sees from the preface to his
book (especially pages VH-VI11), where he tells of having consulted with
Sint during his time as a professor at Marburg (1896-1900), then checking
his material with von Sowa's dictionary. The modern published lexica of
Sinto dialects show the soundness of von Sowa's instincts as to which items
were bogus, as they are not noted by scholars working with native speakers,
for example, being unlisted by Calvet and Formoso (1987).
A curious instance of misrepresentation of lexical data through erroneous
decontextualisation occurs in Wolf (1960), with several proper names from
Sinto. These names are taken from a published conversation in Sinto, taken
down in the late 1830s by the seminarian Tielich at the artificial Romani
colony set up by the Lutheran Church at Friedrichslohra in Thuringia, and
reproduced in Pott (1884-1845: 1: 491-497). The speakers to whom the cues
are attributed have German and Sinto names, thus the Sinto Anton is known
as Polla in Sinto, the speaker Wilhelm's name is given as Hater, the speaker
Franz has the Sinto name Kringla, and so on. These were individual people,
and they had individual Sinto names as well as German ones; however, Wolf
translates Polla as "Anton", for example, as if it were the stock translation of
the name, and as if it could be applied as a Sinto name to anyone who was
called Anton, which is not the case.
Borrowing at second and third hand was taken a stage further, when one
Eduard Hrkal compiled a dictionary of "Central European Romani" from a
series of dubious sources - Jesina for Carpathian, Finck for Sinto, Wlislocki
for the spurious Transylvanian dialect (Hrkal 1940), thus succeeding in
incorporating a number of forms, such as kunara, whose history of falsity
extended over a century and a half.
The cases of data contamination in the published records of Carpathian

Romani and Drindari are especially striking. Fr. Anton Jaroslav Puchmayer's
book on Czech Romani (Puchmayer 1821) is reliable, probably the first
wholly reliable source on European Romani; it is based on his own field-
work, and he does list separately the words collected by Grellmann which he
was unable to elicit.2 The Puchmayer vocabulary was incorporated by Rudolf
von Mitrowic, Graf Wratislaw in his unoriginal work on the Roma, where it
was mixed with Sinto data from Liebich and with Hungarian Romani and
other data taken from Grellmann. This in turn was used as a source by Fr
Josef Jesina for his work; he incorporated mistakes as well, added a bundle
of forms from Hungarian Romani, and words from Hungarian, Czech, and
German, misspelt quite a few entries, and also borrowed the Angloromani
verb fordel for "forgive". There is a version of the Lord's Prayer in this
Romani in Jesina's book, incorporating the stolen verb for "forgive", with
the wrong imperative ending: if it were genuineRomani,it would be *forde,
not <ford>!
As to the dissemination of questionable forms in BalkanRomani,a group
which has so far had less dictionary work done on it than its diversity and
conservatism would warrant, one should mention the curious case of the ex
tensive BalkanRomanivocabulary published by Marchese Adriano Colocci
(1889), which is mostly Drindari, with interpolations in Turkish or Bulgarian
where the informant could not give a Romani word (or where these loans
were part of the consultant's parole). According to Gilliat-Smith (1926), this
was collected in the field by a Greek physician, Dr. Caramanos, and com
municated by him to the Greek scholar ofRomaniin Thrace, Dr. Alexandre
Paspati, who had intended to send it to Messrs. Smart and Crofton, who had
worked on the remnants of inflected EnglishRomaniin Manchester (Smart
and Crofton 1874; Grant 1992), but who then sent it to Colocci instead,
having annotated it with forms from his own collection. Colocci then
published the vocabulary in Caramanos' French-influenced spelling, with
Paspati's annotations from his fieldwork in his own (different!) spelling
system, making two orthographies in one work; he did this without saying
that there was dialect mixture in the work, that some data were taken from
Paspati's book (1870), or noting that he himself had not collected it.
My suggestion to anyone examiningRomanivocabulary diachronically
(for example, investigating loan-strata and lexical loss in a dialect, or
compiling a synthetic lexicon of a given dialect) and coming across words
which have been lifted from another source, especially one covering another
60 ANTHONY P. GRANT
dialect, is that they follow their more cautious instincts and list such terms
separately, drawing on their knowledge of the migration routes of the group
of Roma in question (bearing in mind, for instance, that there are no loans
from Hungarian orRomanianin the Kalajdzi dialects), and do not use them
as evidence in trying to clinch any theoretical point, while noting and
monitoring forms of dubious authenticity and cross-checking them with more
reliable sources and data from other dialects.
4. Lexical orphans in European Romani

Another important source of unusual vocabulary found in the lexicon of a
Romani dialect is the presence of lexical orphans, that is to say, words which
are attested in only one dialect, and which have not been passed to other
dialects from Common European Romani. All dialects of Romani contain
words which are not taken from the host-language as recent loans, and which
are not known in other dialects of Romani.
Several sorts can be distinguished: cognate orphans, which go back to
Indo-Aryan, loan orphans, and unique lexical innovations. Loan orphans can
be further divided into: early loan orphans, which are taken from languages
with which the Roma came into contact (one might subdivide these into pre-
European and post-European loans, the latter consisting of words from Greek
and South Slavic, languages with which all speakers of Romani were in
contact before the dialectal divisions occurred), later loan orphans, borrowed
from the languages of countries through which the Roma passed after the
European internal migrations began, en route to their present destination
(these words are especially useful in allowing one to track the migration
routes of groups), and of course the large number of loans from the host-
language which are peculiar to the Romani dialect(s) in contact with this
language. Unique lexical innovations are words for which we at present have
no convincing etymology, and they are found only in one dialect. Each dialect
usually has a few percent such words in its lexicon; Cal has a very high
percentage indeed (at a rough guess, more than fifty percent). Most dialects
show instances of every sort of lexical orphan.
The most interesting lexical orphans from the Indologist's point of view
are what may be called cognate orphans, that is, words which can be traced to
an ancestral language (especially if they are attested in a language itself
ancestral to this ancestral language), but which are attested only within a
certain dialect or within a certain sub-group in one of its descendants. Thus,
an otherwise pan-Indo-Aryan word recorded only in one dialect of Romani
would be a cognate orphan, because Romani has descended from Middle

Indo-Aryan.
For example, in Welsh Romani, we find naj for "knee", as well as the
more common form derived from cang. Now naj meaning "nail", or in Vlach
dialects "finger", is well known, but naj meaning "knee", derived from
Sanskrit nalaka, is a unique retention in WelshRomani,and we can be sure
that John Sampson, who recorded this word, did not make an error of
interpretation or attempt deception by including such a word in his work.
Loan orphans are loanwords recorded only in one dialect: thus in Welsh
Romani one finds such words from Greek as butsa, "ball", lutria "scullery,
place where dishes are kept", words which are not recorded in any other
Romani dialect. There are quite a few old loanwords exclusive to British
Romani (Welsh Romani and English Romani) to be found in Sampson's
work.
This raises the question of how far back up the Stammbaum one should
attribute old loanwords attested in a single dialect, and what criteria one
should impose when defining and delimiting Common European Romani
vocabulary. In his review of Rishi's dictionary (Rishi 1974), Terrence
Kaufman stated (Kaufman 1979: 135) that his projected pandialectal
dictionary ofRomaniwould only include those European loans recorded in
more than one dialect group, and (by implication) loan orphans would also
be inadmissible, thus, by his criteria butsa and lutria would not be included
in his dictionary because they have not been recorded in other dialects. This is
on the whole a fair criterion, and I would add theriderthat loans which form
part of the second or host-language of the Roma where a given dialect is
currently spoken, or which belong to a language also known to the Roma
speaking a certain dialect, should also be excluded, simply because practically
all the lexicon of the host-language is potentially Romani; thus Bulgarian
loans should be excluded from such a dictionary where dialects of Bulgarian
Romani are concerned, even if they are common to several subgroups, and a
similar remark might apply to Turkish loans in dialects spoken by groups in
Bulgaria whose second language is Turkish, simply because such loans are
innumerable and borrowing from Turkish is practised freely. With the
exception of a few loans into Ursari, Bulgarian loans are not found outside
Bulgaria, and Turkish loans in Romani dialects only otherwise occur if
mediated through Slavic,Romanianor Hungarian, as the Roma did not pass
through Turkish-speaking territory on their way to Europe.
62 ANTHONY P. GRANT
On the other hand, historical circumstances have caused the situation of

the lexicon of the Vlach group of southern dialects to be unique: as a result of
the Roma on formerly Romanian territory undergoing four centuries of
slavery, and dispersion from the 1860s onwards, with loss of the knowledge
of Romanian among these groups, numerousRomanianloans are found in
the dialects of groups who have not spoken Romanian and who have not
been in touch with one another for over a century - Gurbetis in Bosnia,
Grebenarja and Kalburdzja in Bulgaria, Lovara and Kalderasa all over the
world, and under the above criteria aRomanianloan common to two or more
of these would be ripe for inclusion.
The score of Hungarian loans found in dialects spoken in the Czech Re
publics, in Slovakia and in southern Poland would be inadmissible under
Kaufman's criteria since the dialects are all in the Carpathian dialect group,
nor does occurrence of these words in Romungro (whose host-language is
Hungarian) aid their case.
Apart from loan orphans, every dialect has a small number of unique
lexical innovations, words found in no other dialect, whose etymology is a
mystery, and which are not among the numerous recent loans from the
language of the host country. Sometimes these mysterious words, found in a
vocabulary, can be explained as being "ghost-words" coined by misreading a
previous record. Thus the strange word <okunjelus>, recorded by Zippel for
East PrussianRomaniand meaning "hops", is a misreading of a handwritten
<chmjelus>, from Polish chmiel "hops", and Colocci's word <djuuri>
"soup" is a misreading for <djumi>, a form cognate with the pan-European
Greek loan zumi..
Other strange words attested in no other dialect remain loan orphans or
are just mystifying, unique lexical innovations. I shall give a few examples
from dialects of the Northern group, which shows considerable internal
diversity (far more so than is shown by the Vlach dialects, for instance), and
which was presumably the first one to split off from the body of Romani
speakers, with a time-depth of at least half a millennium (1417, the oft-
quoted date of the arrival of the Roma into Central Europe, may serve as a
terminus antequemfor itsfinalsplit from the Southern dialects).
Thus, I have listed some from Welsh Romani, of varying etymologies,
including "unknown". These include: bilano "gimlet", naj "knee" (< Indic),
roxerel "he castrates", butsa "ball", lutria "scullery", (<Greek), dyta "pipe,
whistle" (<Slavic), swegla "tobacco-pipe" (<German), skuta "nut" (<Shelta;
this may in turn be based on Northern English /nut/ "nut"), stigl "alone",
pii "certainly" (the last two are of unknown etymology).
It must be pointed out that not all words of unknown etymology are
confined to one dialect. There are a number of pandialectal words, such as
gadzo "non-Rom", or por "tail", for which satisfactory etymologies have not
been found. And there are words confined to one subgroup, which form part
of the shared lexicon of that group, whose origins are mysterious.
Naturally there are loan orphans (often from French or Italian) and unique
lexical innovations in German Sinto, too, words such as matrli "potatoes",
for example, and an incautious approach to the study of the Sinto lexicon
might cause such words to be discarded, simply because they do not occur in
other dialects. In this case, evidence from the lexica of subdialects, such as
the Sinto dialects of France and northern Italy, is important. There is much
false material in many of the published Sinto lexica which must be sifted from
the genuine material, but we do not want to throw the baby out with the
bathwater.
There are also a number of lexical orphans in Finnish Romani. This
dialect shares many of the Greek and Slavic loans which are to be found in
Sinto and Welsh Romani, as well as having maybe 700 stems borrowed from
Swedish (e.g. vella "evening" from kvlljego "one's own" from egen,
dzenom "through" from genom, ilako "evil" from archaic elak "loathsome"
crossed with illa "evil", and msava "to roam" from msa "to roam";
(Thesleff 1901). Furthermore, there are some strays, isolated loans from
Scots (banaka "breadcake" from bannock), and Norse zurupos "settlement"
from porp.Some words from earlier strata, such as jinderdi "rainbow"
(recorded in the speech of the Lajenge Roma and rather fancifully claimed by
some to be related to Sanskrit indradhanu "Indra's bow" and celadoin
"swallow" (< Greek) are not found outside Finnish Romani.
Swedish was a source language for loans into Finnish Romani which, of
course, was not exploited by other Romani dialects, the Roma having reached
Finland after a long stay in Sweden; furthermore, Swedish was long a
prestigious language in Finland, and most Finish Kaale (as they call
themselves: "black", like the Spanish and Welsh Gypsies), bear Swedish
surnames. Thus all Swedish loans can be said to be "loan orphans". The
words from Norse and Scots are the only words from these languages in
Finnish Romani.
Finnish Romani also had its unique unetymologisable elements, such as
phab "cheek, side of face", also found in Scandinavian Romani mixed
64 ANTHONY P. GRANT
languages but not elsewhere, and also ummi "women's defensive weapon".
The penultimate example is a word of unknown etymology not found in any
other dialect, which usually have reflexes of cham. The final one is also of
unknown etymology and seems to have been regarded by Thesleff as an odd
word, and it is not unique to FinnishRomani,by a hair's breadth. In a case
which shows the value to Romani philology of pre-modern materials, it
otherwise occurs only as <schiimije> "skewer", in the pre-1570 wordlist of
Johan van Ewsum of Groningen, Netherlands, our earliest example of a
Sinto dialect, and one of our earliest samples of Romani (Kluyver 1910-
1911). Thus it is noted both for Sinto and for FinnishRomani,and could be
attributed back to the Northern Division of European Romani, as a word of
unknown source like paijar "case, cover, border, edge", shared by Finnish
Romani, Romnimos and Sinto, would be an item of shared lexicon in
Northern Romani.
Other dialects, apart from those here discussed, have their share of old
loanwords which are otherwise unknown in other dialects. Examples will be
drawn from the Greek-derived part of the lexicon. Thus SlovakRomanihas
ten "an equal, a counterpart" from a form connected with the Greek verb
teriazo, which I have otherwise only seen in terenices "stalemate, a draw; an
equal result between two marbles-players", attested in the Kalajdzi dialect of
the Burgudzi in Macedonia, and kindly brought to my attention by Norbert
Boretzky; nor have I met FinnishRomaniguduni "bell", also from Greek (the
etymon is kuuni)elsewhere apart from North RussianRomani.Yet another
Greek loan, timin "price, cost", occurs only in North Russian Romani, and
as a loan in Russian Kalderas (Demeter 1990). The Greek loan kockarida
"hiccough", from kloksos, is only found in CarpathianRomani,while forms
for "heel" derived from patuna are to be found only in Carpathian and some
Balkan dialects.
All of these are valid if not widely dispersed early loans into Romani,
from the Greek stratum of lexicon, the first truly European stratum in the
Romani lexicon, an important element which bridges the gap between
thematic and athematic grammar (older loans being assimilated into the
essentially Indic grammar), and which still awaits full documentation and
study.
Kaufman's criterion, that a European loanword only be included in a pan-
dialectal dictionary of Romani if it is attested in at least two of the twenty
groups, is fundamentally sound, but it might be tempered with some
historical hedging, especially in the case of loan orphans which are obviously
relics of the early migration of the Roma in that they are taken from Greek or
South Slavic. One may argue that a hard-line approach to the admission to the
shared lexicon of EuropeanRomaniof loans noted only in one dialect group
can obscure matters of historical or cultural interest, as can overreliance on
data from a betterrecorded subdialect.
I shall close with examples relating to the Greek stratum. Even a brilliant
work such as Sampson (1926) does not give a complete picture of British
Romani at all times, and omits some elements not recorded by Sampson and
which need to be supplied from other sources. Thus, we see that Welsh
Romani has the Greek word for "heaven", ranos, in ravnos, as well as such
words as valgra "market", from Greek ayor. Neither of these words is
found in other dialects. The pan-dialectal word for "town", foros, is not
found in WelshRomani,where gav, elsewhere "village", is now "town", and
"village" is vlija, but Andrew Boorde collected it in 1547 for English Romani
in our oldest sample of Romani.
The Greek numerals borrowed intoRomaniare usually felt to be eft oxt
enj trinda sarnda pennda, "7, 8, 9, 30, 40, 50", which are still used in
manyRomanidialects (though the word for "forty" has oddly dropped out of
Kalderas and is replaced by a compound form starvardes, while Alexandre
Paspati also noted eksinda for "sixty" in a Drindari-type dialect spoken at
Yambol in Bulgaria). There is no evidence for the Greek numerals in
Sampson's WelshRomani,apart from desto xori "eighteen pence", though
"7, 8, 9" are listed in England by Jacob Bryant in 1784 (see Sampson 1910-
1911) and were apparently heard by him rather than copied from elsewhere.
But WelshRomanihas drika "dozen", from Greek /eka/, known in no
other dialect, while Kotel Drindari has dekapnde "fifteen", unattested
elsewhere (Donald Kenrick, personal communication). Could it be that once
most of the Greek cardinal numeral system (apart from terms for 1-6, 10, 20
and 100) was used in Romani?
5 . Conclusion
As Grellmann pointed out over two centuries ago, the history of the
Roma is to be found in their language, and the length of their stay in various
countries can be gauged by the proportion of loans into Romani from each
source. An exploration of the mustier corners of the lexica ofRomanidialects
will tell us much more about the history and travels of the various groups,
and of the relations obtaining between particular groups. However, when
studying the lexica of one or moreRomanidialects, it is important to bear in
66 ANTHONY P. GRANT
mind the ignoble tradition of plagiarism, which is especially strong in works

incorporating material from Grellmann's book. One need also bear in mind
that a few percent of that part of the lexicon of each dialect which is not taken
from the hostlanguage is unique to itself, in other words, that it comprises
lexical orphans, and that these are often historically opaque and unety-
mologisable but sometimes contain important cultural information. Careful
studies of the Slavic and especially the Greek elements, lexical and otherwise,
inRomanidialects would be especially valuable.
NOTES
I would like to thank Peter Bakker, Jette Bolle, Norbert Boretzky, yit Bubenik, Jean
Cooney, Angus Fraser, Victor Friedman, John Green, Ian Hancock, Milena
Hbschmannov, Birgit Igla, Donald Kenrick, Corinna Leschber, Christopher Sheppard,
and Hein van der Voort for their assistance and encouragement in the research of which this
paper is an offshoot.
Najs tumnge!
1 It is through the dialect mixture in Grellmann's work that the Hungarian and most of
the Romanian loans (and the false Indicisms) "entered" the Sinto lexicon. There are no
Magyar loans in any Northern European Romani dialects apart from Carpathian ones, and
only a few Romanian ones. The early speakers of Northern Romani dialects apparently
went via Croatia and Slovenia into Germanophone territory in what is now Austria (and
maybe briefly into Bohemia) before entering Germany.
2 An opposite example is a word allegedly meaning "false", vingro, which is listed by
Puchmayer (1821: 51) as a word which he could not elicit from his consultant, but noted as
occuring in Grellmann's book; Grellmann also noted a word latschila, also supposed to
mean "false" and also unrecorded by careful observers. Grellmann took these words from an
early vocabulary, apparently an Angloromani list, which listed : Lachilo, vingro "false".
The intrusive comma led Grellmann to believe that there were two words there when there
was in fact a single word, a form of * lahelavngero "good-word person", that is, one who
says good words but who does not support them with good deeds, and thus a false person.
As one might expect, Jesina (1886) swallowed the two words whole, as lailo and vingo.
REFERENCES
Bischoff, Ferdinand (1827) Deutsch-Zigeunerisches Wrterbuch. Ilmenau.
Calvet, Georges and Bernard Formoso. (1987) Lexique tsigane 2: le sinto
pimontais. Paris: PoF.
Colocci, Adriano (1889) Gli Zingari. Torino: Loescher.
Decourdemanche, Jean-Adolphe (1908) Grammaire du tchingan ou Langue
des Bohmiens errants. Paris: Geuthner.
Demeter, Ptr and Stepan (1990) Cygansko-Russkij Slovar'. Moskva:

Izdatel'stvo "Russkij Jazyk".
Finck, Franz Nikolaus (1903) Lehrbuch des Dialekts der deutschen Zigeu
ner. Marburg: Elwert.
Gilliat-Smith, Bernard (1926) Some letters from Paspati to Smart and
Crofton. Journal of the Gypsy Lore Society, 3rd series, 5: 103-119.
Grant, Anthony Paul (1992) Linguistic obsolescence in 19th century Lanca
shire: the case of English Romani. Paper read at First Manchester Post
graduate Linguistics Conference, University of Manchester, March 14
(1992).
Grellmann, Heinrich (1783) Die Zigeuner. Ein historischer Versuch ber die
Lebensart und Verfassung, Sitten und Schicksahle dieses Volks in
Europa, nebst ihrem Ursprnge. Dessau und Leipzig.
Hrkal, Eduard (1940) Einfhrung in die mitteleuropische Zigeunersprache
mit Wrterverzeichnis. Leipzig: Otto Harrassowitz.
Igla, Birgit (1991) Probleme der Standardisierung des Romani. In: James R.
Dow and Thomas Stolz (eds.) Akten des 7. Essener Kolloquiums ber
"Minorittensprachen/Sprachminoritten", Bochum: Brockmeyer.75-90.
Jesina, Josef (1886) Romni cib oder Zigeuner-Sprache (Grammatik,
Wrterbuch, Chrestomathie). Leipzig: List und Francke.
Kepeski, Krume & Saip, Jusuf (1980) Romani gramatika. Skopje: Nasa
Kniga.
Kaufman, Terrence (1979) Review of Rishi (1974). International Journal of
the Sociology of Language 19, 131-144.
Kluyver, A. (1910-1911) Un glossaire tsigane du seizime sicle. Journal of
the Gypsy Lore Society, 2nd series, 4, 131-142.
Knobloch, Johann (1953) Romani-Texte aus dem Burgenland. Burgenlndi-
sche Forschungen 24. Eisenstadt: Burgenlndisches St aatsarch iv.
Knobloch, Johann (1964) Bausteine zur Lexikographie der Zigeunerdialekte.
Anthropos 59, 128-158.
Kochanowski, Jan (1965) Gypsy Studies. New Delhi: Sata-pitaka 25, 26.
Liebich, Richard (1863) Die Zigeuner in ihrem Wesen und in ihrer Spra-
che.Leipzig: Brockhaus.
Paspati, Alexandre (1870) Etudes sur les Tchinghians ou Bohmiens de
l'Empire Ottoman. Constantinople: Koromla.
Poplack, Shana and David Sankoff (1984) Borrowing: the synchrony of
integration. Linguistics 21 (269), 99-135.
Pott, August Friedrich (1844-1845) Die Zigeuner in Europa und Asien.
Halle: Heynemann.
Prince, John Dyneley (1907), The English-Rommany jargon of the American
roads. Journal of the American Oriental Society, 28, 271-308.
Puchmayer, Anton Jaroslav (1821)Romni Cib, cili ciknsky jezyk. Praha.
Rishi, Weer Rajendra (1974) Multilingual Romani Dictionary. Chandigarh:
Roma Publications.
68 ANTHONY P. GRANT
Rdiger, Johann Chr. Chr. (1782) Von der Sprache und Herkunft der Zigeu
ner aus Indicn. Neuester Zuwachs der teutschen, fremden und all
gemeinen Sprachkunde in eigenen Aufstzen, Bcheranzeigen und
Nachrichten. 1,37-84.
Sampson, John (1910-11) Jacob Bryant. Journal of the Gypsy Lore Society,
2nd series, 4, 162-194.
Sampson, John (1908-1909) Review of Prince (1907). Journal of the Gypsy
Lore Society, 2nd series, volume 2 (1), 74-84.
Sampson, John (1926) The Dialect of the Gypsies of Wales. Oxford Univer
sity-Press.
Smart, Bath Charles and Henry Thomas Crofton (1874) The Dialect of the
English Gypsies. London: Asher.
Sowa, Rudolf von (1898) Wrterbuch des Dialekts der deutschen Zigeuner.
Abhandlungen fr die Kunde des Morgenlandes, XI. Band., 1. Teil.
Leipzig: Deutsche Morgenlndische Gesellschaft.
Thesleff, Arthur (1901) Wrterbuch des Dialekts der finnlndischen Zigeu
ner. Helsinki: Acta Scientarum Societatis Fennicae XXIX.
Wlislocki, Heinrich von (1884) Die Sprache der transsilvanischen
Zigeuner.Leipzig: Friedrich.
Wolf, Siegmund A. (1960) Groes Wrterbuch der Zigeunersprache (romani
tsiw). Wortschatz deutscher und anderer europischer Zigeunerdialekte.
Mannheim. [reprint: 1987, Hamburg: Helmut Buske Verlag)
Wratislaw, Rudolf, Graf von Mitrowic (1868) Versuch einer Darstellung der
Lebensweise, Herkunft und Sprache der Zigeuner. Prague.
INTERDIALECTAL INTERFERENCE IN ROMANI
NORBERT BORETZKY
Ruhr-University Bochum
0. Introduction
Romani is split up in a great number of dialects, and we very often find
speakers of different dialects living in the same community or even in the
same neighbourhood. Therefore, we would assume dialect mixture to come
about on a rather large scale. The following is a first attempt at collecting and
systematizing the phenomena of dialect interference in Romani, but before
going into detail let us discuss the factors favouring or impeding interference.
It is plausible that linguistic contact should occur at places where speakers
of various groups of Roma live together. We are not in a position to claim
that this has been the case in Central and Western Europe at all times, but in
the south-eastern part of our continent where stable urban settlements existed
for many centuries different groups living together in one area can be
considered the normal form of settlement up to our days. In Serbia and
Macedonia we normally come across no less than three groups in one town.
In general, Gypsies live in towns rather than in small scattered villages.
However, today we find small groups or single families in typical villages as
well. Apparently, the type of settling depends on the Gypsies' way of earning
their living. Thus, in Kosova the Bugurdzides, traditionally blacksmiths, are
found in villages as well. There is rarely a village without at least one
Bugurdzi family. Greater towns have their ciganske mahale (Gypsy quarters)
where separate groups live next to each other but in different streets or sub-
quarters rather than totally mixing.
According to what has been said here we might expect a far-reaching
levelling of dialects or, at least, quite chaotic variation, but this is not at all the
case. On the contrary, it is more difficult to find instances of mixture or
mutual influence, even random mixture, than to determine recent dialect dis
tinctions. In the following we want to discuss some factors that might have
controlled the Roma's linguistic behaviour.
70 NORBERT BORETZKY
a) Religion. Official religion has no bearing on the linguistic behaviour

of the Gypsies. It appears that groups changed their religious denomination
quite easily, assimilating to the major religion of a given region. In Greece,
the majority of the Gypsies adheres to orthodoxy, whereas in the regions im
mediately to the north of Greece Islam is the prevailing denomination. In
northern Serbia, Vojvodina and Romania the Roma are orthodox christians
(except for more recent immigrants from the South). It is possible that dif
ferent religious affiliation prevented families from intermarriage, but religion
as such was no hindrance for language contact.
b) Tribal particularism. It can be assumed that Gypsies did not come

to Europe in a great undifferentiated bulk, but it is more likely that, on their
way from India, they entered Europe gradually, in all likelihood speaking
slightly distinct dialects. Migrations and splittings resulted in the formation of
new groups, but this did not result in a global levelling between dialects.
Rather, linguistic differentiation increased during European times. There was
no common leadership, no political bodies, no common religion. Therefore,
it is understandable that particularistic feelings prevailed. The image of other
groups was (and is to date) composed of quite negative features: The others
do not behave in an accepted way, they steal, and, what is even more impor
tant here, they speak in a strange way or even uncorrectly - the correct
language is one's own dialect. The linking feature of not being a gadzo did
not suffice to create a positive common feeling or something like solidarity. It
only meant not to belong to the country's majority. It is evident that parti
cularism influenced marriage relations.
c) Intermarriages. It is reported that, for older times, marriages be

tween members of different tribes or groups were contracted very seldom. As
far as I know, for the middle-aged Bugurdzides in Kosova this holds even
today. In most of the families I became acquainted with, both partners were
Bugurdzides. One Bugurdzi at the age of fifty who had married a Gurbetka
was constantly being teased by his relatives. Among Bugurdzides there is a
saying: ha, pi, e vlahosa te n' ovel tu buci "eat and drink, but don't mix up
with Vlach-Gypsies". Certain groups of the Kalderas tribe are said to keep to
themselves very strictly, especially those who emigrated to France and to the
United States. This is not true, however, for people from North Serbia and
the Vojvodina with whom I became acquainted. Among them intermarriages
become more and more frequent. The reasons for this development are mani-
INTERDIALECTAL INTERFERENCE 71
fold: increasing emancipation leading to individual choice of partners; eco

nomic factors (e.g. a more moderate price to be paid for the bride); the
enormous fluctuation as well as the recent emigration opening up new con
tacts. Nevertheless, it is my impression that marriages arranged by the
parents are overwhelmingly contracted within one group even today.
There seem to have been strict rules for mixed couples for a long time:
After the marriage is contracted the wife moves into the household of her
husband's parents, and she is expected to learn her husband's dialect as soon
as possible - which she normally does. There certainly remains some kind of
foreign accent, but lexicon and morphology are acquired correctly. Thus, it is
less likely that much of the mother's dialect is handed down to the children.
As a rule, speakers are aware of foreign forms, and they are therefore in a po
sition to consciously avoid them. Of course, exceptions do occur. I was told
about a Gurbetka married to a Bugurdzi who obstinately refused to adopt her
husband's dialect, which also affected her children's speech.
d) Use of contact languages in inter-group communication.

Speakers of closely related dialects normally communicate in their respective
varieties. This is true for the various Kalderas-groups, for Kalderas and
Lovari, and in the south for various groups of Arli. Misunderstandings can
be cleared up by discussion, but very often this is not even necessary, since
the differences between the dialects are Known to both sides.
But whenever groups speaking very differing dialects come into contact,
and, what is more important, groups that have incorporated loan words from
different source languages, recourse is made to a common contact language.
A contact language is used especially in those cases where the groups do not
live in one region and do not have any experience with each other's dialects.
Kalderas and Arli, when meeting in Germany, have great difficulties to make
themselves understood in Romani, they use Serbian instead, the reason being
that Kalderas has many important loan words from Romanian, whereas Arli
has adopted a lot of indirect turcisms, most of which are not known in
northern parts. The same can even be observed between Kalderas and south
ern Gurbet (Kosova, Macedonia), although both belong to the Vlach group.
The situation is different, however, for the groups settling together in Kosova
and Macedonia. Here the Gurbet and to an even greater degree the Bugur-
dides try to adjust their speech to that of the more numerous Arli as much as
they can. In the long run these practices may lead to dialect mixing.
72 NORBERT BORETZKY
Where groups from different countries without a common language at

their disposal come into contact, they, of course, try to useRomani.When a
group settles in the neigbourhood of another group, the newcomers try to
learn as much as possible of the dialect of the established group, as expected.
Rajko Duric told me that this even happened with a group of Askalije from
Kosova who settled near Mladenovac (south of Belgrad) and came into con
tact with members of the Gurbet tribe. Although the Askalije's native lan
guage is (by definition) Albanian and they often deny to be Gypsies, they
began to learnRomaniand the second generation acquired quite a good com
mand of it.
e) Prestige. It is known that languages do not influence each other in a

random way; rather, the direction of interference is dependent on differences
in prestige. As far as I can see, differences of this kind between groups of
Roma and their dialects are either negligible or altogether absent. No Gypsy
would admit that another group has or should have a higher prestige than his
own. Economic and political factors do not play any role either, since no
group is in a position to make others dependent; Gypsies normally do not em
ploy other Gypsies; employers are the gadze and their institutions, e.g. the
Serbians, Macedonians, Albanians etc. in former Yugoslavia.
f) Number of speakers. The only factor that might influence speech

behaviour is the different numbers of speakers. Bugurdzides are few com
pared to the Arli and to the Gurbet, and the Arli appear to outnumber the
Gurbet. It is my impression that Bugurdzides try to speak some sort of Arli
when addressing Arlis, but I am not sure if the Gurbet readily adapt them
selves to the Arli dialect. At any rate, it is not at all certain whether such
behaviour automatically provokes interference. To consciously incorporate
elements from other dialects would involve acknowledging those dialects as
better or older - and this is never conceded. In general, all Gypsies are very
much convinced that their own dialect is the purest and oldest, whatever this
may mean, and that other groups have a mixed or corrupted language. In this
respect, people with some education do not differ from illiterate people.
1. Problems with the identification of the dialects

Recent interference between groups in contact is rather easy to establish,
if there are related groups living at other places, which could serve as control
groups. It is more difficult, however, to uncover older, historical influence
between the dialects, because it is not at all clear what the major dialects were
like in early times. We do not know how to define the dialects linguistically,
which traits are old or which are regional innovations etc. It will be an impor
tant task for the near future to develop a classification that is based on lin
guistic criteria rather than on geographical factors and tribal denominations.
Obviously, this would be the best basis for analysing interference. In the
following, we want to discuss the importance of some isoglosses for
identifying the major dialects.
1.1. Older dialect isoglosses

Apparently, there are only a few distinctions that can be considered old
and that possibly stem from pre-European times. If we come across such
traits in dialects where we would not expect them, we may infer that they
have become part of the dialect by interference.
1.1.1. s-/h-
This alternation is found in the copula, and moreover in sar/har "how",
savo/havo "which", so/ho "what", i.e. in function words (interrogatives), h-
must be derived from s-, since this type of change occurs quite often in
various languages. In all likelihood it goes back to India, because the same
distinction is found between the new Indo-Aryan languages. Aspiration of s-
is most characteristic for Sinti and closely related dialects; in Arli1 it appears
in the copula only, along with s-forms:
hinjum, hinjan, hi (i) besides sinjum, sinjan, si/isi
In some varieties of Arli we have free variation, others seem to prefer one
or the other form. It is almost impossible to determine which form is the basic
one for which variety. Where forms with h- appear in other dialects, they
may have been introduced by interference (cf. 2.2.l.d).
There was ocurrence of h- in the speech of one Gurbetka (Pozarevac). In
this case, it is totally unclear how it has become part of the dialect.
This phenomenon must not be confused with the aspiration of final
(grammatical) -s occuring, for instance, in Bosnian Gurbet.
1.1.2. 2nd sing.past -an/-al and copula san/sal

It is difficult to imagine that one of the variants derived from the other
during the European development of Romani. There is a slight possibility that
-an may be the result of assimilation within the combination -al tu ("thou"),
but this seems rather unlikely, -al occurs in Sinti and other central dialects,
74 NORBERT BORETZKY
but there also seem to be Sinti-like dialects displaying -an - in all probability
an old interference (cf. 2.2.2. i).
1.1.3. 1st sing. past -om/-em

As far as I know, the second variant can be found in Vlach dialects only,
and, it seems, in all of them. Therefore, wherever we come across dialects
apparently belonging to the Vlach group but having -om instead of -em, this
may have been brought about by interference. Thus, dialects in Romania,
especially the Bukovina dialect (Miklosich 1874), but also in the Ukraine and
South Russia (Barannikov 1934: 95, 98) display -om 2 , although the
Bukovina dialect comes close to Kalderas in most features (cf. 2.2.2.k).
It is important to determine which of the two forms is the older one;
problably -om, because -em can be derived from it by umlaut (assimilation),
i.e. -j-om > -em. We are faced, however, with a more general problem: was
there a proto-Vlach that has been imported to Romania as such, or did this
type of Vlach only emerge in Romania? If the second assumption holds the
-om of the Bukovina dialect should be considered an archaism.
1.1.4. -nr-/-nd(r)-/-rn-/-r-
This variation seems to be less old than the aforementioned, but it is not a
recent one either. The various forms can be derived from one another phonet
ically in the following way:
a) -nd- (Indian) > -nd (preserved in Prilep and in one of the varieties
described by Paspati 1870, the dialect of the nomads; cf. Tableau
comparatif 118ff);
b) -nd- > -nr-, cerebrality being preserved in the sound of [r] (in Kalderas,
partially in Lovari and Gurbet), and further
c) -n- > -nr- with loss of cerebrality (Dambazi, older Sinti; cf. Bischoff
1827);
d) -nr-/-nr- > -ndr- (occasionally found in Greece, Bulgaria, and Ukraina);
e) -nr- > -rn- (in northern Gurbet, in Bosnian Gurbet; cf. Uhlik 1942);
f) -nr- > -r (Arli, Bugurdzi; central and northern dialects, among them Sinti,
cf. e.g. Finck 1903; Drindari, which is closely related to Bugurdzi, as
well as Erli have the strong cerebral instead, cf. Gilliat-Smith 1914).
On the whole the reflexes of -nd- have a clear-cut and plausible distribu
tion, but some details are rather puzzling: southern Gurbet has -nr-, whereas
the northern variant has -rn-. Can this be explained as a late metathesis of -nr-
> -rn-? The same distinction can be found within certain subdialects in Greece
(Vlach-type;cf.Igla l989).
1.2. Innovations
The problem with innovations in general is that they may either have
come about independently in different dialects, or else they owe their exist
ence to contact. It is less convincing to assume a simultaneous initiation of a
given sound change in an area of interrelated dialects. If similarities are not
the result of common change, but of simple chance, then, of course, they
cannot tell us anything about (historical) dialect coherence.
1.2.1. The loss offinal -s

The loss of grammatical -s occurs in the Balkan dialects of Romani to a
varying degree. There are no instances of it in Bugurdzi (as well as in many
central dialects); it is weakly developed in Kalderas and Lovari, and also in a
dialect spoken in Prilep (Macedonia), which is similar to Arli with regard to
many other features; it is more developed in northern Gurbet, and
consequently carried out in Arli and in variants of southern Gurbet. It is
rather unlikely that this phenomenon can be utilized in order to classify the
dialects historically, and it may not be the result of interference either, since a
sound change of this type may very well occur at different points inde
pendently, although the beginnings of this process in Romani may be very
old.
As for the differences within Gurbet it is worthwhile to assume inter
ference. The northern variants show -e < -es in the obliquus masc, but have
preserved -es in the 2nd person sing., whereas in the southern variant this -s
was dropped too. The latter may have happened under the influence of Arli
(South Serbia, Kosova, Macedonia), although an internal development can
not be ruled out with certainty. The state of Bosnian Gurbet, where -s has
given to -h or disappeared altogether, points to the second possibility rather
than to interference; cf. 2nd sing, dzane, te mudareh. Sometimes even word-
internal grammatical s becomes h (leste > lehte etc.).
1.2.2. The -el/-ol isogloss

For the 3rd sing. -el can be considered the old formant, first because of
its general appearance in the dialects, second because of its accordance with
2nd sing. -es and 2nd/3rd plur. -en, and third because of its etymology ( -el <
Old Indian -ati is regular). The other variant, -ol(a), is found alongside -el(a)
76 NORBERT BORETZKY
in Arli and in the dialect of Prilep, but also in southern Gurbet and Dzambazi,
and there perhaps even more consistently. It is, however, absent in Prizren. It
is remarkable that speakers of Arli acknowledge -el(a) as a possible form,
although they prefer -ol(a). How is this process to be reconstructed? Did it
start in Arli and spread as an ongoing change over the neighbouring dialects
of Gurbet/Dzambazi and others, or has -ol(a) been borrowed from Arli in a
later process? In the speech of southern Gurbet -el seems to be absent, but
this must not mean that the change is older in Gurbet than in Arli (cf. -el in
the northern variety of Gurbet). The behaviour of Prilep speakers is rather
unusual: they consider -el to be the correct form, but nevertheless prefer -ol in
their speech. I am tempted to take this as an indication of a recent influence on
the part of Arli and/or Dzambazi. It should be noted that the Paspatian dialect
of Greece, which is close to that of Prilep, shows no traces of -ol (cf. Paspati
1870). From all this we can conclude that at least in Prilep -ol came up by
interference, maybe after a longer-lasting influence from Arli and Dzambazi,
whereas the speakers of the Prizren dialect (Kosova) preserved -el because
they immigrated in more recent times.
The reasons for the change -el > -ol are by no means clear. The process
may also have been an assimilation of e to velar /.
2 . Interference phenomena
2.1.Planned, intentional, conscious interference
2.1.1. Written sources
Interference can be brought about consciously, or it can occur as the con
sequence of an unintended act. We find instances of the first type when in
dividuals or groups try to standardizeRomaniin order to use it as a broader
means of communication, mixing elements from different dialects. We also
find it when local standards are to be developed on the basis of one dialect,
incorporating elements from other dialects that have become obsolete in the
basic dialect or that do not seem to be very distinctive; cf. the following cases:
A. Saip Jusuf of Skopje, a Dzambaz by origin, chose Arli as the basis for his
grammar (Kepeski & Jusuf 1980), but he managed to avoid extreme forms of
Arli, and he adopted some elements from other dialects, although in quite a
random way; cf. gndinela "to think" from Kald. gndil (a romanism); khonik
"nobody" and khanci "nothing", extinct in all Arli varieties known to me,
from Dzambazi; po, pi "on, at, above" also from Dzambazi; maj (compara
tive) from Vlach, along with po- (comparative) and naj-(superlative), which
are borrowed from Macedonian3. As for the copula, forms beginning with h-
have been avoided, since they differ too much from Vlach and other dialects.
B. Contrary to this procedure, Ali Krasnici of Prishtina used his local variety
of Gurbet when writing his short stories (1981 and 1986). He even preserved
such idiosyncratic forms as munro "my" instead of the more widespread and
better understandable miro, moro, mo. To be sure, there are foreign elements
in his language, but it is difficult to determine what their status is:
a) are they older elements that have become firmly integrated into the
dialect; or
b) are they elements taken up unconsciously; or
c) are they utilized consciously in order to enrich the language?
In the following we will try to evaluate the importance of some of these
elements.
a) Demonstratives with word-initial k-, e.g. kava, fem. kaja "this", kova
"that", kate "here" appear along with others beginning with g-: gova goda
godo "that", godolese "therefore", g(e)ja "thus", gasavo "such", gothe
"there". It is possible that forms containing k- gradually have been assi
milated into the dialect which originally had g-forms only (cf. northern
Gurbet). On the other hand, we find k-forms in Bosnian Gurbet (Uhlik
1942), but here as well it is not clear whether this is a result of dialect mixture
or not. Also, it is possible that Gurbet never lost the k-forms totally.
Although it is difficult to present clear-cut criteria, I believe that the k-forms
in the language of Krasnici are of foreign origin.
A morphological case is the past of inkle l "go out, go up"; in Vlach it
should be inklisto (northern Gurbet, Kalderas), but here we have ikislo,
ikisli, which in all likelihood derive from Arli.
b) In hutlo "he jumped" the cluster -tl- is borrowed from Arli, xuklo
being the correct Vlach form. It appeares to be a rather unconscious borrow
ing, brought about by the author's familiarity with the other dialect. The
normal Vlach form of the verbal (and deadjectival) noun is -ipe (-ape), but in
the texts -ipa is also used; cf. darajpa "fear", sastipa "health", -ipa might be a
contamination of -ipe and Arli -iba, but since even -ipa is not unkown in Arli
(e.g. in the Arli of Gnjilane), it may have been borrowed as such. Again, we
cannot say if forms ending in -ipa are in common use in southern Gurbet, or
if we are confronted with individual borrowing here.
78 NORBERT BORETZKY
For "forget" we find bistrol, but also a metathesized form bristol too. The
latter is an idiosyncratic form, the better known bistrol may have been used
unintentionally.
c) punro "foot" is the form we expect in this dialect, and it seems to be
used throughout, but for the derivation "step" we find prnalo <prno, which
is common in northern Gurbet. It is likely that prnalo is taken from another
dialect, since the derived word was not available in the autor's own dialect.
A rare word is aindz "field", normally used only in Bugurdzi. The author
maintained that he heard it from older Gurbet speakers, but he conceded that
today it is unknown to speakers of his dialect. This is clearly a case of con
scious enrichment.4
prandosarel (pe) "to marry" is normally used in Arli and Bugurdzi, but it
did not belong to the inventory of Vlach. In all likelihood the author bor
rowed it from Arli, possibly because the Vlach romanisms mrtil (pe) and
insuril (pe) have become obsolete in his dialect.
C. In the language of Rajko Duric (Belgrad) a certain development can be

observed. Whereas in his earlier poems (1980) he followed his own Gurbet
dialect rather strictly, in the later ones (1982) he chooses neutral forms that
are more accessible to speakers of other dialects. In the poems of 1980 we
encounter many palatalized forms such as nachel "to pass" instead of nakhel,
which constitutes an unfavourable sound alternation (nakh-av, but nach-el);
or we have genitive and dative forms without -k as in abijav-es-e "wedding"
instead of abijav-es-ke, dat. and gen.fem. to masc, abijav-es-ko, by which
the distinctiveness of the case formant is reduced. However, there is a good
deal of variation; cf. abijavese along with vakarimaske "speaking, speech"
and dichimata "look" besides dikhel. Contrary to this, in the 3.p.sg. -el is
used throughout, although -ol is not wholly unkown to northern Gurbet. In a
publication from 1982 the strong palatalization of Gurbet has consistently
been abandoned, however there are still some choices that clearly challenge
dialect levelling; cf. forms with unpalatalized / in dikhlan "you saw" instead
of the more common dikhljan, cirikla instead of ciriklja, plur. of cirikli
"bird". Also, it does not seem sensible to omit the -s in the formant of
acc.sing.masc. -es, e.g. chave for chav-es, which thus becomes homo
nymous with nom.plur. chave. Forms like brokh "breast", an idiosyncratic
variant for the much more widespread brekh, are another indication of the fact
that an ideal language form has not yet been found.
2.7.2. Oral texts

Intentional interference is found in oral communication as well. Some
speakers adapt themselves to the dialect of another speaker either because
their own dialect is less common and they want to increase their chances of
being understood, or else because they have a better command of the other
person's dialect than the other person has of their own dialect. A total switch
between two dialects would be of little interest linguistically, and it hardly
ever occurs.
A. The following material is taken from a story told by a Dzambazka from
Kumanovo. The language is interspersed with Arli elements, probably
because my companion addressed her in something similar to Arli. In my
opinion this is an ad-hoc mixture rather than a blending of dialects that has
become traditional in Kumanovo. The following elements appear not to be
indigenous, but taken from the local Arli.
a) Phonology:
- bers, -bersengiri (Arli) alongside bres (Dzambazi);
- maw alongside manro/mangw,
- tikni alongside ciknv,
- dive alongside g'ive (phonetically uncertain);
- on, oj alongside von, voj (uncertain);
- k instead of g in demonstratives: kava, kal, okoja, kate alongside gova,
gothar; in all likelihood this variation is quite common.
-pro "Fu" alongside prno,punro
b) Morphology:
- definite article fem. i instead of Vlach e (probably the only possible
form);
- genitive -kiri etc., cf. efta-bersengiri, lakiri alongside lengi',
- preposition tar "from", with article tar-o, tar-i (tari mi chej) alongside
katar (katar o Munster)',
- preposition ko, ki "in" (ki Germanija) besides an-o, an-i (ano lil, ani
strand)', apparently, this alternation has become customary;
- preposition ko, ki "on, at" (ke staklje "on crutches") besides po,pi
(po aviono);
- preposition kare/kara "at, near" (kare mande, kare late) instead of kaj
or the pure locative;
- copula past tense ine besides sa;
- copula subjunctive te ovol instead of avol (uncertain form);
80 NORBERT BORETZKY
- present ending on -a (dzava, kerola) besides the shorter form (kerol),

which is much more frequent. The longer forms with -a may represent
remainders of the imperfect in -as as well.
- 1st sing, past tense vakerom "I said" (only once, uncertain) alongside
the normal -dem;
- pluperfect formed by the past + copula past: sar ali naakari (i)ne "how
she has/had come here" (perhaps usual in the local Dambazi);
- past ikljili instead of inklisti "she came out" (paradigmatic levelling or
interference? 5 );
- past alo alji "he/she came" besides aviljem avilo avilji, in' avio,
n' avlji "was not" is indigenous!) As for the variation alo/avilo it is not
clear if the contracted form has been borrowed ad hoc or has become
usual.
c) Vocabulary:
-po-zala, zalagica "something, a little" instead of cora, xanci etc.
- herbuze "watermelons" instead of lubenice ( herbuze, lubenice te
- vakerav pravo amaro "h., 1. - to use our own word"). Both of them
are loans but originate in different languages (herbuze < Turk.,
lubenice < Serb.)
As has been mentioned above, some of these phenomena are now firmly
integrated in Dzambazi, they are not random borrowings in this text.
B. Something similar happened when Rade Uhlik let a Bugurdzi from

Urosevac/Ferizaj tell him a fairy tale. The story-teller knew that Uhlik was a
fluent speaker of Vlach (Bosnian Gurbet) and tried to adapt himself to this
dialect by sporadically, but rather consciously, using Vlach forms; cf.
- short genitives: biparen-go "without money", daja-ki "mother's"
(fem.) instead of -oro, -iri;
- non-palatalized past forms: dikhla "saw" instead of dichas;
- nan(e) alo "was not/didn't become", which might be a contamination
of Gurbet n'avlo and Bugurdzi na ulo;
- definite article fem, e in e ra "at night" instead of r,
- reflexive plur. pumende "to themselves", which is Arli rather than
Vlach;
- of dakerla and dodarla "wait" only one (probably the first) should
be indigenous.
C. The situation is different with a text recorded by Kostov (1962) in

Ihtiman, Western Bulgaria. As Kostov himself stated, the dialect has a Vlach
basis (cf. such typical Gurbet forms as negat. ni, plur.instr. vojnicenca "with
the soldiers" from rum. pl. voinici with [c']) but we find a lot of elements that
cannot be original Vlach.
a) Phonology:
- mandro "bread" with ndr (an archaism?);
- e borja acc. "the bride" with preserved) (possibly an archaism);
- hijas chivgjas chivde "to throw" (non-Vlach) besides huv chudas',
- haj "daughter" (non-Vlach) instead of hej;
- iklistilo "to go out", ikalel "to take out" instead of inklisto inkalel;
- cid- "to draw" (non-Vlach) besides crd-;
- kaxni "hen" besides kajnv,
-phanle "to bind" ( <phandle) (non-Vlach) instead of phangle.
b) Morphology:
- grammatical -s in: nom.sing. masc, prxos stlos (from Drindari?)
nom.plur.fem. prtes (from Drindari?)
3rd sing.past xaljas (an archaism?)
- definite article fem. i: i lisica, and-ifurnja instead of e;
- definite article plur. o: o (portes), and-o (khera), o (gurv);
- demonstrative plur. kal "these" (Gurbet should be kal; may be an old
form);
- copula som, past 1st sing. -gjom (non-Vlach; cf. the Bukovina texts
of Miklosich l874);
- cidingjas "drew", past tense extended in -in- (typically Arli);
- copula past sins (sic!) (non-Vlach) alongside sas;
- present with -a: dan-a "they go" (non-Vlach);
- loan verbs in -in-: mislinel, cudinel pes, cidingjas alongside isprat-isar-
gjas (Arli);
- verbal noun in -ibe: maribe.
c) Vocabulary:
- xri (and zlak) instead of xanci, cora etc. "something, a little" (xari
is non-Vlach).
For this text we have no indication whatsoever that the storyteller mixed
forms of different dialects in order to make himself better understood. It
seems that this is a dialect in which the quoted interferences occur quite
82 NORBERT BORETZKY
normally. This can be taken for granted in those cases where the text has no
alternatives, as for instance with grammatical -s, with the feminine article i,
with 1st sing.past -om, and the expanded past forms in -in-.
2.2. Stable interference phenomena

The main concern of this paper is to inquire into those cases of inter
ference that have become firmly established. To be sure, there are some such
cases, but it is not always easy to produce convincing proof of borrowing.
2.2.1. Phonology
a) Demonstratives beginning with g- are considered typical of the Gurbet
group of dialects. Here the older Indian k- has been replaced by g- for
unknown reasons. Now, a Pristina informant for Gurbet gave me the k-
forms as the normal ones in his dialect, for instance kava kova kote, although
in the text of a song forms with g- were used as well. I would like to interpret
this state as a free vacillation that has come about under the influence of the
prevalent Arli. (Cf. what has been said above about the distribution of k- and
g- in the texts of Krasnici; 2.1.1. B)
b) Palatalized k g appears in Macedonian Dambazi as [k' g'], whereas in
the closely related Gurbet either [6 dz] contrasting with /c dz/ (northern
variant) or [c d] (Kosova) is found. Since the Gurbet as a Vlach tribe must
have immigrated from the North, they should have brought to Macedonia an
already palatalized variant of the [6 dz] type. Under the influence of the local
Albanian and Serbian these have merged with old /[c d]/, but in Macedonia
in all likelihood the earlier sounds [k' g'] have been restituted. This recon
struction is corroborated by the fact that the velars are lacking in original
clusters containing s; cf. les-e < les-ke "him", les-i < les-ki "her" possess..
The only way of explaining this discrepancy is to assume an intermediate
stage [lesk'e], since the k could not have been dropped in a form with velar
[k], i.e. in [leske]. The restitution of [k' g'] in Dambazi has been effected
under the influence of Macedonian, but probably supported by Macedonian
Arli, which as a Romani dialect displays the same words (cognates) as
Gurbet. This might be called etymological support. The reconstruction
scheme runs as follows:
kinel > k'inel > cinel (northern Gurbet) > cinel (Kosova Gurbet);
> k'inel (restitution in Dambazi);
leske > lesk'e > *lesce (earlier Dambazi) > lese (both in Gurbet and
Dambazi without restitution).
c) The Arli of Gnjilane/Gilan (eastern Kosova) is distinguished from

other varieties of Arli (Macedonia, and most of Kosova, too) by a peculiar
isogloss. Whereas in Macedonia, for instance, in the varieties of Skopje and
Stip as well as in the dialects of Prilep (Macedonia) and Prizren (south
western Kosova) and elsewhere, sk- before palatal vowels has been
preserved or has rendered [sk'], k has been lost in Gnjilane: Thus we find
les-iri instead of leskiri "her" possess, and les-e instead of leske "him", the
immediate stages to be reconstructed being *lesciri and lesce. In analogy to
this even the masculine form les-koro has been reshaped to les-oro - an
exceptional phenomenon to be observed in this dialect only.
How can this development be interpreted? In Gnjilane and its surround
ings, where speakers of Gurbet must have been more frequent than those of
Arli, the velars underwent a palatalization process under Gurbet influence. In
the end, k g before e and i resulted in [c d], and in -sk- the k disappeared
altogether, as in Kosova Gurbet. The reshaping of leskoro > lesoro however,
which has no model in Gurbet, can be characterized as a kind of overreaction,
a morphological analogy rather than a phonetic process. It should be noted
that overreaction happens quite often in language contact, with the result of
the influenced language developing further than the influencing one.
If we compare this with the case presented under b) we can see that in
fluence has gone in different directions and has had a different impact: in b)
Dambazi (Gurbet) was influenced by Arli, in c) Arli was influenced by
Gurbet. Whereas in Macedonian Dambazi the process is a partial restitution
that can be conceived of in purely phonetic terms, in Gnjilane Arli the process
can be characterized as the transfer of a sound change plus a morphological
(analogical) expansion. (Sporadic forms in the Gnjilane texts like the Vlach
past dzeljem instead of dzeljum are also an indicator of interference on the
part of Gurbet.)
d) An unsolved problem is the distribution of s- and h- in the copula
forms of Central European Romani dialects. German Sinti and related vari
eties, but also the Hungarian Cerhari dialect (Mszros 1976: 3526) have h-
throughout, cf. (Holzinger 1993:108f)
pres. hom ham past horns ham-s

hal han hals hans
hi hi his his,
84 NORBERT BORETZKY
but a Sinti variety of Piemont (Senzera 1986) presents two paradigms, one of
them with s-:
som sal si besides om al i, apparently < hom hal hi, h- being lost under
the influence of Italian. In the Sinti of the Slovenian Prekmurje, neighbouring
villages differ as to s- and h- (Igla, personal communication; Strukelj 19807),
and other dialects mix both forms in one paradigm. There are Central and
North European non-Vlach dialects, however, that have preserved s-
throughout (for North Russia cf. Wentzel 1980; for Slovak and Hungarian
dialects cf. von Sowa 1887: 92f, and Hancock 1990; for South Poland cf.
Kopernicki 1930). Data are presented in the following table :
North-Russian Slovakian Southern Poland

som somas som somas som somas
san sanas sal salas sal sal
(i)s1 (i)s1s hi/ehi ehas/has hin/ehin sas/esas
sam samas sam samas sam samas
san sanas san sanas san (esan) sanas
(i)s1 (i)s1s hi/ehi ehas/has hin/ehin sas/esas
The texts by Kopernicki (1930) reveal that hin/ehin functions as a past

tense form as well (cf. the Arli past sine), and further that nane means both
"is not" and "was not". It is not clear whether this dialect has a variant nasas
(or a similar form) or not.
How did these paradigms come about? If the s-/h-isogloss is old, perhaps
going back to Indian times, the mixed paradigms could have been generated
by interference. If the sound change s- > h- is younger, and took place during
the European period, the change could be interpreted as a gradual process,
i.e. as an internal reduction of the most unmarked form, the 3rd person si >
hi, which was later transferred on to other forms of the paradigm. In the Slo
vak dialect the change stopped here, whereas in German Sinti the h-onset
extended over the whole paradigm.8 But even if the second interpretation is
historically correct, the copula paradigm of some subdialects, for instance that
recorded by Kopernicki, should have been formed by interference rather than
by internal development only, although one can imagine a sound change
affecting the 3rd person of the present tense but not the closely related past
tense form. Moreover, other doublets in the Slovakian dialect do point to
interdialectal interference; cf. so and ho "what", sar and har "how", but
perhaps only havo "which". A possible source for the restitution of the s-
forms is Lovari.
In the Hungarian Vend dialect (Vekerdi 1984: 72), which roughly fol
lows the Slovak paradigm, a specializiation of si and hi has taken place: hi is
"is", whereas si means "have" in the construction si man etc. "I have". It may
well be that reduction began with enclitic si "is" but did not extend to fully
stressed si in si man etc..
To sum up, it can be said that aspiration must have taken place on Euro
pean soil, at least in some dialects, and that not all mixed or variating para
digms owe their existence to interference.
In Vlach dialects, for instance in Lovari and in Bosnian Gurbet, a variant i
/-j' < si can be found as well, but this reduction is restricted to the 3rd person.
e) The change from h d> can be taken as a feature distinguishing
Kalderas and Lovari from other dialects. In dialects spoken in West Ukraine
(cf. Barannikov 1934) forms with h d alternate with those containing ,
which clearly points to interference. I am not sure as to the basis of those
dialects, but it seems to me that we are confronted with a deformed Kalderas.
The newly emerged are rather marked sounds, and it is less likely that
they should have been transferred into other dialects.
f) In Bugurdzi the word "day" has the forms zis and zies < dives. A
Kovac from Skopje cited the form zives, which might be a contamination of
indigeneous zies and Arli dive.
g) The normal form for "town" in Bugurdi is ziz. A speaker from
Southern Kosova used diz instead, apparently under the influence of Arli,
which prompted another speaker to blaming him for not speaking the pure
dialect.
2.2.2. Inflectional morphology and related phenomena

There are a number of phenomena to be discussed here.
a) The definite article in the plural in Vlach is e throughout, but in the
Gurbet of Prishtina o is used in free variation with e. The o clearly goes back
to Arli.
b) The definite article fem.sing. in the Gurbet of Prishtina reveals in
fluences from the same dialect: it should be Vlach e, but what we have is non-
Vlach i instead.9
c) The reflexive pronoun pes- "himself" in Lovari has developed a plural
form pen- by way of analogy with 3rd sing.masc. les-ko "his", fem. la-ko
86 NORBERT BORETZKY
"her" and 3rd plur. len-go "their" (cf. Pobozniak 1964: 49). While, in
general, this innovation did not spread to the closely related Kalderas, we
find it in a variety spoken in the Vojvodina. Surely, the speakers do not call
themselves Kalderasa, but they are certain about not being Lovara. Moreover,
they speak a dialect which differs from Kalderas varieties spoken in Romania
and in Serbia south of the Danube river in unimportant details only. I assume
that plur. pen- has been adopted through contact with Slavonian Lovari
dialects.
d) The preposition pe "on, at" became obsolete in practically all varieties
of Arli, and in most of them the preposition an "in" is not in use anymore,
both being replaced by ke, originally "to, at" (with article ko ki ke). This
process appears to have spread to Kosova Bugurdzi, where pe never occurs,
and an is randomly replaced by ke (but not the other way round). What is
noticeable in this case is that distinctions which appear useful if not necessary
have been abandoned because of pressure exerted by a more important dia
lect. Such cases are by no means unusual in language contact.
In Macedonian Dambazi we find a comparable situation. Here ke also
substitutes for an, but ke is used to an even greater extent; cf. ki bolnica "in
the hospital", ki Germanija 'in Germany"; ke staklje "on crutches". Maybe
substitution even operates vice versa, i.e. sometimes an appears where at an
earlier stage only ke was admitted; cf. potpisisajlji ano ljil "she signed the
paper", which may be interpreted "she gave her name on the paper"; ljilja ...
ani strana te phirol "she began to limp", literally "she began to walk to the
side"; ni primin tu ani buki "they don't employ you", literally "they don't take
you to work". The examples are not fully conclusive, though.
e) Dambazi as a Vlach dialect has the Romanian-derived comparative
marker maj, but alongside it po (for the comparative) and naj (for the
superlative) as well. 10 Since po and naj are Slavic elements we cannot say
with certainty whether they are borrowed from Slavic (Macedonian) directly
or from Arli, where they have become the only means of comparison,
presumably a long time ago. Even a third possibility, i.e. a joint influence
from Macedonian and Arli on Dambazi, cannot be excluded.
f) In Bugurdzi sine "was" is the normal form of the copula past and, as
expected, na-sine the negative form of it, but I heard older speakers using na
sas, i.e. a Vlach form. It is rather unlikely that the Bugurdi should have
borrowed it from Gurbet, firstly because the form of the copula is sa, not sas,
and secondly because it is Arli that serves as a source for borrowings in the
first place. (The Arli, i.e. Turk, yerli "resident, local" have been settling in
the South of the Peninsula presumably since the Roma's arrival in Europe,
whereas the Gurbet/Dambaz re-immigrated from the North.) Therefore, we
have to assume that sas in na-sas is an older form of the copula that survived
in combination with negative na- alone. This means, however, that sine is an
innovation in Bugurdi that might well have been taken from Arli. The hypo
thesis of an Arli origin of sine is corroborated by Drindari sjas (Kenrick
1967:78). Judging from a number of common phonetic changes both dialects
must have had a common ancestor. 11
g) In the Drindari of Kotel, the subjunctive of "to be" is formed with ach-
originally "to remain", but besides it ti ovil (past ulu) can be found. Kenrick
(Ph.D. Thesis) assumes that ov- in this variety of Drindari has been
borrowed from other dialects. (Since Bugurdi and Bulgarian Kaljdzi also
have ov-, this form should not be totally alien to Drindari.)
h) In the same dialect the future tense is formed with ma-, mo- etc. <
mang- "to want", but kam- is used as well, again due to dialect mixture. It
has to be noted that the full verb kam- is not normally used in this dialect,
which points to the fact that kam- is probably of foreign origin.
i) The 2nd person sing.past tense of the verb underwent interference as
well. In a dialect spoken in the Austrian Burgenland, the speakers of which
call themselves Louvara, we find forms like dikl-l "you saw" and und-l
"you heard", but for the copula san "you are" with an n (Knobloch 1953).
Since the Louvara may have spoken a Vlach dialect before they moved west
ward, -al in all likelihood has been borrowed from a central dialect.
k) We already mentioned the 1st person sing.past in -om instead of -em in
the Bukovina dialect (Miklosich 1874), which is unexpected in view of the
fact that this variety displays all characteristics of a "good" Vlach dialect
(Romanian and Serbian Kalderas); cf.:
- h d > z (orthographically s z);
- vowel centralization i > % e > o in contact with sibilants etc.;
- weak palatalization of the velars before i: giv "wheat" > [d'iv];
- tl > kl, e.g. in xuklo from xutav "jump";
- contraction of pijen "they drink" >pen;
- loss of j- in jek(h) "one";
- reduction of katar "from" and andar "out of' > kata anda;
- -plurals with masc. nouns of the type dand "tooth", grast "horse";
- obliquus fem, of adjectives on -a, e.g. bar-a; ekh-a grasnja;
- reduction of len "them" acc. > le;
88 NORBERT BORETZKY
- future form of the verb on -a: lel-a "he will take"; kro <krav-a"I
will do" (contractions of this type are known from Hungarian Lovari);
- 3rd person sing.past of motion verbs both on -o/-i and -j-as (as in
Serbian Kalderas);
- sode "how much" instead of kobor, kozom etc.;
- article forms with /: le, la, l
What deviates from average Vlach is article forms like nom.plur. ol,
acc.sg.fem. ola and the like. An innovation not known to me from other dia
lects is the 3rd pers. sing.past tense on -ou, e.g. bel'ou instead of belo or
bel'as.
As for -em instead of -om, I assume that this formant entered the dialect
by interference from non-Vlach dialects of the Ukraine. This is corroborated
by another correspondence with dialects spoken north to it: the negator is na,
both for the indicative and the imperative, whereas in Kalderas we have ci
and in Gurbet ni for the indicative, but na for the imperative. In this the
Bukovina dialect shows the same characteristics as all dialects of the Ukraine
and Russia known to me; cf. na phenel "he does not speak" and na phen!
"don't speak!" Thus, the evidence points to the following: What we are faced
with is a Vlach dialect of the Kalderas type which has become altered in con
tact with northern dialects, but in a few traits only.
1) From Bugurdi a peculiar vacillation can be quoted: in the 1. pers.
sg.past tense we have both -om and -urn, which are used indiscriminately. I
was unable to elicit from the speakers which form was the "correct", the
"good" one, because my informants changed their opinion all the time. Since
-urn is common in Arli (and in the dialects of Prizren and Prilep as well), the
variation may be caused by influence from this dialect. Bulgarian Drindari,
which is closest to Bugurdi, has -im, a contracted form that cannot tell us
much (from -ijom or -ijum?).
m) In Bugurdi the use of the gerund (verbal adverb) is decreasing, tuj +
present is taking its place.11 I was able to find a few instances of the old
gerund though, some of them ending in -indo si-ando s, others in -indoj/
-andoj. Of those only the first variant seems to be indigenous, because Bu
gurdzi is a dialect that preserved final grammatical -s very consistently,
whereas it is lost in Arli and southern Gurbet; -indoj/-andoj may be borrowed
from a Vlach dialect, but it is more likely that there are varieties of Arli
spoken in Kosova which display -indoj as well.13
n) A case of interference in word formation can be found in Bugurdi.
The verbal noun shows much variation, since we have the widespread -ipe
along with -ibe, -iba and even -ipa. I am unable to establish which of these
forms is the indigenous one, mainly because this category is not formed
automatically from all verbs and adjectives, and speakers are uncertain as to
therightform. At least -iba has been borrowed from Arli. Bulgarian Drindari
uses both -ipe and -ibe (the first for denominal, the second for deverbal
derivations; cf. sahcip < sastip "health", xoxeib < xoxa(v)ip) "fraud", so
both may belong to the inherited inventory of Bugurdi. -ipa should then
have come about by contamination of -ipe and -iba.
o) Drindari has pimus "drink" and xamus "food" besides piibi and xaibi
(Kenrick, Ph.D.Thesis), -mus clearly being of Vlach origin. Since -mus does
not occur freely but only in the two words quoted here, this is a case of
lexical borrowing rather than of a transfer of a word formation morpheme.
2.2.3. Vocabulary
At this level of language, interference is especially difficult to detect. All
European dialects have an essentially identical inherited vocabulary at their
disposal, and whenever we come across a word (etymon) in a subdialect
which is not documented in affiliated dialects, the given word may be an
archaism. To give an example: The word bov "stove" is not used in Kosova
Bugurdi, whereas in Kumanovo (northwestern Macedonia) it was known
and, as I was assured, in general use. Is this a relic in a variety of Bugurdzi
or did the speakers adopt it from Dambazi?
In Bugurdi kirvo and kirvi "godfather" and "godmother" are known,
although the institution of godparenthood is seldom practiced. What is more,
in Bugurdi the sequence ki- should have developed to ci-. Both factors point
to borrowing, perhaps fromDambazi[k'irvo].
For Drindari, Kenrick (Ph.D. Thesis) lists a number of words that appear
not to be indigenous: sukar "beautiful" and sukaribi "beauty" besides lachu
(sukar is absent from Bugurdi, too); phen-ke (a word calling someone's
attention), in all likelihood from a Kalderas-like dialect (with the meaning of
"says he/she, said he/she; that means"); suv "needle" (Vlach) alongside siv;
satra "tent" alongside katuna and cerxa; soru (Vlach) "head" alongside seru,
vas "hand" alongside the idiosyncratic as(t) of the dialect; za- alongside
nkja- "go, come"; va "yes". It has to be considered that va and za- are old
words now becoming obsolete. The same perhaps holds for rom with the
meaning "Gypsy" (not with the meaning "husband"), which is rarely used
alongside the derivation romaici < *roman-ici (a deminutive form, cf. rom-
oro in other dialects).14
90 NORBERT BORETZKY
3. Conclusion,
The investigation of interdialectal interference in Romani attempted here
provides convincing evidence for the assumption that there was not only
short-term interference in speech (parole), but also interference with lasting
effects on the language (langue). Many details could not be clarified mainly
because we still know too little about the historical splitting-up and
development of the dialects and the earlier distinctive features of the main
branches of Romani. On the whole, we can state that there is less interference
than might be expected under the given circumstances, with single dialects
having remained astonishingly homogeneous.
I hope that it has become clear that decisions on interference depend to a
high degree upon our insights into dialectal ramification as well as on the
accessibility of data.
Since in most cases it has been related dialects with very similar structures
which influenced one another, the amount of change through interference re
mains rather modest. Basic structures have not been changed, and adaptation
processes of greater extent have not taken place.
In general, it is lexical elements that undergo the most far-reaching
adaptation, not only with regard to their sound shape but also with regard to
inflectional class, syntactic behaviour and meaning. However, since we have
few clear cases of lexical borrowing little can be said about this topic. In
Bugurdzi the verb "live" appears in two variants, vest-iz-ava with the formant
-iz- typical of this dialect, and vest-in-ava, which because of -in- can be
considered to be of Arli origin. This means that morphological adaptation has
only been optional.
With morphological interference the following cases create no problems
for the replica dialect:
- Bukovina Vlach -om instead of -em (2.2.2.k);
- Bugurdzi -urn in addition to -om (2.2.2.1);
- Bugurdzi -indoj in addition to -indos (2.2.2.m);
- Bugurdzi na-sine instead of na-sas (2.2.2.f), making the
paradigm more homogeneous;
- Dambazi po- and naj instead of maj- (2.2.2.e), involving
categorial expansion;
- Kalderas plur. pen- in addition to pes- (2.2.2.c), contributing to
harmonizing morphological structures;
- Gurbet article i instead of e (2.2.2.b); paralleling the article with the
feminine formant -i.
Bugurdzi -iba and -ipa in addition to -ipe (2.2.2.n), however, create(s)

too many variants without adding new functions. Moreover, the new variants
do not contribute to matching the grammatical ending with gender: -a in -iba
is not better qualified for expressing masculine gender as was -e in -ipe.
Bugurdzi ke instead of pe, and sometimes even instead of an (2.2.2.d) has
led to abandoning useful semantic distinctions without compensatory advan
tages.
But even in the last cases listed here the changes did not disturb inner-
systemic relations, understanding, etc.. The only exception to be quoted here
is the replacement of the Gurbet plur. article e by o (2.2.2.a). If there are or
have been masculine plurals in -0 (as in Kalderas), e.g. sing. plur. dand
"tooth, teeth", then the introduction of o did no longer allow for
distinguishing the plural from the singular. The effect might have been that
plural forms without endings had to be abandoned.
In the domain of phonology nearly all the changes caused by interference
are of minor importance. As a rule, no new phonemes have been introduced
into the replica (excepting perhaps the restitution of h d for in Ukrainian
Vlach dialects), and new phonetic types are rare.
NOTES
1 If no other source is given the forms quoted here are collected by myself. For forms
that can be considered as generally known to scholars of Romani philology no sources are
given either.
2 Only some of the dialects west of the Dnepr have -em.
3 The indigeneous -eder has not been adopted, however, problably because it has fallen
out of use in most dialects of Serbia and Macedonia.
4 This word occurs in Bulgarian Erli too, in the form of alindz (personal communication
B.Igla). Since the form containing -/- must be older than Bugurdzi aindz , an indigenous
Gurbet aindz is impossible.
5 The past form iklilo may be taken from Arli, which has both iklisto and iklilo (cf.
Bugurdzi iilo, too), but it may as well be a process of inner-dialectal levelling, iklilo
being reshaped according to the present form ikljol. This assumption is corroborated by
Prishtina Gurbet iklilo. Both forms have passed into the passive paradigm. Since the same
change can be observed in Arli, in Bugurdzi and, as mentioned above, in Kosova Gurbet,
but neither in Kaldera nor in northern Gurbet, the southern dialects seem to have carried
out this change in a joint action. It is difficult to decide in which of the dialects the process
began.
6 Because of some typical traits (e.g. neg. i), Cerhari is considered by Mszros a Vlach
dialect, but this should not be taken for granted. It can be said with certainty that i occurs
92 NORBERT BORETZKY
not only in Vlach dialects, and that it does not go back to rum. nici. We find it in German
Sinti varieties as well; cf. Holzinger 1993: 40, 45.
If the dialect had a Vlach basis, then the copula paradigm should have been borrowed from
another dialect.
7 In Strukelj we find present som sal, but hi; sinja "was", but -s- > -h- in grammatical
suffixes; cf. zal-a-hi ipf. "went" < *dal-a-si.
8 The Romani of Wales as decribed by Sampson 1926 has om an si, not horn han hi;
this too makes us hesitant to consider Sinti hom han hi as a very old paradigm.
9 It is possible that in Romani e represents the older form, among others because we find
it in dialects that are very similar to Arli, e.g. the dialect of Prizren and that of Prilep,
where it may be a relic form. From e we get to i quite easily, by way of analogical
adaptation to the feminine -i found in nouns, adjectives and pronouns, whereas for an
adaptation in the opposite direction no plausible explanation can be found.
10 The old indigeneous comparative formant -eder survives in relic forms only.
11 We cannot exclude the possibility of sine and sas (sjas) once having existed side by
side, in complementary distribution, but as long as we have no indication in favour of this
assumption we should disregard this possibility.
12 tuj is clearly of Albanian origin. It is interesting to note that Bugurdzi did not copy the
Albanian construction in all parts: whereas Albanian does not distinguish the grammatical
persons, in Romani a personal construction evolved because of tuj being combined with the
finite present.
13 In Macedonian Arli -indor is found rather than -indoj or -indo.
14 Even the Cal of Spain reveals some traces of dialect mixture. The following doublets
are documented: piro and pindro "foot", minrio "my"andmin-daj "(my) mother" against
monro "friend", apparently < munro "my" (Kalderag); the verbal noun in -pe, -pen and -ben
(uncertain); borrowed masculine nouns in -o and -os, cf. ro/iros, foro/foros, druposltrupo;
pokin-andplasar- "to pay" (Boretzky 1992:18f).
REFERENCES
Barannikov, A.P. (1934) The Ukrainian and South Russian Gypsy Dialects.
Leningrad: Izdatel'stvo Akademii Nauk SSSR.
Bischoff, Ferdinand (1827) Deutsch-Zigeunerisches Wrterbuch. Ilmenau:
Verlag B.F. Voigt.
Boretzky, Norbert (1992) Romanisch-zigeunerische Interferenzen (zum
Cal). Erfurt & Jeing & Perl (ed.). Prinzipien des Sprachwandels. I.
Vorbereitung. In: Beitrge zum Leipziger Symposium des Projekts
"Prinzipien des Sprachwandels" (PROPRINS) vom 24.-26. 1991 an der
Universitt Leipzig. Bochum: Universittsverlag Brockmeyer. 11-37.
Duri, Rajko (1980) Prastarare- daleki svet/Purano svato - o dur them.
Beograd: Narodna Knjiga.
Duri, Rajko (1982) A i U/A thaj U. Beograd: Narodna Knjiga.
Finck, Nikolaus von (1903) Lehrbuch des Dialekts der deutschen Zigeuner.
Marburg: Elwertsche Verlagsbuchhandlung.
Gilliat-Smith, Bernard (1914) The Dialect of the Drindaris. Journal of the
Gypsy Lore Society, New Series VII, 260-298.
Hancock, Ian (1990) A Grammar of the Hungarian-Slovak (Carpathian,
Bashaldo, Romungro) Romani Language. Manchaca, Texas: Inter
national Romani Union (USA).
Holzinger, Daniel (1993) Das Rmanes. Grammatik und Diskursanalyse der
Sprache der Sinte. Innsbruck: Institut fr Sprachwissenschaften. (Inns
brucker Beitrge zur Kulturwissenschaft 85).
Igla, Birgit (1989) Das Romani von Ajia Varvara. (Unpublished Diss., Ruhr-
University Bochum).
Kenrick, Donald S. Morphology and Lexicon of the Romani Dialect of Kotel
(Bulgaria). (Unpublished Ph.D. Thesis)
Kenrick, Donald S. (1967) The Romani Dialect of a Musician from Razgrad.
Balkansko Ezikoznanie 11/12, 71-78.
Kepeski, Krume & Jusuf, Saip (1980) Romani Gramatika - Romska Gra-
matika. Skopje: Nasa Kniga.
Knobloch, Johann (1953) Romani-Texte aus dem Burgenland. Eisenstadt.
(Burgenlndische Forschungen, hrsg. vom Landesarchiv und Landes
museum, Heft 24.)
Kopernicki, Izydor (1930) Textes tsiganes. Krakow: Naklad Polskiej Aka-
demji Umiejtnoci.
Kostov, Kiril (1962) The Vixen and Pirusambi. Journal of the Gypsy Lore
Society XLI:1-2,31-38.
Krasnii, Ali (1981) Cergarende jaga/ Cergarske vatre. Pristina: Jedinstvo.
Krasnii, Ali (1986) Iripe ano dzuvdipe/Povratak u zivot. Pritina: Jedinstvo.
Mszros, Gyula (1976) The Cerhari Gypsy Dialect. Acta Orientalia Aca-
demiae Scientiarum Hungaricae XXX:3, 351-367.
Miklosich, Franz (1874) Mrchen und Lieder der Zigeuner der Bukovina.
Mundarten und Wanderungen der Zigeuner Europas IV. In: Denkschrif
ten der phil.-hist.Cl der Kaiserlichen Akademie der Wissenschaften.
Wien. XXIII, 273-327.
Paspati, Georgios Alexandros (1870) tudes sur les Tchingians ou
Bohmiens de l'Empire Ottoman. Constantinople: Imprimrie Koromla.
Pobozniak, Tadeusz (1964) Grammar of the Lovari Dialect. Krakw:
Pastwowe Wydawnictwo Naukowe.
Sampson, John (1926) The Dialect of the Gypsies of Wales. Oxford: Univer
sity Press.
Senzera, Luigi (1986) II dialetto dei Sinti Piemontesi. Lacio Drom 22:2.
Sowa, Rudolf von (1887) Die Mundart der slovakischen Zigeuner. Gttin
gen: Vandenhoeck und Ruprecht.
Strukelj, Pavla (1980) Romi na Slovenskem. Ljubljana: Cankarjeva Zalozba.
94 NORBERT BORETZKY
Uhlik, Rade (1942) Bosnian Romani (ed. by F.G. Ackerley). Journal of the
Gypsy Lore Society XX, 100-140, XXI, 24-55 and 110-141, XXII,
305-323.
Vekerdi, J. (1984) The Vend Gypsy dialect in Hungary. Acta Linguistica
Academiae Hungaricae 34, 65-86.
Wentzel, Tatjana W. (1980) Die Zigeunersprache. (Nordrussischer Dialekt)
Leipzig: Enzyklopdie Verlag.
VERB EVIDENTIALS AND THEIR DISCOURSE FUNCTION
IN VLACH ROMANI NARRATIVES
YARON MATRAS
0. Introduction
A number of scholars have drawn attention, in a descriptive manner, to
a structural split in the formation of the simple past tense in Romani.
Miklosich (1873: 45) observes that in the third person singular, the past
participle can appear alone, without an auxiliary: "asnilo, risit".1 Similarly,
Sampson (1926: 194) notes that "beside the 3rd. sg. in -as, we find that
Gk.Gyp. and other Southern Eur. dialects use the bare participle sg. in -o
(m.) and -i (f.)". Gjerdman & Ljungberg (1963: 114) add a partial restriction
to "a preterite which is qualified by gender", stating that it tends to appear
with intransitive verbs of motion. Boretzky (1986: 205) refers to the split as
a "peculiar phenomenon", which is restricted to a number of verbs in the
third person singular, while the majority of verbs appear only in the
inflected form, carrying the personal suffix -a/-as. Hancock (1993: 48-51)
classifies the active past participle as an "irregular preterite", and suggests,
in order to avoid confusion with the adjectival or passive participle, "to
incorporate a rule into the standardized grammar that would restrict past
participles to the pre-nominal position".
What is the origin of the split in the form of the third person of simple
past tenses in Romani, and what is the motivation behind its retention with a
number of verbs? Romani is a Balkanized Indic language, a classification
justified by the fact that it shows Balkanic features on the level of sentence
organization. The split in the form of simple past tense verbs of the type
avilas "he/she came" (inflected preterite), vs. avilo "he came", avili "she
came" (active past participle), found in Romani overwhelmingly with verbs
of motion and change of state, is connected, I propose, to the process of
language convergence in the Balkans. At the same time it also reflects the
unique position occupied by Romani within the Indic languages,
96 YARON MATRAS
documenting an intermediary stage in the development from Middle to

Modern Indo-Aryan. It therefore provides a good example for the formula
'functionalization of the Indic stock of forms in accordance with the Balkan
model', which governed the emergence of European Romani on the level of
sentence organization.2
In retaining the split within the system of simple past tenses as
inherited from the transition period from Middle to Modern Indo-Aryan,
Romani not only adheres to the formal model of the contact languages of
the Balkans, but it also adapts to the functional requirements of this model:
the formal split is a functional split. This becomes apparent when the
relevant forms are considered within the context of actual discourse in
natural communicative interaction.
Below I suggest that the nature of the split in Romani strongly
resembles that of some of the other Balkan languages: it enables the speaker
to indicate his/her attitude toward knowledge and the procedure by which
knowledge is acquired and transmitted. By doing so, it adds to the tense-
aspect system an epistemic dimension, referred to as 'evidentiality'.3 In the
following I adopt a pragmatic approach to evidentiality, which basically
agrees with the notion developed by Givn (1982): evidentiality is regarded
as an interactional device which appears with propositions that are
challengeable by the hearer and may therefore require evidentiary
justification.
1. Through ergativity to evidentiality

There are basically two patterns for forming the simple past tense in
modern subcontinental Indo-Aryan: The Central languages, as represented
in Figure 1 by Hindi, form the more conservative type. They have retained
the Middle Indo-Aryan active past participles. The transitive participle in
Hindi generally agrees with the direct object, a feature of Indo-Iranian
ergativity.4 The intransitive participle, which is our concern here, agrees
with the subject in gender and number. This is also the pattern represented
by the active past participle in Romani, which is however, unlike in Hindi,
restricted to the third person singular and in most dialects also to a closed
set of lexical verbs, namely those expressing motion or other change of
state, the majority of them being intransitive (see appendix).
EVIDENTIALS IN VLACH NARRATIVES 97
active participle inflected past

Hindi 1 Bengali
& 'Central' languages & 'Outer' languages
ma y- mi el-um
I come:PART-MAS I come:PART-lSG
"I (m) came" "I (m/f) came"
ma y-
I come:PART-FEM
"I (f) came"
vo y- se el-o
he/she come:PART-MAS he/she come: PART-3SG
"he came" "he/she came"
vo y-
he/she come:PART-FEM
"she came"
Romani ||
me avil-em
I come:PART-lSG
"I (m/f) came"
vov avil-o vov avil-as
he come:PART-MAS he come:PART-3SG
"he came" "he came"
voj avil-i voj avil-as

she come:PART-FEM she come:PART-3SG
"she came" "she came"
Figure 1: The split in third person past tense verbs

in Romani and its positionin Modern Indo-Aryan
The so-called 'Outer' languages of the Indian subcontinent, represented

here by Bengali, have developed an inflected past tense by integrating into
the participle personal copulative forms, which have become personal af
fixes. In these languages there is generally agreement with the subject, both
98 YARON MATRAS
with intransitive and with transitive verbs. This pattern corresponds to the
emergence of inflected forms in Romani, as shown on the bottom right-hand
side of the figure.
Thus Romani generally corresponds to the 'Outer' languages - this is in
fact the case with regard to a series of grammatical formations - but it also
preserves the more conservative active participle.
Bubenik (this volume) suggests two possible explanations for the fact
that Romans does not show ergativity: (i) Proto-Romani had shared the
ergative construction with other Indo-Aryan languages, but lost it after
coming into contact with the (non-ergative) languages of Europe; (ii) The
ancestors of the Roma left the Indian subcontinent before ergativity
ermerged in Indo-Aryan.
The co-existence of the inflected past and active participle in Romani
may be used to argue for the first option, if one were to consider the
participle avilo to reflect the old ergative type. Once ergativity was lost, the
language retained active participles that agreed with a beneficiary which
was also the subject. 'Semantic ergativity' would have thus been allowed to
survive within a nominative-accusative syntactic system. This would
explain why the active participle in Romani is restricted to verbs denoting a
change of state: with such verbs it is the syntactic subject, and not the
object, on which the outcome of the process may be observed.
Alternatively, one could maintain that the participle is pre-ergative,
representing yet an older stage in the development toward ergativity. This
development began with the gradual spread in Old Indo-Aryan of the
intransitive participle, leading ultimately to a generalization of the transitive
participle as well, with the participial forms finally replacing the inflected
past tense in late Middle Indo-Aryan (cf. Chatterji 1926: 938-940). The
stage we witness in Romani would then be that of an active participle which
had not yet established itself with transitive verbs.
Either way, Romani, a nominative-accusative language, still shows
traces of the ergative development in Indo-Aryan by possessing the active
participle alongside the inflected forms.
In inheriting two forms for the past tense, one (the adjectival past
participle) denoting a state which is the result of an action, the other (the
inflected preterite) referring to the action itself and its agent, early Romani
was not unique among the languages of the Balkans. Friedman (1986: 179,
184ff.) describes the development of the Common Slavic aorist and
imperfect (inflected past) into the Balkan Slavic (Bulgarian and
Macedonian) definite past marking the speaker's confirmation of the

information conveyed. The Common Slavic perfect, on the other hand,
acquired the contextual meaning of non-confirmation from its contrast to
the marked confirmative aorist/imperfect.5
Similarly, in Albanian (Friedman 1986: 180ff.) the inverted perfect
developed into the admirative paradigm, which concentrated on the state
resulting from an event rather than on the event itself, and is used to express
unexpectedness (such as surprise, doubt, reportedness, or irony). In Turkish
(Aksu-Ko & Slobin 1986: 165) the historical participle in -mI developed
first into a perfect tense, then into a general inferential suffix indicating
inference from a secondary source of knowledge (see also discussion in
Haarmann 1970: 39-59).
Even though there are distinct invariant functions as well as variant
meanings associated with the various forms of verbal 'evidentials' in the
Balkan languages, and despite language-specific development patterns, the
languages mentioned seem to share a general tendency: they functionalize
an inherited structural distinction in the system of past tense verb formation,
using it to convey the speaker's mode of experience and transmission of
knowledge about a completed event.6 In all the languages concerned,
reference to the state resulting from an event or action is interpreted as the
speaker's restricted access to the process underlying the result.
2. Explicit knowledge vs. situative evidence

The following examples are taken from a 24.000 word corpus of tape-
recorded narratives in a Kelderash/Lovari contact variety, spoken in Ham
burg, Germany by first and second generation immigrants from Poland and
Slovakia. The corpus encompasses narratives embedded into several types
of discourse, including biographical stories, fairy-tales, jokes, contributions
to a political debate and a political lecture.7
On the whole, active participle forms of the type avilo are somewhat
less frequent than the corresponding inflected preterite forms of the third
person singular: there are less than eighty instances in the entire corpus,
involving eighteen different lexical verbs (see appendix). The use of the
inflected past and the active participle by the same speaker in the same
conversation in examples (l)-(2) on the following page provides us with a
first clue for explaining their distribution:
100 YARON MATRAS
(1) a. But zumavenas te integrujin pe,

many they-tried that they-integrate REFL
arakhenas buja ande fabriki,
they-found jobs in factories
b. Bucjaren rodenas ando kapitalismo nevo.
workers they-searched in capitalism new
c. Ale ci pardzilas khanci pa ginduri sar
but not changed:3SG nothing on thoughts how
train e Rom.
they-live the Roma (RF/1/129-131)
a. Many tried to integrate, they found jobs in factories,
b. They were looking for workers in the new capitalism.
c. But nothing changed with regard to the prejudices on how
the Roma live.
The excerpt is taken from a lecture on Romani history. The speaker

recalls the social development which affected the Romani community in
Europe at the turn of the century: as a result of industrialization, new jobs
became available. Many Roma attempted to improve their living conditions
by taking up such jobs, but society remained hostile to them and would not
abandon traditional prejudices. The statement in b. is derived from the
speaker's personal knowledge as an expert on Romani history. It shows the
inflected form, with a personal affix, for the verb "changed" - pardzilas.
Now compare the active participle form of the same verb in example (2):
(2) a. Sa khetane saj phenav tumenge kado:

all together can I-say to-you this
b. Ke sar sas de katar avilam ame
that how was since from we-came we
ande Europa zi adzes,
in Europe until today
c. ci pardzili pa amari situacija
not changed:PART on our situation
khanci.
nothing (RF/1/265)
a. All together I can tell you this:

b. That as it has been since we arrived in Europe and until this
day,
c. nothing has changed with regard to our situation.
In segment c. the speaker is directly addressing the audience. The par

ticiple focuses on a state which is the result of an underlying process or
event, rather than on the speaker's explicit knowledge about this event itself.
The situative evidence conveyed by the form of the participle is accessible
both to the speaker and to the hearers. By using the participle the speaker
calls upon his audience to carry out an inference procedure based on situa
tive evidence, rather than simply accept the speaker's experience or point of
view.
Romani informants, whom I asked to explain the difference between
avilas and avilo, claimed that the latter (active participle) implied a more
recent arrival of the actor. Although such an interpretation could hold for
the opposition between inflected preterite and active participle in (l)-(2),
there is generally no evidence that a language of the Romani 'type' should
possess a form enabling it to categorize the length of time lapse between
event and speech action, except if by a short 'time lapse' we mean reference
to the immediate speech situation.
Now, the claim that the participle is related to the immediate speech
situation needs to be relativized, since there are numerous occurences where
this is not, at least explicitly, the case.8 Nevertheless, the distribution of the
active participle in the documented excerpts of discourse proves that it is
not used at random, but that it carries a different invariant meaning than the
inflected preterite:
(3) a. Numajekh, o Jono arakhdzilo ando Cexo,

only one the Jono was-born:PART in Czech
b. Aj vov kothe ande temnica arakhdzilo.
and he there in prison was-born:PART
c. Ke phari sas e Katica, arakh/ xutinde
because pregnant was the Katica found they-caught
la le stofonsa taj o/ xoxadas la gaja
her the textiles-with and the cheated the woman
taj phandade la, e romni phari.
and they-arrested her the woman pregnant
d. Taj arakhdzilas o Jono ande temnica, no.
and was-born:3SG the Jono in prison well
(Mri/3/6-9)
a. Only one, Jono, was born in the Czech Republic,
b. And there he was born in prison,
c. Because Katica was pregnant, they found/ they
caught her with the textiles and the/ she had
102 YARON MATRAS
cheated the woman and they arrested her, and she

was pregnant,
d. And (so) Jono was born in prison.
In (3), a passage from a biographical narrative, the fact that one of the
family members was born in the Czech Republic, and in prison, is presented
as a curious detail in the family history. The participle is chosen here to
convey an exceptional and surprising state of affairs. The element of sur
prise in segments a.-b. is transmitted by the speaker's presentation of the
facts as being based on circumstantial evidence. On the other hand, the pre
sentation of the same event in segment d. relies on the preceding back
ground information. Here the speaker is drawing a conclusion on the basis
of detailed proof, and it is here that a switch into the inflected preterite takes
place.
While in examples (l)-(2) the distribution of the inflected preterite and
the active participle is connected to the difference between expertise
knowledge and actual situative evidence, in (3) the opposition is used to
express the contrast between virtual evidence for an unexpected state of
affairs and well-established conclusions. The term 'evidential', if used in a
pragmatic, rather than a structural sense, seems to me to capture the in
variant meaning of the form: It is used to approach an event via circum
stantial evidence rather than from explicit, prepared knowledge.
Friedman (1986: 185) rejects the term 'evidential' for the non-con
firmative in Balkan Slavic and Albanian on the grounds that it does not
mark the source of information, but the speaker's attitude toward it. In this
respect Balkan Slavic and Albanian differ from Turkish, where the
possibilities for choosing a form are much more restricted. With respect to
the opposition in Romani, 'source' and 'attitude' do not seem contradictory to
me. As seen in (3), it is the speaker's 'attitude' that motivates the choice of
the participle, which marks the event as inferred from secondary evidence.
The same event can appear again in the inflected preterite after the
background for the presentation has been prepared in advance.
As argued by Givn (1982), evidentiality does not relate to the actual
truth of the proposition, but to the 'contract' drawn between speaker and
hearer in the communicative transaction. Choosing the adequate form for
the simple past tense is a strategy applied in order to ensure that the
assertion is accepted by the hearer. When the information conveyed is
unexpected or exceptional, and the speaker's assertive authority is at stake,
the speaker can resort to the evidential form, thereby disclaiming direct
responsibility for the report on the actual underlying event. Rather than
mark the speaker's plain 'attitude' or 'source of information', evidentiality is
related to the speaker's monitoring of the hearer's preparedness to process
propositions and accept assertions.
The following pair of discourse excerpts provide another example:
(4) a. Aj po dujto var po kaver kurko gelem pale,
and on second time on other week I-went again
aj pale sikhadem o vast,
and again I-showed the hand
b. Ta avilas kothe baro reisoro sas aj pinzardas
and came:3SG there big director was and recognized
ma ke simas aba jokvar, hacares?
me because I-was already once you-understand
c. Aj phendas mange ke kamin von te
and said to-me that they-want they that
khelav filmo.
I-play film (NL/1/104-106)
a. And the second time, the week after that I went

again, and I raised my hand again.
b. And an important film director came and he
recognized me, because I had already been there
once, you understand?
c. And he told me that they want me to act in a film.
(5) a. Rovav, keras.

I-cry we-do
b. So avel? O Gurano, sundas ke leske phral
what comes the Gurano heard that his brothers
othe si, hacares.
there are you-understand
c. Avel, sas les jekh kufero, kade baro.
comes was him one suitcase so big
d. Taj akana avilo, i kaj ame besas, avilo
and now came:PART until where we we-sit came:PART
muro/ kodo, o Gurano.
my this the Gurano.
e. Akana rovas, so te keras, vorta kon avel:
now we-cry what that we-do suddenly who comes
muro dad.
my father (Mri/6/74-78)
104 YARON MATRAS
a. I'm crying, we're doing.

b. Who arrives? Gurano, he heard that his brother is
there, you understand.
c. He's coming, he had a suitcase, this big.
d. And now he came, up to where we were sitting, my/
this one came, Gurano.
e. Now we're crying, what shall we do, suddenly who
arrives: my father.
In (4) the speaker is telling about her adventures during a visit to Uni
versal Studios in Los Angeles. The type of narrative can be classified as a
'report' (cf. Rehbein 1984: 89ff.): She reports on the course of reconstructed
events which she has not only witnessed, but already processed, evaluated
and categorized, and she does so by presenting them in a planned and
serialized sequence. Throughout the entire excerpt the inflected preterite is
used, including the form of the intransitive verb of motion avilas "he came".
By contrast, in (5) the speaker is re-experiencing the original situation.
Events are described in the present tense, as if speaker and hearer were
witnessing them in their original scene. In segment d. the participants are
taken by surprise by the appearance of Gurano, indicated by the active
participle avilo "he came". The state of arrival is first captured on the basis
of perceived situative evidence, from which the underlying event is then
inferred. The linear presentation of events is not pre-planned, but is
constantly re-organized as the hearer is guided through various pictures of
the scene. Connectiveness is achieved through attention markers, such as
akana "now" (segments d. and e.) and staging questions (so avel?, "what's
coming?", in segment b., and kon avel:, "who arrives:", in e.), whereas in
(4) it is marked by simple additive conjunctions, the hearer's attention
having already been drawn to an entire chain of assertions.
Comparing the simple past tenses used in (4) and (5), we see that the
opposition inflected preterite/active participle is connected to the organi
zation of knowledge in discourse. The participle is used to mark information
which is not pre-structured and where the speaker needs to resort to circum
stantial evidence in order to maintain assertive authority.
3. Virtual evidence
Aksu-Ko & Slobin (1986) regard the inferential procedure as in
volving sensory evidence. In (3) we saw, however, that the evidential form
can also occur when the event in question is detached from the sensory
domain of the current speech situation. In such instances it is not actual

evidence which the speaker resorts to, but virtual evidence which is part of
more general knowledge. Compare the next example:
(6) a. Sun, des bers dine les, ( ) ande jekh cir

listen ten years they-gave him in one cell
korkro sute les.
alone they-threw him
b. Kothe sichilo te iskril:
there learned:PART that writes
c. Ungrika, rumunicka, sa o skrimata zanelas
Hungarian Rumanian all the scripts knew
muro dad.
my father (Mri/2/61-62)
a. Listen, they gave him ten years, ( ) they threw

him in a solitary cell.
b. There he learned to write:
c. Hungarian, Rumanian - my father knew how to write
all the languages.
In segment b., the speaker uses the participle or evidential when talking
about the unexpected fact that her father, having had no schooling what
soever, a) learned to write, and b) acquired his skills in solitary confinement
in prison. Once again the participle is used to mark an exceptional, sur
prising event. However, the statement conveyed by the participle is not
based on sensory evidence. In choosing the participle, the speaker avoids re
constructing the actual action of learning, and instead presents the accom
plished fact.9 Rather than assume direct responsibility for reporting on an
unusual event, the speaker concentrates on the results of the event, evidence
for which is less disputable and virtually accessible, as elaborated on in seg
ment c .
(7) a. O Robert Ritter kaj sas o sefo pe amari

the Robert Ritter who was the chief for our
teoreticno/ pa amaro te/ pe teoretino
theoretical for our that for theoretical
mordo te suden ame andre sas pala marimo
murder that they-throw us in was after war
ando Frankfurt doxtoro le forosko.
in Frankfurt doctor of-the town
106 YARON MATRAS
b. Ando panz-var-des-ta-panz-to/ pan-var-des-ta-

in fifty-fifth fifty-
jekh-to bers mulo peske.
first year died:PART REFL (RF/1/246-247)
a. Robert Ritter, who was responsible for our

theoretical/ for our/ for the theoretical murder,
for our detention [in camps], was a municipal
doctor in Frankfurt after the war.
b. In the year fifty-five/ fifty-one he just died.
In (7) the speaker in a political lecture reports on one of the key persons
responsible for the Nazi-genocide on Gypsies, Robert Ritter, who escaped
trial and conducted a normal working life after the war. The evidential form
in mulo, "died" can be interpreted in this particular context as an expression
of irony, the speaker disconnecting himself and his personal conviction
from the fate of the criminal. This contextual meaning is reinforced by the
reflexive peske, literally "to himself", implying that Ritter had lived and
died at his own discretion, without anybody disturbing his peace.
The ironic interpretation of the verb itself is connected to the inference
procedure triggered by the participle: the speaker was not involved in the
actual event, but was only able to witness its outcome (not by being present
or actually seing Ritter's dead body, but by receiving the relevant in
formation). He is therefore not to be held responsible for the provoking
aspects of the event, i.e. for the fact that Ritter died in peace, and not, as he
would have deserved, as a criminal convicted of genocide.
Again, in choosing the evidential, the speaker is monitoring hearer
participation and acting in anticipation of the hearer's possible reactions. He
does so on the basis of a set of common values and attitudes, which allow
him to foresee some of the hearer's own judgements.
The use of the Romani evidential in the preceding examples bears
certain similarities to specific meanings conveyed by the Balkan Slavic and
Albanian non-confirmative, as described by Friedman (1986), such as
unexpectedness, surprise, irony, or reportedness. Friedman regards non-
confirmation itself only as one of the possible contextual meanings of the
indefinite past. Non-confirmation cannot be seen as the invariant function of
the Romani participle either. Examples (2), (3) and (5) do not support a
non-confirmative interpretation since the speaker is well aware of the results
of the event. In (6) and (7) uncertainty may be connected to certain details,
but again the actual occurence is not denied.
In its core the form seems to convey a disclaim of direct knowledge
about the internal course of an event, but does not question the event itself.
However, knowledge about the event is restricted to those pieces of
information which the speaker is able to derive from evidence for the ac
complished result that is available to him. Confirmation of the event
therefore relies on a secondary processing procedure, applied to 'primary'
elements of knowledge pertaining to the result.
With the inflected preterite, the propositional content reflects the
speaker's direct knowledge of a completed event. From this knowledge, the
speaker derives his recognition that the event in question has produced a
result, which may have some bearing on the current speech situation.
Knowledge concerning this result is conveyed implicitly by the predicate
acting in the proposition.
In the evidential, direct knowledge concentrates on the result. Know
ledge about the internal structure of the event resurfaces only as a 'by
product'; it is inferred through secondary processing of the circumstantial
evidence provided by the relevant state of affairs. The reconstruction of an
event by means of the evidential form is therefore epistemic: it requires
further, non-verbal development of existing knowledge.
This opposition between different tasks for processing knowledge
figures in the structural formation of the verb: the inflected preterite is
marked for person; it portrays the direct involvement of an agent in the
action or event. This necessarily implies that the speaker has some
knowledge about the actor, and so about the internal course of the event.
The evidential, however, is only marked for gender. Its adjectival formation
allows a detachment of the perceived result from the actor responsible for
its emergence. Rather than name the responsible actor, the gender suffix
helps locate the referent on which the outcome of the event is detectable.
4. Figurative evidence
So far, two types of evidence were discussed which allow inference
about an underlying process, event, or action, as expressed by the active
participle: actual evidence that is part of the sensory dimension of the
speech situation, and virtual evidence which is embodied within the scope
of accessible, general knowledge. Typical of reconstructive narratives is,
following Rehbein (1989: 166ff.), the transposition of the center of attention
108 YARON MATRAS
(Origo) from the dimension of actual, situative perception (Wahrnehmung)

into the domain of imaginary conception (Vorstellung). The participants in
the interaction assume the roles of witnesses in the original reconstructed
scene of events. As a result of such transposition, knowledge conceived of
in an imaginary setting performs as perceivable evidence and can be used as
a basis of information about an underlying event. I call this type of evidence
'figurative':
(8) ( ) kana avilo muro dad, taj avilo vov -

when came:PART my father and came:PART he
skileti, skileti!
skeletons skeletons (Mri/8/102)
When my father came, and he came - [they were]

skeletons, skeletons!
(9) Akana avilo o kko Nono anel kodo baro/

now cameiPART the uncle Nono brings this big
so ame akana, sa kidisajle o savore
what we now all they-gathered the children
pasa leste.
near him (Mri/6/88)
Now uncle Nono came carrying this big/ now all of

us children gathered around him.
(10) a. No astarlo o rasaj te izdral, te

well started:PART the priest that shivers that
izdral.
shivers
b. Taj dikhel pe leste jekh rom: "Ta so
and looks at him one man and what
hi rasaj a?"
is priest (MCL/4/27-39)
a. Well, the priest started to shiver, to shiver.

b. And one person looks at him: "So what's the matter
priest?"
The participle is used again to express surprise and unexpectedness,

capturing the striking effect of the action on the overall situation, rather than
the internal dimension of the particular occurence itself. In reproducing this
effect the speaker is relying on the hearer's imagination and ability to
picture the original situation. In (8) the description of the two persons who
have just been released from prison (skileti "skeletons") lacks a verb or any
marker of tense, time, or place: it is not presented from the perspective of
the current speech situation, and it directly reproduces the reconstructed
scene. In (9) the time deixis akana "now", repeated in the second part of the
utterance, places the reconstructed event within the present center of
attention. In (10) figurative reproduction relies on the present tense and the
direct quote.
The active participle thus provides a figurative account of the outcome
of an underlying event as perceived in the original scene. While the speaker
claims not to have witnessed the actual event, he shares the information
upon which his assertion is based with the hearer by transposing the hearer
into the imaginary setting of the original scene.
5. Contextual evidence
The following examples document how evidentials can be used to mark
a turning point in the discourse by combining the 'surprise-effect' - unpre-
paredness for a new piece of information - with clues pertaining to the new
event which have already been established as background information. The
speaker resorts to this background information while performing the
inference procedure, thus sharing with the hearer a common basis of
contextual evidence:
(11) () ta dikhlas mura da taj prinardas pe

and saw my mother:ACC and met REFL
lasa aj nale, nakhlotar lasa.
with-her and they-escaped ran-away:PART with-her
(Mri/1/66)
( ) and he saw my mother and he got to know her

and they escaped, he ran away with her.
(12) Muro papo atunci nas lenca, ande kaver

my grandfather then was-not with-them in other
110 YARON MATRAS
foro varekaj sas, kade ke vov las soko taj

town somewhere was so that he took shock and
nas but vrama vareso, xasajlo, ci
was-not much time something disappeared:PART not
anenas kaj-lo, gindinas ke mulas.
they-knew where:PART they-thought that died
(DN/1/76)
At the time my grandfather wasn't with them, he

was in another town, so that he was shocked and he
wasn't there for something like a long time, he
disappeared, they didn't know where he was, they
thought he died.
In both examples the participle is used to foreground established infor

mation. It forms a kind of paraphrase of the previous proposition: "they
escaped" > "he ran away with her", "he wasn't there" > "he disappeared".
The sequence created by the preceding inflected preterites is progres
sive, i.e. it represents the actual order of occurence of the reconstructed
events, each action following the outcome of the previous one. The par
ticiple interrupts this sequence. The information it conveys is processed
'backwards': it is not the action that produces an outcome, but rather the
outcome, based on information repeated from the context, from which the
underlying action is derived. By reversing the progressive flow of events,
the participle singles out a piece of information from the linear sequence,
establishing a discourse climax.
The sudden interruption of the sequence is an unexpected conclusion
drawn by the speaker on the basis of contextual evidence derived from her
own verbalized sequence of assertions. The speaker realizes the effect of the
established facts on the hearer after they have already been presented, and
promotes them retroactively to a decisive turning point in the story.
(13) a. () zanas phiravenas taj bikinenas, jo,

they-went they-travelled and they-sold yes
o kako Bego, muro dad.
the uncle Bego my father
b. Sar areslo o Bego kothe?
how arrived:PART the Bego there
c. So te phenav tuke, o kako Bego
what that I-say to-you the uncle Bego
gelotar te/ ce kothe te lel

went-away:PART that something there that takes
peske romnja taj avilo parpale, no.
REFL wife and came:PART back well
(Mri/5/16-19)
a. They travelled around and did business, yes, uncle

Bego, my father.
b. How did Bego get there?
c. What shall I tell you, uncle Bego went to/
somewhere to get married and he came back, right.
The indented utterance in segment b. of example (13) documents the

surprised reaction of a listener to the assertion presented by the speaker in
segment a.. By using the evidential in areslo "he arrived", the listener indi
cates that he was not expecting the participant Bego to be relevant for this
part of the story. The information that Bego was in fact present is derived
from the contextual evidence provided by the speaker in the preceding
utterance. In c. the speaker reacts by capturing the necessary background
information as a set of accomplished facts, represented by the participles.
Rather than deliver a detailed report on the actual course of events which
led to the state of affairs conveyed by her assertion in a., she disclaims
knowledge on the internal development of this state of affairs, accepting it
as a completed point of departure.
In doing so, the speaker disclaims responsibility for the communicative
gap that has emerged following the hearer's expression of surprise in b..
Having failed to prepare the hearer for the information conveyed by the
assertion in a., the speaker herself, by choosing the evidential forms, claims
not to have been aware of the relevant background details.
6. Planning the discourse

In planned discourse, evidentials are often used to prepare the im
mediate information background for the presentation of a foregrounded
event. The participle represents a completed state, which constitutes a fixed
point of departure. The underlying event is backgrounded as attention is
distracted from its internal structure and focused on its accomplished result.
The information presented thereafter is thus promoted into the center of
attention:
112 YARON MATRAS
(14) a. Sa e dokturi e maj bare besenas othe,

all the doctors the more big they-lived there
kesave eh/ Zidovuri.
such Jews
b. Aj kana eh/ o marimo zanes sa line len
and when the war you-know all they-took them
avri.
out
c. Aj jekh kher asilo taj amen othe
and one house remained:PART and we there
andre gelam.
in we-went (Mri/10/21-22)
a. All the most important doctors lived there, such eh/ Jews.
b. And when eh/ the war, you know, they deported them all.
c. And there remained a house and we moved in there.
(15) Kade kana djas la, avili voj khere,

so when gave her came:PART she home
asundas o imperato kado.
heard the king this (Mri/1/25)
And so, when she gave it to her, she came home, the king
heard about this.
In (14), the evidential forms are incorporated into a pre-planned para-

tactic structure. In (15) the participle interrupts an open hypotaxis: in sepa
rating the actual subordinated clause from the main clause, it delays the
initialization of the complex sentence, adding emphasis to the last part of
the utterance.
Such strategies for backgrounding information in discourse are found
to be configurated on the sentence level in a widely occuring type of com
plex constructions. Here, contextually presupposed or other background
knowledge resurfaces in the subordinated clause in preparation of new
information, which is carried by the main part of the sentence. Background
and foreground functions are thus integrated within the structural com
plexity of the hypotaxis:
(16) a. Aj phabardas o vine kakala sa aj ingerde

and burned the veins these all and they-brought
le ande spital.( )
him in hospital
b. Kade kana sastilo, sastilo, line
so when recovered:PART recovered:PART they-took
taj/ von taj nasle duj ene
and they and they-escaped two people
ando Cexo.
in Czech (Mri/1/53-55)
a. And he burned his veins, all these, and they brought him to
the hospital.( )
b. So when he recovered, he recovered, they got going and/ and
the two of them escaped to the Czech Republic.
(17) Anda kodo turne xacarena va ke

From this you you-understand:FUT already that
angla tranda-e-tri-to bers kana avilo Hitlero
before thirty-third year when came:PART Hitler
aba ando Njamco sas o nacijonalnosc.
already in Germany was the nationalism (RF/1/172)
From that you will have already understood that before the
year thirty three, when Hitler came, nationalism already
prevailed in Germany.
(18) Ando jekh-mija-ta-enja-sela-to bers, ande

in one-thousand-and-nineteen-hundredth year in
kodola bers kana astrdzili e industrijacija,
those year when began:PART the industrialization
astarde e Rom te arakhen peske thana
they-began the Roma that they-find REFL places
pasa foruri, ke ci maj nasavenas le kade
near towns because not more they-drived them such
sar maj anglal.
as more before (RF/1/127)
In the nineteen hundreds, in those years when

industrialization began, the Roma began looking for places
in the vicinity of the towns, since they were no longer driven
away as they had been before.
114 YARON MATRAS
Note that in (17)-(18), evidentials appear in parenthetic subordinations

which mirror a date mentioned in the preceding part of the utterance. The
participle portrays a situation corresponding to the relevant point in history,
and it is not intended to represent the internal stretch of an event or action.
While verbalizing the subordination, the speaker seems to be already
aware of the propositional content with which he wishes to proceed.
Parenthetic subordinations, as in (17)-(18), prove to be carefully inserted
into a pre-planned utterance structure. Having already set the goal for the
speech action, the speaker treats supplementary background information
syntactically according to its pragmatic status: he subordinates it, allowing
for its incorporation into the utterance without revising the overall plan.
Moreover, by subordinating the additional description provided by the
evidential, the speaker is actually indicating that he already has at his
disposal a plan for the further development of the utterance.
7. Reformulating interactional evidence

In examples (11)-(12) we encountered evidentials used to paraphrase
preceding parts of the utterance. The last example for this paper, presented
in this section, is also connected to reformulating speech actions. Here,
evidentials are used to recapitulate an entire chapter of the story, while the
content of the story is treated as an accomplished result of the preceding
interaction.
The hearer H (represented in the transcript by the indented utterances)
is interviewing the speaker S on episodes in her biography. The interview is
being tape-recorded, and the excerpt quoted in example (19) is taken from
the beginning of a new cassette. The short break while changing cassettes
creates a gap in the interaction, which H attempts to close by returning S to
the point where her story was interrupted:
(19) a. Taj atunci,

and then
b. Las la o kko D/o kko Roko
took her the uncle the uncle Roko
c. o Roko las la.
the Roko took her
d. No,
right
e. Taj vov atunci kana anklisto avri o
and he then when came-out:PART out the
Roko sas ando kher le dilengo sute

Roko was in home of-the crazy they-threw
les, co dad,
him your father
f. Ja.
yes
g. ... pala kodo sar kurisardas tuke,
after that how beat you
h. Jo.
yes
i. Asilo kothe dopas bers, anklisto avri
stayed:PART there half year came-out:PART out
j. Aha,
k. Taj las peska romnja ( ), las peska romnja
and took his wife took his wife
aj gelotar.
and went-away:PART
1. Gelotar/ te phirel.
went-away:PART that travels
m. Aj gel/ gelo ke kodo Iva.
and went went: PART to this Iva
n. Ke kodo Iva.
to this Iva
o. Aj o Iva sas les o sefo Aleksandro.
and the Iva was him the chief Alexander
p. O Iva das duma te lel peske romni ( ),
the Iva gave word that takes REFL wife
zanes,
you-know
q. Kodo Aleksandro ...
this Alexander
r. Taj kuko/ aj les trade les ande Rumanija
and that-one and him they-sent him in Romania
coro, aj kana avilo...
poor and when cameiPART
s. Abschiebung dine les,...
expulsion(GERMAN) they-gave him
t. Jo, taj kana avilo ande Rumanija...
yes and when cameiPART in Romania
u. ... ke vov sas pe xoxamne lila kothe.
because he was on forged papers there
v. Jo. Taj kana avilo ande e Rumanija kothe
yes and when cameiPART in the Romania there
Bego las les ando vast,
Bego took him in arm
116 YARON MATRAS
w. Ale ci maj traisardas but.

but not more lived much
x. Kade sundam kodi vorba, kade so Rok/
so we-heard that word so what Rok
y. Kodo Iva gardila maj but sar panz bers,
that Iva hid more much as five years
z. Kana sunas/ sunelas ke kaj sam
when hear heard that where we-are
naselas,
fled (Mri/7/5-29)
a. And then,
b. UncleDz/Vuncle Roko took her.
c. Roko took her.
d. Right,
e. And then when he came out, Roko was in a
psychiatric hospital, they sent him, your father,
f. Yes.
g. ... after he had beaten you,
h. Yes.
i. He stayed there half a year, and came out.
j. Aha,
k. And he took his wife ( ), he took his wife and left.
1. He left/ to travel.
m. And he wen/ he went to this Iva.
n. To this Iva.
o. And Iva had a boss, Alexander.
p. Iva was talking about getting married ( ), you know.
q. That Alexander...
r. And that one/ and him they deported him to Romania, the
poor guy, and when he came...
s. They gave him an Abschiebung' [=expulsion],
t. Yes, and when he came to Romania...
u. ... because he was there with forged documents.
v. Yes. And when he came to Romania this Bego helped him
out.
w. But he didn't live much longer.
x. Anyway, we heard this story, what Rok/
y. This Iva hid for more than five years.
z. When he hear/ when he heard where we were, he ran away.
The communicative function of the evidentials in the excerpt is easy to

reconstruct on the basis of a mapping of the discourse positions involved:
Segments a.-o.: Recapitulation by H

H dominates; he reformulates the story as told previously by S. With
the exception of segment b., where S participates in recapitulating
the story, S's role is generally restricted to signalizing confirmation,
with the additional elaboration on a minor detail in segment 1. (te
phirel, "to travel").
Segments p.-q.: Intervention by S

S first assumes an active role in contributing new information to the
story. H, however, continues, disregarding S's intervention.
Segment r.-u.: Anouncement by S

S attempts to take over the narrator's role, anouncing a new chapter:
she establishes a new scene by inserting a temporal subordination
(kana avilo, "when he came"). H loses the main turn, and is now
commenting on information provided by S in segment r.. H's
comments interrupt S's attempts to start a new chapter in the story. S
repeats her anouncement several times.
Segment v. to end: Presentation by S

S succeeds in taking over the narrator's role, completing the
anouncement of a new chapter and entering the chapter itself.
How are the evidentials distributed among these positions in the dis
course excerpt? Quite simply: there are two types of evidentials. The first
appears in the recapitulating phase, in segments a.-o., and is used to mark
all verbs of motion or change of state that appear in the summarized story:
anklisto, "went out", asilo, "remained", gelotar, "left". The story itself is
thereby promoted to an accomplished result of the underlying narrative; it is
captured in its entireness as a completed fact. Rather than be developed
progressively from an internal or involved perspective, it is rediscovered
from an external, passive perspective, having already reached its terminal
point.
The second type of evidentials appears in segment r. (avilo,"came"),
and is repeated in segments t. and v.. This type corresponds to the
evidentials discussed in the previous sections, i.e. to those used within a
complex sentential frame for planning the discourse: it appears in a
118 YARON MATRAS
backgrounded, preposed temporal subordination, which is used to anounce a

new chapter in the story.
In using evidentials to recapitulate an already told story, the speaker (in
example 19 the hearer, H) treats an entire interactional episode as
circumstantial evidence. He signalizes that his ability to outline the course
of the reconstructed events is not based on direct knowledge of those
events, but rather on the previous presentation of the entire story by another
speaker (in this case: S).
But as segment 1. shows, even S, who is the source of the information,
participates in this mode of reformulating her own story. H and S thus both
approach the part of the story already told via its result, backgrounding the
entire chapter and treating it as a point of departure for a new phase in the
interaction. The evidentials in (19) can therefore be said to mark a specific
position within the pattern of narration, namely that involving mutual
control of common knowledge before new knowledge is transmitted.10
8. Conclusion
Using comparative typology and discourse pragmatics in a combined
approach, I attempted to illustrate the function of a grammatical category in
Romani which has until now been treated as an 'outcast' by most structural
descriptivists of the language. To summarize: the active participle conveys
the speaker's restricted knowledge of the internal course of events, and
concentrates instead on their outcome. The underlying event is approached
via external circumstantial evidence for its occurence, such as the state
resulting from it.
Although evidentials can superficially be interpreted as an indication of
the speaker's access to or attitude toward knowledge, when taken in their
actual communicative context they turn out to appear when the speaker
needs to resort to circumstantial evidence in order to maintain assertive
authority, for instance in anticipation of the hearer's disbelief or criticism.
Operating within the basic functional scope of the form, the speaker is
therefore not indicating doubt, but rather consolidating interactional author
ity in assertions the propositional content of which is, as described by Givn
(1982: 24), "open to challenge by the hearer and thus requires - or admits -
evidentiary justification".
As seen in a number of examples, notably in (3), (5) and in virtually all
the examples in sections 6 and 7, such functions may be typical of specific
positions in interactional patterns, and therefore contribute to the organi-
zation of entire actions of speech in discourse. Evidentials thus prove to be

discourse-related categories, the scope of which reaches beyond qualifying
single propositional contents.
From a typological point of view, Romani occupies a special position
within Indo-Aryan. It shows inflected past tense forms with subject-concord
and no ergativity, corresponding to the eastern branch of the 'Outer'
languages (e.g. Bengali). On the other hand, the presence of an active past
participle with beneficiary-concord still links it, though remotely, to the
'Central' languages. The result is a mixed type, partly resembling that of
Gujarati, where an adjectival form of the participle in -l survives
alongside the perfect tense formation in -y (cf. Masica 1991: 300-302).
In functionalizing the old participle and deriving from it a past tense
evidential, Romani, or at least the Vlach variety discussed here, shares a
Balkan areal feature with Balkan Slavic, Albanian and Turkish, but it also
appears to join a vast linguistic area in which evidentials are
grammaticalized in the verb system. This area stretches from central Asia,
to the Baltic region in the northwest, and to the Balkans in the southwest,
encompassing various Siberian, Turkic, Finno-Ugric and Caucasian
languages (see discussion in Haarman 1970).
The fact that the evidential system in Romani is confined to the third
person singular need not necessarily be attributed to the donor structure (the
adjectival participle); it could also be appreciated as further proof of a
universal tendency to justify propositions that are less certain from the point
of view of 1) general human experience, and 2) the specific experience the
speaker and the hearer share (cf. discussion in Givn 1982). Such
propositions tend to exclude events in which the speaker or the hearer are
themselves active participants, for obvious reasons (cf. Anderson 1986:
277).
Although the process of language convergence in the Balkans provides
an attractive explanation for the emergence of the evidential category in
Romani, and indeed it is difficult, owing to the general development in
Romani, to view the existence of evidentiality in Romani and in other
Balkan languages as purely coincidential, this explanation is partly obscured
by a number of circumstances. First, as pointed out convincingly by
Boretzky (personal communication), there appear to be contemporary
Balkan dialects of Romani where use of the active participle is not subject
to speaker's choice, but where past-tense forms are arranged in a structural
complementary distribution, participles being employed with intransitive
120 YARON MATRAS
verbs of motion, while inflected forms are used with all other verbs. This
could to be taken as a reflection of an older stage of the language, before
evidentiality emerged, but it would indeed be difficult to explain why
Balkan convergent development 'skipped' such varieties.
Furthermore, it is not clear how exactly contact could have contributed
to the emergence of evidentials in the Vlach variety, since Vlach Romani of
the type discussed in this paper has had its most intensive contacts in the
Balkans with Greek and Romanian, both languages that lack evidentiality. It
is possible to counter this doubt with common sense, figuring that there
must have been contact at least with Balkan Slavic before reaching the
Romanian territories. But here the evidence presented by Kostov (1973) on
borrowed evidentials in Bulgarian Romani provide a second challenge,
since it is structural borrowing that is involved, and not functionalization of
the inherited stock of forms.
However the counter-arguments are pursued, I believe enough evidence
was presented in this paper to show that the active participle in Vlach
Romani is a functional category, and as such bears a certain typological
affinity to the 'evidential' categories in Balkan Slavic, Albanian and
Turkish, and that Vlach Romani in this respect shares an areal feature with a
number of historically contiguous languages. Given that documentation on
the early period of Romani which could be relevant to our discussion is
non-existent, it is not surprising that a diachronic hypothesis cannot succeed
in illuminating all points in a fully satisfactory manner.
Appendix
List of the evidential forms found in the Hamburg Romani Corpus
anklisto "came out"

arakhdzilo "was born"
areslo "arrived"
asilo "remained"
astrdzili "began"
avilo, avili "came"
dilajlo "gone crazy"
gardzilo "hid"
gelo(tar) "went (away)"
mulo "died"
nakhlo(tar) "passed (by)"

naslo(tar) "went through", "escaped"
phdzili "broke"
pardili "changed"
sastilo "recovered"
sichilo "learned"
xasajlo "disappeared"
zaklucisajlo "joined"
NOTES
1 Miklosich interpreted the personal endings of the inflected past tense as auxiliaries,
derived from the present copula form (cf. discussion in Bubenik, this volume).
2 It was Rdiger, the pioneer of Romani linguistics, who first suggested that Romani
was using inherited Indic forms to "copy" the structure of a European contact language
(Rdiger 1782: 71). Miklosich concluded from his survey of Romani dialects that the
primary contact language for all Romani dialects was Greek, and that a Greek-speaking
area had been the early European homeland of the Roma (Miklosich 1873: 4). Balkanic
features of Romani syntax are discussed by Kostov (1973: 105ff.), Friedman (1985),
Boretzky (1993: 98), and Matras (1994; in print).
3 In their editors' introduction, Chafe & Nichols (1986) regard evidentially as a gram
matical device used by speakers to convey the source and reliability of their knowledge.
Likewise, Anderson (1986) defines evidentials as a justification of a factual claim. Typical
meanings associated with evidentials are hearsay and inference from an observed result (cf.
Haarmann 1970; Aksu-Ko & Slobin 1986). For a discussion of definitions and types of
evidentiality cf. Willet (1988).
4 Exceptions are animate as well as definite direct objects, which take the accusative
suffix -ko. Here the verb assumes a 'neutral' form.
5 According to Kostov (1973: 107-108), the Bulgarian participial suffix -/ is borrowed
into the Romani dialect of S liven, where it attaches to the verb, as in Bulgarian, in various
tenses (imperfect, preterite, anterior future) to indicate hearsay or indirect evidence.
6 Perfect tense forms are generally an attractive source for forming evidentials in lan
guage since their resultative meanings may be drawn upon to focus on the result, rather
than the event itself (cf. discussion in Willet 1988; Haarmann 1970).
7 The data were collected as part of a larger study on Romani Grammar and Discourse
(see Matras 1994). I wish to acknowledge a University of Hamburg grant and support from
the German Research Foundation, provided through the Program on Multilingualism and
Language Contact at the University of Hamburg, which made it possible for me to collect
and analyze the data.
8 Anderson (1986: 277) even claims that evidentials are rarely used when the fact
reported on is directly observable by both speaker and hearer. This has been mentioned by
Givn (1982), who regards "deictic obviousness" as high on the certainty scale of pro
positions where evidentiality is not required.
122 YARON MATRAS
9 sich-, "to learn", is a transitive verb. Nevertheless it involves a change of state, since
the outcome of the process can be observed on the subject, a learned person.
10
See Rehbein (1984, 1989) and Ehlich & Rehbein (1986) for a theory of repetitive
discourse patterns.
REFERENCES
Aksu-Ko, Ayhan A. & Dan I. Slobin (1986) A psychological account of

the development and use of evidentials in Turkish. In: Chafe, Wallace
& Johanna Nichols (eds.) Evidentiality: The Linguistic Coding of
Epistemology. Norwood: Ablex. 159-167.
Anderson, Lloyd B. (1986) Evidentials, paths of change, and mental maps:
Typologically regular assymetries. In: Chafe, Wallace & Johanna
Nichols (eds.) Evidentiality: The Linguistic Coding of Epistemology.
Norwood: Ablex. 273-312.
Boretzky, Norbert (1986) Zur Sprache der Gurbet von Pristina
(Jugoslawien). Giessener Hefte fr Tsiganologie 3:1-4, 195-216.
Boretzky, Norbert (1993) Conditional sentences in Romani.
Sprachtypologie und Universalienforschung 46:2, 83-99.
Chafe, Wallace & Johanna Nichols (eds.) (1986) Evidentiality: The
Linguistic Coding of Epistemology. Norwood: Ablex.
Chatterji, Suniti Kumar (1926 [1970]) The Origin and Development of the
Bengali Language. London: George Allen & Unwin.
Ehlich, Konrad & Jochen Rehbein (1986) Muster und Institution.
Untersuchungen zur schulischen Kommunikation. Tbingen: Narr.
Friedman, Victor A. (1985) Balkan Romani modality and other Balkan
languages. Eolia Slavica 7/3, 381-389.
Friedman, Victor A. (1986) Evidentiality in the Balkans: Bulgarian,
Macedonian and Albanian. In: Chafe, Wallace & Johanna Nichols
(eds.) Evidentiality: The Linguistic Coding of Epistemology. Norwood:
Ablex. 168-187.
Givn, Talmy (1982) Evidentiality and epistemic space. Studies in
Language 6-1, 23-49.
Gjerdman, Olof & Erik Ljungberg (1963) The Language of the Swedish
Coppersmith Gipsy Johan Dimitri Taikon. Uppsala: A.B. Lundequist.
Haarmann, Harald (1970) Die indirekte Erlebnisform als grammatische
Kategorie. Eine eurasische Isoglosse. Wiesbaden: Harrassowitz.
Hancock, Ian (1993) A Grammar of Vlax Romani. Austin: Romanestan.
Kostov, Kiril (1973) Zur Bedeutung des Zigeunerischen fr die Erforschung
grammatischer Interferenzerscheinungen. Linguistique Balkanique
XVI-2, 99-113.
Masica, Colin P. (1991) The Indo-Aryan Languages. Cambridge:

Cambridge University Press.
Matras, Yaron (1994) Untersuchungen zu Grammatik und Diskurs des
Romanes (Dialekt der Kelderasa/Lovara). Wiesbaden/ Berlin: Har-
rassowitz.
Matras, Yaron (in print) Structural Balkanisms in Romani. In: Reiter,
Norbert (ed.) Spachlicher Standard und sprachlicher Sub-Standard in
Ost- und Sdosteuropa. Berlin: Osteuropa Institut.
Miklosich, Franz (1873) ber die Mundarten und Wanderungen der
Zigeuner Europas III. Wien: Karl Gerold's Sohn.
Rehbein, Jochen (1984) Beschreiben, Berichten und Erzhlen. In: Ehlich,
Konrad (ed.) Erzhlen in der Schule. Tbingen: Narr. 67-124.
Rehbein, Jochen (1989) Biographiefragmente. Nicht-erzhlende
rekonstruktive Diskursformen in der Hochschulkommunikation. In:
Kokemohr, Rainer & Winfried Marotzki (eds.) Studentenbiographien I.
Frankfurt: Lang. 163-254.
Rdiger, Johann Ch. Ch. (1782 [1990]) Von der Sprache und Herkunft der
Zigeuner aus Indicn. Reprint of: Neuester Zuwachs der teutschen,
fremden und allgemeinen Sprachkunde in eigenen Aufstzen, 1. Stck,
Leipzig 1782, 37-84. Hamburg: Buske.
Sampson, John (1926 [1968]) The Dialect of the Gypsies of Wales. Oxford:
Clarendon Press.
Willet, Thomas (1988) A cross-linguistic survey of the grammaticization of
evidentiality. Studies in Language 12-1, 51-97.
NOTES ON THE GENESIS OF CAL AND OTHER
IBERIAN PARA-ROMANI VARIETIES
PETER BAKKER
University of Amsterdam
0. Introduction
This paper deals with the different varieties of Romani spoken in or
originating in the Iberian peninsula, especially Cal (Spanish Romani).
Iberian Romani is taken as the collective name for the varieties of Romani
spoken on the peninsula. These are now all reported (perhaps unjustly) to be
extinct or close to extinct. The argument presented here is therefore based on
publications rather than fieldwork. Virtually all the material published shows
the complete loss of the inherited grammatical system and an adoption of the
grammatical system of the host region.
Below, I present data which may help uncover the genesis of Cal. First,
I discuss the mixed Romani dialects in general (section 1) and some historical
factors related to the Gypsies in the Iberian peninsula (section 2). Then, I
briefly discuss the place of Iberian Romani among Romani dialects (section
3). Next, I discuss some structural features of Cal, in order to assess the
nature of the mixture in Cal (section 4), and provide some data on the other
Para-Romani dialects of the Iberian peninsula (section 5). After that I briefly
mention the social functions of Cal (section 6). Furthermore, I discuss the
oldest sources of Iberian Romani in order to assess the possible origin
(section 7). With these historical, historical-linguistic, structural and clas-
sificatory facts in mind, a hypothesis is formulated concerning the genesis of
the Iberian Para-Romani dialects, in particular Cal (section 8).
As yet, there has been no attempt to explain the genesis of Cal itself,
save in some more general overviews. These were based only on a limited
number of sources.
126 PETER BAKKER
1. Para-Romani languages
Until today, about ten varieties of Romani have been identified which
have a Romani lexicon but which have lost the native grammatical system.
Instead, these dialects use the morphosyntax of the surrounding languages
(see Bakker & Van Der Voort 1991 and especially Boretzky & Igla 1994).
These are called Para-Romani languages, a term originally coined by Marcel
Cortiade. They roughly display the following characteristics: the vocabulary
is Romani (hence basically Indic), but nearly all the phonology, morphology
and syntax are non-Romani. Although all Romani dialects borrow heavily
from the languages of the host country, no cases are reported of languages
with a Romani grammatical system and a non-Romani lexicon. Until now,
Para-Romani languages have been described, documented or mentioned in
the literature in connection with the grammatical systems of Swedish, Nor
wegian, German, Catalan, Portuguese, English, Basque, Spanish, Greek,
Persian, Turkish, and Armenian. All these varieties must be seen as lan
guages in their own right, and not as dialects of Swedish etc. or Romani.
Several theories have been put forward to account for the genesis of these
dialects, ranging from saving a dying language by preserving the lexicon,
through gradual massive grammatical borrowing, the conscious creation of a
mixed language, relexification to language intertwining - the combination of
one lexicon with the morphosyntax of another language. Not all of these are
mutually exclusive, but the debate on their genesis is still going on.
The Para-Romani language of the Spanish part of the Iberian peninsula is
relatively well documented. Both speakers and outsiders have written down
and published vocabularies or grammatical studies of this language. Almost
all of this, however, is in languages other than English. This study will in
troduce this language to an English speaking audience.
The language is generally called Cal [kalo] in the literature. Un
doubtedly the Romani word kalo "black" is the source. It is used both as an
ethnic self-designation and as a name for the language. Speakers also call it
Romano (Quindal 1867: 49), a nominalized adjective derived from the noun
Rom 'Gypsy, man' with the Romani suffix -ani/-ano.
Cal as a Para-Romani language should not be confused with other
languages and slang varieties which are also called Cal. All these languages
have in common that they are cryptolectal or in-group languages embedded in
Spanish structures. This name for non-Romani languages is reported from
places as far as California (Polkinhorn et al. 1983). Although such crypto
lectal languages may have borrowed some words from Romani or Gypsy
GENESIS OF CALO 127
Cal, they will not be discussed here. When we mention Cal in this paper, it
refers exclusively to the Gypsy language of that name, as spoken by people
of Gypsy descent in Spain, Portugal and South America.
As a Para-Romani language, Cal is particularly interesting for three main
reasons. First, there is a lot of material, covering two or perhaps even three
centuries. There is no other Para-Romani dialect documented over such a
long period of time.
Second, in contrast to most other Para-Romani languages, Cal also
makes abundant use of cryptic devices, apparently meant to conceal the
meaning of the communication from outsiders.
Third, apart from Cal, which has an (Andalusian) Spanish grammatical
system, there appear to be several other varieties of Para-Romani languages
which came about under contact with languages of the Iberian peninsula, viz.
one with a Catalan grammatical system, one with a Portuguese grammatical
system (in Brazil), and one with the non-Romance language Basque. These
varieties all seem to be derived from one branch of Romani as they all share
some lexical particularities apparently not shared with other varieties of
Romani (see section 4).
2 . Gypsies in the Iberian peninsula

If the description of "Egyptian" acrobats and jugglers by the Byzantine
writer Nicephorus Gregoras indeed refers to Gypsies, the first Gypsies may
have reached the Iberian peninsula in the first decades of the fourteenth cen
tury (Fraser 1992: 48). The earliest undeniable reference to Gypsies dates
from 1425, when a group was granted safe-conducts in Aragon and Asturia
(Fraser 1992: 76). They are first mentioned in Andalusia in 1462, in Portu
gal in the first decades of the sixteenth century, and in Navarra in the 16th
century. Since then there has been a continuous presence of Gypsies on the
peninsula.
The number of Gypsies today in the Iberian peninsula is estimated at
between 300.000 and 700.000. They live in both urban and rural areas, with
major centres in Madrid and Andalusia. For all of them (with the exception of
some newly arrived groups), the dominant and first language today is
Spanish, particularly the Andalusian dialect. Many Gypsies know at least
some Cal words, but even the most knowledgeable speakers may not know
more than fifty or a hundred words (McLane 1977, 1985). There seems to
have been an ungoing decline of the languages for a long time, resulting in a
large-scale shift to Spanish, with an occasional Cal word.
128 PETER BAKKER
3. Lexical classification of Iberian Romani among Romani

dialects
As any other language, Romani consists of dialects. These are grouped
into clusters of dialects, but there is no single accepted classification. In ge
neral, these classifications are based on three factors: internal sound devel
opments in words inherited from Indic and Persian languages, shared or dif
ferent grammatical features and the main source languages of borrowed items
(hereby excluding the language of the host country). The first two are of
course common in any dialect classification, but in the Spanish case the
grammatical system is lost and hence cannot be used. The use of borrowed
items is rather typical of Romani dialectology alone.
On grounds of the lexicon (phonological peculiarities of inherited words
and source language of the borrowings), Cal and the other Iberian dialects
have been classified differently (Kaufman 1979, cf. Hancock 1988). There
are a few words used in the Iberian dialects of Romani which deviate in
certain ways from all other varieties. This means its classification as a
different branch seems justified. The clearest cases are the following:
(1) The word for "boy" and "girl" are raklo and rakli in Romani, but lakri in
the Basque Country, lacrollacrin in Brazil, lacro and lacri in Cal.
(2) The words for "father" and "mother" are dad and dai in common Romani,
but bato and bata in Brazil, bato and bati in the Basque Country, bato and
bata in Cal, bato (and dai) in Catalonia.
Further research may reveal other such Iberian similarities. On the
grounds of the lexicon, different classifications have been proposed. Most
researchers classify Iberian Romani as relatively isolated among Romani dia
lects: Kaufman (1979) makes it a separate branch of European Romani, on a
par with the Northern, Balkan, Vlach, Greek and Zargari dialects. Kenrick
(see Hancock 1988) classifies Iberian Romani as a Balkan dialect on a par
with Vlach and southern Balkan Romani. Boretzky (1992) shows that Cal
has a significant number of words in common with the Northern dialects, in
particular Sinti and Angloromani. Iberian Romani, therefore, seems to be
historically a separate branch, most closely related to the Northern dialects
and non-Vlach Balkan dialects.
4 . Some Romani and Spanish derived features

Cal is not a spontaneous ad-hoc mixture of Romani and Spanish. The
Spanish part (at least in the 19th century) differs from the varieties of Spanish
spoken locally, and in some cases Spanish is not spoken locally (e.g. in
GENESIS OF CAL 129
Portugal, see below). In some cases, Gypsies speak archaic varieties of a

language, e.g. the variety of Catalan spoken by the Gypsies near Perpignan
in North Catalonia. In this section, I will present some of the differences.
4.1. Andalusian features

The Spanish component of Cal is not ordinary standard Spanish but
rather the Andalusian dialect. This is true to a certain extent for all sources of
Cal, not only in Andalusia but also in Madrid. Apparently, Cal found its
origin in Andalusia, from where it spread to other areas. In Portugal, for
instance, Calao has an Andalusian Spanish rather than Portuguese base. Al
so, Cal retains some archaic features lost in modern Spanish. For instance,
old Spanish ende "since", modern Spanish desde, Cal ende (Keller 1892).
A Catalan influence is also suggested in words such as matejo "self", Catalan
mateix, Castilian mismo.
Andalusian Spanish differs from standard Spanish in a number of points.
a) Spanish // (orthographic '11') is pronounced [j].
b) final consonants tend to be dropped, even those of grammatical endings
(for instance plural -s).
c) The phonemes /s/ and // are to an extent in free variation.
d) /d/ and /g/ between vowels are often dropped, e.g. Spanish asadura, And.
saura, Spanish pasado, And. pasao. This is also found in other Spanish
dialects.
e) 'confusion' between (some?) /r/ and /l/. Spanish el, And. /er/ or Id.
f) alternation (some words) Cast, /h/, And. /f/
g) prothetic morphemes a-, des-, en-, es-.
The fact that we find these Andalusian features in all Cal varieties, sug
gests that they came into being in Andalusia and later spread to other parts of
the Iberian peninsula. (For details, see Boretzky 1992: 32). Catalonian Romani
Castilian (Boretzky 1992: 29-34).
4.2. Phonology
The phonology of Cal is Spanish; all Romani words are adapted to the
phonology of Andalusian Spanish. This means that Romani phonemes not
existing in Spanish are lost in Cal. For instance, in Cal there are no
aspirated stops. Romani aspirated stops become unaspirated stops in Cal
(except /th/ which becomes //). It is clear, however, that the aspirated stops
were still used when the Gypsies arrived on the Iberian peninsula. The
130 PETER BAKKER
aspirated /th/ of Romani became // rather than /t/, for instance chem /em/
'land' < Romani them, chute /cute/ Romani thud 'milk' (Boretzky 1992).
Further, the voiced affricate /d/ became // in Cal and NI became /b/, etc.,
following Spanish phonology.
One of the features of Cal inherited from Romani is its stress pattern, or
at least some aspects of it. Whereas Spanish words rarely have stress on the
final syllable, this is commonly so in Romani and Cal (inherited from
Romani). See for instances the stress markers in example (1) below. This
may give Cal a distinct flavour.
Phonotactic constraints seem, at first sight, to be the same in Cal and
Spanish, in that the syllable structure is identical. However, there are ex
ceptions as well: in Cal there are words ending in -m, but these do not seem
to exist in Spanish. A recent study suggests that the phonotactic constraints of
Spanish and Cal are not completely identical. This thought-provoking study
by Dietz and Mulcahy (1988) compares the combination of letters in a Bible
chapter in Castilian and Cal. They conclude on the basis of statistical
evidence "that Cal and Castilian differ greatly in the way they form and
distribute basic linguistic particles" (i.e. 'letters' or perhaps, by implication,
phonemes). For example, in the samples of the same text in Spanish and Cal
versions, the phoneme // (the digraph "ch") was counted 2.056 times in the
Cal text and only 242 times in the Spanish text, a difference of nearly 850 %
(Mulcahy & Volland 1986: 144 n. 4).
How can we explain this? There are several possibilities. With very few
exceptions, Cal phonemic structures are possible in Castilian. This is to be
expected, since the Romani words are adapted to Spanish phonology. It may
be, however, that the frequency of certain combinations of phonemes differ
in both languages because the lexicon of Cal remains basically Romani and
follows some non-Castilian features of Romani. For instance, Romani /c/,
//, /h/ and /th/ all became // in Cal, which could explain the high pro
portion of //. In fact, this is what the results of Dietz & Mulcahy's study
suggest.
Whatever the explanation is, we agree that Cal cannot simply be a
'Gypsified' version of Castilian (Dietz & Mulcahy 1988: 15).
4.3. Morphology
Cal uses Spanish derivational affixes, such as the diminutives -ico, -ito,
the superlative -isimo, the adverbial ending -mente, etc.. Apart from that, it
GENESIS OF CALO 131
uses a number of prefixes (apparently without any meaning) to verbs, such as

a-, en-, de-, des-, some of these derived from Andalusian Spanish.
Inflection is also Spanish. For instance, it has the nominal plural
inflection -s and whole verb paradigms from Spanish. In Cal, all verbs are
conjugated according to the class of Spanish verbs ending in -ar which is the
most regular and unmarked class. This class is also used for borrowed verbs
in Spanish. From this point of view, the Romani verb in Cal behaves like a
borrowed verb in Spanish.
Although Cal morphology is so close to Spanish, it does preserve some
Romani morphemes, productively or not. A small number of these are used
in Cal. First, there is the suffix -pen, forming abstract nouns from verbs and
adjectives. Second, there are the gender suffixes -i (F) and -o (M) used with
adjectives in Cal and nominalized adjectives. Whereas the nouns with these
endings in Romani are probably lexicalized forms, if they were used with
adjectives they would probably still be productive. Third, there are also
instances of the noun plural suffixes -ia(s) for feminine nouns and -e(s) for
nominalized adjectives, followed by the Spanish plural suffix -s.
Some authors also use Romani case endings, probably only non-
productively. In any case, these inflected pronouns are not used like one
would expect considering their function in Romani: Cal uses datives or
genitives with subject function, and this is ungrammatical in Romani. There
are several instances of the case marker -ha for the comitative/ instrumental
case, as in romi-ha, dal-ha, sila-ha. One finds the -ha endings for instance in
Sinti and Balkan and central dialects. It is undoubtedly related to the Cal
preposition sar 'with' and the Romani instrumental case -sa(r) 1.
There are more 'frozen' Romani morphemes in Cal: many Cal verbs
have the Romani personal conjugation markers -av and -el between the verb
stem and the Spanish conjugation. They are derived from the Romani first
person singular ending and the third person singular ending respectively, but
they do not function as such. To these, Spanish inflection is added. Quindal
(1867) mentions a semantic difference between verbs with the -el- element
and those without, based on the same root. Those with -el- denote a more
intensive action, e.g. chinar "to cut", chinelar "to harvest"; querar "to do",
querelar "to execute". This is an innovation in Cal; it does not exist in either
Spanish or Romani. Cal apparently makes use of Romani inflection to
denote 'aspect' or 'Aktionsart'.
Gender assignment in Cal deserves a special study. Here, I will just
present some observations. Both Romani and Spanish distinguish masculine
132 PETER BAKKER
and feminine nouns. Assignment of gender of nouns in Spanish is partially

based on the semantics (especially with animates), partly on the phonological
structure of the stem, especially the final vowel or consonant. In Cal, this is
also the case. Quindal (1867: 51) says that words ending in a consonant or
in the vowels -, -e, -o, or -u are masculine, whereas words ending in -i, -a,
or -i are feminine.
4.4. Lexicon
McLane (1977, 1985) recorded a few hundred words considered Cal by
the Gypsies of Guadix and other parts of Andalusia and Spain. He calculated
(1985: 188) that 68 % of the lexical items in his corpus are of Indic (hence
Romani) origin and 22 % of Spanish origin (many of those distorted). We
also find the usual sources of non-Indic words in Cal from the pre-European
layer of loan elements present in all Romani dialects, such as Armenian,
Persian, Greek and Slavic languages. Boretzky (1992), using a wider
vocabulary, detected words from Slavic languages, in some cases specifiable
as South Slavic, Czech or Polish (Cal dosta "enough"< South Slavic dosta;
Cal kornes "laced boot"< Czech skornje; Cal stajnia "horse-stable" < Po
lish estanja). There are also a few Hungarian words in Catalonian Romani,
such as arany "silver" < Hungarian aranj and cin "colour" < Hungarian szn.
Many of these words, however, are not specific to Cal; in fact they can be
found in other Romani dialects as well. Boretzky further identified a number
of words which Cal has in common with the Sinti and Angloromani dialects
(Boretzky 1992: 16), so called 'northern dialects'. Further there are words
taken from Germania (Spanish Cant) in Cal (see below 6.3).
It should be mentioned that Iberian Romani also seems to have Greek
(and perhaps Persian) words not attested in other Romani dialects. A few
others have been identified as Arabic (e.g. (j)azari "ten" < Arabic ?ara;
Moroccan variety). Overall, the Cal lexicon is highly aberrant among
Romani dialects because of the singular phonological development, the
cryptic devices used (see below) and the relatively large number of words of
unknown origin.
4.5. Word Order

Word order is the same as in Spanish.
GENESIS OF CAL 133
4.6. Innovations
Although there is a clearRomanicomponent and a clear Andalusian Spa
nish component, there are also elements which are neither. I focus on two of
these: the use of innovative place names and the use of cryptolalic forma
tions.
4.6.1. Place Names

Local place names are among the first elements to be used by immigrant
groups when using their native language in the land of immigration. Never
theless, the Gypsies in Spain did not (always) do so. They formed new place
names, many with unclear etymons (see below), instead of adopting Spanish
place names. This list of place names is extracted from Sales Mayo (1870).
Some Cal place-names:
Barcelona: bajari (Sales Mayo 1870:8)

(Barcelons: bajan) (Sales Mayo 1870:8)
Extremadura: chim ye manr (Sales Mayo 1870:24)
Granada: meligrana (Sales Mayo 1870:47)
Guadalquivir: len bar (Sales Mayo 1870:43)
Habana: boban (Sales Mayo 1870:13)
Jerez: borbreo (Sales Mayo 1870:13)
Judea: bordajia (Sales Mayo 1870:13)
(Judio: bordaj) (Sales Mayo 1870:13)
Londres: llundun (Sales Mayo 1870:45)
Madrid: madrilati (Sales Mayo 1870:13, 45)
Sevilla: safacoro (Sales Mayo 1870:13)
Some of these are clearly circumscriptions, such as len bar "big river"
for Guadalquivir, and chim ye manr '"and of the bread" for Extremadura.
Others are distortions of the existing place names, such as madrilati for
Madrid and llundun for London. Some of them are cryptolalic formations,
such as Boban for Habana. It is a pun on Spanish haba "bean" and Cal bobi
"bean": one part of the word is replaced by the Romani word with the same
meaning (see below for other distortions of this type).
This indicates that speakers intended to make their speech unintelligible to
outsiders by avoiding borrowing, and instead making up new words. We
find the same phenomenon with place names in Angloromani (Hancock
134 PETER BAKKER
1984a). However, it is not typical only of Para-Romani dialects: the inflected

dialect of Finland and the Sinti dialect in Germany also have cryptic forms for
place names.
4.6.2. Cryptolalic formations

Cryptolalic formations are not limited to place names. Wagner (1937-
1938) discusses a score of these cryptolalic forms for a variety of words.
Usually, when a Spanish word contains a syllable which is also an existing
Spanish word, then the small word is translated into Romani. For instance,
the Spanish name for "March" is Marzo. This word contains the sequence
mar, identical to the Spanish word mar "sea". In Cal, the sea is loria
(Romani dorjav), hence Marzo becomes Loriazo. Another example is the
Cal word for "namesake". In Spanish it is tocayo. This resembles very
much the Spanish verb tocar "to touch". The Cal verb bajamb-ar means "to
touch", so the Cal word for "namesake" becomes bajambayo. Another
example, mentioned by Keller (1892: 171) is Cal sardenar "to condemn",
from Spanish condenar, both first syllables meaning "with". A final example
is the word for "die, dice" in Cal. The Spanish word is dado, which
happens to be homophonous with the Spanish word for "given" (past
participle). In Cal, "given" is diao, the past participle form of the verb
diar. The Cal noun for "die" is therefore also diao.
Cryptolalic formations like these may be devices to keep the language
unintelligible to outsiders, such as those who learned some of it (see above).
Such processes are common in secret languages.
5. The four Iberian Para-Romani languages

Cal is not the only Para-Romani dialect of the Iberian peninsula. In
Bakker (1991) I showed that the Romani dialect of the Basque Country was
(or is) a mixed dialect like Cal, a Para-Romani dialect or an 'intertwined'
language. It has a Basque grammatical system and a Romani lexicon. It is
therefore structurally different from Cal, especially since Basque is not an
Indo-European language.
It is less known that Catalonia also had such a mixed language, with
(again) a Romani lexicon and a Catalan grammatical system. This differs
from the Catalonian Romani dialects described in Ackerley (1915), which
show a reasonably intact Romani grammar (as the only dialect of the
peninsula). There are a few texts in a novel in Catalan concerning Catalonian
Gypsies that are written in Catalonian Para-Romani. The following text is
GENESIS OF CAL 135
illustrative. The text and its Catalan translation are both from Vallmitjana
1908 (as cited in Leblon 1982: 63). The analysis and interpretation are mine2.
Catalonian Para-Romani Catalan in source

De la mutzi d'un aranu De la pell d'un gat
van nyisquerb un dical ne varen treure un mocador
un busn en diquelava un home s'ho mirava
panant: Quin sambanban dient: ! Que es bonic!
"They go to take of a neckerchief made of a cat's skin. A non-Gypsy

saw it, saying: How beautiful!"
Interpretation:
De la mutzi d'un aranu Van nyisquerb un dical

of the skin of-a cat they-go take-off a neckkerchief
un busn endiquel-ava pan-ant: Quin sambanban
a non-Gypsy see-3.PAST say-ing: what beautiful
The Romani-derived words are mutzi (< R. morthi "skin") dical (<R.
diklo "neckkerchief"), nyisquerba (<R. (n)ikalav "to take off), busn "non-
Gypsy" (<R. buzno "goat"), endiquelava (<R. dikhel "he sees"), panant
(<R. penel "he says"). The words aranu and sambanban are of unknown
origin.
The Gypsies in Portugal are reported to have spoken (or speak) a variety
close to Cal, though slightly influenced by Portuguese (Coelho 1892).
These Gypsies must have come to Portugal from Andalusia. In the following
example from Coelho (1892: 9), the Calao example has Spanish rather than
Portuguese function morphemes such as the verbal inflection and the
possessive pronoun mi vs. meu, the personal pronoun yo vs. eu and the
question word quien vs. quem:
(1) Ai! mi patarr maro, a quien me combisar yo? (Calao)

Ai! meu pae morreu, a quem me encommendarei eu? (Portuguese)
Ai! mi padre morto, de quien me confiar yo? (Spanish)
Ai my father dead, to who REFL rely l.SG.NOM
"Ai my dead father, on whom can I rely?"
136 PETER BAKKER
Gypsies in Brazil spoke (or speak) a Para-Romani language with a Por

tuguese grammatical system and a Romani lexicon. Moraes (1885, 1886)
gives only a handful of songs, totalling twenty lines, in his study, all dis
cussed by Sowa (1889). I will give one of those here as an illustration. The
text and the English translation are from Sowa, the interpretation is mine:
Brazilian Para-Romani English source

Quando, de, tu merinhaste When thou diedst, O mother,
mana tambem merinhou I too died, -
em tanto nachadipem into such a forlorn condition
de mena tudo jalou Have I wholly gone
Interpretation
Quando, de, tu mer-inh-aste mana tambem mer-inh-ou

when, o mother, you die-?-2.PST me too die-?-1.PRES
em tanto nachadi-pem de mena tudo jalou
in such ill-ness of me totally go-l PRES
Here the content words are also Romani: d (< R. dai), merinhaste (< R.
merel "he dies"), mana (< R. manca "with me"), nachadipem (< R. nasul
"evil" or nasvalo "ill", plus abstract noun suffix -pe(n) ),mena (< R. manca
"with me"), jalou (<R. dial "he goes"). The -inh- suffix is unclear. It may be
related to the -in- element found in some words in Cal between the
(Spanish-derived) stem and the inflection, which is also used in other Rom
37-38).
Did the four varieties of Iberian Para-Romani (Basque, Spanish, Catalan

and Brazilian) come into being independently or was the vocabulary trans
ferred from either of them to the others?
There are several indications that they did come into being independently.
All dialects use case-inflected (with Romani case-ending) pronouns as the
unmarked pronoun. The intriguing thing is, that all dialects use different
cases: Portuguese/ Spanish Cal uses the dative (sometimes the locative),
Basque Romani the nominative and Brazil the comitative. There is not enough
data on the Catalonian dialect, but two other European Para-Romani dialects
use again different cases: Scando-Romani of Sweden uses the ablative and
GENESIS OF CAL 137
possessive, and Angloromani the locative. These results are summarized in

Table 1.
Table 1: Personal pronouns in some Para-Romani dialects.
Andalusia: mangue < mange DAT

menda < mande LOC
Portugal: mangue ? < mange DAT
Basque C: me < me NOM (also once: amenge DAT)
Brazil: mena < manca COM
Britain: mandi < mande LOC
Sweden: mannder < mandar ABL
mero <miro POSS
It is unlikely (but not at all impossible) that the people who had already
lost the Romani grammatical system would adopt a different grammatical
system when moving to a different region. The Portuguese Gypsies in any
case preserved the Spanish grammatical system with no shift to Portuguese.
The fact that the pronouns are taken from different inflected forms, also
points to an independent development: once lost, the other pronouns could
not have been recovered. It is further apparent that the forms selected for the
root of the verbs (imperative or first or third person singular) are not parallel
in the different languages (although there is not enough data from Catalonia).
This strongly suggests that these dialects came about independently -
except for the Portuguese Para-Romani, which is an extension of Andalusian
Cal.
6. Social functions of Cal

In a number of aspects Cal is unusual from a sociolinguistic point of
view. First, it is considered a secret language. This it has in common with a
number of other Para-Romani dialects (cf. Kenrick 1979 for Angloromani).
Second, Cal, unlike other Para-Romani languages, was known by many
non-Gypsies as well. Many people with first hand knowledge of the lan
guage have noted this. I discuss this below (6.1). Third, the language has
been called 'close to extinct' during the past century and a half. This is also
discussed below (6.2). Finally, there is a strong interaction with secret
languages such as Germania (Spanish Cant). There are many cases of mu-
138 PETER BAKKER
tuai borrowing and it is often hard to keep the two languages apart. This is
discussed in (6.3).
6.1. Cal known by non-Gypsies

Many sources claim that non-Gypsies living near Gypsies also spoke
Cal. This was the case in the 19th century and, also, in the late 20th century.
Brown (1922) remarked: "In no other country is a knowledge of Gypsy as
widespread as among the inhabitants as in Spain." The people who knew
Cal were local inhabitants, including policemen. One of the most recent
field-researchers, Merrill McLane notes the following: "There are also poor
Castilians who know more Cal than many Gypsies" (1977: 304). He also
states that "Cal has long been used by Gypsies and non-Gypsies in trading
horses, mules and burros" (McLane 1985: 190).
It is not surprising that languages-of marginal groups also contain Cal
elements, such as the language of the non-Gypsy nomads, the Quinquis, and
different varieties of slang and argot such as Germania:
6.2. Cal a dying language?

The Cal language was already considered close to extinct in 1843, when
Borrow wrote that Cal was "at the last stage of its existence" (cited in
McLane 1977). Comments like these have appeared ever since, such as, for
instance, Colocci (1888: 289), who wrote:
As to their language, the greater part of the Gitanos at the present day
speak Spanish, and they employ the Spanish phraseology, only
substituting some Cal words, and modifying some Spanish words with
the terminations saro-sara and une-una. (..) The true Cal still exists, in
the precise sense of the word. But only a limited number of these words
are now used; the rest is Castilian. Each individual Gitano knows only a
small portion of it. Nevertheless, in my conversations with Gitanos,
above all with those of Sierra Morena, and, in particular, with their old
people, I have collected - here or there - some hundreds of words, which,
perhaps, I shall one day publish.
This sounds already very similar to what McLane wrote almost a century
later, when he stated that the language is "in its final step towards extinction"
(McLane 1977: 303). It therefore seems that this language has been
languishing for a long time. It is dying because the words are less and less
used and increasingly forgotten and replaced by Spanish words.
GENESIS OF CAL 139
6.3. Germania and Cal

Germania is the most important secret language of Spain. Its grammatical
system is that of Spanish, and its lexicon stems from various sources. Some
words are distorted Spanish or words from foreign languages (including
some from Romani), but the bulk of the vocabulary seems to be of unknown
origin. I am not aware of an etymological study of Germania. George
Borrow stated that, "[b]y far the greatest part of the vocabulary [of Ger
mania] consists of Spanish words used allegorically, which are, however,
intermingled with many others, most of which may be traced to the Latin and
Italian, others to the Sanscrit or Gitno, Russian, Arabic, Turkish, Greek,
and German languages" (Borrow 1843: 117). In this it is similar to the Cant
in Britain.
The name 'Germania' is a distorted form of a foreign language's name,
as is often the case with this type of secret languages (for example Dutch
'Bargoens' < Burgundy; Danish 'Keltring' < Celtic; German 'Rotwelsch' <
Welsh or Walloon).
In all sources of Cal, we find Cal and Germania words in the same
texts and vocabularies. It seems that the two languages were, from a very
early date, used in similar situations, influencing one another. In older texts
in Germania we already find a few Cal words, i.e. words of Romani origin.
In Hill (1945) Germania texts from 1609 are quoted which already contain
some Romani words. The oldest sources for Cal also contain a number of
Germania words. Hill's (1921) word list (see 7.1), supposedly from the 17th
century, contains Germania words like pio for "teeth" (Cal dai, Romani
dand) and gumarra "chicken" (Germania gura, Cal cai, Romani kaxni or
khajni).
The earliest published Cal materials, those published by Bright (1818),
also contain a number of Germania words.
6.4. Variability: function words in different sources

As any language, Cal is not a homogeneous language, neither in time
nor in space. There are words which are strictly in regional usage, and there
are also differences in what is taken from Spanish and what from Romani.
This can be illustrated by the function words. Table 2 shows a number of
function words from different sources. The first three are the English words
and their Spanish and Romani equivalents. The other columns indicate a
selection of sources and what words are given in these sources for the
function words discussed.
Table 2: Some function words in several Cal sources, (preliminary overview) 140
ENG SPA ROM 1818 1841 1848 1853 1870 1900 1915
BR. BORR. CAMP. JIM. SALES TINEO PABANO
(do)not no na no no ne na, ne na(nai) ne

a un yekh un yes yes, yeque yes yes, yequi yes, yequi
and y thaj y ta y, ta ta,y
by por per per per per, pre? per, pre per, pre
for para pa, somia pa pa pa, somia pa, somia
in, on en and en on an, on on on an, on on
me, I me/yo me,mange me, nu nu me, mangue me; na
nor ni na na, ne na ne
of de de de e e e e(PL es) e
of the M del del dor ya dor dor
of the F de la de la d la ya ya ya (PL yas) ya (PL yas)
one, some uno/a/s yesque/i yesque/i yequ, yes yeque yesqui/e
or 0 vaj 0 0 0
that que sos que sos sos sos sos sos
the M el o/e el o, or or o, or o, or o, or
the F la e/i/la la la a a a a a
the PL.F las e/o las ar, as ar, as as ar, as ar, as
the PL.M los e los os, 1er 1er, os os, 1er lor, os 1er, os
to the M al al or aor
(BR = Bright; BORR = Borrow; CAMP = Campuzano)

GENESIS OF CALO 141
All the functional elements are of Romani or Spanish origin except for the
italicized forms somia for "for" and nu for first person singular, both of
which are of unknown origin.
One result is clear: the sources show no evolution from a more Romani-
oriented variety towards a more Spanish-oriented variety. The first text
(Bright 1818) has almost exclusively Spanish function words, whereas later
texts have more function words from Romani. In fact, the number of Ro-
mani-derived function words used in sources after 1850 is greater than before
1850, contrary to what one would expect, if Cal is seen as a gradual
evolution away from Romani towards Spanish. This could result from
regional differences, however, or increasing purism.
For the final analysis, all sources should be studied, and in a more
thorough way than was possible within the scope of this paper.
7. The oldest sources of Iberian Romani

If we want to explain the genesis of Iberian Romani, we have to look for
the earliest sources. As yet, no exhaustive historical research has been done
concerning mentionings of the Gypsies' languages. However, there are a few
early remarks about the language of the Gypsies in Spain.
Already in 1608 the Spanish writer P. Martin Delro, in his book
Disquisitionum Magicarum said that the language of the Gypsies "was a
vernacular invented by them to replace their native tongue, which they had
forgotten" (cited in Spanish in Paban 1915: 179)3.
One early source claims that the Gypsies around 1600 were fluent in
Greek. This may point to a previous presence in a Greek speaking area,
presumably not long before. It also seems that the Gypsies lost their language
quite early and, then, may have completely replaced it with Cal within 150
years after arrival in Spain. More historical research is needed here.
If we look at the two earliest sources of Cal (or Iberian Romani in
general), we see that these two earliest sources already show a complete
hispanization of the language. I will discuss both of them in some detail.
7.1. Anonymous word list 17th century

The oldest source is a word list found in the Biblioteca Nacional in
Madrid by John M. Hill and published early this century (Hill 1921). It is
called Lengua egipciaca; y mas propio: Guirigay de Gitanos [The Egyptian
language, more properly, jargon of the Gypsies]. Hill seems to suggest that it
142 PETER BAKKER
dates from the 17th century, but he does not indicate how he arrived at this
conclusion.
The list contains 61 words, almost all of them clearly Romani. Although
there are no phrases in the Hst, it is also clear that the Romani inflection is not
inherited. Indeed the words have inflection: verbs end in -ar, and plural
nouns end in -s, both in the Spanish and in the Cal columns. There are also
some shared derivational endings in both sections, such as -ador (or rather its
Andalusian form -aor) for a person who performs an action. Although there
are some Romani grammatical morphemes, these are most probably lexi-
calized forms, such as gach and gach for "servant" and "maid", with the
Romani masculine and feminine endings. The -o and -i endings are pro
ductive in Romani proper for adjectives, but not for nouns and we only find
them used on nouns in this list. The same for pux and puxy for "old man"
and "old woman" respectively. The x is probably a misreading for r, since the
Romani source word is the adjective puro (M), puri (F). If this list is really
from the 17th century, it means that without any doubt there were already
completely hispanicized varieties of Romani by then.
7.2. Bright
The first dated publication concerning Cal, and the first one which con
tains sentences, is Bright's European travel account. In an appendix, he
compares the Gypsy languages of Hungary, Britain and Spain (Bright 1818:
lxxviii-xcii). Bright gives roughly 150 words and some 25 sentences. In
view of the fact that Bright's book is so hard to obtain, his text material is
presented in an appendix to this paper.
Bright did not collect the material himself, but he received it from one of
his friends (Bright 1818: ix). In the word list many plural nouns end in -s and
verbs in -ar. More important, he gives a number of sentences which clearly
show that it is Cal we are dealing with and not Romani with inherited
inflection.
The proportion of Spanish and Romani lexical elements used here differs
considerably from one sentence to another, as we see comparing (1) and (2),
in which Spanish elements are italicized.
(1) ochanaba mangue loque chik (Bright, Cal)
know-3 1SG that-which tell-2SG
"I know not what you tell me" (Bright, English)
(Romani dzanav "I know", man-ge "to me", chile ??;
perhaps dlav 'I sing'?)
GENESIS OF CAL 143
(2) gillate de mi que no te pueda indicar (Bright, Cal)

go-2PL.IMPER from me that NEG you can-3 see-INF
"get out of my sight" (Bright, English)
(literally: "go from me (so) that he cannot see you")
(Romani gel-em "I went", dikh-av "I see",
with Andalusian in-; cf. 4.1.)
In short, there is no doubt that in 1818 Cal was already the mixed lan
guage known from later sources. It even has the Romani dative form used in
all grammatical cases, as in the later sources (see also section 5).
7.3. Inflected Romani in the Iberian peninsula

There are few sources of inflected Romani from the Iberian peninsula.
Apart from the inflected dialect of Catalonia, there are only two brief texts
recorded in the Basque country in the 19th century, in a period when all the
other sources from the Basque Country already show that the mixed variety
was spoken (cf. Bakker 1991). As these were songs and prayers, they can be
expected to have been preserved longer. This indicates that the inflected
language as a means of communication was lost early outside Catalonia. The
texts from the Basque Country will be given in appendix 2.
8. The genesis of Cal

No hypotheses have been formulated as to the genesis of Cal. We
would like to know, however, when, why and how Cal came into exist
ence. Was it a gradual or a sudden process? Was it conscious or uncon
scious? Under what circumstances did it happen?
The lack of hypotheses concerning the genesis of Cal contrasts with the
study of Angloromani and Scandoromani. For the first, there has been a
debate, notably between Donald Kenrick and Ian Hancock, as to the genesis
of this language.
The differences between the two positions can be summarized as follows.
Hancock believes that Angloromani was consciously created by Gypsies and
British outlaws in the 16th century and that Angloromani co-existed with
inflected Romani for a long time (Hancock 1984a, 1984b). Kenrick,
however, believes it was a gradual development taking place mostly in the
19th century, whereby more and more English grammatical elements were
integrated into Romani (Kenrick 1979). Hancock's ideas were followed up in
144 PETER BAKKER
his paper on Scandoromani (Hancock 1992). A third hypothesis was

formulated by Boretzky & Igla (1994), in a comparative study on Para-
Romani dialects. They suggested a reversal of language shift, at a point when
the youth had shifted to the language of the host country, but could still rely
on the older generation to conserve theRomanilexicon.
These hypotheses were based on parallel cases, not in Cal itself. Cal is
sometimes mentioned in these studies. Hancock (1984a) also mentions Cal
in his papers on the genesis of Angloromani, showing many parallels in
functional and structural features. On the one hand this is no proof in itself,
since each case must be studied individually. On the other hand, the parallels
are strong and the facts unusual, so that a comparison is justified.
In the foregoing we presented (among others) the following facts
concerning Cal:
- The Gypsy population in the peninsula has been relatively settled for many
centuries.
- The Spanish elements show archaic features.
- The Spanish elements show Andalusian features.
- Already in 1609 they were reported to have lost their own language.
- Around 1600 they were reported to speak Greek.
- In the earliest source of Iberian Romani (17th century?) the language was
already thoroughly hispanicized.
- The earliest source ofRomanicontained already Germania lexical items.
- Germania in the early 17th century hardly contains Cal words.
- The function words show no evolution from more Romani oriented towards
more Spanish oriented varieties.
- Cal contains distorted Spanish items.
- Cal was also known by quite a few non-Gypsies.
This is compatible with an early genesis of Para-Romani, i.e. relatively

soon after the arrival on the peninsula. If Hill's date for the word list is cor
rect, Iberian Romani was already a Para-Romani language in the 18th century
and, if Delrio can be trusted, already around 1600. A few facts can be used as
arguments against a gradual development, especially the early sources and the
non-evolution of function words.
Nothing in itself gives convincing evidence for either of the hypotheses.
Nevertheless, an early genesis (16th century) seems most likely, considering
the conservatism of some Spanish elements, Delrio's remark and the nature
GENESIS OF CAL 145
of the early sources. It may have been a conscious creation, perhaps related to
an attempt at the reversal of language shift.
9. Conclusions
On the Iberian peninsula and Latin America, at least four Para-Romani
languages have come into being, most likely independently of one another.
Cal combines Iberian Romani vocabulary with Andalusian Spanish
grammar, Errumantxela in the Basque Country combines Iberian Romani
vocabulary with Basque grammar, around Barcelona an unnamed variety
came into being combining Catalan grammar and Iberian Romani, and in
Brazil a Portuguese grammar variety came into being. The Para-Romani
variety of Portugal is derived from Spanish Cal and is called Calo. An early
genesis for Cal is suggested by some archaic Spanish remnants, pointing to
conservatism on the part of its speakers. Perhaps the language already existed
in the 16th century, several generations after the arrival in Spain. Regional
differences remained, some being closer to Spanish, others to Romani, as far
as the use of function words is concerned. Only in Catalonia the inflected
language survived into the 19th century, apparently beside a Para-Romani
variety.
Further research on Cal is needed on a number of subjects. An
etymological dictionary is badly needed, as well as a critical assessment of the
sources, including the question who took over what from whom. Moreover,
a thorough grammatical study is also needed, whereby both the Spanish and
Romani source dialects should be taken into account.
NOTES
1 Boretzky (p.c.1993) has observed that in some Balkan dialects (notably Arli) both -har
and -ha are used beside one another, and the Gurbet dialect of Romani has the suffix -sar,
and he thinks that the preposition sa(r) and the case ending -sa(r) existed side by side for a
longer time (as with most of the other prepositions/ case endings). Cal shows traces of
this.
2
I thank M. Lpez Abelln for his help with Catalan.
3 Paban wrote: "decia que era un habla inventada por ellos para suplir su idioma nativo,
que se les haba olvidado" (Paban 1915: 179). In a superficial search in Delrio's volumi
nous work, I was not able to locate the exact page for this quotation.
146 PETER BAKKER
Appendix 1
THE CAL SENTENCES FROM BR (1818).
Spanish-derived elements are italicized.
Las ducais me marel-an Trouble kills me

Vastel-a-te cate Sit by me
ochanaba mangue loque chile I know not what you tell me
prastarela run!run
no orobeles mi dai Do not weep mother for my health
por la estipen de la mangue
Apande umd el bundal Shut the door
Abele umd acot Come hither
Naguese umd Go! begone
Endineme un prajo Give me a cigar
Abele umdajamar Come to eat
Voy a sobelar I am going to sleep
Se va a romandi-ar He is going to marry
Naguemos a jonjobar/e Let us go and deceive him
Amcabado umd You are a thief
Abela la pani It rains
Esta chai es lili This girl is very wild
Haber el boqui de un dever terero To be very hungry
Que engispo? What do I see?
Se ha endiado el parn a la chai The money was given to the girl
Gillate de mi que no te pueda indicar Get out of my sight
No se gille umd Do not leave me
porque terelo ir de esta cocorri I fear to go hence alone
En el chen de los chindoquenctos In the country of the blind
el que avela un sacai es un clai he who has one eye is a chief
Romandiate con este chavo Marry this fellow
Malos menguis te jamelan May the devils eat you
Mal fen tengas tu cuerpo (curse)
Mal fen tengas (curse)
Gitano Song:
Del estaribel me sacan They take me from the prison
Montadito en un jun mounted on an ass
Yme van acurrubando and flog me
GENESIS OF CAL 147
Por las calles catorr through the streets
Utterance:
Chavo gillate. Be off boy.
que vienen los Doms a cogerte The officers are coming to take you
Date con los carcos en el Buerengi Give your shoes against your breech
Appendix 2
The two texts below were taken from Cnac Moncaut (1855: 345), whose informant was
somebody named Sansberro. Sansberro is one of the Cascarots in Ziburu (Ciboure), a
fisherman's village on the northern Basque coast, close to Donibane Lohitzun (Saint Jean de
Luz). The Cascarots are said to be descendants of marriages of Gypsies with Basques (see
Webster 1889). The first text is a Catholic prayer.
(Source text) (French in source) (Romani) (my translation)

Leba Tusquet Au nom du Pre Le Batoske For the Father
Echa Bisquet Et du Fils e chaves-ke For the Son
Le Apelinguet Et du Saint-Esprit le apenic-ke For the Holy Spirit
Taberamente Ainsi soit-il t'avel amende That he comes to us
This short text contains a few words typical of Iberian Romani: bato for "father" (Romani
dad. The word apelinguet for "Holy Spirit" is unclear, but it may have to do with penice
(written peniche), the Cal form for 'Holy Spirit', presumably derived from Greek pneuma.
The interpretation of the second text is more complex, since the translation does not seem
to fit the text and it contains some words which are hard not clear.
(source text) (source translation)

Usti, usti, chajori Regardez, regardez, cette fille,
mindre foucar moyorr Avec sa jolie figure,
Samend caracolenge Qui va chercher des escargots
(Romani interpretation) (corrected translation)

usti, ust, chajori stand up, stand up, little girl
minre, sukar muiore- of me, with a pretty little mouth
sa, amende caracol-enge to us for snails
148 PETER BAKKER
REFERENCES
Ackerley, Frederick G (1915) The Romani speech of Catalonia. Journal of

the Gypsy Lore Society, New Series 8, 99-140.
Acton, Thomas & Donald Kenrick (1984) Romani Rokkeripen ToDivvus.
The English Romani Dialect and Its Contemporary Social, Educational
and Linguistic Standing. London: Romanestan Publications.
Bakker, Peter (1991) Basque Romani - a preliminary grammatical sketch of
a mixed language. In: Bakker & Cortiade (eds.) 1991, 56-90.
Bakker, Peter & Marcel Cortiade (eds.) (1991) In the Margin of Romani:
Gypsy Languages as Contact Languages. Studies in language Contact I.
Amsterdam: Publikaties van het Instituut voor Algemene Taalweten
schap 58.
Bakker, Peter & Hein van der Voort (1991) Para-Romani languages: an over
view and some speculations on their genesis. In: Bakker & Cortiade
(eds.) 1991, 16-44.
Boretzky, Norbert (1992) Romanisch-Zigeunerische Interferenzen (zum
Cal). In: Prinzipien des Sprachwandels. I. Vorbereitung. Beitrge zum
Leipziger Symposium des Projektes "Prinzipien des Sprachwandels"
(PROPRINS) vom 24.26.10.1991 and der Universitt Leipzig, Jrgen
Erfurt, Benedikt Jessing, Matthias Perl (HGG). Bochum: Studienverlag
Dr. N. Brockmeyer. 11-37.
Boretzky, Norbert & Birgit Igla (1991) Morphologische Entlehnung in den
Romani-Dialekten. Arbeitspapiere des Projektes "Prinzipien des
Sprachwandels'" Arbeitspaper Nr. 4. Essen: Fachbereich Sprach-und
Literaturwissenschaften an der Universitt GH Essen.
Boretzky, Norbert & Birgit Igla (1994) Romani Mixed Dialects. In: Peter
Bakker & Maarten Mous (eds.) Mixed Languages. 15 Case Studies in
Language Intertwining. Amsterdam: IFOTT. 35-68.
Borrow, George (1841) The Zincali... With an original collection of their
songs and poetry, and a copious dictionary of their language. London:
John Murray. 2 Vols. [Reprint 1923: London: Constable.
Borrow, George (1843 [1899]) The Bible in Spain. London: John Murray.
Bright, Richard (1818) Travels from Vienna through Lower Hungary. Edin
burgh: Archibald Constable & Co.
Brown, Irving (1922) The knowledge of Gypsy by the Gentiles of Spain.
Journal of the Gypsy Lore Society, 3rd series 3, 143-144.
Campuzano, R. (1848) Orijen, usos y costumbres de los jitanos y diccionario
de su dialecto. Madrid. [2nd edition, 1851. Facsimile reprint, Madrid
1980: Heliodoro].
Cnac Moncaut, M. (1855) Fragment de vocabulaire gitanos. In: Cnac Mon-
caut, M. Histoire des Pyrenes et des Rapports Internationaux de la
France avec l'Espagne depuis les temps les plus rculs jusqu' nos
jours. Paris: Amyot.
GENESIS OF CALO 149
Coelho, Francisco Adolpho (1892) Os Ciganos de Portugal com un Estudio

sobre el Cal. Lisboa: Imprensa Nacional.
Colocci, Adriano (1888) The Gitanos of today. Journal of the Gypsy Lore
Society, First Series 1, 286-289.
Delro, Martin A. (1633 [1608]) Disquisitionum Magicarum. Cologne: P.
Henning. [originally published in 1608].
Dietz, Henry G. & F. David Mulcahy (1988) Romani of a third place: a sta
tistical analysis of nineteenth-century Cal and Castillian. In: Carmen
DeSilva, David J. Nemeth & Joan Grumet (eds.) Papers from the 8th
and 9th Annual Meeting of the Gypsy Lore Society, North American
Chapter. New York: Gypsy Lore Society, North American Chapter. 1-
17.
Fraser, Angus (1992) The Gypsies. London: Routledge.
Hancock, Ian F. (ed.) (1979) Romani Sociolinguistics. (International Journal
of the Sociology of Language 19). The Hague: Mouton.
Hancock, Ian F. (1984a) The social and linguistic development of Angloro-
mani. In: Acton & Kenrick 1984. 89-134.
Hancock, Ian F. (1984b) Romani and Angloromani. In: P. Trudgill (ed.)
Language in the British Isles. Cambridge: CUP. 367-383.
Hancock, Ian F. (1988) The development of Romani linguistics. In:
Jazayery, Mohammed Ali & Werner Winter (eds.) Languages and
Cultures. Studies in Honor of Edgar C. Polom. Berlin etc.: Mouton De
Gruyter. 182-223.
Hill, John M. (1921) A Gypsy-Spanish word-list. Revue Hispanique 53,
614-615.
Hill, John M. (1945) Posas Germanescas. Bloomington, Indiana: Indiana
University Publications 15.
Jimnez, Augusto (1853) Vocabulario del dialecto gitano... con varios rezos,
cuentos. 2nd Sevilla: Imprenta del Conciliador, [originally: Sevilla,
1846: Gutierrez de Alba]
Kaufman, Terrence (1979) Review of Rajendra Weer Rishi 'Multilingual
Romani Dictionary.' In: Hancock (ed.) 1979, 131-144.
Keller, A (1892) Einfluss des Spanischen auf der Sprache der in Spanien
lebenden Zigeuner. Zeitschrift frRomanischePhilologie 16, 165-173.
Kenrick, Donald (1979) Romani English. In: Acton & Kenrick 1984, 79-88.
Also in: Hancock (ed.), 111-120.
Leblon, Bernard (1982) Les Gitans dans la littrature espagnole. Toulouse:
France-Ibrie Recherche.
McLane, Merrill F. (1977) The Calo of Guadix: a surviving Romani lexicon.
Anthropological Linguistics 19, 303-319.
McLane, Merrill F. (1985) Romani speech domains in Spain and Portugal.
In: J. Grumet (ed.) Papers from the 4th and 5th Annual Meeting of the
Gypsy Lore Society, North American Chapter. New York: Gypsy Lore
Society, North American Chapter. 188-198.
150 PETER BAKKER
Moraes, Alexandre Jos de [Mello] (1885) Canconeiro dos Ciganos. Poesa

Popular dos Ciganos de Cidade Nova. Rio de Janeiro: B.L. Garnier.
Moraes, Alexandre Jos de [Mello] (1886) Os Ciganos no Brasil. Con-
tribuo Ethnographica. Rio de Janeiro: B.L. Garnier.
Mulcahy, F.D. & Anita Volland (1986) The Gospel according to St. Luke: a
preliminary analysis of the same text in three Romany dialects. In:
Joanne Grumet (ed.) Papers from the 6th and 7th Annual Meeting of the
Gypsy Lore Society, North American Chapter. New York: Gypsy Lore
Society, North American Chapter. 135-145.
Paban, R.M. (1915) Historia y Costumbres de los Gitanos. Barcelona:
Muntaner y Simn. [Facsimile Reprint, 1980, Madrid: Ediciones Giner]
Polkinhorn, Harry, Alfredo Velasco & Mal Lambert (1983) El Libro de Cal.
San Diego: Atticus Press.
Quindal, Francisco (1867) [pseud, of F. Sales Mayo]. Diccionario Gitano.
Madrid: Oficina Tipogrfica del Hospicio.
Sales Mayo, Francisco de (1870) El Gitanismo. Historia, Costumbres y
Dialecto de los Gitanos. Con un eptome de gramtica gitana, primer
estudio filolgico publicado hasta el dia, y un diccionario Cal-
Castellano, que contiene, ademas de los significados, muchas frases
ilustrativas de la acepcion propia de las palabras dudosas. Madrid:
Librera de Victoriano Surez.
Sowa, Rudolf von (1889) The dialect of the Gypsies of Brazil. Journal of the
Gypsy Lore Society, First Series 1, 57-70.
Tineo Rebolledo, J. (1900) A Chipicalli (la Lengua Gitana). Granada: Im
prenta de F. Gomez de la Cruz.
Vallmitjana, Juli (1908) Sota Montjuic. Barcelona: L'Avena.
Wagner, Max L. (1937-1938) Stray notes on Spanish Romani. Journal of the
Gypsy Lore Society, Third Series 15, 134-138; 16, 27-52.
Webster, Wentworth (1889-1889) The Cascarrots of Ciboure. Journal of the
Gypsy Lore Society, First Series 1, 76-84.
ROMANI LEXICAL ITEMS IN COLLOQUIAL ROMANIAN
CORINNA LESCHBER
Free University, Berlin
0. Introduction
The subject of Romani borrowings in Romanian generally touches upon
different areas of past and present Romanian colloquial speech. It concerns
the language of the youth (school children and students) as well as that of
soldiers, historical argot and contemporary slang, so-called vulgar language,
and the language of newspapers in Romania from the end of the 1800's and
into the 1930's. In this contribution I deal with the adaptation of Romani
words into Romanian and their semantic developments and stylistic
changes. I focus on the question which Romanian words originating from
Romani are still in use today, by whom, and with which semantic content,
and attempt to find out whether the users of these borrowings are conscious
of their origin, and whether the words they use are applied only in certain
circumstances.
1. The status of Romani in related literature

Sfrlea (1989) points out that there is a regrettable gap in Romanian
sociolinguistics: there is no analysis of the effect of linguistic contact
between Romanians and the national minorities according to age groups.
Steinke (1989) notes in his article on Romanian Sondersprachen (i.e.
marginal or special languages) that there is no dictionary which reflects the
current state of the Romanian Sondersprache. The lack of such a book is all
the more noticeable in that the major dictionaries hardly even touch the
Sondersprachen.
In a ninety-two page essay, Graur (1934) deals with Romanian words
which had their origins in Romani. He illustrates these words with examples
taken from the literature and the press. One of the main sources drawn upon
is the humorous weekly Veselia (i.e. "Happiness"), published since 1891.
Graur points out that the authors of these articles and books tried to create a
152 CORINNA LESCHBER
characteristic effect through the use of certain items and phonetic traits that
suggest to the reader a Gypsy context.
There have been incidents ofRomanianjournalists introducing artificial
imitations of Romani to make their newspaper articles more interesting.
They often employed aspiration of initial vowels, additional suffixes such as
-os, -ete, -engher, and verse form for these artificial passages. Around the
beginning of the twentieth century one finds such artificially created
Romani texts and Gypsy anecdotes in cabaret and revues in Veselia. From
1922-28 a certain Pribeagu worked for the weekly magazine Pardon and
produced fake Gypsy texts, although he spoke no Romani. These texts later
appeared in the newspaper Dimineaa(Morning) on the "funny page".
Assuming that the material Graur presents is from authentic sources, it
is apparent, when scrutinizing a word with the help of any dictionary, in this
case the three volume Romanian-German dictionary by Tiktin & Miron
(1986-1990), that Romani etymology is usually not recognized, and words
of Romani origin are traced to other languages. Schroeder (1989) dedicates
a total of eleven and a half lines to the subject of "Gypsy words". According
to Schroeder, the lexemes found in the Romanian language which derive
from Romani are seldom neutral, but for the most part are depreciatory and
vulgar.
An important contribution to the study of Romanian argot was written
by the Frenchman Juilland, who pointed to semantic changes which words
of Romani origin have undergone (Juilland 1953: 433), as well as to the
discussion aroused by Vasiliu (1933/34a-b) with regard to a number of
Romani etymologies mentioned by Graur (cf. also Graur 1937).
In the post-1947 era, the subject of Romani etymologies was taboo in
Romania. This is exemplified by Graur (1960: 368 ff), who complained that
"certain classes are changing the language as they like", and characterized
this process as "a tasteless development of the language, much like Jargons
and Argots" (p. 373), concluding that "these useless varieties are to be
eliminated" (p. 321). Such varieties, he claims, are "irrevocably con
demned" (p. 373) by the working classes (cf. also discussion in Bochmann
1980: 23). Graur's position is especially dramatic when seen in the light of
twenty-five years of his own carefully documented research on the Romani
element in Romanian. Graur, Wald & Stati (1971) state that "to occupy
oneself with socially dependent lexical and semantical changes is to deny an
interest in linguistics". It is exactly these aspects with which the present
contribution is concerned.
ROMANI IN COLLOQUIAL ROMANIAN 153
Although Ivnescu (1980) outlines the arrival of the Roma in Romania

on the basis of historical sources, there is a complete lack in his Istoria
limbii romne of any references to the influence of their language upon the
Romanian language. While Ivnescu provides at least a few clues for further
research in this direction, the famous Romanian linguist Rosetti ignores this
subject completely (Rosetti 1978). The term Romani orignete,which is
occasionally used in Romania, does not even appear in the index of the 936-
page book.
Vascenco's (1991) material for a contribution on Romanian words of
unknown etymology was taken from the DEX (1975), as well as from its
1988 supplement. Ignoring Graur's important study, Vascenco does not
comment on the etymology of -numerous lexemes and common Romanian
expressions which derive from Romani.
Since 1989, it has again become possible to publish work on Romani-
related subjects in Romania. There exists, for example, a variety of books
by Gheorghe Saru, including a Romani-Romanian dictionary with a
grammar supplement (Saru 1992).
2. Sociolinguistic remarks
In looking at Romani lexical borrowings in Romanian, we are dealing
with two types of phenomena:
(1) Individual lexical elements of Romani as a native language were
preserved in the course of language-loss and adopted into Romanian;
(2) Native speakers of Romani transmitted elements of their mother tongue
into Romanian in order to enlarge the scope of expressions.
Transfer of such lexical and stylistic elements from Romani into
Romanian led to a large number of derivatives in Romanian, consisting of
the basic Romani lexemes, to which Romanian derivational morphology is
added. Such linguistic creativity may be observed in the colloquial variants
of many other languages as well (cf., for example, Matras 1991, or Bakker,
this volume). Another relevant issue is the adoption of the Romani variants
by the Romanian youth, which may have been prompted by their tendency
towards non-conformity. Their need to uncover and employ Romani
expressions may have been a response to the great pressure put upon them
during the Ceauescu era to conform to its social dictates. Part of the reason
for the adoption of Romani elements to express dissatisfaction with the
Ceauescu regime may have been the prevalent belief, based on the
stereotype of the Roma, that this people would be most likely to escape the
socio-political pressure of theRomaniandictatorship.
Many of the Romanian informants for the present study, however,
knew nothing about the etymology of a large part of the four-hundred
lexemes that were elicited in the interview. It can therefore be assumed that
Romani elements were introduced step-by-step into Romanian, first by
bilingual Roma who allowed Romani elements to enter into their Romanian,
and then by theRomanianswho copied them. The inherent expressive value
for the Romanians of the Romanian Romani borrowings has had its own
dynamic impact upon their further development.
3. Methods
The field work for the present study was carried out with Romanians
and Romanian Roma who claim that they cannot speak Romani. A control
group was formed of Roma whose mother tongue is Romani and who
consider themselves as bilingual. Twenty-two Romanians and Romanian
Roma participated in the interview. Fifteen of them were willing to take part
in a longer interview involving a thirteen-page questionnaire. The
questionnaire is based on lexemes discussed by Graur (1934) and Juilland
(1952), the only significant studies on this subject. The discussion of the
circa four-hundred lexemes took about four hours for every person and was
divided into two sessions of two hours each. There was an almost equal
number of women and men, and of Roma andRomanians.About half of the
participants have lived in Germany for short periods, while the other half
permanently lives in Romania. Most of the interviewees are under thirty.
The interviews were conducted in 1992 and 1993, and three of them in the
spring of 1988.
The interviewees were asked to describe and more clearly define the
context in which the lexemes were used, and reflect upon possible
synonyms. The atmosphere was purposely kept cheerful and informal in
order to elicit as many instances as possible.
A selection from the accumulated material was made for this
presentation. One of the criteria for the choice of the lexemes in the section
Lexical adoptions from Romani is that these examples have undergone a
semantic development. The most important selection factor, however, was
the extent to which the borrowings were known to the informants, reflecting
an important aspect of current language usage in Romanian today. The
material included in this section is generally considered to reflect usages
which are common in contemporary Romanian speech. Expressions with

which the informants were not familiar have not been included.
4. Lexical adoptions from Romani

A selection of some of the four-hundred examples has been picked to
show some interesting connotations and semantic developments. Lexemes
which remained semantically close to their original etymon are not
mentioned (for example numerals). Wherever possible, I refer to the
spelling used in Boretzky & Igla (1994) when giving an etymon from
Romani.
bfta
Romani baxt (subs.) "happiness, luck, benefit, blessing, joy", baxtal (adj.)
"happy, lucky".
Romanian bft "success, good luck" (for encouragement before
examinations and fishing in the language of the pupils and students), bfta
also means "bye-bye", as well as "fortune, chance, influence, glory". Tiktin
& Miron (1986-1990,I: 267) point out "the Gypsy origin of this Romanian
argot word". Romanian bafts, baftosa (adj.) "happy", with the Romanian
adjective ending masc, -os, fem. -osa, is used "in order to make luck stay".
Cf. Graur (1934: 124), Juilland (1952: 157).
barosn, baros nca

Romani bar (adj.) "large, big, important".
Romanian barosn (masc, subs.), barosnca (fem. subs.), with the
Romanian augmentative suffix for nouns -an, to form masc, nouns, resp.
with Romanian fem, diminutive suffix -anca - a composite suffix - and a
connecting consonant -s-, denotes "a person with a lot of money", "a person
with good relations".
Romani bar san "you're great" cannot form the basis for barosn as it
does not conform to the Romanian rules of word formation. Cf. discussion
in Graur (1934: 126; 1936: 197), Juilland (1952: 158).
benga
Romani bendza-, cf. beng (subs, sg.) "devil", beng (subs, pl.) "epilepsy",
bengal, bendal (adj.) "devilish, mad, insane, angry, wild", also as a noun
"epileptic".
Romanian has bnga "devil", but this meaning has become rare. It was
explained as "a kind of name". We can find Romanian bnga in the
expressions (sta) a dat n benga (street jargon) "somebody who cannot
control his movements", "clumsy", or du-te n bnga "piss off!", and
Romanian beng (subs.) "(something) arrogant", "something which is
growing", bnga (surprisingly as an adj.) "bad". Another variant is a fi
bengs "to be clumsy" with the Romanian adj. suffix -os for derivatives of
nouns that mark the possession of a quality.
A curse is Romanian bengsule! with the Romanian masc. vocative
ending -ule, and Romanian bengals!, an adjective used as a noun with no
vocative ending, cf. Romani bengal "devilish". A further derivation is
zbengut (adj.) "naughty", with the Romanian participle ending -it. One
informant mentioned that South Romanians made fun of North Romanians
using the expression bongose "magical spell"; cf. Graur (1934: 131),
Tiktin & Miron I (1986-1990: 360), bonghen, bongheni, with the
Romanian fem, diminutive suffix -i. Cf. also Graur (1934: 128 ff.),
Juilland (1952: 158).
be, be tele
Romani bes, imperative of beel (vb. intr. 3. Sg.) "to sit (down), to stay, to
remain", best (part.) "sitting"; Romanian:
1. be "stay, sit"
2. be "to sit down" (reference to a position for coitus), also be in flaut.
3. betele "sitting", but also "sit down", cf. Juilland (1952: 158).
4. Popular etymology related to a be, a ba "to suffer from flatulence, to
break wind", and the correlated noun be "poop".
5. a betel pe cineva "to grumble, to criticize somebody" (also in the usage
of intellectuals).
6. In crude usage be! "take off, get out of here!", be de aicy! "piss off!".
Cf. also Romanian be ndra, "stay inside!", and Romanian belean "(you)
were sitting". Cf. Graur (1934: 129), Juilland (1952: 158).
bitri
Romani bsto "twentieth".
Romanian bitri "money", with the Romanian masc. Nomina agentis
suffix -ar, (pl. -ari), usually used for forming derivatives from nouns. In this
case, a numeral is taken as the basis. Cf. Graur (1934: 130), Juilland (1952:
159).
a buli
Romani bul "buttocks, ass", adj. buljan "back, arse", buldel (vb. 3. sg.)
"to have intercourse".
Romanian a buli is a productive verb, the original meaning of which is
"to make love; to fuck", as was mentioned already by Juilland (1952: 159),
but its current meaning is closer to "to fail in a task, to screw up, to blow it,
to make a mess of things". Cf. Romanian am bulit-o "I destroyed it". The
simple meaning is still expressed in Romanian te bulsc. It is used towards
women in a sexual context and towards men in a fight to mean "I'll beat you
up". One person even mentioned abid "to fuck" in one word.
Romanian bulel (subs.), as in bulela asta vine de la el "this crap
came from him", bulel and bulangel also mean "to screw like a rabbit",
with the Romanian fem, abstract suffix -el, used as in this example to
form derivatives from verbs, bulendr "rag, whore", with the Romanian
fem, augmentative suffix -ndra, is strongly depreciatory.
Cf. also buln, with the Romanian augmentative suffix -an for nouns:
1. "chicken leg",
2. "pretty woman's legs" in ce bulne mit are (neutr. pl.)
3. "the thighs of a whore",
4. "truncheon".
A derivation is bulangu "idiot", with the Romanian suffix -giu for
nouns and adjectives to form masc. Nomina agentis. Cf. Graur (1934: 131
ff., 1936: 196 ff.), Juilland (1952: 159).
a se bungh
Romani bango (adj.) "bent, crooked, slanting", but also "guilty, unjust".
Romanian a se bungh (vb. refl.) "to stare, to goggle, to gawk", a se
zbungh (vb. refl.) "to look", grammatically adapted to the fourth Romanian
verbal class with the help of an infinitive /-ending, see below zbanghiu
(adj.) "squinting, cross-eyed" etc.. Cf. Romanian a bungh "to grap, to catch,
to find a solution, to spy out", as in am bunghit-o "I caught him", also
Romanian bonght (part.) "achieved, succeeded". Cf. discussion in Graur
(1934: 195) and (1937: 222), Juilland (1952: 160).
cnci
Romani khnci "nothing".
Romanian cnci has become an interjection: "nonsense" or "no way!"

in the context of trying to borrow money. Somebody explained it as "an
interjection or a negation". Cf. Graur (1934: 132 ff.).
candriu (adj.), candrel (subst.)

Romani khand (subs.) "stench", also "troublemaker", khand (adj.)
"stinking".
According to an older informant, in the 1930's this did not have a
negative meaning in Romanian. According to her, Romanian candriu meant
"clever" and "having something to do with singing". For Graur it meant
"drunk" or "crazy". Informants mentioned Romanian e candriu "(he is) ill,
adventurous", and aft candriu "to be nuts, to be crazy". Tiktin & Miron
(1986-1990) mention Romanian candriu "a little dizzy from alcohol, happy"
(colloquial style), of unknown etymology. It is formed with the Romanian
masc, adjective suffix -iu. Romanian candrt means "beaten".
One derivation is the reflexive verb a se candr "to get drunk, to get
sick", but in the usage taci, c te candresc (cf. a candr) "shut up, or I'll beat
you up" there is already the nuance of "to beat, to hit", so that candrel
already means "fight, sickness", formed with the Romanian fem, abstract
suffix -el, which forms derivatives from verbs, in this case from
Romanian a candr. In the humorous Romanian weekly Veselia
(17.6.1911), candrel still meant "drunkeness". Cf. Graur (1934: 133).
car; crici
Romani kar "penis".
Romanian car, and with the Romanian suffix for nouns -ici with a
different grammatical function [or -ciu, cf. Pascu (1916: 324 ff.)]: crici
"male sex organ", carici "female sex organ", but cf. Boretzky & Igla (1994:
136) Romani karci "man, lady's man", and as masc, noun in: Romanian
freac cariciul (not a literal translation) "he masturbates" (with Romanian
masc. def. article). It shows a high lexical productivity: caricioic, (fem.)
"nymphomaniac" (in Romanian denominated as pulrist), with a Romanian
fem, composite suffix -oaic for adjectives and nouns, with which
denominations for female beings are formed from masc, nouns. Romanian
cariciometru or cariciopd "huge sex organ", also used among children in
word-games, with the aid of two suffixoids, a carici (vb.) "to fuck",
grammatically adapted to the fourth Romanian verbal class with an
infinitive /-ending, cariceal "fucking, masturbation", with Romanian fem.
abstract suffix -el, caricii "sex-maniac" or "large male sex organ", with
Romanian masc, augmentative suffix -oi, slightly derogatory. Cf. Graur
(1934:134 ff.), Juilland (1952: 161).
a cardi
Romani kharl (vb. tr. 3. sg.) "to call", cf. Boretzky & Igla (1994: 157),
akhardem "I called", akharl (vb. tr. 3. sg.) "to call, to invite", akhard
(part.) "invited, called", (cf. also Romani vacardol (vb. passive, intr.) "to
agree upon", vacard (part.) "said, promised", vacarl (vb. tr.) to speak, to
talk, to say, to promise").
Romanian a cardi "to talk", cf. noi cardim "we are talking, we are
having a chat", cardel "talk", with the Romanian fem, abstract suffix
-ed. Cf. Graur (1934: 133), (1936: 198), (1937: 223), Juilland (1952: 160).
crl; caro
Romani kherl (adv.) "from the house", "out of the house", cf. Romani kher
(subs.) "house", or Romani kerl (vb. tr. 3. sg.) "to make" etc.But cf. also
Romani kor, kor, kar (prep.) "in, at", Romani kori (adv.) "there, in that
direction", (prep.) "at, to", cf. Boretzky & Igla (1994: 148).
Romanian carel; caro in: car de aici, carl de aici, crl potca "get
lost!". See also in Juilland (1952: 161) car "flee! run away!". Cf. Graur
(1934: 134), Juilland (1952: 161).
ciangalsule!
Romani chagl, chaglin (part.) "ugly", cf. Boretzky & Igla (1994: 55) also
chang-, Romani changal s le "he is ugly", possibly overheard by
Romanians as a vocative-form. Another explanation is a derivation from
Romani changal (adj., masc.) with the aid of the Romanian suffix -os, for
the formation of nouns from adjectives to mark possession of a quality, in
Romanian masc. vocative form with masc. vocative ending -ule:
*ciangal()- (o)s- ule > ciangalsule.
The meaning could not be further explained by the informant (cf. also
ginglule, cialapadiule, janghinsule), but these words were insulting.
cialapadiule
Romani capldile "stupid"
Romanian cialapadiule "an insult for men", "clod, idiot", with the
Romanian masc. vocative ending -ule.
a ciord
Romani corl (vb. tr.), part, cordo "to steal", cf. cor "thief", cordi "female
thief".
Romanian a ciord "to steal", cf. ciordel "stealing", with the
Romanian abstract suffix -el, which forms fem, nouns from verbs and
adjectives, for describing the result of a previous act; ciorditr "thief", with
the Romanian masc, suffix for Nomina agentis -tor, which serves to derive
verbal adjectives from other verbs; substantival use. Cf. Graur (1934: 139;
1936: 198), Juilland (1952: 162).
ciriclu
Romani cirikl (fem. subs.) "bird", cf. also cirikl (masc, subs.) "bird,
sparrow".
It is used very often by all Romanians, but, as one informant pointed,
they do not know its meaning. In: "Ciordt-i ciricliu, barburt-i ciripi" it
was perhaps a kind of magical spell.
Romanian ciricliu (masc. subs.) is probably based on Romani fem.
subs. cirikl; this is indicated by the yet conserved female form of the
adjective in: ciordt-i ciricliu... The i-ending is adapted to the Romanian
morphological system with the help of Romanian masc. substantive ending
-iu, but the form could suggest also a fem, vocative with the ending -io. Cf.
Graur (1934: 139).
a ciumid
Cf. Romani cumdel "to kiss" (vb. tr. 3. sg.), (prt.) cumidja, (part.) cuntido,
cumidino "beloved", cum, cumi (subs.) "kiss".
Romanian a ciumid "to kiss", but also a ciumid (pe cineva) "to
degrade, to humiliate somebody", cf. the rough expression ciumda-mi bul!
"kiss my ass!". But Romanian ciumte in bi, ciumte! "hey, dummy!" is
related to slav. um "pestilence", Romanian cim "id." with Romanian
substantive suffix -ete, with a diminutive or depreciatory nuance. Cf. Graur
(1934: 140).
a se coflei
Romani kovlo (adj.) "soft, gentle, weak", kovljl (vb. passive, intr.) "to
soften, to slacken" etc..
Romanian a se cofle(vb. refl.) "to get squashy" (melons or fruits), "to
get slack or loose" (skin). The adaption to the fourth Romanian verbal class
occurs by means of a connective element -- with the help of the infinitive

verbal ending -i. Cf. Graur (1934: 142).
dvia
Romani devla "o God!", vocative of del, devel "God", "sky, heaven"
Romanian dvia "head, mind". For a semantic discussion see Graur
(1934: 149), Juilland (1952: 163).
dic
Romani dikh, imperative of dikhl (vb. 3. sg.) "to look", dikh mo! "hey,
look!"
Romanian dic mo\ "hey look!", (cf. mo), die la el ce obraznic este "get
a load of that guy!", die fza! (slang) "hey, take a look at that!" or "get a
load of this!". Cf. Graur (1934: 150).
diliu
Romani dil (adj.) "crazy, mad", (subs.) "madman, fool"
Romanian diliu "a crazy guy", it is formed with the Romanian masc,
adjective suffix -iu, used here as a noun. Cf. Romanian a dili "to destroy".
Romanian diliu is not verified in Graur (1934), Juilland (1952).
dita; dtai
Romani dita! (interject.) "here! look!", cf. Boretzky & Igla (1994: 73) <
dikh-ta!
Cf. Romanian dtai! (it is used when the Romanians want to point to
something big and extraordinary), dtamai (mocking, ridiculing) "look at
him". For mai see Boretzky & Igla (1994: 173). Cf. Graur (1934: 150 ff.).
gagu, gagca
Romani gadz (subs. masc.) "non-Gypsy man, husband, peasant, master,
landlord". Romani gadz (subs. fem.) "non-Gypsy woman, woman, wife".
Romanian gagiu 1 means:
1. "a handsome guy" as in ce gagiu mit "what a handsome guy",
2. "man",
3. "a tough guy",
4. "gigolo",
5. as a rough greeting-formula ce faci, bi, gagule! "(hey man!) how are
you doing?",
6. "a dude" (among the youth),

7. "lover" in e gagul meu "it's my lover".
In the South Romanian dialect, the form is gagic, probably a regression
from gagic (fem.), with the meaning of "groom, admirer, lover".
Romanian gagc means:
1. "girl-friend",
2. "an easy girl",
3. "wife, dear, lover",
4. "pretty girl".
Romanian gagic is very common, even the cartoon-figure "Popeye"
used it to call his girl-friend (on Romanian TV, during the Ceauescu era),
in the Romanian feminine vocative-form gagco!. One Romani informant
without any knowledge of Romani said gagu meant "a Romanian person"
or "a Romanian man". Another Romani informant (also without any
knowledge of Romani) said "man/woman"; another said "non-Gypsy".
Roma with perfect Romani uniformly translated gagiu and gagic as "man"
and "woman". One female Romani informant was upset about the proposed
words, and corrected them into (in Romanian orthography, but phonetically
identical with the Romani etyma) gagi "Romanian man", and gag
"Romanian woman".
Its lexical productivity is shown in: gagicr "a guy with a lot of
girlfriends", with the Romanian masc. suffix -ar for Nomina agentis,
gagicrel as in las gagicrel la o parte! "forget your women-stories"
or "stop giving us your stories". Romanian gagicrel with the Romanian
abstract suffix -el, which forms fem. nouns from verbs and adjectives, for
describing the result of a previous act. Cf. Graur (1934: 152 ff.), Juilland
(1952: 164 ff.).
gen
Romani dzan "they go", dal (vb. 3. sg.) "to go, to rush, to hurry".
Romanian gena 2 in: e gen pe tine "he runs after you" or "he is after
you", cf. Graur (1934: 155 ff.), and see a gin.
ginglule
Romani dzungal "bad, mean, dirty" (about people and their actions).
Romanian ginglule! "curse", with the Romanian familiar
augmentative suffix for nouns and adjectives with depreciatory character
-u, and Romanian masc. vocative ending -ule. Cf. Graur (1934: 155).
a gini
Romani verbal-stem dan- in danl (vb. 3. Sg.) "to know"; Graur (1934:
156) also mentioned the variant din- (gin-).
Romanian a gini "to notice, to look, to steal a glance". Nowadays, a
gini means "to stare", and especially "to spy, to observe" in connection with
the police and the secret service. Cf. Romanian a le gini "to be in the know,
to know what one is doing with...". Its Romanian derivations are: ginela
"spying", with the Romanian abstract suffix -el, which forms fem, nouns
from verbs and adjectives, for describing the result of a previous act, and
ginitr "spy in a political context", and especially "store-detective", with the
Romanian masc. suffix for Nomina agentis -tor, which serves to develop
verbal adjectives from other verbs. Within the control group of Roma whose
first language was Romani, the Romanian verb a ti "to know" was
associated with a gini, which confirms the etymology mentioned above. Cf.
Graur (1934: 155 ff.; 1936: 198), Juilland (1952: 165).
hacan
Romani akan (adv.) "now, immediately, there!"
Romanian hacan "get away!" or "(get) on the side". Also acan,
harcan. For a semantic discussion see Graur (1934: 157), cf. also Juilland
(1952: 166).
hai
Possibly Romani ajl (interjection)"yes!"
Romanian hai is often considered generally Balkan, but nevertheless
mentioned by Graur (1934: 158). Although its origin is obscure, it is still
worth mentioning, and the interviewees explained it to mean:
1. "scandal",
2. as in a face hai (often, vulgar) "to make fun of someone". This expression
has entered the literary language and is known to all Romanians,
3. "to make a good mood, to make people happy",
4. "to cause a fight, to create a scandal",
5. "to laugh loudly",
6. "to bother others".
As a derivation with the Romanian adjective ending -os, -oas, there is
an adjective hais (masc), haiosa (fem.) "happy", in: un om hais "a
happy man" or este o femeie haios "a cool lady". Hais, haios cm also
mean "pretty", for example, in connection with new clothing. Lupu (1972)
mentions hai, -s as "strange, extravagant", as in ce hais eti mbrct!

"how strange you are dressed". Cf. Graur (1934: 158).
hafarl
Romani av ord "come here!".
Romanian hafarl', haord "come over! come here!". The following
observation of a Romanian informant is an example of how a neutral
expression, overheard by a Romanian, suddenly assumes a new meaning:
"In the suburbs of Bucharest there were Roma, who raised pigs, and they
said to the pigs hafarl, and they also used that word to insult each other." It
appears that we are dealing not with hafarl, but with misinterpreted
Romanian haord or Romani av ord. Probably the Roma were calling their
pigs to feed them. When the Romanian children heard the Roma call other
people to come over, they had the impression, they were using a curse, since
they related it to the pigs. For haord see Graur (1934: 161 ff.), Juilland
(1952: 166).
a hali
Romani xal (vb. 3. sg.) "to eat".
Romanian a hal (vb.) "to eat" in the following expressions:
1. trebuie s hlesc ceva "I've got to eat",
2. s nu faci aa, c astfel eti halit "don't do it, or you'll get devoured,
caught".
3. In a more gentle application: ce frumoas eti, haliti-a gura "you're cute,
could eat you up".
4. taci, c te hlesc "shut up, or I'll eat you up", whereby two major
meanings emerge: the derived "to get somebody, to hit somebody", and the
original "to eat (much and quickly), to pig out", also in a figurative sense.
Cf. Romanian hal "hunger". Derivations thereof are: Romanian haleal
(generally) "a large casual meal or a large crude dinner party with several
persons and with simple food". Some informants explained that if need be,
one person can hold his own haleal dinner-party. As a nominalized verb
haleal means "eating, meal, feeding", with the Romanian abstract suffix
-el, which forms fem. nouns from verbs and adjectives, for describing the
result of a previous act. A derivation using a Romanian word formation
suffix is halimatr "quarrel", but also "bread", with the Romanian suffix
-tur, resp. -atur, which forms fem. nouns from verbs. Cf. Graur (1934:
159 ff.), Juilland (1952: 166).
halimi
Romani xalimata (subs. pl.) "ruin" etc..
Romanian halimi "scandal, quarrel", as in a face halimi "to quarrel",
which is not connected to Romani xalma "quarrel", as there is a different
accent, but rather to Romani xalipe, pl. xalimata "ruin, destruction, itching,
boundless desire". According to one informant it is possibly related to Ro-
mani xalm (vb. 1. sg. pret.), to xal "to eat". Cf. Graur (1934: 161).
ta
Romani ta! (interj.) "just look!"
Romanian ta! "look!", ta - dic la el,ta-mi,ta-mi,tala el "look (at
him)!". Cf. Graur (1934: 150, 163).
janghinsule
Romani dangl, danglino (part.) "famous, known, intelligent", cf.
Boretzky & Igla (1994: 82), budangl (adj.) "experienced, learned", (subs.)
"expert".
Romanian janghn/janghins "clever, jealous, envious"; the meaning
of Romanian janghinsule is "clod, idiot", in the Romanian masc. vocative
form with the suffix -ule. The origin of these words (ginglule,
cialapadule, ciangalsule and janghinsule) was not explained, but the
informant was sure that it can be traced to Romani. He frequently heard
these expressions from Romani children, who used them in addressing other
children.
lovle
Romani lov (subs, pl.) zu lov "money".
Romanian lovle "money"3' with a Romanian artic. fem. pl. -le. Cf.
Graur (1934: 165), Juilland (1952: 166 ff.).
macht
Romani makhiv "to get drunk"; macarl, makjarl (vb. 3. sg.), cf. Boretzky
& Igla (1994: 172) "to make drunk".
Romanian machit, maht (part, pret.) "drunken", cf. a se maki (vb.) "to
get drunk", but also "to stare at", with the Romanian participle ending -it.
Graur (1934: 166), Juilland (1952: 167).
mndea
Romani mnde (pers. pron. 1. sg. locative), to me "I" (pers. pron. 1. sg.
nom.).
Romanian mndea "to me" in cite pe mndea "throw it to me".
Romanian mndea can express "something familiar/trusted" as in vnu la
mndea instead of vinu la mine "come here to me". Its meanings are:
1. "come to daddy" (as if doing a favor, patronizing),
2. "I dare you to come here" (before a fight, to a weaker opponent),
3. "come to me" (in the sense of "I'm the boss, I can help you"). It is also
used by women. Cf. Graur (1934: 167), Juilland (1952: 167).
mangti
Romani manghn "treasure, property", manginal (adj.) "wealthy, rich". But
cf. also the Romani verbal stem mang- "to beg".
Romanian mang ti (subs. Pl.) "money", with a Romanian neutr. ab
stract suffix for nouns -ot, here in the plural: -ofi. Romanian manghitr,
manghitr, "thief, beggar", a mangli, a mangli "to steal, to beg". Cf. Graur
(1934: 167), Juilland (1952: 167).
mardi
Romani mard (subs.), "(little) coin".
Romanian mardi (subs. pl.) "money", mardu (subs. sg.) "leu, franc" (the
singular form is never used). Cf. Graur (1934: 168; 1936: 199), Juilland
(1952: 167).
a mardi
Romani marl (vb. 3. sg.) "to beat", part. mard "beaten", but also as subs.
"coin", cf. Romanian mardi.
Romanian a mardi "to beat", as in te mardsc "I'll beat you up", also
mard "to beat", mardel "walloping", with the Romanian abstract suffix
-el, marditr who is beating", with the masc. suffix for Nomina agentis
-tor, mardei "pugnacious", with the Romanian suffix for Nomina agentis
-a. Cf. Graur (1934: 168), Juilland (1952: 167 ff.).
matl
Romani mat (adj.) "drunk", cf. Romani macl, dial, matl "to get drunk",
cf. Boretzky & Igla (1994: 172), see macht.
Romanian matl (adj.) "totally drunken", a se matol (vb.) "to get

boozed up", mati, (subs. pl.)"boozers",in: ei snt mati "they are boozers".
Graur(1934: 169; 1936: 199), Juilland (1952: 168).
a mierli
Romani merl (vb. 3. sg.) "to die", muljarl (vb. 3. sg.) "to kill, to murder".
Romanian a mierli (vb.) "to do in, to murder, to bump off', but also "to
die", as in o mierlte "(I'm sure) he will die", "he's going to kick the
bucket", a mierlit-o "he died", "he kicked the bucket". For discussion see
Graur(1934: 170; 1936: 199), Juilland (1952: 168).
mit
Romani mist (indeclin. adj.; adv.) "good".
Romanian mit "nice, good, pretty". Every Romanian knows this
word. Derivations are: mitocreal in the usage lsa mitocrel la o
parte, treci la subiect "knock it off, let's get down to business", with the
abstract suffix -eal, mitocar "somebody, who often jokes at the expense of
others" 4 , with the masc. suffix for Nomina agentis -ar. Romanian mit also
appears in connection with "to make jokes", or "to make stupid or bad
jokes", and in a lu la mit "to make fun of" and a face mit de cineva "to
talk ironically about somebody". Cf. Graur (1934: 171 ff.; 1936: 199),
Juilland (1952: 168).
moln
Romani mol "wine".
Romanian mol, moln "new wine", with a Romanian augmentative
suffix -an, which forms masc. nouns from other nouns, adjectives, verbs
and interjections. Cf. Graur (1934: 173; 1936: 199), Juilland (1952: 169).
mie
Romani muj "mouth, face".
Romanian mie "face" or vulgarly "gob, mug", and muin "id.", with a
Romanian masc. augmentative suffix -an. Cf. a da la mide (in a sexual
context), a duce cu mida "to lie". Cf. Graur (1934: 174), Juilland (1952:
169).
nasl, nasl
Romani nasl (indeel, adj.) "bad, damned".
Romanian nasl (adj.) "ugly, dumb", (subs.) "dirty fellow", nasl (adj.)
"something ugly". To drive away a person, it's enough to say mai, naslule!
(in the Romanian masc. vocative form with -ule) "get away, ugly!", eti un
nasl! "you are ugly!". Derivations are Romanian o nasulie "a bad, horrible
or ugly event ", with the abstract suffix -ie, for the fonnation of fem. nouns
from other nouns, adjectives and verbs which describe properties and
conditions, and nasolel "id.", with the fem. abstract suffix -eal, a refl.
verb: a se nasuli "to get worse" (for example a situation), in: s-a nasolit
treb. The informants were not familiar with nasoale5. Cf. Graur (1934:
176), Juilland (1952: 169 ff.).
nasfarliu, nasparliu
Cf. Romani nasulip (subs.) "evil, vice, damnation, epilepsy" and nasvalip
(subs.) "illness", nasval, nasfal (adj.) "ill, sick".
Romanian nasfarliu, nasparlu, naparlu was explained to mean "an
unserious type, whom no one can trust; a man who likes to fight; a dandy",
but also "something pretty, interesting, stylish", with the Romanian masc.
adjective suffix -iu, here used as a substantive. Romanian nasfarliu was also
considered a synonym for nasl, and nasl the antonym to mit. A number
of Romani informants said that nasfarliu derives from nasfal, meaning
"sick". Cf. Juilland (1952: 169).
a parad
Romani phaad (part.) "split, ragged", phaavl (tr. vb. 3. sg.) "to make
something burst, to split, to wear out" etc., cf. also Romani phadol (vb.
pass. intr.) "to burst, to split, to explode" etc..
Romanian a paradi (vb.) "to break, to spoil" or "to smash up a car"; this
verb is formed from the Romani part. (see above), and morphologically
adapted to the forth Romanian verbal class with an /-ending; Romanian te
paradsc! "I'll beat you up!"; Romanian parade l "break-in, burglary". Cf.
Graur (1934: 178), Juilland (1952: 171).
a pili
Romani pijl (vb. tr. 3. sg.) "to drink", pilo (part.).
Romanian a pili "to drink", cf. am pilt "I got tanked up" and a se pili
"to get tanked up". Derivations are: pilel "boozing", with the fem. abstract
suffix -el, pilangu "alcoholic", with the masc. suffix for Nomina agentis
-giu, in this case with a depreciatory connotation. Graur (1934: 180) and
Juilland (1952: 171) discussed and rejected a Slavic etymon of Romanian a

pili.
pirnda
Romani phirad (adj. fem.) "like a slut",phirvan,phiran"sweetheart, slut".
Graur proposed pirand (part.), to the verb pirav- "to make love", not
mentioned in Boretzky & Igla (1994).
Romanian pirnda "name", "name of a Romani woman",. Cf.
discussion in Graur (1934: 181 ff.), Juilland (1952: 171 ff.).
puradl, puradu
Cf. Romani poadchavo "naughty child"',poad(part.) "opened wide",
from the verb pofavl (vb. tr. 3. sg.) "to open wide", chavof (subs. dim.)
"small boy, son", to chaa (subs. pl.) "brats, naughty children". According
to an informant with a shift of meaning, so that poado > puradu means
"naughty child".
Romanian puradl, puradu "child", especially "Romani child", "brats".
Some informants quoted Romanian Roma saying am ase puradei "I've got
six children". In a fight, Romanians challenge each other: ia-ti puradii i
pleca!" - "take your brats and get out!". Juilland (1952: 172) related
puradl toRomanipurde "naked".
puriu
Romani phur (adj. masc), phuri (adj. fem.) "old".
Romanian puriu (adj. masc.) "old" or "old man", pure (adj. fem.) "(my)
old woman". We even encountered: o purie btrn (!) "an old purie".
Romanian purisnca "comic old woman", with a fem. composite diminutive
suffix -nc, here with an ironic touch. Cf. Graur (1934: 183), Juilland
(1952: 172).
a soil
Romani sovl (vb. intr. 3. sg.) "to sleep", sovip, sojip (subs.) "sleep, bed".
Romanian a soili (vb.) "to kip, to take a snooze", as s substantive soi in:
a trage un soi "to kip a little bit, to take a little snooze/ forty winks",
soilela "snooze, nap", with the Romanian fem. abstract suffix -el. Cf.
Graur (1934: 187), Juilland (1952: 174).
ucr, ucr
Romani ukr (invar. adj.; adv.) "beautiful, pleasant".
Romanian ucr, ucr (adj. masc), ucr (adj. fem.) "good, pretty,
beautiful", as a compliment to women: eti ucr "you are so pretty".
Compare this to the Romani name Sukari "name of a pretty woman". In
contrast, consider the verbal derivations: a ucr (vb.) "to make somebody
angry" and a se ucr (vb. refl.) "to get angry", as in: m ucrsc "I get
angry". Cf. Graur (1934: 188 ff.), Juilland (1952: 174 fff.).
uru
Romani suri, chur "knife".
Romanian uru and ciuru "knife", specifically "knife, worn on the
body, and one used for criminal purposes". This noun is productive and
forms the Romanian verb a ciurui "to perforate, to poke holes into". The
word uriu is well integrated into Romanian and is rarely associated with
Roma. One Romani informant corrected uriu to uri "knife", which is
phonetically and semantically identical with the Romani word. Cf. Graur
(1934: 189; 1936: 200), Juilland (1952: 176 ff.).
a uti
Romani chuvl (vb. tr. 3. sg.) "to put, to put into, to stuff, to throw" etc.,
(pret.) chut]a.
Romanian a uti "to pinch, to nick things" (in an elegant manner, as
assured to me by an informant), also "to pinch somebody's girl-friend" as in
-am utt gagc', utel "theft", with the fem. abstract suffix -el;
utitr and ut "thief" as in el e ut de buzunare "he is a pickpocket". Cf.
Graur (1934: 189 ff.; 1936: 300), Juilland (1952: 177).
ticlos
Romani tikalos6 "an unhappy person, vagabond", perhaps related to
Romani tka "father", cf. Saru (1992: 160).
Romanian ticls "bad, sad, disgusting, false" (adj.), un ticlos (subs.)
"a lazy person, a man with two faces, dishonest person, liar, cheater, bandit,
immoral person", intentii ticlose (adj.) "bad intentions", ticloenie
"falsehood", with the fem. abstract suffix -ie for the formation of fem. nouns
from other nouns, adjectives and verbs which describe properties and
conditions; in a verbal construction a ticloi pe cineva "to cheat, to deceive
somebody", as in te am ticloit. Cf. Graur (1934: 192).
trla, tral
Romani tras1 (subs.) "fear" , trasl (vb. intr.) "to be afraid of",
(perf.) me trasjlem "I was afraid".
Romanian tr l, tral, with the variant -l from the Romanian fem.
abstract suffix -el, means:
1. "fear, anxiety",
2. a psychological condition, as in mi este trl, s fac ceva "something's
keeping me back from doing...", mi-e trla "I'm afraid" and generally "to
be afraid; the fear to do something, to be frightened, to avoid",
3. "embarrassment",
4. "the feeling, to suffer from loneliness".
Cf. Graur (1934: 193; 1936: 200), Juilland (1952: 177 ff.).
a se uch
Romani ustl, ucl (vb. intr. 3. sg.) "to get up, to wake up", (pret.) utlo,
cf. in Graur (1936: 199) Romani ustiav "I go away".
Romanian a se uch (vb. refl.) "to buzz off", Romanian uchta (adj.)
"something crazy", further explained as "his mind has gone away, he
remains with an empty head", uchela "flight", with the fem. abstract
suffix -el. Cf. Graur (1934: 194; 1936: 200), Juilland (1952: 178).
vast
Romani vast "hand", cf. also vastalo (adj.) "long-armed, thievish".
Romanian vast "hand", but also "paw, pocket" as in a bag vstul "to
paw, to steal", strictly speaking "to grasp", and in ia vstul de aici! "hands
off!". In earlier sources (Graur 1936: 200) documented as "punch in the
chin, push, knock". Cf. Graur (1934: 194), Juilland (1952: 179).
zbanghiu
Romani bango "bent, crooked, slanting" etc., cf. Tiktin & Miron (1986-
1990, III: 917), DEX (1975: 1040).
Romanian zbanghiu (adj.) with Romanian the prefix z- and masc.
adjective suffix -iu:
1. "squinting, cross-eyed"
2. "with a sick eye",
3. "a man, who beats his wife",
4. "an unreliable fellow".
Cf. eti zbanghu? "are you nuts?" Cf. Graur (1934: 195), Juilland (1952:
179).
zurlu
Romani zural (ajd.) "strong, powerful, violent".
Romanian zurliu (adj. masc.) "with the head in the clouds; crazy; in a
happy mood", but also "naughty, impertinent", "impetuous, wild",
"something noisy, barmy, retarded", with the adjective suffix -zw, resp. fem.
-ie. Cf. dar criz e cam zurle (adj. fem.) "it's a horrible crisis", o femeie
zurlie "a crazy woman". Cf. Graur (1934: 195; 1936: 200).
5. Results and conclusions

In formal terms, the examples show a number of patterns for gram
matical adoption of Romani items.Romaninouns and adjectives are usually
assigned Romanian derivational suffixes. Romani endings are re
interpreted, for example -o > -u, as in ginglu. A vowel suffix - is added
to Romani nouns which end in a consonant: bft < baxt. Verbs are usually
integrated into the fourth conjugation, primarily on the basis of Romani
participles and preterite forms, and seldom on the basis of the present root.
Most lexemes encountered in the context of this investigation were
humorous, familiar, or vulgar. It is to this domain of vulgar expressions
which Weinreich (1964: 58) attributed a need for synonyms, it being "an
onomastic low-pressure area". Indeed we find that many of the semantic
fields which Weinreich mentions as candidates for lexical renewal are
represented in our corpus, such as talk, hit, sleep, and sexual vocabulary.
Sexual taboos generally lead to strategies of avoiding terms that are
comprehensible to all, hence the need for (secret) loanwords. A comparable
domain is represented by expressions for stealing: Originally, such words
were borrowed to conceal criminal activities, but eventually they entered
general slang usage and are now an integral part of the Romanian colloquial
language, i.e. their secret character has been lost.
Romanians acted strongly and often humorously or with astonishment
to a large part of the lexemes. They then gave exhaustive explanations of
the lexemes with examples describing the context to clarify to me the use
and the register of style, or the level of formality of these words. But
Romanian Roma who had no knowledge of Romani were quite relaxed in
their reaction. They explained the meaning of the expressions with only a
single word, often without any stylistic explanation, and their definition of
the word was frequently very close to the original meaning in Romani. It
was evident that the Romanians knew relatively few lexemes, usually with a
semantically strong and special connotation. The Romanian Roma, on the
other hand, although without any knowledge of Romani, were familiar with
many lexemes, but related to them semantically neutral meanings.
Judging by their behavior in the interviews, the two groups, Romanians
and Romanian-speaking Roma with no knowledge of Romani, may be
considered as representing two stages in the semantic development of
Romani-origin loans in Romanian. The first, the Romanian group, has
knowledge of selected items which have shifted quite significantly from a
semantic point of view. The second, the Roma, show traces of a mixed
lexical system, where Romanian-speakers still had direct access to an active
Romani lexicon and could draw on items in their original Romani meaning.
Although Romani borrowings are recognized and known to various
degrees, the interviews have shown that none of the interviewees, in neither
one of the groups, including the Romanian Roma who consider Romani to
be their dominant language, understood all words mentioned by Graur
(1934) and Juilland (1952). This raises the question of the authenticity of at
least some parts of the material used by the authors from Veselia or other
newspapers such as Pardon or Diminea ta, which furnished a part of the
sources for Graur and Juilland.
Attempts to conduct a statistical analysis on the basis of elicited data
show that knowledge of the elicited lexemes differs most strongly among
women from both ethnic groups, Romanians and Romanian Roma, while
the degree of knowledge of lexemes among men of both groups is nearly
the same. This means that Romanian women generally tend to know the
least Romani loanwords. Romanian Romani women, on the other hand,
know the most lexemes. This can indicate a stronger language conservatism
among women, but it could also mean that while Romani women are still
strongly embedded in a traditional cultural context, Romanian women have
little access to those domains on the fringe of Romanian society where
taboo-related items, including Romani loans, are used.
Experience shows that it is the intellectuals and students in Romania
who have the most natural and positive relationship to the Roma. An
interesting phenomenon can be observed in the language of school students,
particularly, as far as we can tell, among pupils in Bucharest since the early
1980's. Those informants who applied a new interpretation to the
expression be (see above) were all pupils or students between the ages of
sixteen and twenty-six, who had attended school in Bucharest (with the
exception of one informant who went to school in Ploeti, sixty kilometers
to the north). Other examples of typical expressions used among pupils are
bft "success" (for encouragement before examinations), barosn "big fat
man or woman", and also in a figurative sense "important person", and
gagu "a guy, a dude" (among the youths). A central position in the
language of the pupils is occupied by the antonymic pair nasl "stupid,
bad", and mito "good, marvellous, super, pretty".
To conclude, it can be stated that the Romanian lexical items of Romani
origin dealt with in this study adhere to the criteria for various non-literary
varieties, such as Colloquial Style, Slang, or Jargon. However, what
Domaschnev (1987: 314) mentions as features of Argot - to conceal
meanings from the outside world and serve as a means of identification for
insiders - has disappeared. It remains to be seen whether the Slang
component - a large variety of synonyms (such as expressions for "to steal")
- might not disappear as well, giving way to full integration of the items in
question as 'neutral' expressions in general Romanian colloquial speech.
NOTES
1 It exists also in Bulgarian as the invariable neutral noun gde, denotating female and
male persons. With the definite neutral -article -to: gdeto. It means "lover, girl-/boy-
friend, dear". No Bulgarian, whom I asked, knew about the origin of the word.
2 Romanian gen "(eye)lash" is not related to this.
3 One informant said that there are other words used by the young Romanians for
"money": mardi (<Romani), bitri (<Romani), mangti (<Romani), marafti (<Turk.).
4
Lupu (1972) mentions mitocr "crafty customer, smart lad".
5 C f . Lupu (1972) a bgpe nasoale "to slander; to rabbit on, to blether", afipe nasoale
"to be enough to drive you up the wall". But Lupu saw only a possible connection to nasl.
6 As suggested by aRomaniinformant from the control group.
' It seems that the Romani word is borrowed from an Iranian language, but Romanian
adapted it from Romani.
REFERENCES
Bochmann, Klaus (1980) Die Herausbildung soziolinguistischer Betrach

tungsweisen in der rumnischen Sprachwissenschaft. In: Bochmann, K.
(ed.) Soziolinguistische Aspekte der rumnischen Sprache. Leipzig:
Enzyklopdie. 9-34.
Boretzky, Norbert & Birgit Igla (1994) Wrterbuch Romani-Deutsch-
Englisch fr den sdosteuropischen Raum. Wiesbaden: Harrassowitz.
DEX (1975) Dic tionarui explicativ cd limbii romne [Explanatory Dic
tionary of the Romanian Language]. Bucharest: Editura Academiei
R.S.R.
DEX-S (1988) Supliment la Dictionarul explicativ al limbii romne.
Bucharest: Editura Academiei R.S.R.
Domaschnev, Anatolij (1987) Umgangssprache/Slang/Jargon. In: Ammon,
Ulrich, Norbert Dittmar & Klaus Mattheier (eds.) Sociolinguistics.
Berlin/New York: De Gruyter. 308-315.
Graur, Alexandru (1934) Les mots tsiganes en roumain. Bulletin
Linguistique Romane II, 108-200.
Graur, Alexandru (1936) Notes sur les mots tsiganes en roumain. Bulletin
Linguistique Romane IV, 196-200.
Graur, Alexandru (1937) Notes sur quelques mots d'argot. Bulletin
Linguistique Romane V, 222-225.
Graur, Alexandru (1960) Studii de lingvistic general. Bucharest: Editura
Academiei R.S.R.
Graur, Alexandru, Lucia Wald & S. Stati (eds.) (1971) Tratat de lingvistic
generala. Bucharest: Editura Academiei R.S.R.
Ivnescu, Gheorghe (1980) Istoria limbii romane. Iai: Editura Junimea.
Juilland, Alphonse (1952) Le vocabulaire argotique roumain d'origine
tsigane. In: Juilland, Alphonse (ed.) Cahiers Sextil Pucariu, Vol. I.
Roma: Dacia. 151-181.
Juilland, Alphonse (1953) Les tudes d'argot roumain. In: Juilland, A. (ed.)
Cahiers Sextil Pucariu, Vol. II Roma: Dacia. 431-439.
Lupu, Coman (1972) Observatii asupra argoului studentesc. Limb i
literatur, 349-351.
Matras, Yaron (1991) Zur Rekonstruktion des jdisch-deutschen
Wortschatzes in den Mundarten ehemaliger 'Judendrfer' in
Sdwestdeutschland. Zeitschrift fr Dialektologie und Linguistik 58:3,
267-293.
Rosetti, Alexandru (1978) Istoria limbii romne. De la origini pn n
secolul al XVII-lea. Bucharest: Editura tiintific i enciclopedic.
Saru, Gheorghe (1992) Mic dicfionar rom-romn. Bucharest: Editura
Kriterion.
Schroeder, Klaus-Henning (1989) Etymologie und Geschichte des
Wortschatzes. In: Holtus, Gnter, Michael Metzeltin & Christian
Schmitt (eds.) Lexikon der romanistischen Linguistik III: Rumnisch.

Tbingen: Max Niemeyer. 347-357.
Sfrlea, Lidia (1989) Sprache und Generationen. In: Holtus, Gnter,
Michael Metzeltin & Christian Schmitt (eds.) Lexikon der
romanistischen Linguistik III: Rumnisch. Tbingen: Max Niemeyer.
197-208.
Steinke, Klaus (1989) Sondersprachen. In: Holtus, Gnter, Michael
Metzeltin & Christian Schmitt (eds.) Lexikon der romanistischen
Linguistik III: Rumnisch. Tbingen: Max Niemeyer. 225-229.
Tiktin, Hariton & Paul Miron (1986-1990) Rumnisch-deutsches Wrter
buch. I-III. Wiesbaden: Harrassowitz.
Vascenco Victor (1991) Rumnische Wrter unbekannten Ursprungs unter
quanttativem und semasiologischem Aspekt. Zeitschrift fr Balka-
nologie 27/2, 179-189.
Vasiliu, Alexandru (1933/34a) Din argoul nostru. Grai i suflet V1-V11, 95-
131.
Vasiliu, Alexandru (1933/34b) Glose la citeva expresii din argou. Grai i
suflet Vl-VII, 309-312.
Weinreich, Uriel (1964) Languages in Contact. The Hague: Mouton.
ROMANI STANDARDIZATION AND STATUS
IN THE REPUBLIC OF MACEDONIA
VICTOR A. FRIEDMAN
University of Chicago
0 Introduction
Romani is one of the few widely-spoken languages of Europe for which
basic issues of standardization (orthography, dialectal base, etc.) are in the
process of resolution. In Haugen's (1966:16-26) terms, Romani is at the
stage of selection of a norm. Due to the fact that the Roms are a transnational
people, problems of language planning are additionally-complicated by the
fact that they are being confronted in the frameworks of various and varying
state mechanisms. For the Roms of the Republic of Macedonia, issues of
identity maintenance and sociopolitical integration must be viewed in the
context of an educational policy that has included multilingualism for the past
half century in a state that has only recently achieved independence and is
surrounded by overt and covert threats to its integrity. This paper will
examine a specific event in the current efforts to standardize Romani, namely
a conference held in Skopje, Macedonia on November 20-21, 1992, and the
document that resulted from it. The document itself will be presented together
with commentary on its context significance.
1. Macedonia, standardization efforts, and Romani

Macedonia has served as the site for a number of important events in the
standardization of three languages of the region, e.g. the Macedonian
codification conferences of Skopje in 1944-45, the Albanian Alphabet Con
ference of Bitola (Manastir) in 1908, and the publication of the Romani
grammar by Jusuf and Kepeski (1980). For Macedonian, the results of the
first codification conference of November-December 1944 were not accepted
by the government for a variety of reasons, and attempts to rehabiltate it in
the early 1990Ms were based on party politics, but the basic rules of
orthography and morphology were established in May, 1945 (see Friedman
178 VICTOR A. FRIEDMAN
1993). In the case of Albanian, later events in Albania and Kosovia -

particularly the literary language unification of 1968-72 - have surpassed the
1908 Conference in overall effect, although the Alphabet Congress itself is
enshrined as a crucial step in the establishment of modern standard Albanian
(see Skendi 1967:366-90, Byron 1985). In the case of Romani, Jusuf and
Kepeski (1980) can be compared to Pulevski (1880), which was an attempt at
a Macedonian grammar that is seminal in its signaling of ethnic and linguistic
consciousness but not sufficiently elaborated to serve as a codification. The
language resolutions adopted at the Fourth World Romani Congress held in
Warsaw in 1990 (Cortiade et al. 1991) may become the watershed event in
the normativization of StandardRomani,but it is too early to judge.
In the past decade or so, there has been an upsurge in activity related to
the use of Romani as a means or subject of education. This is especially true
in Eastern Europe, where these efforts have occurred in the context of the
region's general sociopolitical upheaval that has its origins in the early
eighties and broke the threshold of stability at the end of that decade. One of
the results of these events has been a new focus on the situation of Romani as
a language of literacy in Macedonia, where the Roms are a constitutionally
recognized nationality and constitute 3% of the population (60,000) according
to the 1991 census (Velkovska 1991), although the actual number is
undoubtedly higher - probably between 6% and 10% - the discrepancy in
figures being due to various factors including unaccounted for classifications
and the use of religion (Islam) as a possible definer of ethnic identity.
The manuscript of Jusuf and Kepeski (1980) was ready for publication
in 1973, the same year as the second meeting of the Language Commission
of the World Romani Congress, but publication was delayed for seven years.
The alphabet that developed out of the second Language Commission meeting
was published at approximately the same time in Kenrick (1981). In the
following decade, little progress was made in Macedonia in bringing Romani
to a level similar to that of minority languages such as Albanian and Turkish.
There had been sporadic attempts at Romani language instruction at the
elementary school level, but these classes did not lead to the establishment of
a regular pedagogical program. In 1991, TV programming in Romani, and
also Vlach (Arumanian), whose speakers accounted for .3% of the
population, i.e. about 6,000, according to the 1981 census, was begun in
Skopje: fifteen minutes of news on Tuesday and Wednesday, respectively.
By mid-1992, Romani and Arumanian each had 25 minutes of TV time on
ROMANI STANDARDIZATION IN MACEDONIA 179
Wednesdays. There is also Romani radio programming elsewhere in

Macedonia, e.g. Tetovo.
2. The Macedonian Romani Standardization Conference of

1992
The November 1992 conference which is the focus of this article was
sponsored by the Ministry of Education of the Republic of Macedonia and the
Philological Faculty of the University of Skopje for the purpose of reaching
an agreement concerning the introduction of Romani as a course of study in
Macedonian schools. The conference was attended by a number of Mace
donian Roms active in Romani intellectual life, including Saip Jusuf, Trajko
Petrovski, Gjunes Mustafa, Saip Isen, Ramo Rusidovski, Tahir Nuhi, Iliaz
Zendel, and others. Also present were Donald Kenrick and myself as well as
members of the Philological Faculty of Skopje University and the Mace
donian Academy of Sciences, most notably Olivera Jasar-Nasteva and Liljana
Minova-Gjurkova as well as Zivko Cvetkovski, head of the Macedonian
Department. It is important to emphasize that a distinction is made between
course of study (Macedonian nastaven predmet ) and language of instruction
(nastaven jazik).
The ultimate goal of the Roms present at this conference was not the
establishment of Romani as a language of instruction in a parallel education
system (nastaven jazik) but rather the teaching of Romani as a subject in
elementary schools and pedagogical academies (nastaven predmet), with a
view to preparing a cadre of teachers and ultimately a lectureship and
Department of Romani at the University of Skopje. One of the explicit goals
of Romani politics in Macedonia is the establishment of such a Department,
but a qualified cadre of faculty has yet to be trained. It is worth noting that
some Roms in Macedonia have been under pressure to assimilate to Albanian
or Turkish language - for which government-funded parallel education
systems exist in Macedonia - on the basis of shared religion, i.e. Islam, a
situation that is also occurring among Macedonian Muslims. The Macedonian
government has thus begun to support the preservation of Romani ethnic and
linguistic identity not only in connection with Article 48 of the Republic's
constitution, which guarantees minority language rights, but also in order to
reduce challenges from Albanian and Turkish.
The Romani-identified political party in Macedonia, the Party for the
Complete Emancipation of the Roms of Macedonia (Romani Partija Saste
Emancipacijake e Romengiri tari Makedonija, Macedonian Partija za Celosna
Emancipacija na Komite na Makedonija or PCER) had at this time one

representative in parliament, Faik Abdi. During 1992, a second party, the
Democratic Progressive Party of the Roms, split from PCER over various
issues, including questions of language standardization, dialectal compromise
and the place of Romani in educational institutions (Nova Makedonija
21.X. 1992:4). It was in this context that the November 1992 conference took
place. Although representatives of different sides of Romani politics were
present, the conference itself was kept apart from partisan considerations.
Nonetheless, the document that resulted from these deliberations, reproduced
here in English translation with additional commentary in square brackets,
was agreed upon by representatives of the various political currents as well as
by the intellectuals that produced it. The document addresses a number of
general and specific issues in Romani language standardization and should be
viewed in the context of Jusuf and Kepeski (1980), Kenrick (1981), and
Cortiade et al. (1991). As indicated above, both Jusuf and Kenrick were
present at the conference. Moreover, both Jusuf and Kenrick participated in
the deliberations of the Language Commission at the Fourth World Romani
Congress, at which Cortiade et al. (1991) was discussed and signed. Jusuf
was a signatory to that document, but Kenrick was not. Mention should also
be made here of Hancock (1975, 1993), which, while important for the
history of Romani standardization, did not have a direct bearing on the 1992
conference. The former had been superseded by subsequent publications and
events while the latter had not yet appeared.
3. The 1992 Romani Standardization Document: text and

commentary
GENERAL PRINCIPLES
1. This codification is for the Romani language as a course of study in the
Republic of Macedonia. This codification is viewed as a necessary step to
ward the international Romani literary language and not in competition with
it.'
This statement was intended to address Cortiade et al. (1991). The
Romani participants in the conference felt that the situation in Macedonia
required a regional standard for use in Macedonian elementary schools, with
a view to study of the international standard later. See also comments on the
Alphabet.
'2. In view of the fact that the majority of Roms in the Republic of Macedonia
use the Arlija dialect, this dialect shall serve as the basis of the Romani
literary language in the Republic of Macedonia, but with certain grammatical,
phonological, and especially lexical additions (and modifications) from all the
Romani dialects of the Republic of Macedonia such as Dambaz, Burgudi,
Gurbet, and others.'
Throughout the history of Romani standardization efforts in Macedonia,
Arlija has served as the basis, but, as indicated above, this question had be
come a politically divisive issue. This compromise was satisfactory to all
present at the conference.
ALPHABET
The Romani alphabet in the Republic of Macedonia consists of the following
letters in Latin transcription. The corresponding Macedonian orthography is
used for Cyrillic.
Aa, Bb, Cc, , h/h, Dd, D/d, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Kh/kh, Ll,
Mm, Nn, Oo, Pp, Ph/ph, Rr, Ss, , Tt, Th/th, Uu, Vv, '
The corresponding Cyrillic alphabet would be the following (the order
follows that of the Latin alphabet):
Most of the differences between this alphabet and those proposed in

previous works are treated below. Three salient issues not addressed
otherwise are the indication of unpredictable stress by means of a grave
accent, the use of the acute instead of the hacek (Romani ciriklo) to mark
palatals and the use of a special symbol 3 for the voiced palatal affricate
(fricative in some dialects) all in Cortiade et al. (1991). The issue of marking
stress is not addressed here. Apparently the participants did not feel it was an
issue. The use of the hacek instead of the acute for strident palatals is well
established both in East European orthographies and in linguistic
transcriptions, as well as in many Romani publications, including both Jusuf
and Kepeski (1980) and Kenrick (1981). Despite the fact that a number of
publications have appeared in the orthography of Cortiade et al. (1991), the
participants in the conference preferred to follow the type of practice currently
use in Macedonia and elsewhere (e.g. Petrovski 1992, Hbschmannov 1991
et al.). The same attitude dictated the use of Dz instead of 3. The fact that the
acute is used for mellow palatals in East European languages that are in
contact with Romani, e.g. Serbo-Croatian and Polish, was also felt to favor
the use of the hacek for strident palatals for Romani in Macedonia.
The situation can be compared to that of the Albanian alphabet congress
of 1908. The crucial decision of that congress was the adoption of the
principle that Albanian would be written in a Latin alphabet rather than Arabic
or Greek, although the two major Latin alphabets then in use - one based on
the principle of one letter per sound the other using digraphs - were both
endorsed. Eventually a single alphabet became official. Similarly, while most
Roms agree that the alphabet used for Romani should be Latin (the mention
of Cyrillic is simply for contexts where transliteration of individual items
might be desirable in Macedonia, as opposed to the exclusive use of Cyrillic
in Malikov 1992), there is not yet a general consensus concerning the details
of orthography. See especially points 4 though 8 below.
COMMENTARY
In some Romani dialects, the uvular fricative /x/ is distinguished from the
glottal aspirate /h/ and/or the rolled Ivl is distinguished from another related
type of sonorant, but in view of the fact that such distinctions are not made in
the Arlija dialect special letters [for these distinctions] have not been
introduced into the alphabet. Such pronunciations are permitted as literary for
those speakers who have such distinctions in their native dialects.'
These are two major contested issues. Jusuf and Kepeski (1980),
Kenrick (1981) and Cortiade et al. (1991) all prescribe a graphic distinction
for /x/ and /h/, but Jusuf and Kepeski (1980) fail to make the distinction in
practice, using both <x> and <h> in the same roots, e.g. xiv, hiv "hole", xor
"depth" but horadaripe "deepening", an illustration of the problem that would
be encountered by speakers of dialects without the distinction. Given that
minimal pairs are extremely rare and that the sounds themselves are in free
variation in dialects where they are not phonemic, the majority of Roms at the
meeting felt that a single grapheme should be used. In the case of the two
types of /r/, Cortiade et al. (1991) makes the distinction facultative while both
Kenrick (1981) and Jusuf and Kepeski (1980) give only one Ivl in their
standard orthographies.
BASIC ORTHOGRAPHIC, MORPHOLOGICAL AND

MORPHOPHONOLOGICAL RULES
1. There is no special sign for the so-called dark vowel (schwa) in the
alphabet because the vowel is very rare, marginal, or entirely absent in most
Romani dialects. In the rare instances of schwa in the Arlija dialect, the
corresponding form in Dambaz or some other Romani dialect with a
different vowel will be taken as the literary norm, e.g. instead of Arlija
vrdon ["wagon"] Dambaz vurdon is accepted.'
Schwa is of foreign origin in Romani and only occurs in dialects
influenced by contact with languages where schwa is phonemic. Its
occurrence is generally limited to borrowings from those languages. Thus,
for example, in Jusuf and Kepeski's (1980) vocabulary of 2,292 entries,
only 31 items, representing at most 21 roots, contain schwa. Although Jusuf
and Kepeski (1980) use the sign <> for schwa, Kenrick (1981) and
Cortiade et al. (1991) both exclude it as dialectal. In view of the fact that
many occurrences of schwa in one dialect correspond to some other vowel in
another dialect, this was viewed as a good opportunity for expanding the
vocabulary of standardized Macedonian Romani beyond the limits of the
Arlija dialect.
'2. Where there is aspiration in the root of a word, it will always be written,
e.g.jakh ["eye"].'
Although some dialects have deaspiration of underlying aspirates in
some positions, the adoption of the morphophonemic principle of re
presenting the underlying morphophoneme in spelling which is common to
many of the languages of Eastern Europe was adopted.
'3. Automatic devoicing is not spelled at the end of a word, e.g. dad
["father"].'
Same as point 2 above.
'4. Where an underlying dental or velar stop or sonorant occurs before a
front vowel or jot, i.e. t, d, k, g, 1, n plus i, e, j , the underlying consonant is
used in spelling, e.g. buti ["work"], kerdjum ["I did"], geljum ['T went"], lil
["letter"], pani ["water"].'
This is an area of both considerable and salient dialectal variation and
morphophonemic alternation in Romani. Underlying or historical dental
and/or velar stops in these positions can be pronounced as palatals and/or
with affricated or fricativized articulation in various dialects of Macedonia and
elsewhere (see Ventcel' and Cerenkov 1976 for details), e.g. Arlija buti,
Dambaz buki, Burgudi buci, Gurbet buci; singular buti/plural buka, etc.
Similarly, HI and /n/ can become palatals or lost, e.g. Arlija pani but Dambaz
pai (< *pani). Jusuf and Kepeski (1980) show considerable variation, e.g.
writing both <k> and <kj>, <1> and <lj>, etc. before front vowels in the
same lexical items at different occurrences. Cortiade et al. (1991) articulates
this same principle for velars, but has special graphic symbols for alternating
dentals and velars in their function as case markers (also called postpositions,
see Friedman 1991), viz. 6 and q, respectively. Thus in the orthography of
Cortiade et al. (1991) the same morphophonemic alternations have different
spellings, while the same graphic symbols have different pronunciations, as
illustrated in the following table:
Cortiade 1992 Macedo- dialectal pro-

et al. ( 1991 ) nian Conference nunciations
Rom (loc. sg.) Romese Romeste [romeste] [romesce]

Rom (loc. pl.) Romene Romende [romende] [romende]
Rom (abl. sg.) Romesar Romestar [romestar]
Rom (abl. pl.) Romesar Romendar [romendar]
Rom (dat. sg.) Romesqe Romeske [romesce] [romeske]
Rom (dat. pl.) Romenqe Romenge [romende] [romenge]
done (m.sg.pt.) kerdo kerdo [cerdo] [kerdo]
done (pl. pt.) kerde kerde [cerde] [kerde]
There has also been confusion in prepositions and adverbs, e.g. and-o
"in the" (Cortiade et al. 1991) but and-o "in the" anddro "inside" (Saru
1992).
It is important to emphasize that in the various dialects of Romani the
same phonological changes that effect the dental case endings also effect the
dental participial marker, and, similarly, the same processes affect velars
before front vowels in both roots and grammatical endings in those dialects
with fronting of velars. To this can be added the fact that voicing is
distinctive. The Roms present at the 1992 meeting were unanimous in their
decision to follow morphophonemic practice and spelling using underlying
consonants. Cf. comments on the alphabet above.
'5. In writing the first person singular aorist the final consonant is preserved
according to the root, e.g. kerdo ["done"] > kerdjum ["I did"].'
This is a specific example of point 4, but at the same time it specifies the
Arlija form of the first singular aorist, which can also be -em or -om, with or
without jotation, and is a salient dialectal feature, i.e. one which is taken by
speakers themselves as indicative of dialectal affiliation. In Cortiade et al.
(1991), a hacek is used over a vowel that follows a consonant that is jotated
in some dialects but not in others (except in case suffixes, see point 4), e.g.
kerdm.
'6. Where there is jotation, it is written with j ; the letter i is only written as a
vowel:
Romni, Romnie, Romnja, Romnjatar, etc.
["nom. sg., voc. sg., nom. pl. abl. pl."]'
Jusuf and Kepeski (1980) are inconsistent in writing <i> for <j> in final
position in some words, e.g. saj/sai "it is possible", muj/mui "mouth".
Similarly, there has been variation in the spelling of feminine obliques under
the influence of the nominative e.g. abl. Romniatar/Romnjatar for
[romnjatar]. Given that /i/ can contrast with /j/ as indicated in the vocative
singular and nominative plural forms of Romni "(Romani) woman, wife"
cited here, it was agreed that there was a basis for a consistent distinction.
See also point 5 above.
7. In suffixes where Dzambaz and other dialects preserve an older s which
has been lost in Arlija, s is written: devies, devlesa, ["god" acc. sg., instr.
sg.] manges, mangesa, mangas, mangasa ["want", 2 sg. short/long, 1 pl.
short/ long].'
Jusuf and Kepeski (1980) do not address this issue directly but rather
mix forms with and without original /s/ throughout the work. In Cortiade et
al. (1991) this problem is addressed in the instrumental case but not
elsewhere. Thus, for that suffix there is a special grapheme, viz. , but no
prescription for other positions where /s/ alternates with /j/, 0, etc. Hence
devlea, but mangesa or mangea. This was another area of important
compromise for the Roms present at the conference. The fact that the forms
with /s/ are older while those without /s/ represent dialect-specific innovations
gave greater authority to the principle of adopting the compromise.
'8. The instrumental case is always written with s, e.g. mansa ["with me"],
Romensar ["with the Roms"].'
Although /s/ >/c/after /n/ in the instrumental, the alternation is automatic
and so the underlying /s/ is kept in spelling. Moreover, in Arlija and some
other dialects the instrumental plural has a variant in/-r/.Although this is not
etymological, theRomaniparticipants included it.
'9. The personal pronouns are the following:
me, tu, vov, voj [1 sg., 2 sg., 3 sg. m., 3 sg. f.]
amen, turnen, von or ola [1 pl., 2 pl., 3 pl.]'
There is considerable variation of form in the third person personal

pronouns and in the possessive pronouns. The 1992 conference accepted
variation in the third plural personal pronoun but not elsewhere.
'10. The possessive pronouns are the following:
mo, to, po [1 sg., 2 sg., reflexive]
amaro, turnaro [1 pl., 2 pl.]
leske, lake, lenge [3 sg. m., sg. f., pl.]'
See point 10 above. The singular possessive pronouns show the most
variation and phonological change. By adopting these forms, those at the
conference felt they were choosing those forms which were simplest and
most transparent and most easily learned by speakers of other dialects.
'11. The definite articles are the following:
nominative masculine singular: o
nominative feminine singular: i
all others: e
o Rom, i Romni, e Roma, e Romnja, e Romeske, e Romnjake, e Romenge,
etc.'
These are the rules for Dzambaz rather than Arlija, which has o in the
nominative plural. It thus represents a significant dialectal compromise. It
also gives equal weight to marked masculine and feminine forms rather than
marking the feminine nominative i as opposed to the masculine and plural
nominative o as is done in Arlija.
'12. The comparative is formed with the prefix po- or the suffix -eder. The
superlative is formed with maj-.'
At this stage it was felt that the most essential points had been covered
and that a basis for further work and elaboration had been established. The
points 12 through to 15 were added on as examples of the type of elaboration
that would need to be covered in the course of standardization.
'13. The names of the days of the week are the following: kurko, palkurko,
dujtodi, tritodi, startodi, panctodi, savato ["Sunday, Monday, etc."]
14. The names of the seasons of the year are the following: anglonilaj, nilaj,
palonilaj, iven ["spring, summer, etc."]
15. The names of the months are the international ones.'
4. Conclusion
Haugen (1966:16-26) defines four stages in the development of language
planning and standardization: 1) selection of norm, 2) codification of form, 3)
elaboration of function, and 4) acceptance by the community (cf. also Ismalji
1991). As was indicated at the beginning of this article, the Romani

documents under discussion here all represent the first stage. They are
proposals, initiatives concerning linguistic form, but their implementation is
still in process. When compared with early documents in the histories of
other languages spoken in Macedonia (cf. Xhuvani 1905/1980, Koneski
1950, Peco et al. 1972), the 1992 document under consideration here can be
seen to be of the same basic type. In addition to addressing essential issues
such as dialectal base and alphabetical form, features of phonology and
morphology that are most basic, problematic and/or salient to speakers
themselves are those for which norms are specified. Language contact is
generally studied in terms of lexical and structural borrowing, but
normativization can also be seen as a contact process, especially in a context
such as that of Romani in Macedonia. The elaboration of a Romani standard
in Macedonia is taking place not only in the environment of the international
Romani movement, but also in contact with other standard languages that
have themselves only recently achieved elaboration and acceptance, which are
in fact on-going processes. Just as Lunt (1951), by encouraging Macedonian
grammarians to use the third singular present as the ctation form of the verb,
had a significant impact on the standardization of conjugational types, so too,
documents such as that produced by the 1992 Skopje conference have the
potential to influence the on-going process of Romani normativization in such
a way that the lessons of other languages can be applied and modified where
appropriate.
REFERENCES
Byron, Janet (1985) An overview of language planning achievements among
the Albanians of Yugoslavia. International Journal of the Sociology of
Language 52, 59-92.
Cortiade, M. et al. (1991) I alfabta e standardone Rromane chibaqiri,
Dcizia "I Rromani Alfabta". Informaciaqoro Lil e Rromane Uniaqoro
1-2, 7-8.
Friedman, Victor A. (1985) Problems in the codification of a Standard Ro-
mani Literary Language. In: Grumet, Joanne (ed.) Papers from the
Fourth and Fifth Annual Meetings: Gypsy Lore Society, North Ameri
can Chapter. New York: Gypsy Lore Society. 56-75.
Friedman, Victor A. (1991) Romani nominal inflection: cases or post
positions?. Problemy opisu gramatycznego jzykw sowiaskych
(=Studia gramatyczne, vol. 11). Warsaw: Polish Academy of Sciences.
57-64.
Friedman, Victor A. (1993) The First Philological Conference for the

Establishment of the Macedonian Alphabet and the Macedonian Literary
Language: its precedents and consequences. In: Fishman, Joshua (ed.)
The Earliest Stage of Language Planning: The "First Congress" Pheno
menon. Berlin: Mouton de Gruyter. 159-80.
Hancock, Ian (1975) Problems in the creation of a standard dialect of Ro
mans. (Social Sciences Research Council Working Papers in Socio-
linguistics No. 25.) Arlington, VA: ERIC.
Hancock, Ian (1993) The emergence of a Union dialect of North American
Vlax Romani, and its implications for an international standard. Inter
national Journal of the Sociology of Language 99, 91-104.
Haugen, Einar (1966) Language Planning and Language Conflict: The Case
of Norwegian. Cambridge, MA: Harvard University Press.
Hbschmannov, Milena, et al. (1991) Romsko-cesky a cesko-romsky
kapesn slovnk. Prague: Sttn pedagogick nakladatelstv.
Ismajli, Rexhep (1991) Mbi normn gjuhsore. In: Ismajli, Rexhep. Gjuh
dhe etni. Prishtina: Rilindja. 303-326.
Jusuf, Saip and Krume Kepeski (1980) Romani gramatika - Romska gra-
matika. Skopje: Nasa Kniga.
Kenrick, Donald (1981) Romano alfabeto. Loli Phabaj 1, 3-4.
Koneski, Blaze (1950) Za donesuvanjeto na makedonskata azbuka i pra-
vopis. Makedonski jazik 1:5, 99-105.
Lunt, Horace (1951) Morfologijata na makedonskiot glagol. Makedonski
jazik 2:6, 123-31.
Malikov, Jasar (1992) Cigansko-balgorski recnik. Sofia: Fondacija Otvo-
reno obstestvo".
Peco, Asim et al. (eds.) (1972), Becki knjievni dogovor. In: Peco, Asim et
al. (eds.) Srpskohrvatski jezik. Beograd: Interpres. 40-41.
Petrovski, Trajko (1992) O Siljan Strko. Skopje: Detska radost.
Pulevski, Gj. (1880) Slavjano-naseljenski-makedonska slognica recovska.
Sofia: Ugrin Dzikov.
Saru, Gheorghe (1992) Mic dictionarrom-romn. Bucharest: Kriterion.
Skendi, Stavro (1967) The Albanian National Awakening. Princeton:
Princeton University Press.
Velkovska, Vera, ed. (1991) Broj i struktura na naselenieto vo Republika
Makedonija po opstini i nacionalna pripadnost: Sostojba 31.03.1991
godina. Skopje: Republicki zavod za statistika.
Ventcel', T. V. and L. N. Cerenkov (1976) Dialekty ciganskogo jazyka. In:
Konrad, N. I. (gen. ed.) Jazyki Azii i Afriki I. Moscow: Nauka. 283-
332
Xhuvani, Aleksandr (1905/1980) Pr themelimin t nji gjuh letrare. In:
Xhuvani, Aleksandr. Vepra. Tirana: Akademia e Shkencave e
Shqipris. 3-7. (Originally published under the pseudonym Dok Sula
in the periodical Albania 9:8,162)
TRIAL AND ERROR IN WRITTEN ROMANI
ON THE PAGES OF ROMANI PERIODICALS
MILENA HBSCHMANNOV
Charles University, Prague
0. Introduction
Romani is the language of a minority subjected to genocidal and
ethnocidal aggression for centuries, a dispersed minority sub-divided into
jati-like (cast-like) subgroups which, like the Indian jatis maintain relations
of social distance. Only the intensive as well as extensive pressure of the
mainstream gadikane societies brings Romani jatis together, at least at the
level of ideology, as expressed by sayings shared by all of them: Rom
Romeha - gado gadeha (Rom with Rom - gado with gado) or Sem
Roma saml (We are Roma!).
Under what political and social conditions did Slovak Romani in the
Czech and Slovak Republics start to develop as a literary language? I would
like to give a short survey of this process and to show how Romani
"behaves" on the pages of periodicals edited by Roma.
The forms of ethnonymic terms will be used on the basis of Romani
grammar: Rom (sg.), Roma (pl.), Romani (as adjective; also stands for the
Romani language), romipen - romhood, Romani tradition, culture etc.. In
Slovak Romani the suffixes -iben, - ipen, -ben, -pen are highly productive,
and they help derive nouns not only from verbs, adjectives, numerals,
adverbs but also from substantives: kher - kheriben (house - dwelling), lav -
laviben (word - way of talking, way of using words), phral - phraVipen
(brother - brotherhood) etc.. In contrast to romanipen used by Mirga (1987),
the Slovak Roma use the expression romipen. For a non-Roma the term
gado (sg.), gade (pl.) is applied; gadikano (adjective). In agreement with
Hancock (1993) the terms "internal/ external differance" between Romani
dialects are used. The term "calque", more common in French or Czech
linguistics, is applied in correspondence with the definition of the Random
House English Dictionary (1973): " ... (translation) resulting from bilingual
190 MILENA HBSCHMANNOV
interference in which a syntactic structure of the borrowed construction is

maintained but its morphemes are replaced by those of the native language
t!
1. Romani groups and dialects in the Czech and Slovak Republics

Following their own distinctions, the Roma in former Czechoslovakia
(ca. 600 000 - about 400 000 in Slovakia, 200 000 in the Czech part) fall
into five groups. The differences between them are based on a complex of
inter-linked historical, political, socio-professional, cultural and linguistic
specifics.
Romani groups resident in the Czech lands (Bohemia and Moravia)
before World War II were the Czech Roma and the Sinti (German Roma) -
approximately 8000 people in 1939. In March 1939, Bohemia and Moravia
became a protectorate of Nazi-Germany, and according to its racist laws the
Czech Roma and Sinti were nearly all exterminated in concentration camps.
In May 1945, only about 600 Romani survivors still lived in the Czech part
of the reunited Czechoslovakia.
Only a few families of Czech Roma live in the Czech Republic today,
scattered among the gade. They do not speak Romani anymore. Czech Ro-
mani was described by Puchmayer as early as 1821. Lipa (1965) qualified
Czech and Slovak Romani as one dialect (ceskoslovensk ciknstina).
However, neither the Czech Roma nor the Slovak Roma consider each other
to belong to the same group. Most of the Czech Roma were itinerant. Their
traditional professions were horse-dealing, knife-grinding, some were
kettle-smiths.
The Sinti shared the tragic fate of the Czech Roma. Few Sinti families
still live in Czech or Moravian towns (Brno, Beroun, Olomouc etc.). The
Sinti dialect is spoken by the elder generations. Sinti, in contrast to other
Romani groups, keep their language secret. The term Laleri (lalo - mute) is
also used for the Czech Sinti. It is a calque of the Czech expression for a
German - Nmec, derived from nm- mute. Many Sinti in the Czech Re
public have relatives in Germany (Weinrich, Lagrin etc.), and they keep
contacts with them. Sinti dialects from Hameln, Kln, and Hildesheim were
recently described by Holzinger (1993). The author states that the Laleri
(Czech Sinti dialect) is close to the German Sinti language.
Until 1945, the Slovak Roma, Ungrika Roma and Vlachi (Vlaxi) lived
only in Slovakia. After World War II, they started to migrate to the industri
ally developed Czech and Moravian towns in search of jobs and better
TRIAL AND ERROR IN WRITTEN ROMANI 191
housing. Most of the ca. 200 000 Roma living in the Czech Republic today
are post-war migrants from Slovakia and/ or their descendants.
The prevalent majority of Roma in Slovakia (the Slovak and Ungrika
Roma) have been settled in outskirts of villages for three or four centuries.
They provided the peasants with blacksmith and musical services; well-
digging, adobe production, basket weaving, seasonal field-work was also
the labour domain of Roma (Hbschmannov 1984). Specific traditional
production and services of Roma became an integral part of the economy of
semi-feudal, agrarian Slovakia. This is perhaps the most important reason
why during World War II, when Slovakia was a pro-Nazi but formally
independent state (March 1939 - May 1945), Roma were not annihilated as
a group. Though sent to camps of forced labour, evicted from villages to
deserted places, not allowed to enter big towns, they survived. The
estimated number of Roma in Slovakia in 1945 amounted to 100 000.
The Slovak Roma are the most numerous group in the Slovak as well as
in the Czech Republic (65 - 70 % of total Romani population). The older
generations used the attribute slovenska/ slovaika (Slovak) to distinguish
themselves from the Ungrika Roma, Vlachi, eventually also other groups
who were not amare Roma (our Roma). Slovak Roma who were born in the
Czech part - many of them have never visited Slovakia - do not call
themselves slovenska any more. Today it is: My sme Romov-Cesi. (We are
Roma-Czechs.) It is quite interesting to note that the attribute serbika
(Serbian), which was quite common among the Slovak Roma four or five
generations ago, has not fallen into oblivion completely, though several
centuries divide the Slovak {serbika!) Roma from their pre-Slovak
transitory home-land - Serbia and/ or Croatia. Rumungre is an appalation
used for the Slovak Roma as well as for the Ungrika Roma by Vlachi,
though on grounds of etymology it should only be used for the latter: Rom
Ungro - Hungarian Rom.
The Humenn-variety (town in Eastern Slovakia) of Slovak Romani has
been described by Lpa (1963). The nominalisation system of Slovak Roma-
ni was analysed by Hbschmannov (1984). The Romani-Czech/ Czech-
Romani dictionary (Hbschmannov et al. 1991) is also based on (East)
Slovak varieties of Romani.
One regional variety of Slovak Romani is spoken in Southern Poland.
However, here its speakers are called Carpathian or Highland Roma
(Bartosz 1981) or Bergitska Roma (Mirga, personal communication). They
share with the Slovak Roma not only the language but also similar tradition-
192 MILENA HBSCHMANNOVA
al folklore (Kopernicki 1925) and common surnames (Mirga, Mizikar, Pesta

etc.).
In the east Slovak Romani extends beyond the border of the Ukraine to
the region of Uzhorod, Mukacevko. In the west, West-Slovak Romani is (or
rather was) spoken by Roma of South-east Moravia who survived World
War II.
Ungrika (Hungarian) Roma have been settled for several centuries in
South-Slovak regions also inhabited by the Hungarian minority. Their num
ber amounts to 10 - 15 % of the total Romani population. Hungarian Roma-
ni differs from other dialects spoken in former Czechoslovakia not only by
external features (many loan words from Hungarian) but by internal features
as well: the suffix -ahi for imperfect tense, the causative verb (petovel - to
be baked); pekel - to bake; pekavel - to have somebody bake), the clitics lo,
li, le (Holzinger 1993 uses this term to describe Sinti.) etc.. Hungarian
Romani was also spoken in Northern Hungary, in Burgenland (Austria), and
in Slovenia (Halwachs 1994). In all these countries it is now nearly extinct.
In former Czechoslovakia, though decaying, it is perhaps still best
preserved.
The Vlachi (Vlaxi) represent 10 - 15 % of the total Romani population
in the Czech as well as in the Slovak Republic. Their language is a set of
lovari varieties; although their traditional profession was horse-dealing (lo-
Hung. horse) they do not call themselves either lovari or Vlachi but just
Rom(a). The Vlachi were sedentarised by force on the basis of the law 74 in
1958. Until then they were itinerant. Romani is the first language in
probably all the Vlachi families. It is spoken by all generations at home as
well as in public.
2. Historical background: Assimilation policy under the communist

government
2.1. The first wave of assimilation (1958-1969)
At a session dedicated to the "Gypsy question" on April 8th' 1958, the
Central Committee of the Communist Party of the CSSR labelled the Roma
as a "socially and culturally backward population with a special way of
life". In the resolution it is further declared: "It is necessary to reject all the
artificial effort of certain cultural activists, who would like to create a writ
ten language on the basis of Gypsy dialects. If their endeavour suceeded, the
re-education of the Gypsies would be hampered ... and their primitive way
of life would be conserved, as they would become even more isolated from
other members of the working class."
At this session the policy of strict assimilation was adopted and it was
implemented till November 1989, with a short interval during the "Prague
spring".
2.2.First efforts at ethnic emancipation (August 1969 - March 1973)

As a result of the "Prague spring" or Dubcek's policy of "socialism with
a human face" Svaz Cikan-Rom (SCR), the Union of Roma, was founded
in August 1969. The SCR published a bulletin called Romano l'il.The two
first issues were written in Czech, but from the third issue onwards Romani
texts began to appear. Andrej Pesta (born 1921 in Spissk Nov Ves,
Eastern Slovakia), a locksmith and poet, was responsible for the Ro-mani
column. He was also a member of the Linguistic Commission of the SCR,
which developed an orthography for Slovak Romani (the SCR ortho
graphy).
Why Slovak Romani? The Slovak Roma were most numerous. At that
time they were not yet afflicted by language assimilation as the younger
generations are today. Besides, all the Roma who in the seventies "exploded
into writing" were using Slovak Romani (Bartolomj Daniel, Frantisek De-
meter, Tera Fabinova, Andrej Gia, Ilona Lackov, Andrej Pesta).
The SCR orthography, slightly modified, is used until today in most of
the Romani publications, in the Romani-Czech/ Czech-Romani dictionary
(Hbschmannov et al. 1991), and in newspaper articles written or corrected
by those who had a chance to get acquainted with it. The main principles
were published in Romano Vil nevo (Hbschmannov 1993). An essential
abstract of the SCR orthography is presented in section 4.
2.3. The new wave of assimilation and further attempts at ethnic emanci
pation (1973-1989)
When in March 1973 the Union of Gypsies-Roma was banned, a new
wave of assimilation swept away all ethnic emancipation efforts. Romani
poems piling up in the editing offices of Romano Vil remained unpublished
until 1979, when the Cultural House of Praha 8 consented to publish them
as "auxiliary material for cultural activists". The bilingual booklet Romane
gil'a was issued in 200 copies.
From the beginning of the eighties the echo of Gorbachov's "pere
stroka" was alleviating the pressures of the assimilation policy. Romani re-
sounded in songs on public stages sung by folklore groups, some of which

perform until today (Khamoro and Perumos in Prague, Cerchen in Roky-
cany etc.). The contemporary chairman of the Romani political party ROI,
Dr. Emil uka, founded an amateur theatre "Romen" in the West
Bohemian town Sokolov. His first play, Amaro drom, was written in Czech,
with some dialogues in Romani. The second play, a dramatised tale, was
entirely in Romani. In 1984, the central Cultural House of Praha published
another bilingual booklet of Romani proverbs called Lacho lav sar maro.
The next bilingual publication (1987) So hin pro svetos jekhbuter? was a
booklet of Romani riddles and anecdotes. Both the proverbs and riddles
were collected with the cooperation of students who attended the Romani
course in the School of Languages in Prague (1976-1991). Neither Lacho
lav sar maro (200 copies) nor So hin pro svetos jekhbuter? (400 copies)
were for sale. Not many of them reached the hands of Roma. But both
books are a source for Romani texts regularly published in Romani
periodicals today. In all these publications, Slovak Romani and the SCR
spelling were used.
3. Changes after the "velvet revolution " of November 1989

3.1. Explosion of anti-Gypsyism
November 1989 was a political and social earthquake. As on many oc
casions in the previous course of history, Roma became a scapegoat for all
the aggression provoked by general helplessness in solving serious prob
lems brought to the surface by the "velvet revolution". Unskilled Roma can
hardly find jobs in the free market economy - especially when profit-chas
ing entrepreneurs are seized by general anti-Gypsyism. Even the honest
well-to-do Romani businessmen, whose private enterprising has become
legalised, are not spared from the attacks of skinheads, often applauded by
the gadikano public. However, in accordance with the Romani proverb Sar
upre, av kathe tele, sar tele, av katheupre- "up the same as down, down the
same as up" - the painful transition from the communist system to demo
cracy is accompanied not only by "downs", but by "ups" as well.
3.2. The status of "nrodnost" (national minority) and its implications

The main "up" which the Roma have achieved is the political status of
nrodnost with the right to form their own political and cultural organiza
tions and to develop their own language and culture.
3.2.1. Extra-linguistic obstacles in introducing Romani to new social

functions
Within the new political situation, authorities would tolerate the use of
Romani in all spheres of public life: at school, on TV and on the radio, as
well as in the press. However, there are many reasons why it is impossible
to introduce Romani immediately into a new set of social functions and
spheres of communication. Slovak Romani, which in the period of the SCR
was the primary means of communication, is being forgotten by the
youngest generation. Many school-children possess only a passive know
ledge of it. (This does not hold for Slovakia, where at least in the Gypsy
settlements Romani is still the first language.) Unfortunately, the decay of
Romani does not mean that Roma have mastered Czech. The mother-tongue
of many Roma has become a non-standardized ethnolect of Czech in which
the phonetic, grammatical, semantic and stylistic structure of Romani is
calqued. The gradual dispersion of the Romani communities, increasing
contacts with the gade in the neighbourhood, at work or at school, the
decay of traditional oral folklore along with the strong assimilatory policy
have caused a cultural and linguistic vacuum rather than the intended
"transition of Gypsies to higher Czech culture". Besides, the Roma were
driven to develop schizophrenic attitudes towards their romipen. Many
parents protest against the idea of introducing Romani into school. They
consider it a humiliation and discrimination against their own children.
When in Chanov, the Romani quarter of Most, I addressed a young woman
in Romani, she retorted in Czech: "Gypsy is not spoken here! Take care,
there are children present!". I believe it would not be difficult to change this
attitude if good teachers who know Romani in practice as well as in theory
and good text-books were available. Both are lacking at present.
In 1992, a regular program "Romale" (broadcast at 1 o' clock at night!)
was introduced on TV and another one on the radio. Romani is not used
much here either.
In February 1992, 34 Romani organizations were registered by the Min
istry of Internal Affairs. All of them, except one, declared in their program
their wish to foster Romani. Nevertheless, Czech or Slovak is used in pub
lic meetings.
Extra-linguistic limtations on the use of Romani correlate with intra-
linguistic ones. Romani lacks expressions for many objects, concepts and
institutions which until recently were totally irrelevant to the Roma. If ever
themes lying outside the range of everyday communication are discussed,

bilingual speakers shift into a non-Romani language.
3.2.2. Steps taken byRomaniorganizations to foster Romani

The Association of Romani writers (founded in November 1990)
organized two weekend seminars (September 1991, September 1992) for its
members to familiarize themselves with the main principles of the SCR or
thography. A similar seminar was organized by the weekly Romano Vil nevo
(Presov, Slovakia). Editors of the weekly Romano kurko (Brno) were ac
quainted with the SCR spelling in a two-day seminar.
Romani organizations rather concentrate on political activities respond
ing to social problems and racial attacks. Fostering culture is manifested in
showy activities, like beauty queen contests, rather than in painstaking, un
obtrusive, systematic work which will bear fruit in the future. It would be
foolish to expect that the Roma can gain experience in political and cultural
management overnight. If they could, they would be an exception among all
nations.
Credits for fostering written Romani belongs mostly to Romani peri
odicals and book publications (see Appendix.). Although in the beginning
Romani periodicals were publishing most items in Czech or Slovak, Roman
Slovak Romani which prevails as it was in the period of the SCR. Hungar
ian Romani dominates over other dialects in the monthly Roma (Bratislava),
the chief editor of which, Dezider Banga, is a Hungarian Rom. He is an
excellent poet, who to date has published seven collections of poetry written
in Slovak. Only lately has he started to translate his Slovak poetry into
Romani. Thanks to Joszef Ravasz, a Vlach poet, some items in the Vlach
dialect are published in Roma.
Written Romani in periodicals is developing by trial and error rather

than by planned language management. Some authors have spontaneously
learned the SCR orthography, some translate the phonetic structure of their
regional variety into the graphic form as they 'feel' it. What is the reaction of
the readers? We do not know exactly, but letters appreciating Romani texts
and more and more contributions in Romani (especially poems) indi-cate
that written Romani - as uncodified as it may be - is gaining a solid basis for
further development.
4. Main principles of the SCR orthography

The SCR orthography is based on the Czech/ Slovak alphabet through
which Romani children are taught to read and write at school.
4.1. Comparison of Romani (above) and Czech (below) sets of graphemes
a - b c hdddd e - f g h c h i - j
a b c d d e f g h c h i j
k kh 1 I' m n o - p ph - r - s s t t th u -
k -- 1 - m n o p - q r s s t t - - u u
v - - - z z
v x y y z z
In Romani the diacritic for prolonging the vowel (rka-acute accent: ' )
is not used since the vocalic length is not distinctive as it is in Czech (plat -
"salary"; plt - "plate"). Acute accent is applied only to mark shortened
forms of the future or imperfect tense: kerava/ ker (I shall do), keravas/
kers (I was doing).
Palatalisation of d, I, n, t is marked only by a ciriklo () (term coined by
M. Courtiade), while in Czech it alternates with / . y in Czech:
a) is a historical element; b) denotes non-palatalised d, n, t. In Romani it has
no function and therefore it was not included in the alphabet.
palatalised non-palatalised
Romani di, l'i, i, ti di, li, ni, ti
Czech di, --,ni, ti dy, -, ny, ty
In Romani, as in Czech, morphophonetic spelling is used. Voiced and

aspirated consonants are pronounced as voiceless and non-aspirated at the
end of the word and in voiceless positions. However, the voiced and
aspirated consonants are written if they occur in the whole paradigm of the
word: dad (pron. dat) - dadestar, dadoro; rat - ratestar, rata,
jakh (pron. jak) -jakha, jakhori; pek! - pekel, pekav
x, which in many Romani alphabets stands for the German or Czech ch

(machen, Charakter), is omitted and ch is written instead (chaben, chip,
charo).
Where, why and how to level certain distinctive features of regional

varieties of Slovak Romani for the sake of its standardization has been
dicussed elsewhere (Hbschmannov 1993). The subject is too extensive to
be dealt with in this context.
5. Calquing and structural changes due to interference of Czech

As far as interference is concerned I shall concentrate only on calques in
some grammatical categories, where the shift from an older form towards a
calque is most apparent. Older forms are those which are still preserved in
the idiolects of (older) Roma who live in traditional Romani communities
and communicate mostly in their native language. Older forms in contrast to
calques - or structural changes - prevail in the texts of tales which I
collected in the period 1956 - 1976.
Some general observations: Interference in grammatical categories as
well as on the semantic and stylistic levels is not consistent. It varies first of
all in accordance with the author's language proficiency. But even the same
author in the same article often uses both the older form and the calque.
Older forms are better preserved in complex constructions like idioms,
metaphors, especially if these occur in language rituals accompanying
traditional cultural situations (for examples, see 5.3.).
Translations from Czech/ Slovak are most overloaded with calques.
Many texts are published bilingually. The translation is usually done by the
author himself. Only a few authors are able to shift sentence by sentence
from the structure of one language to the other. The translations into Czech
are as 'clumsy' as those from Czech into Romani.
Innumerable calques are to be found in non-traditional genres (political
reports, feuilletons, commentaries etc.) also dealing with non-traditional
themes. Poems, tales and narrations in general preserve more genuine lan
guage (older forms).
Recurring types of calques may indicate structural changes in Romani.
However, the new structure is not fixed. It coexists along with the older
forms. It is more apparent in written than in spoken language.
The main types of interference (or structural changes) are: shift of gen
der (5.1.), case agreement in adjectives (5.2.), loss of the modal negation ma
(5.3.), use of pes as an intransitive marker in all persons (5.4.).
5.1. Shift of Gender

e diz > o diz ("castle"; zmek, m.), o vodi > e vodi ("soul"; duse, f.): In
RLN 70, p. 6 both forms are used: sar bivodakeri, f. ("as without soul"),
but in a complex idiomatic construction the masculine is retained: cirdl'a
pharo vodi (lit. "he drew heavy soul", "he sighed").
5.2. Shift from oblique to case agreement (esp. in accusative constructions)

In my data on Romani tales I did not come across this calque at all. It is
very rare in traditional Romani communities. In the idiolect of Roma who
live in Czech towns, dispersed among gade, case agreement occurs, how
ever, less than on the pages of the Romani press: zaachol ciknen naroden
("he stands up for small nations"; RG 7, p.7). In the very next sentence both
forms are used: sar sakone avres manues ("/they should have right/ like
any other person").
5.3. Shift from ma (neg. imperative particle) to na (neg. indicative particle)

In Czech and Slovak only one particle is used: ne-. Ma is retained in
constructions ma dara\ ("don't be afraid!"), ma ladal ("don't be
ashamed!"). Both are part of a culturally fixed expression by which the
guest is invited to eat: cha, ma dara, amen sam zuze Roma ("eat, don't be
afraid, we are clean Roma", i.e. not degesa, eating horse or dog meat). Cha,
ma lada ("eat, do not be shy") is said when the guest politely refuses the
offered meal at least twice before he starts to eat. See alternative use in RLN
74, p. 5: ma maren, ma maren ("do not beat"; in a poem using this sentence
from a traditional song), but na chochav tut ("don't deceive yourself").
5.4. Shift from the personal to the reflexive pronoun in verbal constructions
In Czech the reflexive pronoun se, svuj is used where in older Romani
personal pronouns were applied. Alternative use of the older form and the
calque occurs in RLN 74, p. 5: bisterav man u dav andal peste avripeskere
vakeribena (lit. "I forget myself and I speak out my narrations"); RLN 70, p.
3: chude pes andre lachi buti (lit. "get yourself into a good job"). But on the
same page: arakhl'om man andro foros ("I found myself in the town").
There is not sufficient space here to list all of the grammatical categories
in which calquing occurs. I have mentioned only the most conspicuous
ones.
Many Roma do not recognize calques as a foreign element. On the other
hand loan words are more widely felt as improper in the zuzi chib ("pure
200 MILENA HBSCHMANNOVA
language"). Zuzi chib has traditionally been praised in contrast to the

pagherdi chib ("corrupted language"). Zuzi implies phurikai ("ancient",
"traditional"), cacikai ("real", "genuine" Romani). In a traditional popular
game sar pes phenel? ("how do you say?"), knowledge of old Romani
words is tested. In the periodicals, especially in Romano L'il Nevo, a
tendency to replace borrowings by zuze romane lava is very obvious. Most
of the zuze lava are neologisms.
6. Neologisms - coinage of new expressions

'Purification' of Romani through new word formation is also happening
by trial and error. Some neologisms are formed with sophisticated linguistic
feeling by means of the Romani nominalisation system; some merly seem
funny and clumsy. Structurally, many newly coined expressions are just
calques of Czech/ Slovak words. Sometimes international words or fully do
mesticated borrowings are replaced by opaque hybrids. However, the same
happens in all newly born literary languages. Out of twelve expressions by
which the Czech revivalists, at the beginning of the last century, tried to re
place German durchsichtig ("transparent"), only one (a calque from German
which in turn is a calque from Latin) remained: pruhledny.
I cannot deal with this interesting subject in detail here, but I give at
least some examples of neologisms which can be found in Romani pe
riodicals.
6.1. Words taken over from otherRomanidialects

rukh (from Sinti)/ stromos (strom - "tree"), luma (Vlach Romani)/
svetos (sv - "world"), luludi (Vlach Romani)/ kvitkos (kvitek - "flower"),
lilkerel, kerel, l'ila (?)/pisinel (pst - "write") etc.
Note: luma is a borrowing from Romanian, luludi from Greek, but Slo
vak Roma are not aware of this; not being Czech or Slovak, these words
sound native to them.
6.2. Words taken from Hindi

itro /obrazis (obraz - "painting"), lekhado /spisovateVis (spisovatel -
"writer"), kastputl'i / babika {babika - "puppet").
Note: citro became very common. In conversations, I heard it even in
the form citrak, which proves that it is becoming 'domicilised'. Similarly,
kastputl'i which is difficult to pronounce has changed to kastupl'i in the
speech and writing of Bartolomj Daniel (a historian).
6.3. Expressions formed by various means of the Romani nominalisation

system
6.3.1. Composition
averthema /cudina {cizina - "foreign countries"): aver - "other", them -
"country"; avrirodipen /viskumos {vyzkum - "research"): avri - "out", rodel
- "seek"; avritho di / vistava (vystava - "exhibition"): avri - "out", "off",
thovel - "put"; angledando /prorokos (prorok - "prophet"): angle - "in
front of", "in advance", "before", danel - "know"; barobar /pomikos
(pomnk - "monument"): baro - "big", bar - "stone".
6.3.2. Past participle in nominalisation function

sikhado / vdcos (vdec - "scientist"): sikhavel - "teach"; sikhl'ardo /
ucitel'is (ucitel - "teacher"): sikhlarel' - "teach", W.Slov. Romani); gil'ado,
gil'ardo /spevakos {zpvk - "singer"): gilavel - "sing"; geni/kika
{knzka - "book"): genei - "read"; zailo /zajatcos {zajatec - "prisoner of
war"): (za)lel, past t. (za)il'om - "capture", etc..
6.3.3.1. Derivation by -iben, ipen, -ben, -pen

eripen, eral'ipen / vedenis {vedeni - "leadership"): sero - "head",
seralo - "with big head", "leader" ; rajipen, rajariben / viada {vlda -
"government"): raj - "gentleman", rajarel - "to behave like a boss", "to
govern"; socialno izdrapen / socialno otrasos, sokos {sociln otes, sok -
"social shock"): izdral - "shiver", "shake"; prastaben, prastapen/ ?
{vyhostni- "banishment"): prastal - "run" (note: this may be a semantic
extension of the archaic term prastapen - "excommunication from the
Romani community")
6.3.3.2. Derivation by -uno, -utno

gil'uti/basicka (bsnicka - "poem"): gil'i - "song", giVavel - "sing";
sikhl'uno, sikhl'utno/ uceikos < ucednk - "apprentice"): sikhVol - "learn",
genuino / tatel'is (Slov. ctatel - "reader"): genel - "read", etc..
7. Conclusion
Out of the four Romani dialects still spoken in the Czech and Slovak
Republics (Czech Romani is extinct), the Slovak Romani is most elabo
rated, though in contrast to the Vlachi Romani it is being forgotten by the
youngest generation. The first attempts at written Romani were published in
Romano L'il, the bulletin of the Union of Gypsies - Roma (1969-1973). The
linguistic commission of the Union developed an orthography for Slovak

Romani (SCR orthography). The assimilation policy of the communist
government not only had a negative effect on the knowledge and use of Ro-
mani but also influenced attitudes of Roma towards their language. Sponta
neous decay of the language was caused by desintegration of the traditional
Romani community, and increase of contacts with gade and gadikane lan
guages (through mass media). However, after the "velvet revolution" (No
vember 1989) when Roma gained the political status of a "national minori
ty" (nrodnost), and when their ethnic emancipation endeavour has again
started to develop quite vigorously especially at the political level, some at
tention to Romani was also paid. Six periodicals edited by various Romani
organisations dedicate more and more space to items written in Romani.
Written Romani is developing without language management, by trial and
error. In Romani texts, especially in translations, interference of Czech/ Slo
vak is very much apparent. However, calques are used along with older lan
guage categories; structural changes are not fixed. Neologisms are formed
by different devices: on the basis of the nominalisation system of Romani,
by replacing Czech/ Slovak loan words by native (or seemingly native)
terms from other Romani dialects, or by implanting Hindi words into
Romani.
Appendix 1
Periodicals published by various Roma organisations in the Slovak and Czech

Republics
(name; periodicity of issue; name of editing organisation; place of publication; date of first
publication; main language used; other languages used; approximate estimate of amount of
Romani texts; Romani dialects used: Slovak Romani = R SI, Hungarian Romani = R Hung,
Vlach Romani = R Vx, other dialects specified; chief editor)
ROMA: monthly; Romani kultura; Bratislava; Nov. 1990; Slovak; Hungarian (Czech), 10-
20 % R; R Hung, R Vx, R SI; Dezider Banga M.A. (Hung. Rom. poet, former teacher).
ROMIPEN: irregular; Zdruzenie inteligencie Romov na Slovensku; Bratislava; 1991;

Slovak, Hungarian; in first number no Romani texts at all, now about 15 %; R Hung, R SI;
Imriku Polk.
ROMANO LIL NEVO: weekly; Association JEKHETANE; Presov ; Sept. 1990 (with
intervals); Slovak (Hungarian, Czech); 30-40 % R; R SI; Anna Koptov (professional Rom
journalist, ex MP), Dana Silanov (Slovak poetess).
AMARO LAV: monthly; Rompress; Brno; Jan. 1991 (under name LACHO LAV since
Jan. 1990); Czech (Slovak); 20-25 % R; R SI; Jan Horvth (Slovak Rom, poet, former
worker).
ROMANO KURKO: weekly; Rompress; Brno; Sept. 1991; Czech (Slovak); in the
beginning no R, now app. 20 %; R Sl (other dialects: Kalderas, Russian R); Jan Horvth
(cf. AL).
ROMANO GENDALOS: monthly (irregular); Romai Chib - Association of Roma

writers and painters; Praha; Dec. 1991; Czech (Slovak); 50 % R; Margta Reiznerov
(Slovak Romni poetess, former worker).
Appendix 2
List of Romani book publications
Romsk psne - Romane gil'a, gendi romana poeziatar (Romani songs, a book of Romani
poetry), The Cultural House Praha 8 1979, 57 p.; bilingual, R Sl, SCR orth., 200
copies.
Dobr slovo je jako chleba - Lacho lav sar maro (A good word is like bread; collection of
Romani proverbs), The Cultural House of the Captal Praha 1984, 110 p.; bilingual, R
Sl, SCR orth., 200 copies.
Ceho je na svt nejvc ? - So hin pro svetos jekhbuter ? (What is most numerous in the
world ?; collection of Romani riddles and anecdotes), The Cultural House of the
Captal Praha 1987,110 p.; bilingual, R SI, SCR orth., 500 copies.
Kale rui (Black roses; a selection of Romani poems and short stories by 11 authors), The
Regional Cultural House in Hradec Krlov 1990, 119 p.; bilingual, R Sl, SCR orth.,
2000 copies.
Demeter, Gejza; Mule maskar amende (The spirits of the dead among us), Romani chib,
Praha 1992,16 p.; SI R, SCR orth., 1000 copies.
Reiznerov, Margta; Kal'i (Kal'i), Romani chib, Praha 1992, 20 p.; R SI, SCR orth., 1000
copies.
Rusenko, Arnost; Trin phea (Three sisters), Romani chib, Praha 1992, 13 p.; R SI, SCR
orth., 1000 copies.
Ferkov, Ilona; Mosarda peske o divipen anglo love (She spoilt her life for money),
Romani chib, Praha 1992,18 p.; bilingual, R Sl, SCR orth., 1000 copies.
ervek, Stefan; Romani chajori the beng (A Romani girl and the devil), Romani chib,
Praha 1992, 16 p.; R Sl, SCR orth., 1000 copies.
Go daver lava phure Romendar - Moudr slova starych Romu (Wise words of old Roma;
Romani proverbs), Apeiron, Praha 1992,45 p.; bilingual, R Sl, SCR orth., 2000 copies.
Gia, Andrej; Bijav - Svatba (The Wedding), Apeiron and The Cultural Union of Romani
Citizens, Praha 1992, 64 p.; bilingual, R Sl, SCR orth. 2000 copies.
Fabinov, Tera; Cavargos - Tulk (A vagrant), Apeiron, Praha 1992, 68 p.; bilingual,
R Sl, SCR orth., 2000 copies.
Lackov, Elena; Rmske rozprvky - Romane paromisa (Romani tales), Vychodoslovensk
vydavatel'stvo, Kosice 1992, 90 p.; bilingual, R SI, the author's own spelling.
Fabinov, Tera; Sar me phiravas andre kola - Jak jsem chodila do skoly (How I used to
go to school), DO Cesk Budjovice and The Association of Roma in Moravia, Brno
1992, 23 p.; bilingual, R Sl, SCR orth.
Ravasz, Jzsef; Jileskero kheroro - Domcek v srdci - Szvhzik (The house in the heart;
short tales), Romani kultura, Bratislava 1992, 44 p.; trilingual, R Vlax, Slovak,
Hungarian, author's own spelling.
Bun mluvi ke svym dtem - O Del vakerel ke peskere chave (God speaks to his children;
stories from the Bible), Ceska biblick spolecnost and Romani daj, Praha 1992, 183 p.;
bilingual: stories from the Old Testament translated by Vlado Olh into Presov-Slovak
Romani, stories from the New Testament translated by Gejza Demeter into Humenn-
Slovak Romani, SCR orth.
REFERENCES
Bartosz, Adam (1981) Carpathian Gypsies and the rural community. Ethno-
logia Polona 7, 27-33.
Halwachs, Dieter (1994) Romov v Burgenlandu. Romano daniben,
casopis romistickych studii 1:1, 38-41.
Hancock, Ian (1993) The emergence of a Union Dialect of North American
Vlax Romani, and its implications for an international standard. Inter
national Journal of the Sociology of Language 99, 91-104.
Holzinger, Daniel (1993) Das Rmanes. Grammatik und Diskursanalyse
der Sprache der Sinte. Innsbruck: Verlag des Instituts fr Sprachwissen
schaft der Universitt Innsbruck.
Hbschmannov, Milena (1984) Nominalisation in Slovak Romani.
Rassegna della facolta di Lettere e Filosofa dell' Universta di Catania
XIV, 27-70
Hbschmannov, Milena (1993) Hlavn zsady romskho pravopisu.

Romano Vil nevo 3, 88-99, 3-6.
Hbschmannov, Milena, Sebkov, Hana, Zigov, Anna (1991) Romsko-
cesky a cesko-romsky kapesn slovnk. Praha: Sttn pedagogick
nekladatelstv.
Kopernicki, Izydor (1925) Textes Tsiganes. Krakow: Nakladem Polskiej
Akademji Umiejtnosci.
Lpa, Jir (1963) Prucka ciknstiny. Praha: Sttn pedagogick
nakladatelstv.
Lpa, Jir (1965) Ciknstina v jazykovm prostedi slovenskm a ceskm,
kotzkm, starch a novejsch slozek v jej gramatice a lexiku. Praha:
Nakladatelstv Cekoslovensk Akademie Ved.
Mirga, Andrzej (1987) The category of 'Romanipen' and the ethnic
boundaries of Gypsies. Ethnologia Polona 13, 243-255.
Puchmayer Antonn J. (1821) Romani cib, das ist: Grammatik und Wrter
buch der Zigeunersprache,nebsteinigen Fabeln in derselben. Dazu als
Anhang die Hantyrka, oder die Cechische Diebessprache. Praha.
CONTRIBUTORS
Peter Bakker Ian Hancock

Dept of General Linguistics Dept of Linguistics
University of Amsterdam University of Texas
Spuistraat 210 Austin, Texas 78723
1012 VT Amsterdam U.S.A.
The Netherlands
Milena Hbschmannov
Norbert Boretzky Indologicky stav, Filosofick
Sprachwissenschaftliches Institut fakulta
Ruhr-Universitt Bochum Universta Karlova
Universitdtsstrae 150 Celetn 20
44780 Bochum 11000 Praha 1
Germany Czech Republic
Vit Bubenik Corinna Leschber

Dept of Linguistics Kilstetter Str. 26
Memorial University of New 14167 Berlin
foundland Germany
St John's, Newfoundland AIB 3X9
Canada Yaron Matras
Germanisches Seminar
Victor A. Friedman Universitt Hamburg
Dept of Slavic Languages Von-Melle-Park 6
University of Chicago 20146 Hamburg
1130 E. 59th Street Germany
Chicago, 111. 60637
U.S.A.
Anthony P. Grant
Dept of Modern Languages
University of Bradford
BD7 IDP
England

Yaron Matras Ed. Romani in Contact The History, Structure and Sociology of A Language

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Yaron Matras Ed. Romani in Contact The History, Structure and Sociology of A Language

Uploaded by

Copyright:

Available Formats

ROMANI IN CONTACT

AMSTERDAM STUDIES IN THE THEORY AND

Series IV - CURRENT ISSUES IN LINGUISTIC THEORY

Advisory Editorial Board

Henning Andersen (Los Angeles); Raimo Anttila (Los Angeles)

Yaron Matras (ed.)

JOHN BENJAMINS PUBLISHING COMPANY

Library of Congress Cataloging-in-Publication Data

In May 1993, researchers in Romani linguistics from seven countries

On typological changes and structural borrowing in the history of

On the migration and affiliation of the Dmba: Iranian words

Plagiarism and lexical orphans in the European Romani lexicon 53

Interdialectal interference in Romani 69

Verbevidentialsand their discourse function

Notes on the genesis of Cal and other Iberian

Romani lexical items in colloquial Romanian 151

Romani standardization and status in the Republic of Macedonia 177

Trial and error in written Romani on the pages of

Language contact has been a central issue in Romani linguistics ever

phonemes and, in some cases, phonological distinctions based on those of

and modern Indo-Aryan made significant contributions to the study of

interdialectal boundaries. Norbert Boretzky (Bochum) demonstrates that

impact of Romani on colloquial Romanian. Checking published sources

Acton, Thomas (1989) The value of 'creolized' dialects of Romani. In:

Boretzky, Norbert (1991) Contact-induced sound change. Diachronica 38:1,

Kostov, Kiril (1973) Zur Bedeutung des Zigeunerischen fr die Erforschung

Any work on interference phenomena from Medieval and contemporary

Byzantine Greek). Athematic grammatical rules apply to words acquired from

3. the preterite/perfect, and

1. Formation of the future tense

(1) Present Future: h-dialects s-dialects

As is well-known the 2nd and 3rd Pers P1 display an identical suffix,

(2) Formation of the future tense in Rajasthani dialects

Malvi Marwari Jaipuri

Structure: "go"+Pres/Pers/No = PRT (Malvi)

sigmatic future carisya:mi). As far as the la:-future is concerned, it is

(3) Formation of the progressive aspect/immediate future in Rajasthani and

Marwari Bangaru Pashto

Sg 1 ma:r:=h: ma:r:=s: lwgem

"I am/will be striking" "I am/will be striking" "I will be falling"

(4) The "take" -future in North Russian Romani

Ukranian Russian Romani (structural borrowing)

(genetic solution) *kar-am la: (< "take")

Old Rajasthani North Russian Romani

2 . Formation of the imperfect and conditional

(5) Present Imperfect and Present Conditional

Perfect Past Conditional

It would seem that - synchronically speaking - we are dealing with the

(6) Panjabi WR (Sampson 1926: 192)

But this comparison is based only on a similarity of the WR formant -as

... (in) the Indian group of languages...the Imperfect...is compounded

Sampson appears to have taken Panjabi sa: as a counterpart to Hindi tha:

same strategy as that involved in the formation of the progressive aspect of

(7) Formation of the imperfect in Bangaru (=Jatu)

In typological perspective it is easier to rationalize the forms of the

(8) Je ma ny: kar:=hai, to: ma mar:(=hai) (Bangaru)

(9) Formation of the conditional in Bangaru (Western Hindi), Lahnda

Bangaru Lahnda European Romani

(10) Vlach Romani Slovak Romani

ker-d-em sim ker-d'-om som

-(j)am sam d'-am sam

A propos their diachrony we have a statement by Wentzel (1980: 95):

(11) kadamhi (<kada=mhi)