(Empirical Approaches To Language Typology 38) Yaron Matras, Jeanette Sakel-Grammatical Borrowing in Cross-Linguistic Perspective-Mouton de Gruyter (2007)

Grammatical Borrowing in Cross-Linguistic Perspective
≥
Empirical Approaches
to Language Typology
38
Editors
Georg Bossong
Bernard Comrie
Yaron Matras
Mouton de Gruyter
Berlin · New York
Grammatical Borrowing
in Cross-Linguistic Perspective
Edited by
Yaron Matras
Jeanette Sakel
Mouton de Gruyter
Berlin · New York
Mouton de Gruyter (formerly Mouton, The Hague)
is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.
앪
앝 Printed on acid-free paper which falls within the guidelines of the
ANSI to ensure permanence and durability.
Library of Congress Cataloging-in-Publication Data
Grammatical borrowing in cross-linguistic perspective / edited by Yaron

Matras, Jeanette Sakel.
p. cm. ⫺ (Empirical approaches to language typology ; 38)
Includes bibliographical references and index.
ISBN 978-3-11-019628-3 (cloth : alk. paper)
1. Language and languages ⫺ Foreign elements. 2. Grammar, Com-
parative and general. I. Matras, Yaron, 1963⫺ II. Sakel, Jeanette,
1973⫺
P324.G73 2007
410⫺dc22
2007042917
Bibliographic information published by the Deutsche Nationalbibliothek

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data is available in the Internet at http://dnb.d-nb.de.
ISBN 978-3-11-019628-3
ISSN 0933-761X
© Copyright 2007 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin.
All rights reserved, including those of translation into foreign languages. No part of this book
may be reproduced or transmitted in any form or by any means, electronic or mechanical,
including photocopy, recording or any information storage and retrieval system, without per-
mission in writing from the publisher.
Printed in Germany.
Contents
List of contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Yaron Matras and Jeanette Sakel
Types of loan: Matter and pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Jeanette Sakel
The borrowability of structural categories . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Yaron Matras
Grammatical borrowing in Tasawaq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Maarten Kossmann
Grammatical borrowing in K’abeena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Joachim Crass
Grammatical borrowing in Likpe (Sɛkpɛlé) . . . . . . . . . . . . . . . . . . . . . . . 107

Felix K. Ameka
Grammatical borrowing in Katanga Swahili. . . . . . . . . . . . . . . . . . . . . . . 123

Vincent A. de Rooij
Grammatical borrowing in Khuzistani Arabic . . . . . . . . . . . . . . . . . . . . . 137

Yaron Matras and Maryam Shabibi
Grammatical borrowing in Domari. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Yaron Matras
Grammatical borrowing in Kurdish (Northern Group) . . . . . . . . . . . . . . . 165

Geoffrey Haig
Arabic grammatical borrowing in Western Neo-Aramaic. . . . . . . . . . . . . 185

Werner Arnold
vi Contents
Grammatical borrowing in North-eastern Neo-Aramaic . . . . . . . . . . . . . 197

Geoffrey Khan
Grammatical borrowing in Macedonian Turkish . . . . . . . . . . . . . . . . . . . 215

Yaron Matras and Şirin Tufan
Grammatical borrowing in Kildin Saami . . . . . . . . . . . . . . . . . . . . . . . . . 229

Michael Rießler
Grammatical borrowing in Yiddish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Gertrud Reershemius
Grammatical borrowing in Hungarian Rumungro . . . . . . . . . . . . . . . . . . 261

Viktor Elšík
Grammatical borrowing in Manange . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

Kristine A. Hildebrandt
Grammatical borrowing in Indonesian . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

Uri Tadmor
Grammatical borrowing in Biak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

Wilco van den Heuvel
Sino-Vietnamese grammatical borrowing: An overview. . . . . . . . . . . . . . 343

Mark J. Alves
Recent grammatical borrowing into an Australian

Aboriginal language: The case of Jaminjung and Kriol. . . . . . . . . . . . . . . 363
Eva Schultze-Berndt
Grammatical borrowing in Rapanui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

Steven Roger Fischer
Grammatical borrowing in Nahuatl. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

Una Canger and Anne Jensen
Grammatical borrowing in Yaqui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419

Zarina Estrada Fernández and Lilián Guerrero
Contents vii
The case of Otomi: A contribution to grammatical borrowing

in cross-linguistic perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
Ewald Hekking and Dik Bakker
Grammatical borrowing in Purepecha. . . . . . . . . . . . . . . . . . . . . . . . . . . . 465

Claudine Chamoreau
Grammatical borrowing in Imbabura Quichua (Ecuador) . . . . . . . . . . . . 481

Jorge Gómez-Rendón
Grammatical borrowing in Paraguayan Guaraní. . . . . . . . . . . . . . . . . . . . 523

Grammatical borrowing in Hup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551

Patience Epps
Mosetén borrowing from Spanish. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567

Jeanette Sakel
Index of subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

Index of authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
List of contributors
Mark Alves Wilco van den Heuvel

Montgomery College University of Manchester
Felix Ameka Kristine Hildebrandt
Leiden University University of Manchester
Werner Arnold Anne Jensen
University of Heidelberg University of Copenhagen
Dik Bakker Geoffrey Khan
University of Lancaster University of Cambridge
Una Canger Maarten Kossmann
University of Copenhagen Leiden University
Claudine Chamoreau Yaron Matras
CELIA (CNRS-IRD-INALCO- University of Manchester
PARIS VII)/CIESAS-Mexico
Gertrud Reershemius
Joachim Crass Aston University, Birmingham
University of Mainz
Michael Rießler
Viktor Elšík Humboldt-Universität zu Berlin
Charles University, Prague
Vincent de Rooij
Patience Epps Amsterdam School for
University of Texas at Austin Social Science Research,
Zarina Estrada Fernández University of Amsterdam
Universidad de Sonora Jeanette Sakel
Steven Roger Fischer University of the West of England
Auckland, NZ Eva Schultze-Berndt
Jorge Gómez-Rendón University of Manchester
University of Amsterdam Maryam Shabibi
Lilián Guerrero University of Manchester
Universidad de Sonora Uri Tadmor
Geoffrey Haig MPI for Evolutionary Anthropology/
University of Kiel Jakarta field station
Ewald Hekking Şirin Tufan
Universidad de Querétaro University of Manchester
Introduction
Yaron Matras and Jeanette Sakel
1. Borrowing in cross-linguistic perspective1
Like any metaphor, the term “borrowing” has its drawbacks. We have decided
to ignore possible reservations about the term, both in the title of this collec-
tion and in the advice on the use of terminology which we have given to the
contributors. Whether the “borrowed” substance is perceived as belonging
or as alien, whether its source is described as a “donor” and the language
into which it is integrated as the “recipient”, “copier”, or “replica”, seems
immaterial as long as clarity prevails as to the kind of phenomena that we
are addressing when talking about contact-induced change. We use the term
“borrowing” as a cover-term for the adoption of a structural feature into a
language as a result of some level of bilingualism in the history of the rele-
vant speech community.
This collection is about the structural effects of language contact. We have
asked each contributor (or pair of contributors) to focus on the diachronic im-
pact that language contact has had on the structure of a particular language.
Accompanying these descriptions are comments on societal multilingualism,
the roles that are assigned to various languages in the community, patterns
of language mixing, and issues of language policy and language education,
which are dealt with in relation to each case study in the introductory sections
of each chapter. The purpose of the compilation is to be able to compare the
effects of different kinds of contact on different kinds of languages, and so to
help forward our understanding of universal effects of language contact.
2. Sampling in contact linguistics
Linguistic typology tries to make generalizations about human languages.

For this purpose, typologists rely on sampling methods. Language samples
make it possible to make generalizations without studying each and every
individual language, which would be a costly and time-consuming endeav-
our. Since Greenberg (1966) it has been accepted that samples should try
and reflect at least the present-day diversity of languages in order to be truly
2 Yaron Matras and Jeanette Sakel
representative of human language. Most researchers have therefore made

an effort to avoid areal or genetic biases when compiling a sample, though
basic typological parameters and extralinguistic factors have played less of
a role (cf. Comrie 1981, Stassen 1985, Dryer 1989; Rijkhoff et al. 1993).
At times samples have been used by a group of studies in a coordinated
fashion, to study the distribution of several different phenomena across the
same set of languages; this was partly the case in the EUROTYP project;
and in the World Atlas of Linguistic Structures – see Comrie, Dryer, Gil and
Haspelmath 2005). Nevertheless, it is fair to say that on the whole linguistic
samples have been used in order to study a particular structural phenomenon
or structural category.
The cross-linguistic study of structural borrowing is a challenge at a dif-
ferent level. Firstly, borrowing can affect many different categories. A com-
prehensive, comparative study of borrowing must therefore take into account
both the “horizontal” diversity of sample languages, and the “vertical” diver-
sity of structural categories on which contact can have an impact. This already
makes the sampling of borrowing phenomena a much more complex task
than the comparative study of any particular domain of linguistic structure.
Second, borrowing is a historical dimension, which can only be identified and
assessed if diachronic information on the relevant language(s) is available.
This factor seriously disadvantages the consideration of entire areas of the
world from which we lack secure and reliable information on linguistic dia-
chrony, and so in effect it counteracts the need to maintain areal diversity in a
representative sample. Finally, there is general agreement that the outcomes
of language contact (or, to be more precise: of widespread bilingualism in a
community) depend not just on structural factors, but to a great extent on ex-
tralinguistic factors. The duration and intensity of cultural contact, the roles
and status of the participating languages, the degree of institutional support
awarded to languages in various stages of their history (e.g. the presence of
literacy or use in the public, acrolectal domain), and speakers’ attitudes to-
ward their own and their neighbours’ forms of speech – all these play a vital
role in determining the direction of change and so in shaping the structural
outcome of language contact constellations. In order to investigate the univer-
sal possibilities of contact-induced change, one needs to take sociolinguistic
factors into account. The ideal sample for the investigation of contact is there-
fore one that is, like other samples, stratified to take into account various
language-genetic groupings, structural types, and regions of the world; but in
addition it must also be informed about diachronic depth and allow the author
or user to control factors that are external to language.
Introduction 3
Given these difficulties of sampling it is not surprising that most attempts

to make generalizations about contact-induced change have been based on
casual observations, rather than on systematic comparative studies. This is
true of Moravcsik’s (1978) discussion of borrowing universals even within the
context of the Greenbergian project, as well as of Thomason and Kaufman’s
(1988) frequently-cited borrowing scale. Some generalizations about bor-
rowing have been proposed with reference to a case study of just one single
contact situation (cf. Haugen 1950, van Hout and Muysken 1994, Ross 2001,
Field 2002), while some have concentrated on identifying counter-examples
to generalizations proposed by others (cf. Cambpell 1993, cf. also Thomason
2001). To the extent that samples have been used in contact linguistics, they
have tended to control one of the key factors in the contact situation, such
as the donor or the recipient language, or even the type of category affected.
Stolz and Stolz (1996, 1998), for example, discuss the borrowing of Spanish
function words into a diverse set of languages in Central America and the Pa-
cific. Johanson (2002) discusses the contact behaviour of Turkic languages,
and Matras (2002) and Elšík and Matras (2006) evaluate structural borrowing
from a diverse set of European contact languages in the dialects of Romani.
It is noteworthy that in these samples the extra-linguistic parameters are also
kept constant. Thus, Spanish is the colonial language in the Pacific and in
Central America, Romani is an oral language of dispersed, socially marginal-
ized, bilingual ethnic minorites, and Persian has played a similar role in the
history of various Turkic languages.
A wealth of data for comparison from various contact situations can be
found in a number of collections devoted to case studies of language contact
(e.g. Gilbers, Nerbonne, and Schaeken 2000, Aikhenvald and Dixon 2001,
Matras, McMahon and Vincent 2005). Aikhenvald and Dixon (2006) espe-
cially contains contributions that cover a wide range of languages, regions,
and contact phenomena. These are accompanied by important summary ob-
servations on general factors and constraints that operate in language con-
tact situations (cf. Aikhenvald 2006). Put together, these and other excellent
contributions to the study of language contact have taken us a significant step
forward toward a typology of contact-induced language change.
Still missing, however, from the body of work produced in recent years
is an attempt at a systematic comparison of the behaviour of grammatical
categories across a sample of languages in contact. Of interest is the ques-
tion whether some grammatical categories are universally more susceptible
to contact-induced change than others. A further question is whether there
is any recurring correlation between the borrowing of structures in one cat-
egory, and those belonging to another. Both these issues can be expressed in
terms of hierarchies of borrowing. These in turn may contain either implica-
tional statements (if X is borrowed, then Y is also borrowed), or just plain
frequency statements (X is borrowed more frequently in the sample than Y,
and hence it can be said to be more prone to borrowing than Y). Equally of
interest is the correlation between a category and the type of contact-related
change that is more likely to affect it: a shift in meaning or in the distribu-
tion of existing structures (which we term “pattern replication” below), or
the actual adoption of a structure from another language for circulation in
the recipient system (“matter replication”). Finally, we are interested in the
interaction between the contact behaviour of a category, and other factors that
condition the nature of the contact situation, including both language internal
features (such as the typological parameters of the languages involved) and
extra-linguistic features (such as the type of bilingualism and the roles played
by the respective languages in various domains of communicative interac-
tion). The purpose of this collection is to facilitate a discussion of questions
of this kind, and to provide information on the basis of which these questions
can be addressed.
3. The data compilation tool
A major difficulty in sampling for the purposes of contact linguistic studies

is the accessibility of relevant information. Unlike synchronic structural facts
about language, the description of borrowing requires diachronic informa-
tion. Even if such information is available in principle for some languages, it
is not always the case that it is included into grammatical descriptions. Gram-
mar books do not generally tend to highlight borrowings at all. Identifying
borrowed structures requires a high degree of expertise and specialization in
the language and its history, and familiarity, at the very least, with the lan-
guages with which it has been in contact. Relevant extralinguistic informa-
tion is often missing from grammatical descriptions, too. For these reasons,
it is hardly feasible for a lone researcher to survey published descriptions
of various languages in order to compile a representative sample corpus of
grammatical borrowing. Sampling in this field is best achieved through team
effort, with experts contributing first-hand information on contact-induced
phenomena.
Underlying the team effort on which this volume is based is a uniform
questionnaire, formatted as a user-friendly database (in FilemakerPro 67).
Introduction 5
The aim of the questionnaire was to obtain a representative and comparable

sample of data on contact-induced change in a variety of languages. For this
purpose it was distributed to the contributors as a detailed reference grid. The
questionnaire can of course continue to constitute a description standard for
language contact phenomena, serving as a checklist for information to be
covered in an exhaustive description of borrowing into any given language.
The questionnaire opens with information on relevant metadata (source
of information, affiliation to sub-samples, date of input, and so on), and con-
tinues to cover extralinguistic information about the language and the speech
community. The remaining chapters cover all principal domains of structure:
Phonology, Typology (a characterization of principal typological traits),
Nominal structures, Verbal structures, Other parts of speech (e.g. quantifiers,
indefinites, phasal adverbs, discourse markers and connectors), Constituent
order, Syntax (clause combining), and Lexicon (general information on the
presence of lexical loans in various semantic domains, as well as specific
Figure 1. Information page of the Language Convergence database (entry: Domari)

showing open chapter menu in the top left corner
Figure 2. Encoding the sociolinguistic situation (Mosetén)
questions on expressions of time and space). Using the “Layout” function

in FilemakerPro, each chapter is displayed on a separate page, accessible
through a menu box (Figure 1).
Individual records, each representing a language in contact, can be tagged
for different kinds of contact constellations or sub-samples. We can distin-
guish, for instance, borrowing situations where just two languages are in con-
tact, from observations on a more widespread regional distribution of shared
phenomena, or “linguistic areas”. Special attention is given to the coding
of a series of extra-linguistic indicators (Figure 2), allowing the user to as-
sess the correlations between sociolinguistic factors and the contact behav-
iour of a language. Depth of contact is taken into account by distinguishing,
where applicable, several different layers of contact (see also Matras 1998):
The Current contact language (i.e. the object of widespread bilingualism),
a Recent contact language (that may still be spoken by an older generation
of speakers), and an Old contact language that has made an impact on the
Introduction 7
Figure 3. Coding of contact languages (Manange)
language in the past, but has little or no contemporary role in the speech com-
munity (Figure 3).
In order to be able to investigate the precise effects of contact on structural
compositions, a distinction is maintained throughout the questionnaire be-
tween the replication of linguistic matter (MAT) consisting of actual phono-
logical segments, and the replication of patterns (PAT), which pertains to the
semantic and grammatical meaning and the distribution of a construction or
structure (see Matras and Sakel 2007). This distinction is encoded alongside
every relevant description of a contact phenomenon in an individual category
(Figure 4).
The advantages of working with the questionnaire database are obvious:
While the checklist ensures uniform and comprehensive coverage of the same
phenomena, and so comparability throughout the sample, the database allows
to filter and to query the results, to retrieve examples of the various kinds of
contact phenomena, and to view correlations among the data sets (see e.g.
Figure 5).
Figure 4. Encoding MAT (matter) and PAT (pattern) replications (Domari)
A sample comparison – piloting just two languages, Kelderash Romani

and Mosetén – was already presented, based on the database questionnaire,
in Sakel and Matras (2007). The present collection features a case-by-case as-
sessment of borrowing. A preliminary assessment of some salient, common
patterns is provided in the two evaluation chapters, by Matras and by Sakel.
Our intention at this point is to continue to expand the data sample, and even-
tually to make the data accessible to users online.
4. Coverage of phenomena and languages
In assessing the diachronic impact of contact, many of the contributors faced

the dilemma of how to tell apart the ongoing effects of current bilingualism
Introduction 9
within a speech community, and structural changes that may be regarded as

permanent. The distinction is often referred to as one between “borrowing”
and “code-switching”, but since there is widespread agreement that alterna-
tional code-switching is unlikely to fossilize into a permanent pattern (see
Backus 2003, 2005), it is in fact the distinction between insertional switches
and borrowing that requires clarification. Our recommendation to the con-
tributors was generally to regard as borrowings those cases where a speaker’s
selection of a particular structure is not facilitated by a choice between two
alternatives, and is not triggered by the speech situation or by the topic of
conversation, nor by a need for a special conversation effect. We thus con-
sider as borrowings all structures that appear with a certain regularity and are
used by different speakers in a range of different situations. Problem cases
are otherwise highlighted in the individual chapters.
Figure 5. Comparing results on the presence/absence of contact phenomena for the

following categories: Connectors (O_PA_co), Focus particles (O_PA_fo), Numerals
and quantifiers (O_nu), and Personal pronouns (O_P_pp)
Even when drawing on first-hand data and the expertise of field linguists,
gaps in the coverage are inevitable; in our case this is due mainly to limita-
tions of space, resources, and the constraints of the production schedule. The
sample nevertheless grants representation to most areas of the world: Saharan
and sub-Saharan Africa (Tasawaq, K’abeena, Likpe, Katanga Swahili), the
Middle East (Khuzistani Arabic, Domari, Kurmanji-Kurdish, Western and
North-eastern Neo-Aramaic), the Balkans (Macedonian Turkish), Europe
(Kildin Saami, Yiddish, Hungarian Rumungro), the Himalaya (Manange),
South Asia (Indonesian, Biak), East Asia (Vietnamese), Australia (Jamin-
jung), the Pacific (Rapanui), Central America (Nahuatl, Yaqui, Otomi, Pure-
pecha), and South America (Imbabura Quichua, Guaraní, Hup, Mosetén). It
contains languages with a tradition of native literacy (Vietnamese, Indone-
sian, Arabic, Turkish, Swahili, and arguably also Neo-Aramaic) and others
without one; languages of ethnic minorities, those that are or were majority
languages, and regional languages in post-colonial settings; languages with
a single contemporary contact language as well as those spoken in either a
multilingual setting or a linguistic area.
Lacking representation in our sample are languages of North America,
Central Asia, and Siberia. Pidgins, creoles, and mixed languages were not
considered either, since their borrowing characteristics are potentially dif-
ferent in principle than those of other languages that are not themselves the
product of recent contacts (but see Thomason 1997, Matras and Bakker 2003).
The present volume appears alongside a two-part publication devoted to lex-
ical borrowing, which is the result of the Loanword Typology project based
at the Max Planck Institute for Evolutionary Anthropology in Leipzig. We
hope that together, both publications will help shed new light on an ancient
process that will have been as instrumental in shaping the development of
the world’s languages as human contacts has been in shaping general human
cultural experience.
Note
1. We gratefully acknowledge support from the Arts and Humanities Research

Council for a three-year research project on “Language Convergence and Lin-
guistic Areas” (award number RG/AN4725/APN16320), during which the ques-
tionnaire and database tools used for this collection were developed, as well as
support from the School of Languages, Linguistics and Cultures for a workshop
on “Grammatical Borrowing in Cross-Linguistic Perspective” in Manchester in
September 2005 and for the production of the present volume. We also wish to
Introduction 11
thank Georg Bossong for helpful comments, Peter Kahrel for typesetting and
copy-editing the manuscript, and Ursula Kleinhenz for her support during the
production process.
References
Aikhenvald, Alexandra Y., and R. M. W. Dixon (eds.)

2001 Areal Diffusion and Genetic inheritance. Oxford: Oxford University
Press.
2006 Grammars in Contact: A Cross-Linguistic Typology. Oxford: Oxford
University Press.
Aikhenvald, Alexandra Y.
2006 Grammars in contact: A cross-linguistic perspective. In: Alexandra
Y. Aikhenvald and R. M. W. Dixon (eds.), Grammars in Contact:
A Cross-Linguistic Typology, 166. Oxford: Oxford University Press.
Backus, Ad
2003 Can a mixed language be conventionalized alternational codeswitching?
In: Y. Matras and P. Bakker (eds.), The Mixed Language Debate: Theor-
etical and Empirical Advances, 237270. Berlin: Mouton de Gruyter.
2005 Codeswitching and language change: One thing leads to another?
International Journal of Bilingualism 9 (3/4): 307340.
Campbell, Lyle
1993 On proposed universals of grammatical borrowing. In: H. Aertsen
and R. Jeffers (eds), Historical Linguistics 1989: Papers from the 9th
International Conference on Historical Linguistics, 91109. Amster-
dam: John Benjamins.
Comrie, Bernard
1989 Language Universals and Linguistic Typology: Syntax and Morph-
ology. Oxford: Blackwell.
Dryer, Matthew S.
1989 Large linguistic areas and language sampling. Studies in Language 13:
257292.
Elšík, Viktor, and Yaron Matras
2006 Markedness and Language Change: The Romani Sample. Berlin:
Mouton de Gruyter.
Field, Fredric
2002 Linguistic Borrowing in Bilingual Contexts. Amsterdam: Benjamins.
Gilbers, D. G., J. Nerbonne, and J. Schaeken (eds.)
2000 Languages in Contact. Amsterdam: Rodopi.
Greenberg, Joseph H. (ed.)
1966 Universals of Language. Cambridge, MA: MIT Press.
Haugen, Einar
1950 The analysis of linguistic borrowing. Language 26 (2): 210231.
Haspelmath, Martin, Matthew Dryer, David Gil, and Bernard Comrie (eds.)
2005 The World Atlas of Language Structures. Oxford: Oxford University
Press.
Johanson, Lars
2002 Structural Factors in Turkic Language Contacts. London: Curzon.
Matras, Yaron
1998 Utterance modifiers and universals of grammatical borrowing. Lin-
guistics 36 (2): 281331.
2002 Romani: A Linguistic Introduction. Cambridge: Cambridge University
Press.
Matras Yaron, and Peter Bakker (eds.)
2003 The Mixed Language Debate. Berlin: Mouton de Gruyter.
Matras, Yaron, and Jeanette Sakel
2007 Investigating the mechanisms of pattern replication in language con-
vergence. Studies in Language 31 (4): 829865.
Matras, Yaron, April McMahon, and Nigel Vincent (eds.)
2006 Linguistic Areas: Convergence in Historical and Typological Perspec-
tive. Houndmills: Palgrave.
Moravcsik, Edith
1978 Language contact. In: Joseph H. Greenberg, Charles A. Ferguson, and
Edith A. Moravscik (eds.), Universals of Human Language, Volume 1,
93122. Stanford: Stanford University Press.
Rijkhoff, Jan, Dik Bakker, Kees Hengeveld, and Peter Kahrel
1993 A method of language sampling. Studies in Language 17: 169203.
Ross, Malcolm
2001 Contact-induced change in Oceanic languages in north-west Mela-
nesia. In: Alexandra Aikhenvald and R. M. W. Dixon (eds.), Areal Dif-
fusion and Genetic Inheritance: Problems in Comparative Linguistics,
134166. Oxford: Oxford University Press.
Sakel, Jeanette, and Yaron Matras
2007 Modelling contact-induced change in grammar. In: Thomas Stolz et
al. (eds.), Aspects of Language Contact: New Theoretical, Methodo-
logical and Empirical Findings With Special Focus on Romanisation
Processes. Berlin: Mouton de Gruyter.
Stassen, Leon
1985 Comparison and Universal Grammar. Oxford: Blackwell.
Stolz, Christel, and Thomas Stolz
1996 Funktionswortentlehnung in Mesoamerika, Spanisch–Amerindischer
Sprachkontakt. In Sprachtypologie und Universalienforschung 49 (1):
86123.
Introduction 13
1997 Universelle Hispanismen? Von Manila über Lima bis Mexiko und
zurück: Muster bei der Entlehnung spanischer Funktionswörter in die
indigenen Sprachen Amerikas und Austronesiens. Orbis 39 (1) [1996–
1997]: 177.
Thomason, Sarah G. (ed.)
1997 Contact Languages: A Wider Perspective. Amsterdam: Benjamins.
Thomason, Sarah G.
2001 Language Contact: An Introduction. Edinburgh: Edinburgh Univer-
sity Press.
Thomason, Sarah, and Terrence Kaufman
1988 Language Contact, Creolization and Genetic Linguistics. Berkeley:
University of California Press.
van Hout, Roeland, and Pieter Muysken
1994 Modelling lexical borrowability. Language Variation and Change 6:
3962.
Weinreich, Uriel
1966 Languages in Contact: Findings and Problems. The Hague: Mouton
(first publ. 1953).
Types of loan: Matter and pattern
Jeanette Sakel
1. Introduction1
A central concern of contact linguistics has long been to categorize the ways
in which elements are borrowed from one language into another. For this pur-
pose Matras and Sakel (2004) introduced the terms matter (MAT) and pattern
(PAT) in the questionnaire on which the sample of contact situations in this
book is based (cf. also subsequent publications on the issue, such as Matras
and Sakel 2007). In the present chapter I will re-visit the definition of MAT
and PAT, as well as address what this distinction could mean in phonology.
I will furthermore give an overview of the overall distribution of MAT/PAT
in the languages of the sample in order to address the validity of a MAT/PAT
distinction in the categorization of contact situations.
2. Definitions
MAT and PAT denote the two basic ways in which elements can be borrowed
from one language into another. We speak of MAT-borrowing when morpho-
logical material and its phonological shape from one language is replicated
in another language. PAT describes the case where only the patterns of the
other language are replicated, i.e. the organization, distribution and mapping
of grammatical or semantic meaning, while the form itself is not borrowed. In
many cases of MAT-borrowing, also the function of the borrowed element is
taken over, that is MAT and PAT are combined.2 In other instances, MAT and/
or PAT are borrowed, but deviate considerably in their form or function from
their original source. In some categories, making a distinction between MAT
and PAT does not make much sense. For example, word-order changes will
invariably be PAT. In other areas, such as phonology, the MAT/PAT distinc-
tion applies only in a restricted way, as MAT and PAT are primarily defined
as functioning above the morpheme level. The concept behind MAT and PAT
is well-grounded in the literature, but only rarely figures in the categoriza-
tion of contact situations. One exception is Heath (1984: 367), who bases
his approach to language contact on this opposition, distinguishing between
16 Jeanette Sakel
“direct transfer of forms from the other language” and “structural conver-
gence”. Other approaches to language contact mention similar distinctions,
often with very different terminology, such as Haugen’s (1950) “importation”
for outright borrowing and “substitution” for loanshifts or calques. Likewise,
Weinreich ([1953] 1966: 7) speaks about “transfer of elements” and “inter-
ference without outright transfer”, which re-appears in Weinreich’s distinc-
tion between “source” and “recipient language” for MAT-borrowings, and
“model” and “replica language” for PAT-borrowings. The distinction is men-
tioned in many subsequent approaches to language contact, such as Gołąb’s
(1956, 1959) “form” versus “substance”, Johanson’s (1992) “global copy-
ing” and “partial copying”, Nau’s (1995) “material borrowing” and “loan-
meaning, loan-translation” and Treffers-Daller and Mougeon’s (2005: 95)
“borrowing, code-switching” versus “transfer”. Many other studies of lan-
guage contact pay only minor attention to this matter, such as Thomason and
Kaufman (1988), who do not include this classification in their influential
borrowing scale (1988: 74).
Some approaches to contact focus almost exclusively on either MAT or
PAT. In this manner, approaches to substrate influence and contact-induced
grammaticalization (Siegel 1997; Keesing 1991; Heine and Kuteva 2005)
focus mainly on PAT-type loans. Likewise, linguistic areas have often been
described as zones in which PAT-borrowing appears. On the other hand, much
of the early literature on code-switching primarily addresses contact phe-
nomena of the MAT type, while more recently PAT has been integrated into
frameworks of code-switching (cf. Savić 1995, Bolonyai 1998 and Myers-
Scotton 2002: 2122). There are reasons why some approaches seem to favor
either MAT or PAT: when studying linguistic areas or substrate influence,
focus is often on PAT because a major part of the loans in these situations
are of this type and indeed areas and substrate influence are often defined as
displaying mainly pattern-loans. On the other hand, when stuyding situations
of code-switching, MAT is often very prominent.
3. Integration of MAT and PAT loans
Let us now look at how grammatical MAT/PAT-loans are integrated into the
recipient language. In many cases of MAT or PAT-borrowing, not the entire
function or form is taken over, but the borrowed elements differ from their
original source. Take for example the way in which Domari copies Arabic
aspect marking (Matras, this volume); Domari borrows Arabic auxiliaries as
Types of loan: Matter and pattern 17
MAT, but it does not simultaneously make use of a subjunctive verb form as in
Arabic because it has its own subjunctive. Hence, not the entire construction
of aspect marking is taken over, but only parts of it. In Otomi the functions
of a borrowed form are extended: the shortened form ko from Spanish como
is used in a number of constructions, also expressing ‘made of’, a function it
does not have in Spanish (Hekking and Bakker, this volume). Hence, when
MAT-elements are borrowed, their functions are not necessarily the same as
in the source language; sometimes only parts of the function are borrowed,
sometimes the functions are extended and the loans are rarely mere copies of
those of their counterparts in the source language. Also the forms of loans are
frequently adjusted, for example by phonological integration of MAT-loans
into the recipient language, which in some cases makes them difficult to iden-
tify as loans without a careful analysis. In the same way, PAT-loans inherently
involve a process of grammaticalization, often leading to different patterns
as those in the source language (cf. Heine and Kuteva 2005). PAT-borrowing
is facilitated by a pivot common to both languages (Matras and Sakel 2007).
Such cases of PAT-adjustment in the sample include the shift in grammatical
meaning of native Yiddish elements to correspond to Slavic aspectual mark-
ers (Reershemius, this volume).
4. MAT and PAT in phonology
How could the distinction between MAT and PAT-loans be employed in phon-
ology? In most contact situations, MAT-loans involve phonological changes
that can go in two directions:
(1) MAT-borrowed element is phonologically integrated into the recipient

language; e.g. Mosetén ishkweera for Spanish escuela ‘school’.
(2) MAT-borrowed element is not integrated and may introduce new phon-
emes into the recipient language; e.g. in Jaminjung, where loan-pho-
nemes have risen to phoneme status within the language and are now
used with native words (Schultze-Berndt, this volume).
Only the latter strategy, (2), involves phonological borrowing in which elem-
ents from the source language are transferred into the recipient language. One
could argue that loans in (2) are MAT-loans if they introduce a new phone,
and PAT-loans if they introduce a new phoneme. The loss of certain phono-
logical distinctions would possibly be a case of PAT, since no new material
18 Jeanette Sakel
is introduced, and a change in the stress patterns could be counted as PAT

as well, since it involves the overall patterning rather than actual forms. A
change in tone, on the other hand, could be MAT-borrowing if it occurred
in isolated loanwords and PAT if it affected the whole language. A cautious
attempt to classify the phonological contact phenomena found in the sample
is the following:
– Borrowing of individual phonemes that are also used in native elements

is found in Vietnamese, Indonesian, Jaminjung and Paraguayan Guaraní.
This type of loan would probably best be classified as MAT, since it in-
volves borrowing of phonological material without a major disruption of
the phonological system.
– Borrowing of a series of distinctions is common. K’abeena, Rumungro
and Kildin Saami have borrowed a palatal series. K’abeena has further-
more borrowed ejectives, while Tasawaq gained pharyngealization and
Vietnamese had a retroflex series introduced by contact. These would be
MAT-loans in the sense that elements are borrowed and PAT-loans in the
sense that a whole series of phonemes within the phonological system is
affected by the contact.
Adaptation of stress, syllable structure, prosody or tone systems would clas-

sify as PAT-borrowing in phonology since they affect the system, rather than
individual elements. Stress patterns are adjusted in Western Neo-Aramaic,
Yiddish, Hup, Manange, North-eastern Neo-Aramaic and Paraguayan Guar-
aní. The syllable structure is adapted due to contact in Yiddish and Indone-
sian, allowing consonant clusters. Prosody is affected in Hup and Jaminjung,
while tone has undergone contact-induced changes in Vietnamese and Man-
ange. Hence, MAT and PAT could be used to classify contact phenomena in
the phonology, even though the strength in having this MAT/PAT distinction
makes most sense in loans from the morpheme-level upwards.3
5. The MAT/PAT distinction in grammar: findings from

the sample
We will discuss the data from the sample by looking at the following cases:
(1) situations with overall MAT-borrowing; (2) situations with overall PAT-
borrowing; and (3) hierarchical relations between contact languages. When
talking of “overall” contact phenomena, we deliberately do not quantify loans
in contact situations to avoid sampling problems, but rather only look at cases
where the overwhelming majority of the loans are of either type.
5.1. Situations with overall MAT-borrowing
In the sample, Jaminjung, Biak and Vietnamese stand out in having predomi-
nantly MAT-loans.
Jaminjung is in contact with Kriol, a creole which is likely to have been
influenced by the native languages of the area (cf. Schultze-Berndt, this vol-
ume). The patterns of Kriol are in many ways similar to those of Jaminjung
possibly due to substrate influence from the languages of the area on Kriol
during its development. For this reason, Schultze-Berndt chose not to include
PAT-loans in her overview to avoid circularity and in this particular case the
explanation for the predominance of MAT-loans is hence merely methodo-
logical.
Biak, on the other hand seems to have a preference for MAT-loans. Biak
word order is highly fixed, which means that in many cases MAT-loans are
easier to integrate than a change in word order or re-modelling of existing
material (cf. van den Heuvel, this volume). For example due to Indonesian
pressure negation is expressed by a sentence-initial adverb. A native element
in Biak could have been adjusted as a PAT-loan, but the pressure against the
change in word order this would involve was higher than that of borrowing
new material.
Vietnamese has had substantial contact influence from Chinese. While
most other situations involve bilingualism and language contact in the spoken
language, Chinese influence on Vietnamese has predominantly been through
written materials. The types of loans encountered are mostly MAT. This can
be attributed to the type of contact situation, as without oral bilingualism pat-
tern copying is difficult, and MAT-loans prevail. This rather trivial fact makes
sense in the current discussion since it is directly related to the types of loans
encountered. MAT-loans can appear even in cases of monolingualism, such
as in the Muran language Pirahã (not discussed further in this volume), which
has exclusively MAT-loans from Portuguese due to very rudimentary bilin-
gualism.
Concluding, there are both structural reasons for MAT-loans, as in Biak,
where PAT is not preferred as it would lead to changes of a highly fixed word
order, as well as reasons found in the type of contact situation linked to the
degree of oral bilingualism, as in the case of Vietnamese.
20 Jeanette Sakel
5.2. Situations with overall PAT-borrowing
Likewise, a number of languages in the sample show an overall majority of

PAT-borrowing. These are K’abeena, Hup, Macedonian Turkish and Khuzista-
ni Arabic, most of which are also part of well-established linguistic areas.
K’abeena is discussed for its participation in the Gurage linguistic area
(cf. Crass, this volume) and exhibits areal patterns of varying sources. Not
surprisingly, most of the contact phenomena found in K’abeena are PAT-
loans. K’abeena has a number of MAT-loans as well, part from lexical elem-
ents, it has borrowed a few markers of temporal and causal clauses, discourse
markers and interjections. These belong to the category of function words,
comprising conjunctions, discourse markers and other elements detached
from the main proposition of the clause (cf. Matras 1998).
Hup is part of the Vaupés are in the North-Western Amazon. Epps (this
volume) focuses on the contact situation between two of the languages from
this area, Hup and Tukano. Hup borrows a vast number of PAT-features from
Tukano, while there are only few MAT-loans. The reasons for this are strong
cultural restrictions in the Vaupés region against MAT-loans. These can none-
theless be overridden, as is the case in MAT-loans from Tukano, including
an adverbial particle ‘until’, a negative emphasis marker and a disjunction
marker. Again, these belong to the category of function words and the only
MAT-loans outside this category in Hup are numerals.4
Macedonian Turkish, which is spoken in the Balkan linguistic area and
displays a large number of areal features that can be established when com-
paring the language to its neighbourng Balkan languages and Standard Turk-
ish. While most loans are PAT, some of the discourse markers, as well as the
phrasal conjunctions ama ‘or’ and i ‘and’ are MAT-loans from Macedonian.
Matras and Tufan (this volume) draw the conclusions for Macedonian Turk-
ish that MAT structures are probably new additions since the change in hier-
archical structures between the languages, while PAT-borrowing must have
occurred earlier as a means of “utterance-organizing strategies”.
A similar picture emerges in Khuzistani Arabic, which has a majority
of PAT-loans, and again the only MAT-loans are found in categories such
as discourse markers, fillers, tags and focus particles where they have an
interaction-qualifying function (cf. Matras and Shabibi, this volume). The
reason why Khuzistani Arabic has many PAT-loans could be its status in the
contact situation, which is similar to that of Macedonian Turkish in that both
languages are spoken across the border from areas where they are majority
languages. The variants discussed here are dominated, but the speakers are
frequently exposed to the standard languages, for example through televi-

sion and visits. Speakers may have a wish to keep the two languages apart
as they attribute equally high status to them in the different contexts of use.
This restricts MAT-loans and leads to mainly PAT-loans, while in the category
of function words the two systems collide, very similar to what we found in
Hup and K’abeena. The fact MAT-loans of function words are frequent in
these languages appears to confirm Matras’s (1998) hypothesis that bilingual
speakers will have greater difficulties keeping their linguistic systems apart
around such markers than for other domains of functions.
5.3. Relations between languages and MAT/PAT borrowing
Let us now look at the data from another perspective, namely whether the
relations between the languages in contact have an impact on the types of
loans encountered.
In the sample, most borrowing appears from hierarchically higher – or
dominant – languages into lower, dominated languages. Dominance is here
broadly defined: a language is dominant when used for administration, as a
lingua franca, and when it has to be learnt by the speakers of the dominated
language, which in return is usually not used for any of the above or which is
used in less official environments. A language can be dominant in one con-
tact situation, while dominated in another. For example, Katanga Swahili is
dominant in being the lingua franca of the region but dominated by French,
which is the official and administrative language of the country. Apart from
one-to-one borrowing situations, some of the languages in the sample belong
to well-established linguistic areas. Table 1 summarizes the contact situations
treated by authors of the chapters. Many of the languages in the sample are
also part of linguistic areas, but where this is not immediately discussed in
the chapters I left it out of the current discussion. X marks the main focus of
the contribution to this volume, Y marks another contact situation which is
not the main focus of the chapter.
We can usually see layers of different types of contact in languages
that belong to more than one contact situation. This is visible in the first
group of languages in table one, i.e. languages that are dominant in some
situations but dominated in others. In this way, Katanga Swahili has only
few MAT-loans, most of which are of French numbers and discourse mark-
ers. The loans from the substrate languages are PAT, apart from three MAT
noun-class markers. If we would expect substrate-influence only leading to
22 Jeanette Sakel
Table 1. Types of contact situation and hierarchical status in contact situation

areas
dominated (treated here) dominant
1 Indonesian, Katanga Swahili X Y
2 K’abeena X
3 Biak, Domari, Imbabura Quichua, X
Khuzistani Arabic, Kurmanji
Kurdish, Manange, Mosetén, North-
eastern Neo-Aramaic, Otomi,
Paraguayan Guaraní, Purepecha,
Rapanui, Saami, Tasawaq, Western
Neo-Aramaic, Vietnamese, Yaqui,
Yiddish
4 Hup, Jaminjung, Likpe, Nahuatl, X Y
Rumungro, Macedonian Turkish
PAT-borrowing, this would be exceptional. We can see reasons for why these
markers have been borrowed in this way by looking at the system. De Rooij
(this volume) shows that many of the Swahili noun class markers correspond
in form and function to those of the substrate languages. The differences –
in particular the three markers in question – were borrowed to assimilate
Katanga Swahili’s system of noun classification to that of the substrate lan-
guages. We are therefore not dealing with a mere MAT-loan of three markers
in isolation, but with a general adjustment of two systems that are already
largely identical. In the same way we find MAT-loans from current substrate
languages in Indonesian. Indonesian is the dominant language in contact
with other languages spoken in Indonesia, such as Javanese but it has also
been dominated by a range of other languages, in particular Sanskrit, Chi-
nese and some European languages. The loans from the dominant languages
are both MAT and PAT and have, for example, led to a number of changes
in the sound system of Indonesian. The influence from substrate languages
is astonishingly likewise MAT and PAT. MAT-loans from Javanese include
collective particles, a third-person plural pronoun, some interrogative mark-
ers, a focus marker and modal particles. If Javanese was a substrate language
at the time of borrowing, these MAT-loans would be exceptional in the light
of general preconceptions that substrate-influence is mainly PAT. However,
we can explain there MAT-loans from Javanese: being a substrate language
today, it used to have a different status in the past, described by Tadmor as

a “quasi-symbiosis” between Indonesian and Javanese. The MAT-loans may
have entered Indonesian during this time.
The only language in the second group, K’abeena, has been discussed in
the previous section for its prevalence of PAT-loans. When looking at the few
attested MAT-loans, these are also from languages that take part in the area:
(Ethio-) Semitic lexical elements are used as adverbial clause markers (time,
reason, cause), an adjectival suffix is borrowed form Amharic and a number
of interjections have the same forms as in other languages of the area. Apart
from the adjective suffix these fall under the category of function words. Even
though this is only one example, it shows that MAT-loans in the category of
function words do appear in linguistic areas.5
The majority of the languages in the sample are in a contact situation
with a dominant language. These are listed as group 3 in Table 1. In these
languages both MAT and PAT loans are common in most categories, while
there is heavy borrowing of function words, which are overwhelmingly bor-
rowed as MAT. The contact phenomena encountered in these languages are
in general very similar, independent of whether the languages are spoken by
small minorities or whether they are major national languages. In this way,
the major South American indigenous languages Paraguayan Guaraní and
Imbabura Quichua (cf. Rendon, this volume) borrow heavily from Spanish
and have comparable contact phenomena to the small indigenous languages
Mosetén (cf. Sakel, this volume) and Purepecha (cf. Chamoreau, this vol-
ume). Stolz and Stolz (1996, 1997) come to similar conclusions comparing
a range of languages in contact with Spanish. They argue that the similarities
in contact phenomena are found within the same categories. Take the follow-
ing example: most languages in contact with Spanish borrow an element out
of the category of temporal adverbial clause markers, such as hasta ‘until’,
desde ‘from’, cuando ‘when’, but which ones of these markers are borrowed,
and which functions they assume in the recipient language, is language-spe-
cific (cf. Stolz and Stolz 1996, cf. also Sakel’s 2007 description for Mosetén).
Similarities in contact phenomena have also been attested for languages in
contact with Russian as the dominant language. Rießler (this volume) dis-
cusses how Russian contact phenomena in Kildin Saami resemble those in
other contact situations with Russian as the dominant language, referring to
Majtinskaja (19789). The reasons for this have to do with universals of bor-
rowing, rather than individual languages. Comparing the contact phenomena
in the sample, we find that similar principles apply for all contact situations
of the same type, independent of the source language. Indeed, when compar-
24 Jeanette Sakel
ing the situations where Spanish is dominant to those where Russian or other
languages are dominant, elements within the same categories are taken over,
and often these are MAT-loans of function words. This also includes dom-
inant languages that are not Indo-European, such as Arabic in contact with
Western Neo-Aramaic. These situations are similar due to general principles
of contact, rather than language-specific ones.6 The built-up of the system and
availability of certain structures in Spanish and Russian may nonetheless play
a role in a few cases, since they have similar typological profiles.
The final group of languages in Table 1 comprises Hup, Jaminjung, Likpe,
Nahuatl, Otomi, Rumungro and Macedonian Turkish. For these languages
the authors of the individual chapters discuss different layers of contact in
both one-to-one borrowing situations and linguistic areas. The general results
are that the languages display PAT-loans from the areas they are in, as well
as MAT/PAT loans from their particular one-to-one borrowing situations. In
some cases area and borrowing situations overlap, as in the case of Hup in
contact with Tukano, both of which are part of the Vaupés area. In other cases,
there are clear differences between the contact influence from the area and the
borrowing situation, as in Nahuatl, which is part of the Meso-American area
(as proposed by Campbell et al. 1986), as well as being part of a one-to-one
borrowing situation with Spanish. Comparing the contact phenomena in the
latter case between a one-to-one borrowing situation and an area, these are
very different: the areal phenomena are predominantly PAT, while Spanish
loans are very similar to other situations in which indigenous languages of the
Americas are in contact with Spanish as the dominant language and include
many MAT-loans.
Concluding, MAT-loans appear in dominated languages in one-to-one
borrowing situations, but also PAT-loans are frequent in this cases. While lin-
guistic areas and situations of substrate influence display mainly PAT-loans,
MAT-loans are very frequent in the category of function words and indeed ap-
pear in all languages of the sample (cf. also Matras’s chapter on borrowability
of categories, this volume). This suggests that function words are borrowed
easily and relatively early on in contact situations. We can see this in layers of
contact, for example, in the variant of North-eastern Neo-Aramaic (cf. Khan,
this volume) whose speakers have all immigrated to Israel. The original con-
tact situation with Kurdish as the dominant language has been replaced by the
new situation with Hebrew as the new dominant language. Indeed, the loans
from Hebrew are primarily found in the lexicon and function words. More
evidence for this comes from Romani (Elšík, this volume, cf. also Elšík and
Matras 2006). Also other sample languages show early borrowing of function
words: all of the sample languages show some degree of MAT-borrowing,

even in those languages where all or most other loans are PAT, such as Hup
and K’abeena.7 In many cases MAT-loans of function words have been on a
long journey, being borrowed from one language into another. For example,
some Kurdish discourse markers and connectors are of Arabic origin and
have entered the language through contact with Turkish (Haig, this volume).
Likewise, some Hup function words of Portuguese origin have most likely
entered the language via Tukano (Epps, this volume).
6. Conclusion
We have shown that the distinction between MAT and PAT is very useful for
classifying contact phenomena. For contact theory in general this means that
the distinction between MAT and PAT-borrowing should be included in at-
tempts to categorize contact situations.
We can conclude the following for the regularities behind MAT/PAT bor-
rowing: We have found that hierarchical relations between languages have an
impact on the types of loans encountered. For example Arabic is a dominant
language in some contact situations and a dominated language in others: it is
dominant in contact with Domari and Western Neo-Aramaic, while Khuzista-
ni-Arabic is dominated by Persian. In the first case, the dominant languages
incorporate many MAT-loans from Arabic, while in the second case, MAT-
loans are from Persian.8 Hence, the direction of types of contact phenomena
(in this case MAT-loans) depends on the hierarchy between the languages.
Furthermore, when a dominant language has high status, MAT-loans into a
dominated language are often easily accepted. For example many MAT-loans
from Spanish appear in Purepecha, Nahuatl and Yaqui, for which Spanish is a
highly dominant and high-status language. In the same way, MAT-loans can
be shunned from contact situations, such as in areas with social constraints
against pattern-replication in the Vaupés, as discussed for Hup above, and
also in other areas such as parts of Papua New Guinea (Ross 1996). Also the
degree of bilingualism plays a role in the way elements are borrowed. With-
out bilingualism, patterns are usually not copied and MAT is only borrowed
in a restricted sense. The type of contact influence is likewise closely related
to degree of (oral) bilingualism. For example Vietnamese has experienced
massive contact influence from Chinese, though this was mainly through
written materials and rarely through oral transmission involving bilingualism,
leading to a majority of MAT-loans.
26 Jeanette Sakel
All languages of the sample display MAT-loans of function words, even

those that have a majority of PAT-loans. This suggests that reasons for bor-
rowing function words as MAT are stronger than other constraints in contact
situations. Such reasons include the detached nature of function words, which
makes them easily borrowable, as well as their function as discourse structur-
ing devices (cf. Matras 1998). It could furthermore point at a general change
of contact situations across the world towards situations with one dominant
language that is used in administration, television and general communica-
tion in a highly mobile, globalized world. This does not mean that dominant
languages did not exist in the past, cf. the influence of Chinese on Vietnamese
or, for instance, Sanskrit influence on Indonesian. In those cases, however,
contact was prevailing in some communities – among the elite, for example,
while in other communities or at other times contact was less intensive. In
today’s globalized world, however, some languages, often former colonizers’
languages, have impact on entire speech communities. Television, school-
ing, trade, mobility, indigenous organizations, health, developmental organ-
izations and other ways of communicating with the “outside” world have
lead to the rise of these already highly dominant, major languages through
increased bilingualism. Many contact situations in the sample show this ten-
dency. For example in Hup, area-internal restrictions such as taboos against
MAT-borrowing have been overridden by overwhelming influence of Tukano
and eventually Portuguese as administrative languages (cf. also discussion by
Aikhenvald 2002).
Notes
1. I would like to thank Yaron Matras and Kristine Hildebrandt for comments on
earlier versions of this chapter.
2. MAT-borrowing without any PAT will not be discussed in this chapter as it is
very rare and mainly occurs in the lexicon; i.e. usually MAT is taken over with at
least part of its original PAT. An example of MAT-only borrowing in the lexicon
is the noun handy ‘mobile phone’ in German, which does not have this meaning
in the source language English.
3. Yaron Matras made the useful comment that just like the shape of morphemes
can combine with a certain meaning and appear in a certain organization pat-
tern, so can a phone acquire meaningfulness as a phoneme, and combine with
certain patterns of prosody, tone, or permissible combinations of sounds. Rather
than define a perfect match for MAT and PAT in phonology, we can simply re-
main conscious of the layered structure of phonological representation and the
fact that contact-related change may affect one level without affecting another.
4. Numerals are very frequently borrowed as MAT in many contact situations.
5. One could speculate that the reasons for this could be temporal dominance of
the source language.
6. These principle are anchored in the role of categories in processing discourse,
and the way that operating in a bilingual setting influences language processing;
see contribution by Matras.
7. We find similar loans in Indonesian, where some function words are taken from
dominant languages in the history, e.g. Sanskrit, Arabic, Creole Portuguese and
Persian. Other function words in Indonesian were borrowed as content words
and then grammaticalized, such as ‘and’ from the Sanskrit word for ‘company’.
8. There are mainly PAT-loans in Khuzistani Arabic for reasons discussed above.
Also, Arabic is still a dominant language in some religious contexts in Iran,
which skews the picture somewhat.
References
2002 Language contact in Amazonia. Oxford: Oxford University Press.
Bolonyai, Agnes
1998 In-between languages: Language shift/maintenance in childhood bilin-
gualism. International Journal of Bilingualism 2 (1): 2143.
Campbell, Lyle, Terrence Kaufman, and Thomas C. Smith-Stark
1986 Meso-America as a linguistic area. Language 62 (3): 530570.
2006 Markedness and Language Change: The Romani sample. Berlin:
Mouton de Gruyter.
Gołąb, Zbigniew
1956 The concept of isogrammatism. Buletin Polskiego Towarzystwa Jezy-
koznawczego 15: 112.
1959 Some Arumanian–Macedonian isogrammatisms and the social back-
ground of their development. Word 15 (3): 415435.
Haugen, Einar
1950 The analysis of linguistic borrowing. Language 26 (2): 210231.
Heath, Jeffrey
1984 Language contact and language change. Annual Review of Anthropol-
ogy 13: 367384.
Heine, Bernd, and Tania Kuteva
2005 Language Contact and Grammatical Change. Cambridge: Cambridge
University Press.
28 Jeanette Sakel
Johanson, Lars
1992 Strukturelle Faktoren in türkischen Sprachkontakten. Stuttgart: Franz
Steiner Verlag.
Keesing, Roger M.
1991 Substrates, calquing and grammaticalization in Melanesian Pidgin. In:
Elizabeth C. Traugott and Bernd Heine (eds.), Approaches to Gram-
maticalization Vol. 1: Focus on theoretical and methodological issues,
315342. Amsterdam: John Benjamins.
Majtinskaja, K. E.
19789 Zaimstvovannye elementy, ispol’zuemye v finno-ugorskih jazykah pri
obrazovanii form naklonenij [Borrowed elements, used in inflectional
forms in Finno-Ugric languages]. Études Finno-Ougriennes 15: 227–
231.
Matras, Yaron
guistics 36 (2): 281331.
2000 Fusion and the cognitive basis for bilingual discourse markers. Inter-
national Journal of Bilingualism 4 (4): 505528.
Myers-Scotton, Carol
2002 Contact Linguistics: Bilingual Encounters and Grammatical Out-
comes. Oxford: Oxford University Press.
Nau, Nicole
1995 Möglichkeiten und Mechanismen kontaktbewegten Sprachwandels –
unter besonderer Berücksichtigung des Finnischen. Munich: Lincom.
Ross, Malcolm D.
1988 Proto Oceanic and the Austronesian Languages of Western Melanesia.
Canberra: Pacific Linguistics.
Sakel, Jeanette, Yaron Matras
2004 Database of Convergence and Borrowing. Manchester: University of
Manchester.
2007 Language contact between Spanish and Mosetén: A study of grammat-
ical integration. International Journal of Bilingualism 11 (1): 2553.
Savić, Jelena M.
1995 Structural convergence and language change: Evidence from Serbian–
English code-switching. Language in Society 24: 475492.
Siegel, Jeff
1997 Mixing, leveling, and pidgin/creole development. In Arthur K. Spears
and Donald Winford (eds.), The Structure and Status of Pidgins and
Creoles, 111149. Amsterdam: John Benjamins.

1996 Funktionswortentlehnung in Mesoamerika, Spanisch–Amerindischer
Sprachkontakt. Sprachtypologie und Universalienforschung 49 (1):
86123.
indigenen Sprachen Amerikas und Austronesiens. Orbis 39 (1) [1996–
1997]: 177.
Treffers-Daller, Jeanine, and Raymond Mougeon
2005 The role of transfer in language variation and change: Evidence from
contact varieties of French. Bilingualism: Language and Cognition 8:
9398.
Weinreich, Uriel
1966 Languages in Contact: Findings and Problems. The Hague: Mouton
(first publ. 1953).
The borrowability of structural categories
Yaron Matras
1. Introduction
The question of the “borrowability” of categories has often been equated with
the presence or absence of constraints that rule out the borrowing of cer-
tain kinds of structures (cf. Campbell 1993, Thomason 2001 and elsewhere).
I use the term here in a different sense. Borrowability is taken to mean the
likelihood of a structural category to be affected by contact-induced change
of some kind or other (whether matter- or pattern-replication; see Matras
and Sakel 2007). From a strictly structure-oriented point of view, one might
interpret this as the “ease” with which a category can be re-shaped through
contact. I am not quite happy with this formulation, either, since it leaves
open the source of the process and its motivation. Nor is the issue resolved
by re-stating the obvious, namely by claiming that there is a link between
the sociolinguistic norms of a speech community, the intensity of cultural
contacts, and the outcomes of structural processes of change (cf. Thomason
2001, following Thomason and Kaufman 1988); for such a statement does
not account for the fact that the borrowing of some categories requires more
intensive contact than that of others. In other words, it fails to explain the
hierarchical relationship between individual positions on the borrowing con-
tinuum.
When we speak of “ease” of borrowing, we are referring implicitly at
least to the communicative behaviour of speakers in a bilingual setting and
to changes in that behaviour that have a long-lasting effect on the shape of
the language that they use. What interests us in this connection is the likeli-
hood that, in respect of a particular structure which serves a particular func-
tion in language processing, speakers might give up the separation of two
sub-components within their linguistic repertoire – the two “languages” –
and begin to employ the structure in question regardless of the choice of
language. Bilingual speakers of English and German, for example, take for
granted that the concepts computer, download, and internet are com-
mon to both sets of communicative interactions in which they normally en-
gage: those where the chosen language of conversation is English, and those
where it is German. Bilingual speakers of Domari and Arabic (see Matras,
32 Yaron Matras
this volume) are fully at ease with the fact that the entire system of clause
combining and connectors is shared by their two languages; a shift in the
interaction setting will lead them to switch into another “language”,1 and
this will affect the selection of various structures – vocabulary, inflections,
anaphora and deixis, and so on – but it will not affect strategies of clause-
combining, which remain the same in all settings (i.e. for both languages).
And for speakers of Macedonian and the local Turkish dialects spoken in
that country (see Matras and Tufan, this volume) the way of organizing in-
formation in copula sentences is identical regardless of the language that is
being spoken. This, in essence, is the core of the diachronic process that we
call “borrowing”. With “borrowability”, then, we mean the likelihood that
speakers will give up the separation between their “languages” – the mental
demarcation line that divides their overall repertoire of linguistic structures
– in respect of a particular function-bearing structure (a “category”).2
2. Borrowing hierarchies
Essentially two kinds of generalizations have been proposed concerning the

borrowing of grammatical categories. Those of the first kind relate to the fre-
quency with which a category may be affected by contact-induced change.
Generalizations of the second kind suggest an implicational relationship be-
tween the borrowing of individual categories: the borrowing of one category
is understood to be a pre-condition for the borrowing of another.
The majority of observations on grammatical borrowing belong to the first
group (cf. Haugen 1950, Heath 1984, Thomason and Kaufman 1988, van
Hout and Muysken 1994, Stolz and Stolz 1996 and 1997, Winford 2003,
Aikhenvald 2006). Some statements are based on casual impressions only,
while others report the results of counting exercises performed on a cor-
pus. An issue that merits attention is the distinction between the counting
of tokens and the counting of types (cf. also van Hout and Muysken 1994:
4243): can we consider nouns to be more borrowable than, for instance,
conditional particles, simply because nominal tokens occur in a corpus more
frequently than conditional particles? Surely, token frequency will tell us how
often a borrowed form is used, but it will not necessarily reveal how likely it
is to be borrowed? Counting types, in turn, raises problems of its own: Can
we conclude on the basis of type-frequency that adjectives, for instance, are
easier to borrow than conditional particles, considering that the first consti-
tute an open class of nearly unlimited types (for the purposes of any practical
The borrowability of structural categories 33
comparison), while a language is likely to have just a very restricted inven-

tory of conditional particles (if at all more than just one)? Such issues make it
difficult to compare frequency-based hierarchies drawn from conversational
corpora.3
This kind of dilemma does not present itself when comparing the gram-
matical (and lexical) systems of different languages in a sample (whether a
structured one, or a casual one). Here, we are interested in the number of lan-
guages within the sample in which category X has been re-shaped as a result
of contact. The more languages show borrowing affecting a certain category,
the higher the frequency of borrowing for that particular category in the sam-
ple. We might then say that this category is “more likely” to be borrowed,
relative to other categories.
Implicational hierarchies entail frequency hierarchies,4 but go a signifi-
cant step further in suggesting a constraint on the occurrence of borrowing
with any lower-ranking category. The usual format of the statement is: Y is
not borrowed unless X is borrowed as well (cf. Moravcsik 1978, Stolz 1996,
Matras 1998 and 2002, Field 2002, Elšík and Matras 2006). The postula-
tion of implicational borrowing hierarchies thus goes beyond the assump-
tion that categories have different susceptibility to contact-induced change. It
suggests that the process of contact-induced change follows, to some extent
at least, a predictable pathway, with one stage leading as a pre-requisite to
another. Moravcsik (1978) had attempted in this way to link the borrowing
of non-nouns to nouns (“no non-nouns are borrowed unless nouns are also
borrowed”), inflectional to derivational morphology (“no inflectional morph-
ology is borrowed unless derivational morphology is also borrowed”), and so
on, resulting in a web of inter-dependencies among various structural types
(cf. Field 2002 for a statement on agglutinative > inflectional morphology).
Matras (1998, 2002) and Elšík and Matras (2006) concentrate their observa-
tions on inter-dependencies of what they consider “values” of the same cat-
egory, i.e. members of a shared structural paradigm. Matras (1998) postulates
a borrowing hierarchy ‘but’ > ‘or’ > ‘and’ with respect to coordinating con-
junctions, and Matras (2002) and Elšík and Matras (2006) suggest ‘necessity’
> ‘ability’ > ‘volition’ with respect to expressions of modality, and many
more.5
Frequency-based hierarchies and implicational hierarchies may comple-
ment one another. Stolz and Stolz (1996) and Ross (2001), both relying on
frequency-oriented observations rather than strict implicational hierarchies,
conclude that contact-induced change begins at the level of the organization
of discourse, proceeds to the organization of the paragraph, utterance, and
34 Yaron Matras
sentence levels, and only then reaches the levels of the phrase and word.
The postulation of this kind of implicational hierarchy rests in such cases on
accumulated observations of a series of frequency hierarchies.
Borrowing hierarchies thus provide us with an opportunity to gain insights
into the factors that prompt speakers to allow their language systems to con-
verge around a particular structure. Explanations of borrowing generally take
one of three directions: (1) The degree of borrowing is related to the intensity
of exposure to the contact language, (2) The outcome of language contact is a
product of the structural similarities and differences (congruence) among the
languages concerned, and (3) Borrowability is a product of inherent seman-
tic-pragmatic or structural properties of the affected categories. Issues such
as prestige and domain-specialization of the languages typically fall under 1,
while conjectures about functional “gaps” as motivating factors fall under
type 2.
Our interest in the present context is in explanations of type 3. This in-
terest derives from the realization that structures and paradigm values often
behave in an asymmetric manner when it comes to contact-related change.
Under the “prestige” or “intensity of contact” effect there is no a priori reason
why ‘but’ should be more vulnerable and prone to borrowing than ‘and’; in
many cases structural congruence does not provide an answer to this hier-
archical relationship, either (cf. Matras 1998). Where category or paradigm
values consistently show unequal or asymmetrical behaviour in contact situ-
ations, factors promoting borrowing must be sought in the inherent proper-
ties that they possess. In trying to explain the borrowability of categories we
must therefore return to our initial assumption that ease of borrowing reflects
the ease with which speakers are willing to give up the separation of two
“language systems” and allow them to converge or to fuse around a particu-
lar linguistic function. The question that we ask is therefore: What is it that
makes one category (or category value) a more attractive candidate for “sys-
tem conflation” than another?
Elšík and Matras (2006: 370ff.), following Matras (1998), argue that bor-
rowing is motivated by cognitive pressure on the speaker to reduce the men-
tal processing load by allowing the structural manifestation of certain men-
tal processing operations in the two languages to merge. The need to do so
arises especially around operations that gauge the presentation of propos-
itional content to hearer expectations, for example connectivity and modality.
In these domains, merger of the structures targets in the first instance those
conceptual domains where the speaker’s epistemic authority is in question,
and the potential for tension at the interaction level is therefore greatest. This
occurs for example around the expression of condition, contrast, participant-

external force, or other, more general conceptual complexity. This accounts
for the direction of the hierarchy, which prioritizes category values such as
“contrast” and “external force”.
We should at this stage clarify the notion of “category”. One of the fac-
tors impeding straightforward comparability among hierarchies postulated in
the literature is the vagueness with which category labels are used. Some re-
searchers have, for example, identified a category of “function words”; others
speak of “particles”, and others still make reference to a class of “pronouns”.
As “function words” we can classify anything from interjections and fillers
on to definite articles and demonstratives – categories that show enormous
variation in respect of their contact behaviour. “Particles” can include mark-
ers of modality, connectivity, aspect, and more, while “pronouns” are used
with reference to such functionally diverse entities as anaphoric or third-
person pronouns, indefinites, and participant deixis.
If we suspect that there is a link between category status and borrowing,
then we must assume that “category” represents a functional notion, rather
than just a constituent slot or a wholesale cover-term. Categories are under-
stood here to be operational devices that trigger mental processing activities
in communicative interaction: nouns name objects, interjections direct atten-
tion to emotive evaluations of the speech situation, connectors establish links
between the processing of individual propositions, word order serves as a
map to organize information at the utterance level, and so on. Their concrete
representation in a given language is through a structural form, which may
or may not have cross-linguistic equivalents. Our agenda is to accommodate
the borrowing behaviour of categories within an explanatory model, one that
accounts for the link between the processing function which the category
triggers, and the degree to which speakers allow its structural representation
to converge or fuse among the two (or more) components of their linguistic
repertoire.
To be sure, different explanatory models may be appropriate for different
structural components of language. There is no doubt that the borrowing of
institutional terminology from a language that is dominant in the public or
acrolectal domain is not a result of mere tension at the level of processing the
speech interaction, but rather an attempt to extend the referential world of the
“dominant” language into interactions in which the “minority” or “weaker”
language is used. The borrowing of phonemes may, in turn, be simply instru-
mental in serving the authentic integration of loanwords without “distorting”
them, by adjusting the phoneme system to accommodate them. Nevertheless,
36 Yaron Matras
these too are conversation-functional factors that motivate speakers to allow

their “linguistic systems” to conflate around certain structures and categor-
ies.
Finally, we must briefly comment on the relevance of exceptions to pos-
tulated hierarchies. In a discussion that focuses on absolute constraints on
borrowing and sets out to test their validity, the discovery of counterexam-
ples can have a sensational effect in dismantling earlier claims. We aim here
at taking what Aikhenvald (2006: 26) describes as the more “positive” route:
understanding a variety of factors and preferences that facilitate structural
diffusion among languages. In this spirit, it would be naïve and counter-
productive to ignore tendencies that are followed by a substantial group of
languages within the sample only because they are not followed by all, or
indeed because they might be contradicted in one or two instances. Where
there appears to be a motivation behind trends, one that is beyond pure coin-
cidence, then these trends deserve our attention. Quite often, it is the coun-
terexample that can be explained as resulting from a local, language-particu-
lar constraint that impedes the realization of common patterns in a particular
instance.
The following sections mirror the organization of the language-oriented
chapters in this collection, and are devoted to an evaluative overview of se-
lected patterns arising from the discussion of the 27 sample languages.
3. Phonology
Phonology in particular is an area in which borrowings are traditionally con-

sidered to fill so-called “structural gaps”, facilitated especially when bor-
rowing does not entail changes to the actual phonemic system but merely to
allophonic distribution (cf. Winford 2003: 5556). The notion of “gap” is
vague, given that languages have long been considered in descriptive linguis-
tics to constitute autonomous, functional systems. We should therefore per-
haps amend the definition to focus on bilingual speakers’ quest for harmony
among the two (or more) systems that constitute their linguistic repertoire;
absence of harmony as a result of absence of a phoneme in one of those sys-
tems is presumably what is meant by a “gap”. There is a functional motivation
favouring consistency in the types and points of articulation as well as the
distribution rules of allophonic variation, regardless of the speech situation
in which language users find themselves, and hence pressure toward conver-
gence of the two phonological “systems”. At the same time, social norms and
awareness of identity and loyalty toward the group associated with the home
language will counteract levelling. The process of phonological borrowing is
the outcome of compromises between these two pressures.
Our sample shows three different types of change: (1) Incorporation of
phonemes from a contact language in loanwords, (2) adjustment in the ar-
ticulation of a phoneme following the model of the contact language, and
(3) incorporation of a borrowed phoneme into the system of inherited words
(substituting an inherited phoneme in some words, though not necessarily in
the system as a whole). Changes of the second type may lead to simplifica-
tion of the system, or to its enrichment through new distinctions, or they may
simply alter the nature of certain phonemes, leaving the complexity of the
system as a whole intact. In the sample, changes of the first type typically add
to the phoneme inventory, as do in most cases changes of the third type. On
the whole, then, our observation is that language contact in the cases under
scrutiny here typically leads to an enrichment of the phonological system.
Another general observation is that contact-related change is more likely
to affect consonants than vowels; indeed, we may even be able to postulate
an implicational hierarchy of contact-related change:
(1) adoption of new consonants > adoption of new vowels
The reason behind this hierarchy is, however, likely to be rather trivial: It is
a product of the fact that consonant inventories are generally larger, and so
the potential for lack of overlap between consonant systems in contact is
higher, resulting in greater pressure to adjust the consonant system. In fact,
the hierarchy under (1) need not at all suggest that contact induced change in
phonology begins with consonants, and it is not impossible that vowels are as
prone, or even more prone to change in situations where there is no significant
difference among the languages in the inventory of consonants.
Almost all languages in the sample incorporate loanwords along with at
least some of their original phonemes, which are new to the recipient system.
Examples are Macedonian Turkish /ts/ with Macedonian loans, the Vietnam-
ese sounds /ʆ/, /f/. /v/, and /z/ with Chinese loans, the Domari pharyngeals /ħ/
and /ʕ/ used in Arabic loans, and Imbabura Quichua /b/, /d/, /g/, /ʋ/, /ʒ/, and
vowels /e/ and /o/ in Spanish loans. This indicates that for speakers of the lan-
guages in question, the integration of lexical loans in an “authentic” manner,
i.e. one that closely replicates their original use in the contact language, takes
precedence over the preservation of the coherent phoneme structures of the
recipient language. The system of the recipient language is adjusted in order
38 Yaron Matras
to accommodate loans in an unmodified form. Of the sample languages, only

Mosetén and Biak appear not to extend their phoneme system to accommo-
date loanwords, while the overlap between the phoneme systems of Kriol and
Jaminjung does not require Jaminjung speakers to make any special effort in
order to accommodate Kriol loans.
The second type of process is the convergence of articulation modes and
positions, which is often, as Winford (2006) suggests, a process affecting al-
lophonic variation. There are numerous cases involving the introduction of al-
lophonic variation, among them the interchange of /dʒ/ to /ʒ/ in Domari based
on the Arabic model, of /q/ and /ɣ/ in Khuzistani Arabic based on the Persian
model, and of /l/ and /r/ in certain positions in North-eastern Neo-Aramaic,
based on the Kurdish model. In some cases, variation leads to a shift in articu-
lation, as in the weakening of /h/ in Macedonian Turkish or the replacement
of /h/ by /ɦ/ in some varieties of Yiddish, of pretone /o/ by /a/ in others, and
a more general shift in this language from dental /l/ to velarized /ɫ/.
From a language user’s perspective, all these are instances of harmoniza-
tion of articulatory patterns, aiming to ease the burden of having to maintain
complete separation of two distinct systems in different settings of conver-
sational interaction. In some cases, phonemic distinctions in the recipient
language are even given up in order to enable harmonization. Some varieties
of Purepecha for instance lose the opposition between retroflex /ɽ/ and flap
/r/, as well as between central /ï/ and front /i/, resulting in a vowel system that
matches that of Spanish. In other cases the system becomes more complex, as
with the introduction of pharyngealized consonants into Domari from Arabic
and into Tasawaq from Tuareg, of long vowels into Rumungro from Hungar-
ian, and of palatalization of stops into Kildin Saami and Yiddish from Slavic
contact languages.
Informal observations lead us to believe that prosody is a domain of phon-
ology that is particularly prone to contact. This can be the result of two inter-
connected factors. The first is the peripheral role that prosody has in convey-
ing meaning, and the fact that it is a form of expression of emotive modes,
operating at the speech act and utterance level, rather than the word level. This
allows speakers to mentally disconnect prosody more easily from the matter
or shape of words associated with a particular language, making it prone to
change and modification in contact situations. The second factor may be the
proven neurophysiological separation between prosody and other aspects of
speech production, making prosody more difficult to control. Both factors
may contribute to the fact that foreign “accents” are most persistent in the
area of prosody.
We have in our sample little data on prosody, however. Several pairs of

languages are reported to share prosodic features with their current contact
languages, confirming the above hypothesis: They include Domari, Nahuatl,
Rapanui, Rumungro, and Indonesian, and to some extent at least also Kurd-
ish, some varieties of Yiddish, Hup, and Kildin Saami. This is already more
than can be said about any other area of phonology, where we find borrowings
and convergent tendencies, but no wholesale convergence.6 Hence, we might
carefully postulate the following frequency-based hierarchy of likelihood of
complete convergence in the phonological system:
(2) prosodic features > segmental phonological features
Once again we need to emphasize that this hierarchy does not suggest that
segmental phonological features are unlikely to be borrowed unless prosodic
features are also borrowed; it merely reflects the tendencies toward full-scale
convergence of the systems. In fact, it does not seem possible at this stage to
point to any position within the phonological system (e.g. certain articula-
tory modes or positions, marked features, etc.) as being particularly prone to
contact-induced change. It seems that the details of phonological change are
entirely a product of the relations among the two systems – or congruence –
and any statistics of change are likely to simply reflect the mere likelihood of
the two phoneme systems in contact to share certain phonemes, and to differ
with respect of others. The one additional generalization that we can make is
that the borrowing of phonemes begins with the borrowing of lexical items
that contain them:
(3) phonological features in loanwords > independent phonological

features
Concluding this section, it seems that there are two alternative strategies
that multilinguals can pursue in respect of phonology, taking for granted that
language contact will lead at least to a transfer of lexical items from one lan-
guage to another. The first is to maintain the complete integrity of the recipi-
ent language system by adjusting the phonology of any borrowed word to
match that of the recipient system. It would appear that this strategy would
be facilitated by widespread monolingualism in the recipient language, and
the confinement of bilingualism to just a small or peripheral group of inter-
mediaries. It is also possible that this strategy can be maintained for a while
in situations where widespread bilingualism is a relatively new phenomenon,
40 Yaron Matras
or where speakers of the recipient language have no need to appear to have

native-like or even good command of the contact (donor) language. In our
sample, only Mosetén and Biak appear to adopt this kind of strategy.
The alternative, which is the route taken by most of our sample lan-
guages, is to maintain the authenticity of donor language items by adjusting
the phonological system of the recipient language to accommodate phono-
logical features of the donor language. This would seem to be facilitated by
widespread bilingualism and the need for speakers of the recipient language
to gain the approval of the donor language community. Authenticity in the
pronunciation of loanwords is a token of the social value attributed to the
donor language, and is emblematic of the social immersion into the donor
language culture. The result is the incorporation of phonological features
from the donor language into the recipient language. These will at first ac-
company loanwords from the donor language. With increased multilingual-
ism and the need to operate regularly in two linguistic environments, it is
advantageous for speakers of the recipient language – we assume that this
is normally the language that occupies the weaker socio-political position,
or a minority language – to allow major components of their phonologic-
al system to converge with that of the dominant, donor language, and so to
rid themselves of the burden to maintain a separation of their two speech
modes. This tendency of course competes with the sense of loyalty toward
the group-language, which may favour maintaining an old system and limit
innovation. The outcome is often a compromise in the form of an adjustment
of certain aspects of the phonological system in favour of common patterns,
or the adoption of some features but not others from the donor language, as
we saw above.
4. Typology
A number of languages show signs of movement between morphological

types: There are changes from polysynthetic to less polysynthetic structuring
in Nahuatl and Imbabura Quichua, Otomi, and Guaraní, from an agglutina-
tive type to a more isolating type in Indonesian, and from an agglutinative
to a more analytic type in Purepecha. An increase in reliance on reduplica-
tion is found in (agglutinative) Likpe under the influence of (isolating) Ewe.
On the other hand there is some acquisition of agglutination in Hup, and in
some traits perhaps also in Kurmanji Kurdish as well as in Rumungro: In
Kurmanji, the agglutination of case markers benefits from the presence of
inherited enclitic case markers, which historically form part of a circumpo-

sition.7 In Rumungro, the adoption of agglutinative prefixes is a by-product
of the almost wholesale adoption of indefinite markers, superlative markers,
and a few other morpheme classes that are high on the “relevance” scale and
so easily borrowable on functional grounds.
It is noteworthy that none of these developments seems to follow any pre-
dictable structural path, and the only common denominator is an accommo-
dation to the patterns of a socially dominant contact language. In all cases,
the drift begins in individual constructions such as adjective comparison or
case marking, and it is yet to be seen whether it will continue to spread.
Frequencies and the evaluation of general trends are not applicable to our
sample in the domain of typology, as some language pairs happen to belong
to similar types (consider Domari and Arabic, Vietnamese and Chinese, Yid-
dish and Slavic languages), while others, among them those named above,
show typological clashes with their contact languages. It is interesting that
Macedonian Turkish maintains strong morphological agglutination despite
considerable re-structuring in the domain of clause organization; indeed, the
loss of converbal morphology in favour of grammaticalized conjunctions
might be considered a small, yet not insignificant step in the direction of
a typological drift. On this basis, we might conclude that typological drift
begins at the clause level. In Khuzistani Arabic, the reinterpretation of the
construct state ending and definite article as equivalents of the Persian ezāfe
attribution marker does not constitute a drift in morphological type as such,
but it does expose the path taken toward morphological re-analysis, here too
in a possessive construction at the phrase level.
Perhaps one of the more outstanding typological shifts reported in the
sample is the change in alignment in North-eastern Neo-Aramaic, modelled
on the Kurdish ergative construction. An interesting aspect of the construc-
tion is that it does not mirror the complete ergative formation – in the ab-
sence of, for example, nominal case in Aramaic – but chooses instead only
a number of pivotal features which it reconstructs with inherited means (see
also Matras and Sakel 2007). Alignment is contact-sensitive elsewhere, too:
We find an expansion of ergativity in Urban Manange in contact with Nepa-
li, and the incipient loss of ergativity in Kurmanji in contact with Turkish.
In conclusion, although we do not have cases in our sample that display
far-reaching changes in overall morphological typology, it is very evident that
morphological type is certainly not immune from contact-induced change.
It would seem fair to state, at least cautiously, that there is by and large an
opportunist motivation for typological drift: It is subordinate to pressure to-
42 Yaron Matras
ward convergence in a particular salient construction, such as the possessive

construction, clause linking, or indefiniteness marking, or else it is triggered
by accidental similarities in the shape and position of markers with similar
meaning.
5. Nominal structures
This is a diverse and complex domain, containing many different sub-cate-

gories, and it is perhaps not a surprise that only two languages in the sample,
Jamingjung and Biak, are reported to show no contact influence at all under
this heading. A prominent sub-domain of nominal structure is case, but here
it is noteworthy that no borrowing of bound case markers is attested in the
sample. The closest evidence of contact influence on case markers is the reli-
ance in some contact varieties of Kurmanji on postposed markers as enclitics
rather than as components of circumpositions. There are also indications of
meaning extension of case markers, such as the ablative becoming genitive
under Dutch influence in Indonesian, or the loss of a distinction between
comitative and instrumental in Imbabura Quichua.
Adpositions on the other hand do show matter-replication; attested cases
include Indonesian sama ‘with’ and guna ‘for the purpose of’ from Sanskrit,
Spanish de in Guaraní, numerous Arabic prepositions in Domari, Spanish por
and para in Purepecha, and Tuareg ámmàs ‘inside’ and àláqqàm ‘behind’ in
Tasawaq. The preposition indicating ‘between’ is the most frequently bor-
rowed, examples being Indonesian antara from Sanskrit, Spanish entre in
Guaraní and other languages, and Arabic bēn in Domari. This gives some
vague evidence in support of the hierarchy proposed by Elšík and Matras
(2006) for the borrowing of local relations expressions in Romani dialects:
(4) peripheral local relations > core local relations
“Core” relations (‘in’, ‘at’, ‘on’) are borrowed less frequently than “peripher-
al” relations (‘between’, ‘around’, ‘opposite’), and this finds some support in
the appearance of ‘between’ as the most frequent borrowing in the sample.
Developments affecting gender marking include a shift in unmarked gen-
der from feminine to masculine in Mosetén, the loss of neuter gender in NE
Yiddish (as in the contact languages Lithuanian and Latvian), and the incipi-
ent system of nominal classifiers in Hup (classifying inanimates by shape,
and animates by gender), adopted from Tukano. Definitely the most extensive
development in this domain is the borrowing of Chinese classifiers into Viet-

namese. There are, in addition, some marginal phenomena such as the loss of
gender in pronouns (in Rumungro as well as in North-eastern Neo-Aramaic).
Our sample gives us the impression that gender in the narrow sense (of a
two- or three-gender system) is more stable in contact situations than more
differentiated classifier systems, where influence might be more extensive.
Nominal possession is a domain in which contact phenomena are fairly
widespread. The most common change to possessive constructions is a modi-
fication of word order, often drawing on existing flexibility and enhancing the
frequency of a more peripheral pattern to match that of the contact language.
Examples of contact phenomena in possession are found in Domari, Mace-
donian Turkish, Rumungro, Khuzistani Arabic, Guaraní, and North-eastern
Neo-Aramaic, concerning mainly the order, and in some cases the distribu-
tion of morphs and their meaning; incipient cliticization of the postposed pro-
nominal possessor in Kurmanji; frequent postpositive possessors in Rapanui;
use of a preposition bearing the same meaning as in the contact language in
Indonesian; and use of a borrowed preposition in Guaraní.
Definiteness is known for its areal diffusion, but in the sample we have
only few cases of contact developments in this domain. Rapanui illustrates
that re-arrangement of definiteness rules may occur when definite markers
exist in both systems prior to contact. Khuzistani Arabic shows selective re-
treat of definiteness marking in some constructions, in contact with a lan-
guage with no overt marking of definiteness. Interestingly, North-eastern
Neo-Aramaic borrows the Kurdish definite article -ak-.
Nominal morphology is most frequently replicated in the case of plural
markers, which are often maintained in loan nouns, either as productive
markers of plurality, as in Yiddish, Rumungro, and Tasawaq, or in conjunc-
tion with a native expression of plurality, as in Domari and Quichua. In Likpe
we find replication of a specific plurality marking pattern. Plural markers
may be said to occupy a position in between derivation and inflection mark-
ers. On the one hand, they are potentially linked to the expression of plural
agreement elsewhere in the sentence, and so they operate at the level of the
sentence rather than just the word. On the other hand, they indicate clear se-
mantic opposition to singulars at the word level. Morphological plural mark-
ing thus meets the criteria for semantic transparency which is so often noted
as a factor facilitating morphological borrowing (Moravcsik 1978, Matras
1998, Field 2002, Winford 2003). The direct borrowing of derivational mor-
phemes is attested throughout the sample. Macedonian Turkish, Yiddish, Qui-
chua, Purepecha, and Rumungro all borrow diminutive suffixes (with Kildin
44 Yaron Matras
Saami re-organizing its diminutive derivation based on a Russian model), and

Quichua and Yiddish borrow agentive suffixes (Quichua -dur from Spanish,
Yiddish -nik from Russian).
The overall pattern leaves us with a picture that is not incongruent with
that reported on in contact linguistics so far. The most widespread changes are
in the possessive construction. They affect the nominal phrase at the syntac-
tic or morphosyntactic level, having to do primarily with the position of the
possessor and possessed object, and partly with the arrangement of posses-
sive morphology. This is in line with predictions that phrase-level borrowing
will be more intense than word-level borrowing. Borrowing of bound mark-
ers favours in particular plural markers, diminutive and agentive derivational
markers, and classifiers (but not gender markers), confirming that semantic
transparency facilitates borrowing. Adpositions are more borrowable than
bound case markers (borrowing of which is not attested in the sample), with
between being the most borrowable in our sample, confirming the tendency
of borrowing to favour peripheral relations, and so for the process of conver-
gence to begin with remote, cognitively less accessible or conceptually more
complex domains. In other domains, such as the distribution of case, defin-
iteness, or gender assignment, languages may develop similarities, often by
extending or limiting distributional rules. However, bound case and gender
markers remain on the whole among the most stable features in the nominal
domain, resisting especially direct replication of matter.
6. Verbal structures
Little attention has been granted in the literature to borrowing of features be-
longing to the domain of verbs (on the borrowing of lexical verbs see below);
reports on the borrowing of TMA markers are quite rare. It is useful to con-
sider the categories one by one. In the domain of tense, we see contact-in-
duced similarities in the organization of the future tense in several languages:
It is lacking in Domari and Arabic, it is suffixed in Hup and Tukanoan, and it
shows a similar periphrastic structure in Kildin Saami and Russian. To this
we might add the similar organization of the prospective aspect in K’abeena
and Amharic.
Contact phenomena appear to be somewhat more frequent in aspect and
aktionsart, where we find matter replication as well as shared patterns. Domari
uses an Arabic habitual auxiliary kān, and Nahuatl introduces a progressive
based on the Spanish model. We find aspectual use of the borrowed comple-
tive ya from Spanish in Guaraní, and sudah from Sanskrit in Indonesian, as

well as similar expressions of experiential perfect in K’abeena and Amharic,
while Likpe adopts a periphrastic present progressive similar to Ewe, and
North-eastern Neo-Aramaic adopts a present progressive particle-turned-
prefix based on a Kurdish model.8 Yiddish re-organizes its verbal prefixes to
replicate so-called Slavic Aspect distinctions (essentially, grammaticalized
aktionsart), and Rumungro shows iteratives that are calqued on Hungarian.
Mood is similarly somewhat richer in contact developments, though dif-
ferences in the structural organization of mood and modality make an exhaus-
tive comparison somewhat difficult. Noteworthy are similarities in the use of
the subjunctive in North-eastern Neo-Aramaic and Kurdish, and Domari and
Arabic (cf. Matras and Sakel 2007: 843). Vietnamese prohibitive and con-
ditional markers are borrowed from Chinese, while Kurmanji borrows the
Turkish conditional marker -ise. Modality shows the most widespread con-
tact phenomena, especially as regards matter replication. Almost half of the
sample languages show matter replication of modality markers, such as Turk-
ish gerek ‘must’ in Kurmanji, Arabic lāzim ‘must’ in Domari, Spanish tiene
que ‘must’ in Rapanui, Arabic mungkin ‘can’ in Indonesian, Spanish pudi
‘can’ in Quichua, Malay harus ‘must’ in Biak, and more. The most common
are markers of obligation (i.e. expressing external forces), followed in turn
by necessity, possibility, ability, and desire. This hierarchy is almost always
implicational, the only exception being Domari (which borrows markers for
all meanings except ability):9
(5) obligation > necessity > possibility > ability > desire
The hierarchy proceeds from the most intensive external force, to the most
participant-internal dimension. It is identical to the hierarchy identified by
Elšík and Matras (2006) for the borrowing of modality markers in Rom-
ani dialects: necessity > ability > (inability) > volition. The more abstract
theme in this hierarchy might be described as the degree of “speaker control”,
low speaker control correlating with high borrowability.
As far as Domari is concerned, its minor deviation from the implicational
hierarchy can be explained by the fact that all the modality markers that it
borrows from Arabic are impersonal expressions, or non-verbs. Even bidd-
‘want’, is a nominal, and its person inflection in Arabic follows the paradigm
reserved for nominals, i.e “my-wish” etc. (and this inflection is carried over
into Domari as well). The Arabic expression for ability, however, ‘a-qdar-,
is an inflected verb, and although Arabic verbs are borrowable in Domari,
46 Yaron Matras
it competes with an inherited Domari verb sak-, which prevails. Thus, it is the
formal inconsistency in the system of the donor language which in this case
imposes a constraint that breaks the hierarchy.
Returning now to a general view of TMA and modality, we have seen the
high density of (matter) borrowing in the domain of modality, in some cases
also in mood, frequent matter and pattern replications in the area of aspect
and aktionsart, and few cases of pattern replication in tense, all involving the
future. This picture lends itself to an interpretation in terms of the hierarchy
in (6), which depicts the likelihood of the respective categories to be affected
by contact:
(6) modality > aspect/aktiosnart > future tense > (other tenses)
By and large, this hierarchy reflects both frequency, and implicational rela-
tionships. There is one case in the sample – Kildin Saami – where there ap-
pears to be contact influence in the arrangement of tense, but not in modality.
Yiddish might be considered a case for contact influence on aktionsart, but
it similarly lacks borrowed modal verbs from its Slavic contact languages,
though to some extent this might be explained by the presence in this domain
of Hebrew loans acquired through literary tradition.
The internal rationale of the hierarchy – which, once again resembles the
findings for Romani dialects (cf. Matras 2002) – leads us to postulate again
that external circumstances that limit the degree of speaker control – mood
and modality in general – are the most contact-sensitive. They are followed by
a qualification of the internal structure of the event – aspect and aktionsart –
these too being beyond the immediate control of the speaker. Only then do
we find contact influence in tense, the most intimate relationship between the
event and the speaker’s own perspective, though it is noteworthy that in our
sample this is limited to the future tense, which identifies the event as being
least stable and secure from the speaker’s perspective. The overall theme is
therefore once again the speaker’s epistemic authority; its absence or weak-
ening correlates with high borrowability.
Existential and possession verbs are affected by contact in several of the
sample languages. In Domari, the Arabic copula is adopted in its function
as a periphrastic expression of the habitual aspect, and it co-exists with the
Domari enclitic copula. But Rapanui uses Tahitan and Spanish forms as cop-
ula, and both Indonesian and North-eastern Neo-Aramaic are reported to
have developed copula forms through imitation of forms in their contact lan-
guages. The borrowing of ‘have’ is on the whole marginal. Spanish tengo is
employed in Rapanui, while in some other languages, such as Hup, we find

calquing of constructions.
Contact phenomena in the area of voice and valency are almost exclusive-
ly pattern-oriented, and usually involve an increase in frequency distribution
of an existing option: analytic reflexives in Quichua, the periphrastic passive
in Purepecha, a causative marker derived from the verb ‘to do’ in Manange
(copying the function of the Nepali affix -aau), and productive use of the
inherited morphological causative in Rumungro. Vietnamese is exceptional
in directly borrowing reflexive and passive markers from Chinese. Recent
contact-induced grammaticalizations lead to the emergence of causative, pas-
sive, and reflexive markers in Hup, and to a reflexive in Likpe. It is difficult
to further sub-divide this domain into category components, or to draw clear-
cut connections with other components of the verb, other than to say that
derivation generally appears much more contact-susceptible than tense, and
perhaps at the same level as aspect. Both are in a sense statements about the
internal organization of events, not directly connected to the speaker’s pos-
ition, but with no explicit evaluative statement concerning the truth-relevance
or factuality of the event (as in mood or modality), either.
Moravcsik (1975) had drawn attention to the frequent use of incorporation
strategies to accommodate borrowed verbs, and this typological discussion
of verb borrowability has recently been revived (cf. Wichmann and Wohlge-
muth, forthc.). There appears to be a near-consensus view that the borrowing
of verbs is not, of course, impossible, but made more cumbersome in some
languages due to the widespread tendency of verbs to be morphologically
more complex (see Winford 2003: 52). In our sample, direct borrowing of
verbs, without any formal adaptation, is found in Vietnamese, where there
is no morphology in either the recipient or the donor language Chinese; in
Likpe, where isolating Ewe contributes verb roots into an agglutinative struc-
ture; but also in many other languages, including Tasawaq, Quichua, Otomi,
Guaraní, Hup, and K’abeena. There is thus obviously no universal constraint
on the integration of borrowed verbs.10
Nonetheless, several languages in the sample prefer to apply an overt,
morphological accommodation strategy when incorporating verbs of foreign
origin into the lexicon. A favourite strategy is the use of so-called “light”
verbs. Macedonian Turkish, Kurmanji, Domari, and Khuzistani Arabic be-
long to a larger isogloss covering the Caucasian–Mideastern–South Asian
area, where mainly two light verbs are used, each combining with a root of
nominal form of the borrowed verb. The distinction between the two light
verbs is usually one of valency, and they usually derive from or are identical
48 Yaron Matras
to the lexical verbs for ‘to make/ to do’ and ‘to be/ to become’. In Domari, the
carrier verbs are semi-grammaticalized, and full forms -kar- ‘to do’ and ‘-hr-
‘to become’ co-exist with the abbreviated integration markers -k- and -Ø-. A
somewhat similar strategy is found in Mosetén, where one of the verb adap-
tation markers is also a valency-augment, in Yiddish, where Hebrew-derived
verbs are accompanied by zayn ‘to be’, and in Jaminjung, where loans have
coverb status and are always used in combination with a native inflected verb.
A series of other languages use a verbalizing augment which is otherwise
employed to derive verbs from non-verbs as an integration marker: Mosetén,
Nahuatl, Indonesian, Guaraní, Biak, Purepecha, Manange, Rumungro, and
Yaqui. The borrowed verb itself usually appears in either the root form, an in-
finitive form, or an unmarked inflected form, quite often – for Spanish verbs,
especially – the third-person singular present.
While no constraint on the borrowability of verbs can be upheld, it is
nevertheless evident that a large number of languages require greater gram-
matical effort in integrating verbs than for the integration of nouns. The bor-
rowing hierarchy
(7) nouns > verbs
expresses the grammatical “ease” or simplicity with which elements belong-

ing to these two word classes can be integrated. Why is it that verbs require
greater morphological integration effort, and what does this greater effort
represent? On the one hand we find a pair of morphologically isolating lan-
guages like Vietnamese and Chinese, with no morphological complexity sur-
rounding verbs in either the recipient or the donor language, and where verbs
are integrated in a straightforward manner, just like nouns and other parts
of speech. It is difficult, however, to attribute the need for explicit loan-verb
adaptation markers even in other languages to the morphological complexity
of verbs alone. Recall that most languages tend to integrate simple forms of
the verb, such as the root, the infinitive, or an unmarked form. There does not
seem to be, in those languages, any difficulty in stripping the target verb to its
bare lexical essentials, before transposing it into the host morphology.11
I suggest instead that the difficulty lies in the conceptual complexity of
the verb, and the fact that when borrowed and integrated, the verb is expected
to perform two operations: The first is to serve as a referential lexical item –
a content word, not dissimilar to a noun, adjective, or descriptive adverb. The
second is to initiate the predication and so to serve as the principal anchor
point for the entire proposition of the utterance. This latter function consti-
tutes its “verbness”. It appears that borrowing of verbs is motivated by a

similar need for modifying the inventory of lexical-referential expressions as
the borrowing of nouns (and no doubt various specific semantic motivations
could be postulated for groups of lexical content words). Speakers thus allow
the lexical component of the verb to “cross” the mental demarcation bound-
ary between languages, i.e. they license themselves to employ the same ac-
tion/event signifier in any speech interaction. The bare lexical stem, however,
is not always sufficient in order to assume the role of predication-initiator. A
great number of languages therefore require this additional, crucial function
to be explicitly marked out in the verbal expression; in other words, they need
to transform the strictly “lexical” depiction of an action/event into a predi-
cate. This is achieved through explicit marking of its “verbness”.
A more detailed study is required in order to ascertain the conditions under
which languages require some form of loan-verb integration. A first, banal
observation is that a pre-requisite for the employment of loan-verb adapta-
tion markers is the availability in the recipient language of a morphological
procedure to derive verbs from non-verbs. Whereas an isolating language like
Vietnamese may rely to a considerable extent on the pragmatics of morpheme
juxtaposition as a way of (indirectly) marking out word classes, flectional lan-
guages will often require an additional means of identifying derived verbs.
This, of course, only explains a part of the story. It is clear that the simi-
larities in loan-verb adaptation strategies found among the languages of the
Caucasian–Mideastern–South Asian area are as much areally motivated (i.e.
through contact and imitation among the languages) as they are function-
ally motivated by the respective morphological structures of these diverse
languages. Moreover, not all languages that possess verb derivation strat-
egies employ them with loan-verbs. And finally, languages are known to have
changed their loan-verb adaptation strategy over time, without adopting any
significant changes to their morphological typology. Thus, Romani appears to
have shared the “light verb”, valency-marking strategy of loan-verb adapta-
tion during its early, Byzantine period, with languages of the Caucasus–Ana-
tolia–South Asia “area” (see Matras 2002). It then transferred the function of
marking out loan-verbs to a set of Greek-derived aspectual markers. Finally,
in some contemporary dialects of Romani, such as Sinti, Kaale, and Vlax
(spoken in or around Germany, Finland, and Romania/Hungary, respective-
ly), loan-verb integration markers are being reduced altogether. For the time
being, our principal conclusion can be that the hierarchy depicted under (7)
applies for those cases where integration of a lexical item requires morpho-
logical support through derivational means.
50 Yaron Matras
7. Other parts of speech
The present section gives an overview of those elements often summarized

together as “function words” or “unbound grammatical lexemes”. Lumping
them together in one section is a matter of convenience, and follows, as did
the previous sections, the structure of the Language Convergence and Bor-
rowing questionnaire, which forms the basis of the descriptive chapter. By
discussing them under a shared heading, I am not suggesting at this stage that
they share properties that motivate borrowing, nor that their behaviour in con-
tact situations should, for any reason, be uniform or even similar at all.
7.1. Numerals
Numerals, in fact, are often considered low on the borrowing scale. This may
derive from an assumption that all languages have some form of quantifica-
tion. Although it is now known that not all languages possess systems for
counting discrete entities, it is not necessarily the clash of systems of quanti-
fication that provides the motivation for the borrowing of numerals. Several
types of borrowing involving numerals can be identified. Pattern replication
appears in some languages: A decimal system is reported to have been adopt-
ed as a result of contact in Mosetén, Indonesian adopts a Javanese tag-lexeme
indicating “teens” and re-organizes its earlier system of numeral juxtapos-
ition above 10 accordingly, Hup adopts the Tukano quintenary system for
numerals between 5 and 20, and a combination lexemes (‘ten-and-one’) re-
place single lexemes (‘eleven’) in some varieties of Kurmanji, replicating the
Turkish arrangement. This rather small group suggests the following implica-
tional hierarchy representing the likelihood of pattern-replication in numer-
als, which is yet to be confirmed by a larger sample:
(8) over 10 > below 10
More than two-thirds of the sample languages show some form of direct
matter-replication of numerals. This includes most of the languages that re-
organize their pattern of numerals, and which often employ borrowed numer-
als alongside the re-modelled “internal” or inherited system. In some cases,
numeral replication is subject to sociolinguistic constraints, with contact-lan-
guage numerals used as the preferred system for formal purposes such as cit-
ing dates and addresses and performing even simple mathematical tasks such
as counting (as opposed to the casual use of numerals as attributes), in trans-

actions involving money, in names for coins of banknotes, or in some cases
in the citation of grades. Such contextual splits are described for the use of
Turkish numerals in Kurmanji, for the use of Hokkien numerals in colloquial
Indonesian, for dates in Rumungro, and for Chinese numerals in Vietnamese,
leading us to postulate a sociolinguistic hierarchy for the likelihood of use of
borrowed numerals:
(9) more formal contexts > less formal contexts
This hierarchy reflects the fact that numerals enter languages through the
dominance of the second language in formal and business transactions, and
through education and other forms of institutional discourse. In many of the
sample languages, especially those in post-colonial contexts, knowledge of
the indigenous system of numerals is reported to be in decline, and the young-
er generation shows a clear preference for borrowed numerals. The adoption
of borrowings in such situations clearly favours higher numerals over lower
numerals, allowing us to postulate the following implicational hierarchy for
the borrowing of cardinal numerals:
(10) higher numerals 1000, 100 > above 20 > above 10 > above 5 >
below 5
This hierarchy appears related to some of the hierarchies postulated above,

where borrowing is facilitated around conceptual complexity and inaccessi-
bility. At the same time, higher frequency in casual language use of lower nu-
merals is clearly a factor supporting the retention of native forms. Languages
for which speakers are reported to be using native forms for lower numerals
under 5 – either primarily, or alongside borrowings – but mainly or exclusive-
ly borrowings for numerals above 5 include Domari, some Kurdish speakers,
Jaminjung (which has no native numerals above 3), Tasawaq, Otomi, Guaraní,
Purepecha, Yaqui, Kildin Saami, and we might add Rumungro, where 79 are
Greek loans into Early Romani, and 6 has been argued to be an early loan
from Dardic (Indo-Iranian frontier languages). For some languages, such as
Mosetén, Quichua, Nahuatl, and Biak, the cut-off point tends to be 10, while
Hup shows a split at 20. Higher numerals show an independent susceptibility
to borrowing. Tasawaq for instance borrows its lower numerals from Arabic,
but its word for 100 from Tuareg; Rumungro generally shows Greek borrow-
ings, but 1000 is Hungarian; and Vietnamese uses a Chinese word for 10,000.
52 Yaron Matras
Somewhat paradoxically, “0” ranks closer to the higher numerals 100, 1000,
and so on: the only K’abeena borrowing is zeeruta, from Italian via Amharic,
while in Rumungro it derives, like the higher numerals, from Hungarian. This
is not surprising, and shows that cognitive complexity in the counting system
operates in respect of the ability to easily identify and appreciate a quantity.
This is hindered the greater the quantity, but it is similarly hindered in the
absence of any quantity at all. An additional factor that no doubt plays a role
is the relative formality of the term “zero”, which is associated with math-
ematical and other formal notations and transactions, but not with everyday,
casual expression of “nothingness”.
Sample languages that do not borrow numerals are Macedonian Turk-
ish, Khuzistani Arabic, Yiddish, and Manange (though incipient influence
of Nepali on the numeral system is reported). In all but the latter, we can at-
tribute the stability of numerals to a firm tradition of native-language educa-
tion, media, and literacy, if not widespread among all speakers, then at least
firmly anchored in the community and its history. This confirms once again
that the borrowing of numerals is motivated not necessarily by “gaps” in the
system of counting, but by a much more general accommodation to the lan-
guage of formal institutions and the public domain in the way of conceptual-
izing and expressing formal transactions surrounding quantification.
The hierarchies presented in (9) and (10) are fully in line with the obser-
vations described for Romani dialects by Matras (2002) and by Elšík and
Matras (2006), which lends support to their validity as universal indicators.
A tentative case can be made for the following hierarchy of the likelihood of
borrowing of ordinal numbers:
(11) lower ordinals > higher ordinals
This hierarchy is presented by Elšík and Matras for Romani. In the present
sample it is confirmed by Kildin Saami and Rumungro, which use borrowed
ordinals for ‘first’,12 and Western Neo-Aramaic, which uses Arabic ordinals
for ‘110’, while Domari, Otomi and Purepecha generally rely on borrowed
ordinals. Note that English (not part of the sample) is an exception to the hier-
archy, having borrowed second but not first from Romance.
The ordinal ‘first’ is often a separate word, quite often suppletive to the rest
of the ordinal paradigm. In some languages, this is also true of ‘second’. This
structural conspicuousness could be a factor promoting borrowing. In the
assessment of Elšík and Matras (2006) the high borrowability of lower ordi-
nals is a direct factor of this universal tendency to prioritize the ordinal ‘first’
through lexical suppletion, which in turn is an expression of its cognitive

saliency. Borrowing therefore simply follows the same path; in other words,
the search for a renewable (=suppletive) item as an outstanding marker of the
pragmatic saliency of “firstness” exploits the bilingual situation in recruiting
an item from the contact language. We might therefore, in respect of ordinal
numbers, postulate the following hierarchy of borrowability
(12) exclusivity > inclusivity
where “exclusivity” is taken to mean the separation of a single concept, en-

tity, topic or state of affairs from a larger set – here in relation to the order of
prominence, such as temporal sequence, or the attention granted to the ob-
ject. Conceptually, this hierarchy is well in line with the contact-susceptibil-
ity of such properties as condition (see above), privative ‘without, instead of,
except for’ (cf. Elšík and Matras 2006), contrast, phasal change (‘already’),
restriction (‘only’), and the superlative (see below), all of which denote a
broken chain of expectations, singling out our delimiting one entity from a
presuppositional set.
7.2. Pronominal forms
The borrowing behaviour of so-called “pronouns” illustrates how limited a

wholesale structural approach to category borrowing can be, and how it is the
functionality of categories that motivates borrowing. Only Indonesian shows
borrowing of personal pronouns, from Sanskrit and Javanese, into a system
of highly differentiated, lexicalized terms of address and reference. Other
contact-developments in pronouns are limited to the organization of the sys-
tem of reference: Imbabura Quecha is reported to have developed a polite
form of the second-person pronoun kikin on the basis of Spanish Usted, and
in some heavily Hispanicized varieties of Guaraní, the inherited distinction
between inclusive and exclusive is dissolved. Borrowing of other deictic and
anaphoric forms is limited. Spanish la is used as an anaphor in Guaraní, the
Arabic resumptive pronoun iyyā- is used in Domari in relative clauses, and
Rumungro borrows the Hungarian deictic prefixes am- and ugyan- which are
combined with Romani deictic stems.
Reflexive pronouns are borrowed in Tasawaq, Western Neo-Aramaic,
and Rumungro. Yiddish shows an extension of reflexivity based on a Slavic
model, and Indonesian calques a reflexive apparently on a Sanskrit model.
54 Yaron Matras
Reciprocal pronouns are borrowed in Domari and Western Neo-Aramaic,

and Rumungro calques a Hungarian model. What motivates the borrowing of
reflexive and reciprocal forms? Unlike straightforward deictic and anaphor-
ic reference devices, reflexives and reciprocals may be said to constitute an
extension of the derivational system of the verb, contributing to the layout of
actors involved in and affected by an event. They are thus part of a construc-
tion that revolves around the verb’s “actionality”.
Interrogatives are borrowed into several languages. Those that stand out
as more highly borrowable are the interrogatives for quantity (‘how much’),
borrowed into Domari, Otomi, and Manange, and time (‘when’), borrowed
into Domari, Indonesian, Quichua, and Rumungro. Borrowing also affects in-
definites. Domari, Otomi, and Rumungro borrow all or almost all of their in-
definite expressions, while Tasawaq and Purepecha borrow time indefinities,
Guaraní the indefinites for person and thing, and Yiddish borrows indefinite
markers. Although there is no direct, predictable link to other categories, bor-
rowing in the domains of interrogatives and indefinites appears to be a more
“advanced” stage of borrowing among Other Parts of Speech.
7.3. Connectors/conjunctions
The grammatical category that is by far the most susceptible to borrowing

is that of connectors (see already Matras 1998). All languages in the sample
borrow connectors, and the general picture confirms the implicational hier-
archy postulated as universal in Matras (1998), and confirmed by Elšík and
Matras (2006) for Romani (and recently by Stolz 2007 for a number of lan-
guages in contact with Italian):
(13) but > or > and
Sample languages that borrow all three connectors include Domari, Mosetén,
Nahuatl, Kurmanji, Rapanui, Indonesian, Quichua, Otomi, Guaraní, Kildin
Saami, and Western Neo-Aramaic; languages that borrow only ‘but’ and ‘or’
are Tasawaq, Purepecha, Vietnamese, Rumungro, K’abeena, and Likpe. No
languages borrow ‘and’ without also borrowing ‘but’ and ‘or’.
There are, however, a few languages that deviate slightly from the expect-
ed pattern. Macedonian Turkish borrows Macedonian i ‘and’ as well as ili ‘or’
and a ‘or, whereas’, but retains Turkish ama ‘but’; however, the latter is iden-
tical to Macedonian ama, which is a Turkish borrowing (cf. Matras 2004).
Jaminjung uses the borrowed contrastive marker Kriol ani ‘only’ alongside
its native bugu, while only borrowed forms are used for addition and disjunc-
tion; but this is due to the absence of any native connectors for addition or
disjunction to compete with the borrowings. In Manange, Nepali ani ‘and
then’ can also be used for clause coordination (cf. Stolz’s 2007 discussion of
Italian allora). Neither of these cases necessarily contradicts the hierarchy
in (13). Biak, however, is reported to use the Indonesian disjunction marker
atau ‘or’ and less frequently the addition marker dan ‘and’, but no mention is
made of a borrowed contrastive marker. Hup borrows ou ‘or’ from Tukano.
The source is ultimately Portuguese, and is reported to have diffused widely
in the area. Aikhenvald (2002), too, reports on Tariana ou ‘or’, with no bor-
rowing of other Portuguese connectors. Similarly, Yaqui appears to borrow
only Spanish o ‘or’. The fact that counterexamples can be found does not
invalidate the overall observation that contrast is a semantic-pragmatic fea-
ture that facilitates borrowing, nor of course that clause-combining is an op-
erational domain that is prone to contact-related change. Most likely, certain
constraints of a structural and perhaps also a cultural nature (conventions on
structuring discourse and expressing overt contrast) override the universal
tendency in some cases. Noteworthy is the cluster of Amazonian languages
within which Portuguese ou diffuses, often via secondary sources only. Bor-
rowing in the domain of coordinating conjunctions is missing only in Yiddish
and Khuzistani Arabic.
Subordinating conjunctions are similarly a frequent target of borrowings.
Among the complementizers, borrowing is almost entirely restricted to those
that introduce factual clauses, which are borrowed in Domari, Khuzistani
Arabic, Rumungro, Western Neo-Aramaic, and Likpe. Although modality
has been shown to be contact-prone, at the level of the organization of the
complex sentence factual complements show greater event independence and
so greater effort is needed in order to process the connection between the two
clauses (see already Givón 1990, Dixon 1995, 2006). Factuality at this level is
thus quite in line, as a factor promoting borrowing, with contrast, limitation,
exemption and other properties that we have so far encountered at the top of
borrowability hiererachies. Also well in line with these tendencies is the high
presence, among borrowed conjunctions that introduce adverbial clauses, of
those that mark concessive relations (borrowed in Yiddish, Tasawaq, Indo-
nesian, Quechua, Guaraní, Domari, and Likpe), causal relations (Domari,
Mosetén, Nahuatl, Kurmanji, Rapanui, Jaminjung, Tasawaq, and numerous
others that calque causal subordinators), and purpose (e.g. Domari, Mosotén,
Nahuatl). High on the borrowing scale are also conditional subordinators
56 Yaron Matras
(borrowed in Domari, Mosetén, Indonesian, Quichua, and Guaraní), while

the borrowing of temporal subordinators is often linked to that of conjunc-
tions expressing purpose and cause. We can therefore postulate the following
tentative borrowing hierarchies:
(14) concessive, conditional, causal, purpose > other subordinators

(15) factual complementizers > non-factual complementizers
Concession, of course, is tightly linked to contrast and unexpectedness. Con-

dition is an expression of modality. Cause and purpose are both efforts to link
independent events, as are factual complementizers, while cause constitutes
in addition an explicit argumentation effort at the interactional level. The hier-
archies (14)–(15) thus supply us with a series of semantic-pragmatic proper-
ties that are borrowing-prone.
7.4. Particles
Not many languages in the sample borrow phasal adverbs, but those that do
show a clear implicational hierarchy
(16) yet, already > still > (no longer)
confirming that observed in Matras (1998) for Romani, as well as in van der
Auwera (1998) for a sample of European languages. While Rumungro and
Domari show borrowings in all positions, Jaminjung borrows ‘yet’, Guaraní
only borrows ‘already’, while Otomi also has ‘still’. The semantic opposition
involved is one of change vs. continuation, in the first instance. While ‘no
longer’ essentially expresses change, its position on the hierarchy is partly
influenced by its tendency to be composed of several structural elements.
It is therefore the first two positions on the (left of the) hierarchy that are the
most meaningful, and which continue the theme of contact-susceptibility of
contrast and discontinuity of pre-suppositional expectations. Another particle
that shows frequent borrowing is ‘again’ (Domari, Mosetén, Kurdish, Jamin-
jung, Indonesian, Otomi), expressing an unexpected repetition of events.
Half the languages in the sample borrow focus particles, giving the impli-
cational hierarchy
(17) only > too > (even)

once again in line with observations on Romani and other languages in Matras
(1998). In fact, the particles ‘only’ and ‘too’ usually go together, but ‘only’
can be considered higher on the hierarchy, since Indonesian and Western
Neo-Aramaic have borrowed ‘only’ but not ‘too’. Khuzistani Arabic borrows
Persian ‘too’ (hem), but since Persian ‘only’ is itself an Arabic loan (faqat) it
is not identifiable as a borrowing in Khuzistani Arabic. The hierarchy in (17)
indicates that restriction facilitates borrowing, while the proneness of focus
particles (and indeed of phasal adverbs and repetition adverbs) to borrowing
shows the vulnerability of the system of processing states of affairs and atti-
tudes that are high on the relevance scale and that assess information in direct
relation to existing hearer-sided presuppositions and expectations.
Fully consistent with this observation is the overwhelming tendency of
the languages in the sample to borrow discourse markers (once again, cf.
Matras 1998). There are only two languages that do not show borrowing of
discourse markers: Biak and Vietnamese. There is no obvious explanation for
the absence of borrowed discourse markers in these languages, except per-
haps the fact that using native intonation and modal particles is considered a
central characteristic of talking Biak and so an important identity marker,13
and that Chinese influence on Vietnamese was transmitted to a considerable
extent via the formal and literary language, rather than via oral discourse. We
also find less extensive borrowing of discourse markers in languages with a
tradition of native literacy: Yiddish, Khuzistani Arabic, Macedonian Turkish,
and Indonesian. Limited borrowing in this domain is also typical of Manange
and Hup, while Domari, Mosetén, Jaminjung, Guaraní, and Purepecha show
the most extensive use of borrowed discourse markers. It is noteworthy that
Tasawaq has its fillers and discourse markers from Hausa, a contemporary
“pragmatically dominant” language (cf. Matras 1998). On the whole, the fol-
lowing hierarchy (from Matras 1998) of borrowability, both frequency-based
and at least in most cases implicational, could be upheld:
(18) discourse markers > other particles
Question and answer particles must also be considered in this connection.

The former are not a universal phenomenon anyway, and it is not surprising
that they are limited, in our case, to Macedonian Turkish, which borrows
its question particle from Albanian, interestingly replacing a native Turkish
question particle. The borrowing of the positive answer particle ‘yes’ is more
common. It is often employed as a turn-taking particle rather than just as a
signal of agreement with content, and it perhaps for this reason that it is more
58 Yaron Matras
vulnerable than its negative counterpart ‘no’. Rumungro and Mosetén bor-
row ‘yes’ (from Hungarian and Spanish respectively), while both ‘yes’ and
‘no’ are borrowed from Tuareg in Tasawaq, from Arabic in Domari, and from
Spanish in Guaraní. The small sub-sample suggests an implicational hier-
archy for the borrowing of answer particles which agrees with that postulated
for Romani dialects by Elšík and Matras (2006: 343):
(19) positive > negative
7.5. Grammatical vocabulary
A notable gap in the borrowing inventory surrounds expressions of place

deixis, which in our sample appears fully resistant to borrowing. Borrowed
expressions of place are largely limited to the place indefinites ‘nowhere’ and
‘anywhere’, in which the borrowed component may either be the entire ex-
pression or just the indefinite marker (Domari, Otomi, Rumungro). The Yid-
dish presentative ot from Slavic is used in pointing only, and although a deixis
of sorts, it is arguably more a verbalized gesture and hence more closely re-
lated to discourse markers than a member of a deictic paradigm.
Borrowed time expressions encompass both indefinites (‘always’, ‘never’)
and deixis (‘now’, ‘then’). Here, both classes are subject to borrowing and their
behaviour appears to be linked in the following implicational hierarchy:
(20) always > never > now, then
In line with the high frequency of borrowing around indefinites, ‘always’ is

the most frequently borrowed, followed by ‘never’. The relevance properties
of indefinites – as operators that process hearer-sided presuppositions – ex-
plains their higher position on the hierarchy (at least, this is well in line with
other hierarchies discussed so far). The deictic expressions ‘now’ and ‘then’
are usually linked, and therefore occupy a single position on the hierarchy;
occasionally they are borrowed independently of one another: Tasawaq bor-
rows just ‘now’ from Tuareg, while Rumungro borrows only ‘then’ from Hun-
garian. Note that the Romani sample (Matras 2002) shows a clear hierarchy
which favours the borrowing of ‘then’, beginning most often in its sequential
rather than remote-deictic function, while ‘now’ is rarely borrowed.
Terms for days of the week are generally borrowed from the language of
education, or the formal-official language: We find borrowings in Nahuatl,
Jaminjung, Indonesian (ultimately from Arabic), Quichua, Otomi Guaraní,
Biak, Purepecha, Hup (from Portuguese), Rumungro, K’abeena, Kildin
Saami, and Western Neo-Aramaic. Borrowing of expressions for times of
day (‘morning’, ‘noon’, etc.) is usually linked to days of the week, the sample
showing an implicational hierarchy:
(21) days of week > times of day
This hierarchy can be nicely connected to the role of institutional administra-

tion and the language of commerce: Between the two categories, it is days of
the week which tend to be the property of the public domain, governing the
schedule of activities of individuals in relation to institutions, more so than
times of the day, which have a greater autonomous role within the private do-
main. Once again we find that languages that do not have a strong tradition of
relying on external languages for affairs of the public domain – Yiddish, Ma-
cedonian Turkish, Khuzistani Arabic – show no borrowings in these fields.
Manange once again occupies a similar position, showing here too resistance
toward borrowing and behaving much like a language with a tradition of lit-
eracy and institutional discourse.
For adjective comparison, our sample clearly confirms the hierarchy pos-
tulated by Elšík and Matras (2006) for Romani dialects (omitting the least
borrowable value “positive”, which is the default form of the adjective and is
not usually accompanied by any overt derivational marker):
(22) superlative > comparative
Formal means of constructing both the superlative and the comparative are
borrowed in Domari, Otomi, and Vietnamese, as well as Western Neo-Ar-
amaic. Languages in which borrowing is limited to the superlative include
Indonesian, with a Javanese particle paling, Rumungro, which borrows the
Hungarian superlative prefix leg, and Kildin Saami, which has the Russian
the superlative marker same. Yiddish, also confirming the hierarchy, shows an
interesting case of hierarchical distribution of matter- and pattern-replication:
While the comparative shows pattern-replication – greyser fun mir ‘bigger
from me’ – the superlative shows matter borrowing, replicating the Russian
marker same: same groys ‘biggest’.
60 Yaron Matras
8. Constituent order and syntax
Contact-induced change in word order is generally not common in our sam-

ple. Evidence for change in basic word order is found in Rapanui, with a
tendency toward change in Quichua and Otomi and sometimes in Khuzistani
Arabic (here we are dealing mainly with relaxation of discourse constraints).
It is plausible that in their earlier history Hup, Rumungro, and Domari showed
different word-order patterns, though concrete evidence is lacking. The most
common change in word order appears to affect possessive constructions.
Examples are Domari, Macedonian Turkish, Rapanui, Quichua, Rumungro,
and Likpe (see above, under Nominal structures). This is understandable,
given the fact that a change in the position of possessor and possessed does
not affect the position of the verb and so it leaves the organization of the
predication intact. The position of adjectives is affected by contact in Domari,
sometimes in Quichua, and in urban Manange.
Relative clauses change their position relative to the head in Macedo-
nian Turkish and in Nahuatl, as well as, arguably, in Domari, if we compare
the language with attested Indo-Aryan languages. The position of the cop-
ula appears more vulnerable to change than the position of lexical verbs, as
seen in Macedonian Turkish, Rumungro (tendency only), and North-eastern
Neo-Aramaic. From this, we might propose the following tentative hierarchy,
based partly on frequency and partly on pure prediction, for the likelihood of
word order to be affected by contact-related change:
(23) nominal constituents (possessor, adjective) > copula predications >

verbal predications
Note that the hierarchy is sensitive to the presence of a lexical verb as initiator
of the predication – a factor which impedes borrowing. Thus the most bor-
rowable are structures that do not involve a full predication, or at least not a
verbal one; these are followed by non-lexical predications, while predications
that contain full lexical verbs appear last.
In the area of clause structure, one of the frequent changes observed is
the emergence of copula clauses: Otomi borrows the Spanish copula ta, Hup
uses a possible Tukano loan as a copula, and Indonesian creates copulas on
a Sanskrit model. This makes sense, if one considers that non-universality of
(present-tense) copula predications, and the fact that a clash of systems, and
so pressure toward convergence, is more likely to occur here than in other
clause types.
As far as syntax-relevant grammatical vocabulary is concerned, negation

particles are borrowed in Domari, Quichua, Guaraní, Biak, Hup. Connectors
and conjunctions were dealt with above (under Other Parts of Speech), and it
was seen that most languages in the sample show some kind of contact influ-
ence on clause combining strategies. This is true especially of coordination,
where almost all languages are affected, some by mere borrowing of connec-
tors, others by changing connectivity strategy, as in the case of Rapanui (from
serialization to connectors), or Macedonian Turkish (from converbs to con-
nectors introducing finite clauses). A new type of adverbial subordination is
attested only in Macedonian Turkish, while relative clauses are re-structured
in Macedonian Turkish, Domari, Yiddish (to some extent), and Tasawaq.
Like connectors, relative particles appear to be borrowing-prone, and we find
loans in Nahuatl, Rapanui, Domari, Otomi, Guaraní, Indonesian, and Kildin
Saami.
On the whole, then, the sample languages do not offer an extreme wealth
of data on contact-induced change in clause or sentence structure. In particu-
lar, colonial languages and languages of administration do not seem to have
the effect on syntactic structures as they may have on other domains of struc-
ture, in particular grammatical vocabulary. Those cases where we do find far-
reaching changes in syntactic typology tend to be languages in a prolonged
situation of stable multilingualism, as in the case of Macedonian Turkish or
Domari, confirming Thomason and Kaufman’s (1988) prediction on the link
between prolonged and intense cultural contact, and significant typological
disruption.
9. Lexicon
All languages borrow lexical elements. Based on authors’ responses to the

questionnaire, we can make a statement about the likelihood of a certain
word-class element to be affected by contact. This hierarchy is not implica-
tional, as there is no evidence to suggest that borrowing in a lower-ranking
category necessarily entails borrowing in higher-ranking categories. Rather,
it is based on frequency; (24) shows how frequently selected word classes
occur among the list of word classes affected by contact in the sample:
(24) nouns, conjunctions > verbs > discourse markers > adjectives >
interjections > adverbs > other particles, adpositions > numerals >
pronouns > derivational affixes > inflectional affixes
62 Yaron Matras
In some salient features, this hierarchy resembles other hierarchies pro-

posed in the literature.14 Thus, nouns appear at the top of the list, unbound
grammatical vocabulary is rather high on the list, and bound morphology is
low, with derivational morphology outranking inflectional morphology. Note
however some differences to the hierarchy proposed by Muysken (1981)
and others: Conjunctions and discourse markers occupy a high position,
and outrank some of the lexical categories. Numerals outrank pronouns and
derivational morphology in this particular sample, much of it composed of
languages in contact with a colonial language. No differentiation is made be-
tween coordinating and subordinating conjunctions in our discussion, but it
is clear that the position even of subordinating conjunctions is far higher on
the hierarchy that that assigned to it in Muysken’s data.
We must, however, treat the meaningfulness of such a full hierarchy with
caution. We have seen evidence that the presence of numerals on the list can
be biased by the type of contact situations that are selected, and the presence
or absence of a tradition of literacy in the recipient languages. The presence
of pronouns on the list is largely a product of the function of terms of address
and their degree of lexicalization, or alternatively of the clash of systems
that distinguish exclusive/inclusive reference, and those that don’t. These
and more are coincidental circumstances that influence the contact situation
around particular categories, and which will promote or demote certain cat-
egories on the hierarchy, depending on the degree of presence of languages in
the sample that answer to certain sociolinguistic and structural criteria. Thus,
the more reliable hierarchies are those that provide a picture of the suscepti-
bility of category values to borrowing, while the comparison among categor-
ies is not entirely free of arbitrary factors.
10. Concluding remarks
Noteworthy is the extent of borrowing across the different languages of the

sample. If we take as an indicator 36 prominent categories representing vari-
ous aspects of structure – from phonology, through to morphology, unbound
grammatical vocabulary, lexicon, and syntax15 – and assign scores to lan-
guages based on the number of categories that show some kind of contact in-
fluence, then the scores range from 31 (Rumungro, Guaraní), to just 6 (Yaqui)
and 7 (Biak) (see Table 1).
Table 1 shows the overall “borrowing score” for each language. It also
shows the borrowing score among 11 categories representing Other Parts of
Speech (OPS),16 an indicator of mostly unbound grammatical lexemes, and

the proportion of OPS among the categories affected by contact. The majority
of languages range within a 0.05 distance from the proportional 0.305 share
of OPS among the category total, indicating that there is, on the whole, a ra-
ther predictable share of OPS among borrowed categories. Note however that
some languages show a disproportionately high level of borrowing of OPS:
Yaqui, Biak, Likpe, Mosetén, Jaminjung score between 0.40.5, meaning that
OPS account for up to 50 percent of borrowed categories. Note that all these
languages tend to have overall borrowing scores of between 615, that is, on
the lower side of the borrowing range. This might be interpreted as indirect
Table 1. So-called “borrowing scores”, and the proportion of Other Parts of Speech
among categories affected by contact.
Language Total score OPS score OPS/Total ratio
Sampled 36 11 0.305
Yaqui 6 3 0.500
Biak 7 3 0.428
Manange 10 3 0.333
K’abeena 10 3 0.333
Likpe 10 4 0.400
Mosetén 11 5 0.455
Mac. Turkish 12 3 0.250
Rapanui 13 3 0.230
Khuz. Arabic 13 4 0.307
Kildin Saami 15 5 0.333
Jaminjung 15 7 0.467
Vietnamese 17 6 0.352
Nahuatl 18 5 0.278
Tasawaq 18 7 0.389
Purepecha 19 7 0.368
Western Neo-Aramaic 20 6 0.300
Hup 21 5 0.238
Kurmanji Kurdish 21 6 0.286
Yiddish 22 7 0.318
Domari 24 9 0.375
Otomi 25 9 0.360
Quichua 26 8 0.307
Indonesian 26 9 0.346
Guaraní 31 10 0.323
Rumungro 31 10 0.323
64 Yaron Matras
evidence that borrowing begins with OPS, before continuing to other cat-
egories. The lowest OPS scores, for Rapanui, Hup, and Macedonian Turkish
(0.230.25), represent languages that undergo structural changes as a result
of contact but in which borrowed OPS are under-represented. These figures
generally confirm the predictions and observations that unbound grammatical
morphemes are high on the borrowability scale compared to other categories;
though they also allow us to conclude that contact influence is rarely limited
to them, and that it is not impossible for a language to display even a certain
amount of resistance toward borrowing of OPS.
It is also interesting to note that a number of categories occupy an entirely
peripheral position in the borrowing behaviour of languages in this sample.
They include bound case markers, bound tense markers, bound person mark-
ers as well as in most cases unbound person markers (deixis and anaphora;
exceptions being reciprocal and reflexive pronouns, and “lexicalized” pro-
nouns as in Indonesian), demonstratives and expressions of place deixis. In
the following discussion it will hopefully become clear that the absence of
borrowing in these domains is not taken to mean that constraints exclude
them from being borrowed. Rather, our focus is on those categories that do
show a more salient and frequent tendency to be affected by borrowing, and
our agenda is to explain why speakers are motivated to borrow forms and
structures in those categories. The absence of borrowing among other cat-
egories may be left to be interpreted as just that: the absence of any particular
motivation to converge the two systems around these particular categories.
Typological features are neither excluded nor even rarely affected by bor-
rowing in the sample. A number of languages undergo considerable typo-
logical convergence: Macedonian Turkish, Domari, North-eastern Neo-
Aramaic, and Rumungro. Ongoing shifts in morphological typology can be
detected in a number of other languages, too. Although statistically, unbound
grammatical morphemes are more likely to be borrowed than typological
features, there is no direct interdependency between any specific value or
category that falls within these respective groups of structures. Implicational
hierarchies of the kind postulated above only apply among the values of the
same category. But since the likelihood of borrowing is different for different
categories, there may be a quasi-implicational relationships across categories
in different structural domains. Thus, since connectors are frequently bor-
rowed, but re-structuring within the TMA domain is rare, we might expect
a language that shows contact-induced re-structuring in the domain of TMA
to show borrowed connectors as well. Such an expectation is based on the
higher borrowing frequency of connectors.
Despite the lack of any direct functional link between the borrowing of
connectors and the restructuring of TMA categories (or any other example of
contact influence), the challenging question remains, why certain categories
are more susceptible to change in situations of language contact, than others.
It is here that borrowing hierarchies, especially the implicational relations
among paradigm values of the same structural category, can shed some light.
The fact that borrowing within such categories often follows a non-arbitrary,
predictable course, suggests that semantic-pragmatic features that distinguish
among category values participate in motivating borrowing. The relations
among borrowed values can thus help us illuminate the motivation behind
borrowing, and so help us make sense of the different degrees of susceptibil-
ity of categories to the borrowing process.
Let us, for this purpose, review the hierarchies. A first set of hierarchies
might be grouped together based on a general notion of frequency, referential
meaning, and usage context of the borrowed structural material. This group
is rather diverse. The more frequent adoption of new consonants over new
vowels (1) is conditioned by the mere diversity of consonants and the fact that
they tend to outnumber vowels in each of the sample languages. The greater
likelihood that phonemes be adopted as part of loanwords than as independ-
ent phonological features (3), is similarly a practical issue relating to the need
to accommodate loanwords.
Borrowing as a utilitarian enrichment of means of expression also be-
longs here. The greater likelihood of borrowing of nouns over other parts of
speech (as expressed in 24) is a product of the likelihood of nouns to express
new concepts and to name objects and institutions (see already Weinreich
1953: 37). The high borrowability of lower ordinals (12) is connected to the
fact that they mark exclusivity by assigning lower figures a special lexical
item rather than a grammatical derivational procedure, with borrowing sup-
plementing the procedure of lexical creativity. Certain usage contexts may
favour borrowing, if there is a close association with the contact language
in certain domains. Thus, borrowed numerals are more likely to be used in
formal than informal contexts (9), higher numerals as well as mathematical
“zero” are more likely candidates for borrowing, being reserved primarily to
more formal-institutional contexts (10), and names of days of the week are
more likely to be borrowed than times of the day (21).
An additional theme, which groups together another bundle of hierar-
chies, may be defined as accessibility, cognitive complexity, and expected-
ness. Low accessibility and/or high complexity correlate with the borrowing
susceptibility of peripheral as opposed to core local relations (4), of higher
66 Yaron Matras
numerals (10) (reinforcing the usage-based motivation cited above), of inde-

pendent (factual) embedded events over dependent (non-factual) ones (15),
and of linked independent events (purpose clauses, causal clauses) over
linked dependent ones (adverbial subordinations) (14). Low expectedness
can be brought in connection with some of those, and in particular with the
contact-susceptible properties of contrast (13), concessive subordination
(14), phasal change (16), restrictive focus (17), and superlative (22). In all of
these cases, the speaker’s assertive authority is potentially reduced as a result
of the speaker venturing into propositional domains involving a degree of
uncertainty or unexpectedness.
Not unrelated are the properties around which external circumstances re-
duce the speaker’s confidence and control even more overtly. These include
the high borrowing susceptibility of conditionality over other subordinations
(14), of participant-external modality over participant-internal modals (5),
of modality itself over aspect and tense, as well as of aspect/aktionsart (the
internal structure of the event, independent of the speaker’s perspective) over
tense, and of the future over other tenses (6), as well as of indefinites (which
rely on a presuppositional domain) over deictics (which rely on the speaker’s
own orientation perspective)17 (20).
Finally, we find a set of hierarchies that operate at the level of the interac-
tion, where those structures are more borrowing-prone that are more tightly
connected to the emotive level of the discourse or speech act rather than
to the content level of the word or phrase. Such is the case with prosodic
features over segmental phonological features (2), discourse markers over
other particles (18), connectors over other parts of speech (as in 24), causal
argumentation over other forms of subordination (14), and even the posi-
tive answer particle over the general (i.e. also content-bound) negative par-
ticle (19).
What do these three themes – accessibility/expectedness, external de-
pendency, and interaction-level operations – have in common, and why are
they especially susceptible to contact-related change? In order to answer this
question, we must return to our hypothesis about what constitutes “borrow-
ing” in the first place (cf. Section 1). Borrowing, we had said, is a strategic
compromise which bilinguals adopted in conversation and which has be-
come socially acceptable. Social acceptability is a pre-condition for change,
since language is the collective, socio-cultural product and asset of a com-
munity. But there is no reason to assume that social attitudes should in any
way prejudice contrastive connectors over additive connectors, or temporal
indefinites over time deixis. The reason for the hierarchical arrangement of
categories in respect of their borrowing susceptibility has to do with the first

part of our definition of borrowing, namely the part that describes borrowing
as a strategic compromise adopted in conversation. It is here that speakers
are naturally inclined to prioritize when handling the control mechanism that
selects certain (“language-particular”) structures in certain sets of interac-
tions.
Maintaining the demarcation boundary between repertoire components
(or “languages”) is a burden on the mental processing of language in con-
versation, and yet it is a social requirement. Compromise is sought when
the tension assumes its most extreme forms: when the burden of controlling
the language selection mechanism coincides with other sources of tension
in the interaction itself. Such tension emerges when the speaker’s assertive
authority is at stake and a special effort is needed in order to win over the
hearer’s confidence: When expressing unexpected chains of arguments, when
contradicting or challenging presuppositions, when assuming responsibility
for propositional content that lies beyond the domain of secure knowledge,
or when directly intervening with hearer-sided processing by monitoring and
directing turns and speech acts (e.g. through prosody or discourse markers).
Since the conversational tension around such processing tasks cannot itself
be reduced, bilingual speakers’ only alternative is to eliminate the need to
distinguish between sub-components of their linguistic repertoire – or “lan-
guages” – and to unify the structures that trigger the appropriate processing
operations. The result is a fusion of the two systems of structures around the
relevant functions (see Matras 1998).
The trigger for borrowing around these kinds of structures – those that
cluster around the themes defined above as “accessibility”, “expectedness”,
“interaction-level”, and so on – is thus neither social acceptability, nor pres-
tige, nor gaps in the recipient language. Rather, it is the need to reduce the
cognitive load when handling a complex linguistic repertoire. Social accept-
ability is merely an accompanying condition for spontaneous innovations to
become anchored in the long-term speech behaviour of the community.
In this respect, the susceptibility of a great number of grammatical cat-
egories to borrowing is pre-determined by their language processing func-
tion, and therefore universal. One of the most striking findings of the present
investigation is the fact that so many hierarchies that were identified for the
cross-linguistic sample presented in this volume, were a perfect or near-per-
fect match to those identified by Matras (2002) and by Elšík and Matras
(2006) for the sample of Romani dialects in contact with a variety of differ-
ent languages, and, to the extent that material for comparison was available,
68 Yaron Matras
also with hierarchies proposed for other samples. This clearly supports their
universal predictive power. Moreover, the fact that a sample consisting of
multiple recipient languages shows virtually the same results as a sample
with a controlled recipient language (Romani) suggests that the structure of
the recipient language plays only a secondary, perhaps even just a peripheral
role in determining preferences of borrowing. The primary role is played by
the functionality of the categories and the extent of bilingual pressure, i.e.
the extent to which bilinguals need to make frequent decisions on language
choice.
Note that this sharpens the focus of what Thomason and Kaufman (1988)
had referred to somewhat more bluntly as the intensity of cultural contact,
helping us to move toward a more specific characterization of relevant pat-
terns of communicative interaction. To be sure, the structures of the languages
involved, especially the recipient language, may play a certain role in the
borrowing process. But this role must be seen primarily as an imposition of
constraints on what is essentially a universal process, motivated by cognitive
features of language processing. Such constraints might include the presence
of a competing structure on one side of the paradigm (as with the Jaminjung
contrastive marker); or the availability of literacy as a factor strengthening the
coherence of the recipient system and thus reinforcing demarcation boundar-
ies and helping to resist borrowing; or indeed the presence of social attitudes
that block language mixing. On the other hand, the fact that languages like
Malay employ a plethora of lexical means, and not just deictic and anaphoric
expressions, to refer to participants creates a motivation for renewal of this
inventory of expressions and so also for borrowing in the domain of (so-
called) personal pronouns, which is not typically found in languages that rely
on participant deixis and anaphora.
The principal conclusion that must be drawn from the above observations
is that different borrowing motivations apply to different functional categor-
ies. With some, the motivation is lexical enrichment. With others, it is the fu-
sion of elements of formal discourse with the language that dominates formal
discourse, while in a series of categories the motivation is a reduction in the
tension surrounding certain language processing tasks. Though neither gaps
nor social prestige are primary motivators for borrowing, both are indirectly
involved, as the process of “borrowing” can be defined as a license to speak-
ers to dismantle the mental demarcation boundaries that separate their indi-
vidual “languages” and, around a particular selection of categories, to make
full use of their entire repertoire of linguistic structures and forms irrespec-
tive of the setting of the communicative interaction.
Acknowledgement
This chapter was written during a research visit to the Research Centre for Linguistic
Typology at La Trobe University, Melbourne, made possible through a Distinguished
Fellowship at the Institute for Advanced Study, La Trobe University, and an Inter-
national Linkage Fellowship of the Australian Research Council.
Notes
1. I use the term “language” in quotes in this context since it is not obvious that
multilingual speakers process language in the form of separate systems; it is
safer to assume that multilingual speakers have an overall repertoire of linguistic
forms, to which constraints are attached concerning the situations and conver-
sational constellations in which those forms may be used, for various purposes.
The notion of a separation of “linguistic systems” on the part of the language
user is therefore somewhat of an abstraction.
2. We assume that borrowing always begins with at least some degree of bilingual-
ism, however rudimentary, or at least with an exposure to settings of communi-
cative interaction that require the selection of a separate inventory of forms and
structures. Once a certain behaviour pattern is adopted by those speakers who
interact in a variety of settings – and so have access to two (or more) “languages”
– new forms and structures may diffuse into the speech patterns of monolinguals
as well, or may survive the historical decline of widespread bilingualism. Such
latter process may strengthen our ability to identify borrowings, but it is not a
pre-requisite for borrowing.
3. On the problem of establishing “borrowability” on the basis of corpus frequen-
cy, see already Weinreich’s (1953: 3536) critical remarks.
4. Since the occurrence in a corpus of a low-ranking category presupposes that of
the higher-ranking category, occurrences of the higher-ranking category will
always outnumber those of the lower one.
5. Throughout I use the notation “greater than” (>) to denote the value that is more
likely to be affected by contact induced change (in a frequency-based hierarchy),
and which in an implicational hierarchy constitutes a pre-requisite for the bor-
rowing of any item specified to the right of it and marked “lesser than”.
6. The one exception being Kriol, which has a phonological system that is very
similar to Jaminjung.
7. In both Romani and Domari, genetically related material (deriving from Indo-
Iranian postposed adverbial specifiers) undergoes a similar development toward
agglutinative case markers (cf. Matras 2002).
8. The model is in fact areal, and is also shared by Persian and Western Armenian,
and to some extent by Levantine Arabic as well.
70 Yaron Matras
9. Rapanui uses Spanish tiene que which seems to express both necessity and obli-
gation; Imbabura Quichua has pudi- “can”, which could well cover both ability
and possibility.
10. We also have no evidence to uphold the (frequency) hierarchy proposed by
Wichmann and Wohlgemuth (forthc.) (note that prominence of strategies is ar-
ranged from left to right): light verbs < indirect insertions < direct insertion
< paradigm transfer. But we have no grounds on which to challenge this hier-
archy, either.
11. See Bakker (1997), however, on constraints that prevent the isolation of the
Algonkian verb to a bare stem, in the context of Cree/French contact (albeit in
connection with the formation of the mixed language Michif, not with borrow-
ing in the conventional sense).
12. Like all Romani dialects, Rumungro too uses Greek derivation markers for form
ordinals from cardinals, but the hierarchy applies to the borrowing of ordinal
word forms.
13. Wilco van den Heuvel, p.c.
14. Compare with integrated hierarchy presented by Muysken (1981), repeated by
Winford (2003): 51: nouns > adjectives > verbs > prepositions > coordinating
conjunctions > quantifiers > determiners > free pronouns > clitic pronouns >
subordinating conjunctions
15. The full list is: Consonants, vowels, morphological typology, alignment type,
local relations, classifiers/gender, possession, plurality, definiteness, diminution/
augmentation, nominalization, case marking, tense categories, tense marking,
aspect categories, aspect marking, aktionsart categories, aktionsart marking,
mood categories, modal verbs, voice and valency, numerals, personal pronouns,
demonstratives, indefinites, interrogatives, connectors, subordinating conjunc-
tions, phasal adverbs, focus particles, discourse markers, time deixis, adjective
comparison, constituent order, syntax, basic cultural vocabulary.
16. These are: Numerals, personal pronouns, demonstratives, indefinites, interroga-
tives, connectors, subordinating conjunctions, phasal adverbs, focus particles,
discourse markers, time deixis, adjective comparison.
17. More precisely, indefinites can be said to engage the hearer more actively in
supplementing an imaginary knowledge domain in which the missing context
can be situated: consider an indefinite expression such as ‘anywhere’, where it
is up to the hearer to construct an image of possible locations that satisfy vague
contextual criteria. With deixis, on the other hand, the speaker is confident that
speaker and hearer share a very particular perspective. Thus, ‘here’ leaves no
room for ambiguity, or for hearer-sided creativity.
References
Aikhenvald, Alexandra Y. and R. M. W. Dixon (eds.)

2001 Areal Diffusion and Genetic Inheritance. Oxford: Oxford University
Press.
2006 Grammars in Contact. A Cross-linguistic Typology. Oxford: Oxford
University Press.
2002 Language Contact in Amazonia. Oxford: Oxford University Press.
2006 Grammars in contact: A cross-linguistic perspective. In: Alexandra
Y. Aikhenvald and R. M. W. Dixon (eds.), Grammars in Contact.
A Cross-Linguistic Typology, 166. Oxford: Oxford University Press.
Bakker, Peter
1997 A Language of Our Own: The Genesis of Michif, the Mixed Cree–
French Language of the Canadian Métis. Oxford: Oxford Univer sity
Press.
Campbell, Lyle
1993 On proposed universals of grammatical borrowing. In: Henk Aertsen
and Robert Jeffers (eds.), Historical Linguistics 1989: Papers from
the 9th International Conference on Historical Linguistics, 91109.
Amsterdam: John Benjamins.
Dixon, R. M. W.
1995 Complement clauses and complementation strategies. In: F. R. Palmer
(ed.), Grammar and Meaning: Essays in Honour of Sir John Lyons,
175220. Cambridge: Cambridge University Press.
2006 Complement clause types and complementation strategies in typo-
logical perspective. In: R. M. W. Dixon and Alexandra Aikhenvald
(eds.) Complementation. A Cross-Linguistic Typology, 147. Oxford:
Oxford University Press.
Mouton de Gruyter.
Field, Fredric
2002 Linguistic Borrowing in Bilingual Contexts. Amsterdam: Benjamins.
Givón, T.
1990 Syntax. A Functional-Typological Introduction. Amsterdam: Benja-
mins.
Haugen, Einar
1950 The analysis of linguistic borrowing. In Language 26 (2): 210231.
Heath, Jeffrey
1984 Language contact and language change. In Annual Review of Anthro-
pology 13: 367384.
72 Yaron Matras
Matras, Yaron
guistics 36 (2): 281331.
Press.
2004 Layers of convergent syntax in Macedonian Turkish. Mediterranean
Language Review 15: 6386.
Matras, Yaron and Jeanette Sakel
2007 Investigating the mechanisms of pattern-replication in language con-
Moravcsik, Edith
1975 Verb borrowing. Wiener Linguistische Gazette 8: 330.
1978 Language contact. In: Joseph H. Greenberg, Charles A. Ferguson and
Edith A. Moravscik (eds.), Universals of Human Language, Vol. 1:
93122. Stanford: Stanford University Press.
Muysken, Pieter
1981 Halfway between Quechua and Spanish: The case for relexification. In:
Arnold Highfield and Albert Valdman (eds.), Historicity and Variation
in Creole Studies, 5278. Ann Arbor: Karoma.
Ross, Malcolm
2001 Contact-induced change in Oceanic languages in north-west Mela-
nesia. In: Alexandra Aikhenvald and R. M. W. Dixon (eds.), Areal Dif-
fusion and Genetic Inheritance: Problems in Comparative Linguistics,
134166. Oxford: Oxford University Press
Stolz, Thomas
1996 Grammatical Hispanisms in Amerindian and Austronesian languages.
The other kind of Transpacific isoglosses. Amerindia 21: 137160.
2007 Allora. On the recurrence of function-word borrowing in contact situ-
ations with Italian as donor language. In: Jochen Rehbein, Christiane
Hohenstein and Luaks Pietsch (eds.), Connectivity in grammar and
discourse, 7599. Amsterdam: Benjamins.
1996 Funktionswortentlehnung in Mesoamerika, Spanisch-Amerindischer
Sprachkontakt. In Sprachtypologie und Universalienforschung 49 (1):
86123.
indigenen Sprachen Amerikas und Austronesiens. Orbis 39 (1): 177.
Thomason, Sarah G.
2001 Language Contact: An Introduction. Edinburgh: Edinburgh Univer-
sity Press.
Thomason, Sarah G., and Terrence Kaufman

University of California Press
van der Auwera, Johan
1998 Phasal adverbials in the languages of Europe. In: Johan van der Auwera
with Dónall P. Ó Baoill (ed.), Adverbial Constructions in the Languages
of Europe, 25145. Berlin: Mouton de Gruyter.
van Hout, Roeland, and Pieter Muysken
1994 Modelling lexical borrowability. Language Variation and Change 6:
3962.
Weinreich, Uriel
1953 [1968] Languages in contact. The Hague: Mouton.
Wichmann, Søren, and Jan Wohlgemuth
Forthc. Loan verbs in a typological perspective. In: Thomas Stolz, Dik Bakker
and Rosa Salas Palomo (eds.), Aspects of Language Contact. Berlin/
New York: Mouton de Gruyter.
Winford, Donald
2003 An Introduction to Contact Linguistics. Oxford: Blackwell.
Grammatical borrowing in Tasawaq
Maarten Kossmann
1. Background1
Tasawaq (tásàwàq) is the main language of the date palm oasis of In-Gall
(íngàl), about 100 km west of Agadez in the desert of Niger (Western Africa).
It is sometimes stated that Tasawaq is also spoken in Tegidda-n-Tesemt, an im-
portant salt extraction site in the region. This is true, but only in the sense that
salt exploitation is seasonal labor (Bernus and Bernus 1972: 23, 30), and that
in the rest of the year the great majority of salt miners stay in their homes in
In-Gall. The number of speakers is unknown, but probably lies between 2,000
and 10,000. Since 1991, Tasawaq has been recognized as an official language
of Niger (Sidibé 2002: 186). This status has no practical consequences.
Tasawaq is a Northern Songhay language. Songhay is a close-knit language
group, which is commonly regarded to be part of the Nilo-Saharan language
phylum (e.g. Bender 1997). Other affiliations have been proposed, and it con-
stitutes one of the better candidates in Africa for an isolated family.
All Northern Songhay languages have been heavily affected by language
contact. In the case of Tasawaq, the main language of influence is southern
Tuareg (also called Tamajeq), the language of the main nomadic group in the
desert around In-Gall. Tuareg is a Berber language belonging to the Afroasi-
atic language phylum. Languages of minor influence on Tasawaq are Arabic
(Semitic, Afroasiatic), the language of religion and in earlier times also of
long-distance trade, and Hausa (Chadic, Afroasiatic), the lingua franca in this
part of Niger, and the language of the only urban center in the region, Agadez.
Fulfulde, spoken by another nomadic group in the desert around In-Gall, does
not seem to have had any influence on Tasawaq, neither grammatically, nor
lexically.
According to speakers of the language most Tasawaq speakers also speak
Tuareg and Hausa (Sidibé 2002: 194). On the other hand, non-native speakers
of Tasawaq seem to be extremely rare.
Tuareg has influenced Tasawaq, both at the lexical and the grammatical
level. Lexical influence includes the introduction of many items which are
considered to be ‘basic’ according to most researchers who have an opinion on
this, e.g. body-part terms such as ‘finger’, ‘heart’, ‘tongue’, ‘knee’ and ‘tooth’.
76 Maarten Kossmann
Northern Songhay languages in general, and Tasawaq in particular, have

been analyzed as “mixed” languages, with Songhay and Tuareg components
(Nicolaï 1990, Wolff and Alidou 2001). This is based on a number of argu-
ments. In the first place, the high degree of lexical influence, in the second
place the existence of separate systems for elements of different origin in
some sub-systems of the morphology of Tasawaq, and finally a number of
important structural features which would have a Tuareg background, e.g.
the genitive adposition Hǹ and SVO word order. Neither of these features are
decisive for ranking the language under the “mixed” languages. For example,
Northern Berber languages such as Riffian Berber – which nobody would call
“mixed” in the sense of Michif or Ma’á – have similar percentages of loan-
words in basic lexicon and have separate morphological sub-systems on the
basis of etymological origin. The remaining part of the argument, the struc-
tural features, are not decisive, as this type of structural borrowing is attested
elsewhere in non-mixed languages, and also because the adduced evidence
for borrowing is not always very strong. Therefore, I will consider Tasawaq
a Songhay language which was strongly influenced by Tuareg, rather than a
mixture of the two languages for which no basic language can be identified.
2. Phonology
The most conspicuous phonological influence of Tuareg on Tasawaq is the

introduction of pharyngealized consonants, both in Tuareg and in Songhay
lexemes. In Proto-Songhay pharyngealization was certainly not present. In
Songhay lexemes, pharyngealized consonants mainly occur in the vicinity of
a and o; in lexemes borrowed from Tuareg, there are no restrictions on their
occurrence. It should be noted, however, that according to Robert Nicolaï
(p.c.), who worked with a number of different informants, in most Tasawaq
idiolects (among others the one described by Nicolaï 1979), consonantal pha-
ryngealization is absent.
Further influence of Tuareg and other languages mainly concerns the dis-
tribution of certain consonants. Thus, in the Songhay part of the lexicon q
and γ are in near complementary distribution, q being found in word-initial
position before o and in word final position after a and o, while γ appears
elsewhere. Due to the introduction of especially Arabic lexemes, in which
both sounds are phonemic, this distribution has become blurred. Another
conspicuous effect of language contact is the change in relative frequency of
certain types of consonant cluster. In the Songhay part, most consonant clus-
Tasawaq 77
ters start with a nasal or r. Other clusters are possible, but rare. The influx of
foreign lexemes from languages which are less restrictive in their clustering
procedures has raised the relative frequency of the other consonant clusters
considerably.
There exist interesting similarities between Tasawaq and Hausa phon-
ology. In the first place, they share a phonological rule by which the short mid
vowels e and o are lowered to a in all contexts except before pause (cf. New-
man 2000: 399). The underlying vowel quality reappears in contexts where
the vowel in question is lengthened, e.g. Tasawaq:
(1) γáy dàr ‘I stretched out’ γáy dààr-á ‘I stretched it out’

γáy dáb ‘I closed’ γáy dééb-à ‘I closed it’
γáy dàs ‘I touched’ γáy dòòs-á ‘I touched it’
In the second place, Hausa and Tasawaq have similar tonal features. Both lan-
guages have three tones, High, Low and Falling. Different from other Song-
hay languages such as Zarma, but similar to Hausa, Rising tone does not
exist in Tasawaq. Moreover, both languages have a strong dislike for all-Low
contours in polysyllabic words. In Hausa, such contours are rare and mainly
occur in loanwords (Newman 2000: 606). In Tasawaq, underlying all-Low
contours (which are frequent) are automatically changed to sequences with
an initial Falling tone in polysyllabic words. Because of phonotactic restric-
tions on the occurrence of Falling tones, it is simplified to a High tone in cer-
tain syllable types. The basic all-Low contour reappears in bound syntactic
contexts, e.g. in nouns when they are followed by a numeral:
(2) bângù ‘well’

bàngù hínká ‘two wells’
It is difficult to decide whether these similarities between Hausa and Tasawaq

phonology are due to Hausa influence on the language, or whether they con-
stitute independent developments.
3. Noun morphology
In its noun morphology, Tasawaq makes a strict distinction between native

elements and foreign elements. This is most conspicuous in plural forma-
tion. Native Songhay words have no lexical plurals; plurality is obligatorily
78 Maarten Kossmann
marked by means of an element H-yo, which occurs in noun-phrase final pos-

ition.2 This is illustrated in the following examples:
(3) a. dábdè ‘piece of

clothing clothing’ (underlying form: dàbdè)
b. dàbdá-yo ‘clothes’
clothes-pl
c. dàbdè sídày-yo ‘red clothes’
clothes red-pl
With borrowed words, on the other hand, plurality is marked on the noun ra-
ther than on the noun phrase. These plurals cannot immediately be followed
by the suffix H-yo. This suffix reappears when the noun is not the last constitu-
ent of the noun phrase, e.g.
(4) a. tákàrdè ‘(sheet of) paper’ (< Tuareg)

paper
b. sìkárdààwà n ‘sheets of paper’ (< Tuareg)
papers
c. sìkárdààwà n sídày-yo ‘red sheets (of paper)’
papers red-pl
The plural formations found in borrowed elements reflect Tuareg morph-

ology. They can be of two types. In the first type, there are both changes in
the initial syllable of the noun, and changes elsewhere – either vowel changes
in the stem or suffixes. This is illustrated by the following examples:
(5) singular plural singular plural

àsáágù ìsúúgà < Tuareg əsegu isuga ‘comb’
táágày sígàyà n < Tuareg tagăyt šigăyyen ‘palm frond’
The second type is characterized by the suffix -(t)à n. As in Tuareg, this type is
found with a small set of Tuareg nouns, but especially constitutes the canon-
ical way of integrating loanwords from other languages than Tuareg, e.g.
(6) singular plural

àládày àládààyà n < Hausa àládèè ‘pig’
àlkìtáb àlkìtáábà n < Arabic al-kitaab ‘book’
Tasawaq 79
The etymological split between a class of nouns of Songhay origin, which

exhibits no lexical number marking, and a class of borrowed nouns with
singular–plural distinctions is relatively strong. Only eleven Songhay nouns
were found, which receive plurals according to the Tuareg fashion of treating
loanwords (i.e. by suffixing -(t)à n), e.g.
(7) singular plural

gwánsì gwánsìtàn < Songhay ‘snake’
A certain number of borrowings from Arabic and Hausa are treated as if they
were Songhay nouns, e.g.
(8) singular plural

àssàbí àssàbí-yo < Arabic ṣabiiy ‘child’
bàṛqòòní bàṛqòòní-yo < Hausa bàrkòònóó ‘pepper’
Another interesting influence from Tuareg is found in the marking of natural

gender. In the Songhay part of the language, the only non-phrasal way of ex-
pressing natural gender is by means of suppletive stems. This is only found
in a few cases; many items which would be gendered in other languages are
left unmarked for gemder in Tasawaq. Examples:
(9) àlzírày ‘male or female in-law’ < Songhay

ízè ‘son, daughter’ < Songhay
báàbà ‘father’ náànà ‘mother’ < Songhay
áàrù ‘man’ wây ‘woman’ < Songhay
báynà ‘male slave’ ṭààmú ‘female slave’ < Songhay
Tuareg has a derivative gender system, in which masculine denotes male ani-
mates and larger (in relation to the corresponding feminine form) inanimates,
while feminine denotes female animates and smaller inanimates. In Tasawaq,
this system is reflected in a consistent manner in borrowings denoting ani-
mates. The size difference with inanimate concrete nouns, on the other hand,
is only reflected in a few lexicalized pairs, and cannot be considered a feature
of Tasawaq grammar. The natural gender difference is illustrated by the fol-
lowing pairs:
(10) àbóóbàz ‘male cousin’ tàbóóbàz ‘female cousin’ < Tuareg

àgéélìm ‘male orphan’ tàgéélìm ‘female orphan’ < Tuareg
ááràb ‘Arab man’ tááràb ‘Arab woman’ < Tuareg
80 Maarten Kossmann
4. Adjectives
Songhay languages have a special class of words, which are only used as
nominal modifiers. I will refer to them as “adjectives”. Many of them are de-
rived from verbs by special morphological devices. Tuareg, on the other hand,
has no class of adjectives. The type of modification carried out by adjectives
in Songhay, is provided by relative clauses in Tuareg. In subject relatives,
Tuareg uses a special form of the verb, the so-called participle. With stative
verbs, which are very common in Tuareg adjective-like relative clauses, the
participle (m.sg.) is marked by a suffix -ăn.
Tasawaq has retained the Songhay system of modification with adjectives,
most of which are morphologically derived from verbs. There are several
regular formations of adjectives. Adjectives based on monosyllabic verbs (al-
ways with the shape CVC) are formed by lengthening the (underlying) vowel
of the verb and adding a suffix -o. The lexical tone is replaced by a L-H tone
pattern, e.g.
(11) nàq ‘to press’ nààqó ‘pressed’

fár ‘to open’ fèèró ‘opened’
qwáṣ ‘to cut’ qòòsó ‘cut (adj.)’
zìr ‘to wipe’ zììró ‘wiped’
A similar morphological device is found with disyllabic verbs ending in a

vowel. In these verbs, the final vowel is substituted by -o, and the lexical
vowel pattern is replaced by a L-H contour, e.g.
(12) síírí ‘to be crooked’ sììró ‘crooked’

fúmbú ‘to stink’ fùmbó ‘stinking’
qwárnò ‘to be warm’ qwàrnó ‘warm’
These two related morphological devices reflect Songhay patterns. All verbs
to which they apply have a Songhay origin.
The other verb types have a different morphology, which consists of the
suffixing of -à n, under some circumstances accompanied by vowel lengthen-
ing in the preceding syllable. The lexical tone pattern of the verb is retained.
Verbs of this class include both original Songhay verbs and verbs borrowed
from Tuareg, e.g.
Tasawaq 81
(13) bààráy ‘to change’ bààráyà n ‘changed’ < Songhay

kàkáy ‘to build’ kàkááyà n ‘built’ < Songhay
fáṛàṭ ‘to sweep’ fáṛàṭà n ‘swept’ < Tuareg
yízmàm ‘to squeeze’ yízmàmà n ‘squeezed’ < Tuareg
gílìllìt ‘to be round’ gìlíllìtà n ‘round’ < Tuareg
fùsús ‘be light’ fùsúúsà n ‘light’ < Tuareg
The suffix -à n reflects the masculine singular suffix of the Tuareg ‘participle’
(subject relative verb form). It should be noted that Songhay verbs which
have the required structure for this type of adjective formation are quite rare,
while on the other hand all Tuareg loan-verbs belong to this class. Thus, al-
though the distribution of adjectival formations is ruled by structure and not
by etymology, there is a clear etymological scission between the class of -o
final adjectives (all of which have a Songhay background) and the -à n final
adjectives, most of which are based on Tuareg verbs.
5. Verbs
The verb in Tasawaq is entirely Songhay in its structure. Different from Tuareg,
mood, aspect and negation are not marked in the verb stem. As in other Song-
hay languages, these categories are expressed by porte-manteau morphemes,
which immediately precede the verb. The positive Perfective is unmarked.
(14) γá b-sì ṭàkááfùṛ.

1sg imperfective-speak French
‘I speak French.’
Unlike other Northern Songhay languages, the subject is not obligatorily ex-
pressed by a pronoun when a lexical subject is present. The Tasawaq con-
struction probably reflects proto-Songhay at this point, e.g.
(15) sáy í-n nààná-yo sìní …

well 3pl-of mother-pl say
‘And their mothers said …’
Phrases with a lexical subject followed by a subject pronoun are quite fre-
quent, e.g.
82 Maarten Kossmann
(16) àžéémùr à ṇáṣ

calf 3sg be.fat
‘The calf is fat.’
This may represent influence from Tuareg, which has obligatory subject in-
flection, or Hausa, where the subject is obligatorily expressed in a porte-
manteau morpheme combining pronominal and aspectual information.
Tuareg verbs are borrowed according to two strategies. In about half of the
borrowed verbs, a form without any reflex of Tuareg person–number affixes
is used. This class includes all verbs with more than two syllables as well as
a number of disyllabic verbs, e.g.
(17) gílìllìt ‘to be round’ < Tuareg gələllət ‘be round’

(Short Imperfective)3
In the other half, the Tuareg 3sg:m prefix y(ə)- is taken over as y(i)- preceding
the verb form. This is only found in disyllabic verbs. In Tasawaq the initial y
does not refer to person, but is part of the verb stem, e.g.
(18) γáy yízmàm < Tuareg y-əẓmăm

1sg press 3sg:m-press:perfective
‘I pressed.’ ‘He pressed.’
In Tuareg, aspect is marked by different vowel patterns in the verb stem. This
provides us with the opportunity of deciding which Tuareg aspectual form
was used as the basis of the Tasawaq verb. With verbs without the y- prefix,
this turns out to be a difficult question, as all kinds of vowel patterns are found
and no specific form can be discerned as the basis of borrowing. With y- ini-
tial verbs, on the other hand, one finds an interesting distribution (see Koss-
mann fc.). Tasawaq y- initial verbs which refer to actions reflect the (Positive)
Perfective aspectual forms of Tuareg. They all share similar vowel patterns
(mainly i ~ a, reflecting the Tuareg Perfective scheme ə ~ ă) and a High–Low
tone pattern (reflecting Tuareg penultimate stress), e.g.
(19) yígmàm ‘to chew tobacco’ < Tuareg ‘y-əgmăm 3sg:m Perfective
yílmàq ‘to swim’ < Tuareg ‘y-əlmăγ 3sg:m Perfective
yínḍàb ‘to shoot’ < Tuareg ‘y-ənḍăb 3sg:m Perfective
Tasawaq stative y-initial verbs reflect another Tuareg aspectual form, the Re-
sultative. This is shown by the vowel pattern (mainly i ~ a, reflecting the Tua-
Tasawaq 83
reg Resultative scheme ə ~ a), and by the Low–High tone pattern, reflecting
Tuareg final stress, e.g.
(20) yìgdá ‘to be right’ < Tuareg yəg’da 3sg:m Resultative

yìggád ‘to be shy’ < Tuareg yəg’gad 3sg:m Resultative
yìláz ‘to be ugly’ < Tuareg yə’laz 3sg:m Resultative
One remarks that in Tuareg every verb may appear in any of the aspectual
stems. In Tasawaq, this aspectual difference has been lexicalized as a differ-
ence in tone class.
Both Tuareg and Songhay use affixation for verb derivation. In Tasawaq
only one Songhay-based verb derivation exists, the causative suffix -n`dá, for
example,
(21) káání ‘to sleep’ káán-ìndá ‘to put to sleep’ (< Songhay)
In the course of borrowing, many originally derived Tuareg verbs have been
borrowed into Tasawaq. In a few cases both a derived verb and an underived
verb have been borrowed. This is relatively rare, and far from systematic. Tasa-
waq does not have the intricate interaction between underived Songhay-based
verbs and derived Tuareg-based verbs as found in some other Northern Song-
hay languages, such as Tadaksahak (Christiansen and Christiansen 2002).
6. Verbal nouns
The morphology of Tasawaq verbal noun formation is split according to the

etymological origin of the verb. Most Songhay verbs derive their verbal noun
either by zero derivation (in one specific class tonal change), or by the suf-
fixation of a suffix -yo (tone uncertain), e.g.
(22) verb verbal noun origin

ḅáq ḅáq ‘to break / the breaking’ Songhay
ḍà n ḍâ n ‘to sing / song’ Songhay
bá n bá n-yo ‘to finish / end’ Songhay
Tuareg verbs are taken over together with the corresponding Tuareg verbal
noun. The great deal of variation and irregularity in Tuareg verbal noun for-
mation is reflected in Tasawaq, e.g.
84 Maarten Kossmann
(23) verb verbal noun origin

fáṛàṭ àfáṛàṭ ‘to sweep/sweeping’ Tuareg
kírìnzìt àkírìnzì ‘to claw/clawing’ Tuareg
yìlkám àlákàm ‘to follow/following’ Tuareg
zìrgín tàzárgàn ‘to be dirty/dirtyness’ Tuareg
7. Other word classes
The only obvious influence of Tuareg on the pronominal system of Tasawaq

is the introduction of the reflexive element ímà n ‘self’ (< Tuareg iṃan ‘self,
spirit’, constructed with a genitival phrase (‘his self’), e.g.
(24) γá b-ṣíṛìnkìṭ γá nn ímà n-sí.

1sg imperfective-comb 1sg of self-to
‘I am combing myself.’
The corresponding constructions in Tuareg and in Songhay are similar, both

using a genitival phrase. Songhay languages normally have a noun meaning
‘head’ where Tasawaq uses ímà n.
Numerals 4 to 10, as well as the decades, are borrowed from Arabic; the
numerals 100 and 1000 are borrowings from Tuareg. There is an interesting
syntactic difference between the cardinal numerals from 1 to 19 and the car-
dinal numerals from 20 onward. In the first group, the numeral follows the
head noun, thereby reflecting Songhay syntactic patterns, e.g.
(25) bàngù hínká ‘two wells’

bàngù sábàγà ‘seven wells’
In the second group, the numeral precedes the head noun and is linked to it
by means of a particle Hǹ, e.g.
(26) γàssìrín in ṭáṛṛày ‘twenty roads’

xàmsín ìn bàngù ‘fifty wells’
téémàdá n bàngù ‘a hundred wells’
This closely reflects the Tuareg pattern, in which the genitive preposition n
occurs between the higher numeral and the noun.
Tasawaq 85
Different from Tuareg practice, in both constructions the quantified noun

has the singular form (thus ṭáṛṛày ‘road’ rather than ṭáṛṛààyà n ‘roads’).
For most other quantifiers no certain loan origin could be discerned, the
main exception being àlkúl ‘every’, which was borrowed from Arabic.
Simple postpositions all derive from Songhay. A possible exception is the
genitival adposition Hn, which in Tasawaq relates a possessor (in first pos-
ition) to a possessed (in last position), e.g.
(27) hááwí n gí
cow of grease
‘grease of a cow’
Other Songhay languages mostly use Possessor–Possessed constructions

without a linking adposition. As Tuareg has a genitival preposition n, it is
generally assumed that this was borrowed into Tasawaq (as well as into the
other Northern Songhay languages). In the opinion of the present writer, this
is far from evident. In the first place, the Tuareg construction has the inverse
structure, i.e. Possessed–Possessor. In the second place, the tone pattern of
the Tasawaq adposition remains unexplained if one considers it a borrowing.
I consider a Songhay origin (maybe somehow related to the old Songhay
genitival pronoun wánè ‘that of’) a serious alternative.
Many spatial relationships are expressed by means of complexes contain-
ing a preposition and a nominal element. This nominal element is often a bor-
rowing from Tuareg, e.g.
(28) kùsú nn ámmàs < Tuareg aṃṃas ‘the inside’

pot of inside
‘inside the pot’
A number of coordinating and subordinating particles have been borrowed.

These include most coordinators: kó ‘or’ (< Hausa), mé ‘or’ (< Tuareg), àmmá
‘but’ (< Hausa < Arabic); the preposition n`dá ‘and, with’ (mainly used in NP
coordination), however, has a Songhay background. Subordinating particles
less commonly have a foreign origin; I have noted wàlá ‘even if’ (< Tuareg
< Arabic) and compound tún gá ‘because’ (from Hausa tún ‘because’ and
Songhay gá ‘on’).
86 Maarten Kossmann
8. Syntax
All Northern Songhay languages have relatively strict S–Aux–V–O word

order, while most Songhay languages have an alternation of S–Aux–O–V
word order with less frequent S–Aux–V–O word order, mainly found with
low-transitivity two-place verbs (Heath 1999: 161). Because of this differ-
ence in word order it has sometimes been claimed that Northern Songhay
changed its word order under the influence of Tuareg. This claim is difficult
to substantiate. In the first place, Tuareg is a VSO language, so the influence
would have to be indirect. In the second place, S–Aux–V–O is also attested
in Songhay outside the Northern group, in Timbuktu Songhay (Koyra Chiini;
Heath 1999b).
One part of syntax where Tuareg influence is conspicuous is the forma-
tion of relative clauses. In Songhay languages outside Northern Songhay,
relative clauses are marked by means of a relative particle kaŋ. This marker is
quite similar to a relative pronoun: in adpositional relatives the postposition
follows kaŋ, and in most Songhay languages there is no (other) pronominal
reference to the head noun in the relative clause. In Tuareg, a different system
is found. The relative clause is marked by the use of certain verbal forms (the
‘participle’ with subject relatives) and certain syntactic features (esp. clitic
fronting). There is no pronominal reference to the head of the relative clause,
except with subject relatives: the ‘participle’ is marked for number and gen-
der. There exists a three-way opposition between relative clauses neutral to
definiteness, overtly definite relative clauses, and overtly indefinite relative
clauses (Galand 1974). The neutral relative clauses have no overt linking to
the head noun, while the definite and indefinite relatives use different pro-
nominal markers as a linking device.
In Tasawaq there exist two relative constructions, a construction without
relative linker, and a construction which uses a pronominal element à-γó,
lit. ‘this one’. In the construction without linker, the head noun may bear the
normal deictic clitic -γo, ‘this’ when it is definite; when the head noun is in-
definite, no elements come in between the noun and the relative clause. The
head of an à-γó relativization is always indefinite. The following examples of
subject relatives illustrate this:
(29) ààrù-γó [gáw àssáγàl né bí] M à sí.

man-proximal [work work here yesterday] PN 3sg be
‘The man [that worked here yesterday] is M.’ (definite head; no relative
linker)
Tasawaq 87
(30) γáy gùn(á) bàrá-fó [b-gáw àssáγàl].

1sg see man-one [imperfective-work work]
‘I saw a man [who was working].’ (indefinite head; no relative linker]
(31) áfàẓò hún-kàt [àγó fìrízì].

reed grow-ventive [relative green]
‘Green reed grew.’ (indefinite head; relative linker]
The constructions and differentiations are quite similar to Tuareg patterns.

Tuareg patterns are also reflected in the way Tasawaq treats extraction
from a postpositional phrase (for Tuareg cf. Heath 2005: 633ff.). Both Tua-
reg and Tasawaq in such cases put the adposition (without proniminal refer-
ence) on the left edge of the relative clause. Depending on the type of rela-
tive clause, this adposition follows a relative particle or stands alone, for
example,
(32) γá b-ní n àššááhì [kúná súúkàr à ssí].

1sg imperfective-drink tea [on sugar 3sg be.not]
‘I drink tea without sugar.’ (Lit. I drink tea [on which there is no
sugar])
(33) γáy gùn(á) tùgúzì-fóó [àγó gá àssàbí bárà].

1sg see tree-one [relative on child be]
‘I saw a tree [on which there was a child].’
9. Conclusion
Tasawaq is strongly influenced by Tuareg and, to a lesser extent, Arabic and

Hausa, both in its matter and in its patterns. In lexical borrowing, one finds an
interesting feature, which consists of the introduction of foreign morphology
through borrowing. Thus in nominal plural formation and in verbal noun forma-
tion, elements of different etymological origin have completely different mor-
phologies, reflecting their original morphology. Something similar has prob-
ably been the background of the distribution of two adjective formations, one
using Songhay morphology, the other using borrowed Tuareg morphology.
Pattern borrowing is also found a lot. As the reconstruction of Songhay syn-
tax is a difficult matter, it is often difficult to decide whether certain patterns
derive from Songhay (with changes due to internal factors), or whether they
have been taken over from Tuareg.
88 Maarten Kossmann
Notes
1. All Tasawaq data in this chapter were collected by the author during fieldwork
in Niger in Fall 2003. The fieldwork was carried out with one single informant,
Mrs. Ibrahim, born Nana Mariama Aweïssou, a school teacher in her twenties,
originary from In-Gall, now resident in Agadez. I wish to thank her for her time
and patience. Mrs. Ibrahim has fluent command of Tasawaq, Hausa and French,
but does not speak Tuareg. Her idiolect is unusual, it seems, in a number of
respects, esp. the presence of phrayngealized consonants, which most speak-
ers of Tasawaq seem to lack (Robert Nicolaï, p.c.). The chapter was written in
the framework of the NWO (Netherlands Organization for Scientific Research)
research project “Tuareg and the Central Sahelian Languages. A History of Lan-
guage Contact”. Other sources on Tasawaq are Alidou (1988), Nicolaï (1979,
197984, 1980), Wolff and Alidou (2001). I thank Robert Nicolaï, who gave
me the opportunity to listen to his Tasawaq recordings, which helped me a lot in
clarifying a number of questions. Transcription follows general practice in the
field of African linguistics, with the exception of the following. Nasalization of a
vowel is indicated by a superscript n following the vowel. Nasalization is in most
cases an allophone of the nasal n, but is unsystematic in word-final position.
Pharyngealization is indicated by a subscript dot. Superscript H indicates a float-
ing High tone. In Tuareg transcriptions ă indicates a short low central vowel.
Tone is consistently marked; absence of tone marking reflects the author’s un-
certainty about the transcription.
2. The tone of the final syllable is difficult to hear, and has been left unmarked in
the examples. The application of the preceding floating High tone, on the other
hand, is well established.
3. For ease of reference, I adopt the terminology in Heath (2005) for the different
aspectual stems. Other researchers on Tuareg use different terms.
References
Alidou, Ousseïna
1988 Tasawaq d’In-Gall. Esquisse linguistique d’une langue dite “mixte”.
Mémoire d’Études et de Recherches sous la direction de Prof. Dr.
Ekkehard Wolff, Université de Niamey.
Bender, M. Lionel
1997 The Nilo-Saharan Languages. A Comparative Essay. Munich: Lin-
com.
Bernus, Edmond, and Suzanne Bernus
1972 Du sel et des dattes. Introduction à l’étude de la communauté d’In
Gall et de Tegidda-n-tesemt. Études nigériennes 31. Niamey: Centre
Nigérien de Recherches en Sciences Humaines.
Tasawaq 89
Christiansen, Niels and Regula Christiansen

2002 Some verb morphology features of Tadaksahak. SIL Electronic Work-
ing Papers. (http://www.sil.org/silewp/abstract. asp?ref=2002005).
Galand, Lionel
1974 Défini, indéfini, non-défini: Les supports de détermination en touareg.
Bulletin de la Société de Linguistique de Paris 69 (1): 205224.
Heath, Jeffrey
1999a A Grammar of Koyraboro (Koroboro) Senni. Cologne: Rüdiger
Köppe.
1999b A Grammar of Koyra Chiini. Berlin/New York: Mouton de Gruyter.
2005 Tamasheq (Tuareg of Mali). Berlin/New York: Mouton de Gruyter.
Kossmann, Maarten
Forthc. The borrowing of aspect as lexical tone: y-initial Tuareg verbs in Tasa-
waq (Northern Songhay). To appear in Studies in African Linguistics.
Newman, Paul
2000 The Hausa Language: An Encyclopedic Reference Grammar. New
Haven/London: Yale University Press.
Nicolaï, Robert
1979 Le songhay septentrional (études phonématiques). Bulletin de l’IFAN,
41, série B; part 1: 304370; 539567; 829866.
197984 Sur la phonologie des langues “mixtes” du songhay septentrional.
Comptes rendus du GLECS, 2428: 395412.
1980 Le songhay septentrional (études prosodiques). In: Itinérances … en
pays peul et ailleurs. Mélanges réunis à la mémoire de Pierre François
Lacroix, I, 261289. Paris: Société des Africanistes.
1990 Parentés linguistiques (à propos du songhay). Paris: Éditions du
CNRS.
Prasse, Karl-G., Ghoubeïd Alojaly, and Ghabdouane Mohamed
2003 Dictionnaire touareg-français (Niger). Copenhagen: Museum Tuscu-
lanum Press.
Sidibé, Alimata
2002 Analyse critique de quelques opinions sur l’idiome des isawaghan: Le
tasawaq. Mu ƙara sani. Revue de l’Institut de Recherches en Sciences
Humaines (Université Abdou Moumouni, Niamey) 10/12: 185197.
Wolff, H. Ekkehard, and Ousseïna Alidou
2001 On the non-linear ancestry of Tasawaq (Niger). Or: how “mixed” can
a language be? In: Derek Nurse (ed.), Historical Language Contact
in Africa, 523574. (Special volume of Sprache und Geschichte in
Afrika 16/17).
Grammatical borrowing in K’abeena
Joachim Crass
1. Background
K’abeena is a Highand-East Cushitic (HEC) language spoken by some 35,000

people living in and around the town of Wolkite. This small town is situated
some 160 km south west of Addis Ababa, the capital of Ethiopia, at the north-
western edge of the Gurage region. The Gurage region, a settling of people
speaking South-Ethiosemitic languages, is surrounded by areas that are in-
habited by people speaking Cushitic languages. These Cushitic languages are
– apart from K’abeena – Alaaba, Hadiyya, Libido, and Oromo. The closest
related languages to K’abeena are Alaaba, Kambaata and T’imbaaro. These
four languages form one subgroup of HEC. The other four subgroups are (1)
Burji, (2) Gedeo, (3) Sidaama, and (4) Hadiyya and Libido.
K’abeena is a member of the Ethiopian linguistic area (ELA). It is in con-
tact mainly with the South-Ethiosemitic languages Amharic, Chaha, Ezha,
Muher, Wolane and with the Lowland East Cushitic language Oromo. Oromo
is spoken to the north, Chaha to the south, Ezha to the southeast and Muher
and Wolane to the east of the K’abeena speaking area. However, this is only
an approximation. In fact, the main town Wolkite and the surrounding rural
area are multi-ethnic, i.e. no clear boundaries can be drawn between settling
areas of various groups. Almost all K’abeena speak Amharic as a second
language, which is the lingua franca of Ethiopian towns (Meyer and Richter
2003) and of the Gurage region. The other contact languages are known by in-
dividuals who live or grew up in the respective contact area. Many K’abeena
speak three or four languages.
The South-Ethiosemitic languages belong to different subgroups of South-
West-Semitic. Amharic belongs to one branch of Transversal South Ethiopic,
Wolane to the second branch of Transversal South Ethiopic, called East Gur-
age. Chaha, Ezha and Muher belong to different groups of Outer South Ethi-
opic (Faber 1997: 6). HEC and the Gurage languages form a sub-area of the
ELA, which we name HEC-Gurage sub-area. The existence of this sub-area
is proposed by Zaborski (1991), who calls it Gurage-Sidamo sub-area. Un-
fortunately Zaborski does not discuss its features (for recent discussion cf.
Bisang 2006, Crass and Bisang 2004).
92 Joachim Crass
The K’abeena reached their actual settling area in the second half of the
nineteenth century, after splitting of from the Alaaba (Braukämper 1980).
Culturally the K’abeena do not differ considerably from the people speak-
ing Ethiosemitic languages. The staple food is Ensete edulis, known as “false
banana”. Whereas the K’abeena are exclusively Muslims, the Ethiosemitic
peoples are mainly but not exclusively either Christians or Muslims.
This chapter deals with features found in K’abeena and other languages
of the area. However, in most of the cases it is difficult to define the source
of these features. Several features represent grammaticalization processes or
pattern-replication. However, this does not mean, that these features should
be excluded as contact-induced features. Especially in the case of rare or un-
attested grammaticalizations, contact-induced change is one possible way of
explaining the similarities (cf. Bisang 1996, Heine 1994, Heine and Kuteva
2003). Furthermore it is not possible to decide whether these features belong
to the ELA or to the HEC-Gurage sub-area of the ELA. The other languages
dealt with in this chapter are Amharic, Libido, Oromo, Muher, Wolane, Gumär
(Outer South Ethiopic) and Zay (East Gurage). Gumär and Zay are not direct
contact languages of K’abeena but they are members of the HEC-Gurage sub-
area. The examples to illustrate the features are all taken from K’abeena. The
data of all languages is an outcome of research conducted by the author of this
chapter and by Ronny Meyer.1 The research was initiated by Crass in under-
taking a comparison between K’abeena and Amharic. It later was extended to
other languages of the HEC-Gurage sub-area. Crass provided data on Libido,
Meyer provided data on Gumär, Muher, Wolane, Zay, and Oromo. The find-
ings of this research will be published in Crass and Meyer (2007).
2. Phonology
Within the sound system system of K’abeena /p/ is most probably introduced
due to contact with Ethiosemitic languages via loanwords, to which this mar-
ginal morpheme seems to be restricted. The same is true for [ä] which I do not
consider to be a phoneme in K’abeena; rather, it is a phonetic variant of /a/. The
distribution is not easy to formulate (for a discussion cf. Crass 2005: 25f.). A
phonological feature of the ELA is the presence of ejectives (for extensive dis-
cussion cf. Crass 2002). K’abeena possesses four ejectives, namely the ejec-
tive plosives /p’/, /t’/ and /k’/ and the ejective affricate /c’/. For Proto-HEC
only /k’/ is reconstructed by Hudson (1989: 11). On the basis of this finding,
one can suppose that ejectives were introduced into the consonant inventory
K’abeena 93
in course of time, possibly due to contact. However, there is no evidence that

ejectives were introduced in K’abeena in recent time. If Hudson’s reconstruc-
tion holds true, the introduction of ejectives must me very old.
Palatalization of alveolar consonants to post alveolar or palatal consonants
as a morphophonological process is another areal feature, which might have
spread due to contact. In K’abeena the following palatalizations occur: alveo-
lar plosive /t/ to post-alveolar affricate /c/, alveolar plosive /d/ to post-alveolar
affricate /j/, alveolar ejective plosive /t’/ to post-alveolar ejective affricate /c’/,
alveolar fricative /s/ to post-alveolar fricative /sh/, alveolar fricative /z/ to post-
alveolar fricative /zh/, and the alveolar nasal /n/ to the palatal nasal /ñ/. In Am-
haric, in addition, the alveolar lateral /l/ is palatalized to the approximant /y/.
The ablative case marker, which itself is not considered to be a result of lan-
guage contact, is used with verbs to form ‘since’-temporal and real condi-
tional clauses in all languages except Oromo (for the grammaticalization of
an ablative case marker to a ‘since’-temporal clause marker, see Haspelmath
1997: 66ff., Heine and Kuteva 2002: 35). In Muher and Gumär the use of the
ablative case marker is restricted to ‘since’-temporal clauses. In K’abeena
the ablative case marker is the suffix -VVcci. Example (1) includes the func-
tion as ablative case marker and as marker for ‘since’-temporal clauses, ex-
ample (2) the function as real conditional clause marker.
(1) jarman-íicci ’ameeccoomm-íicci kabare.

Germany-abl come.prv.1s-abl today.gen
’agana saaminta ’ikko.
month.acc week.acc be.prv.3s.m
‘It is five weeks ago since I came from Germany.’
(2) c’aata k’ama’yoomm-íicci ’óssuti ’affaa-’e-ba.

khat.acc chew.prv.1s-abl sleep.nom hold.ipv.3s.f-1s-neg
‘If I chew khat I cannot sleep.’
The past tense marker is used in K’abeena and in all other investigated lan-
guages of the area to mark the apodosis of irreal or counterfactual conditional
94 Joachim Crass
clauses. This is remarkable because a past marker mostly occur in the protasis
(Fleischman 1989: f.) In K’abeena the past tense marker is the suffix -kk’i.
Example (3) shows the function as past tense marker, example (4) and (5) the
function as a marker of the apodosis.
(3) ’anni bokkóoni ’agáre-’e yiye-he

father.gen house.loc wait.imp.s-1s say.cnv.1s-2s
’oro’yoommi-kk’i-ba-indo?
go.prv.1s-pst-neg-q
‘Didn’t I leave by saying to you: “Wait in the house of my father!”?’
(4) beréta t’eenoo ’ubbo-ba-’ikkáani t’aafaa

yesterday rain.nom fall.prv.3s.m-neg-cnd Tef.acc
’udunnáammi-kk’i.
thresh.ipv.1s-pst
‘If it had not rained yesterday we would have threshed Tef.’
When no adverb indicates tense, a past or a non-past interpretation is pos-

sible.
(5) t’eenoo ’ubbo-ba-ikkáani t’uma-ha-kk’i.

rain.nom fall.prv.3s.m-neg-cnd good-cop-pst
‘It would have been good if it did not rain.’
‘It would be good if it did not rain.’
The prospective aspect (cf. Comrie 1976: 64f.) is marked in K’abeena by a

verbal noun in the dative (cf. example (6)) and in Ethiosemitic languages by
a verbal noun with possessive suffixes. In both cases the verbal noun is fol-
lowed by a copula. In Ethiosemitic languages two morpheme orders occur.
Either the copula precedes the possessive suffix (e.g. Wolane) or it follows
the possessive suffix (e.g. Muher).
(6) ’áni timhirtíta shuuliiháa-ti.

1s.nom study.acc finish.vn.dat-cop
‘I am about to finish my studies.’
The prospective aspect is distinguished formally by the intentional. The latter

is expressed with a subordinate clause followed by a copula. In K’abeena, the
K’abeena 95
predicate of the subordinate clause is a converb expressing purpose (example

(7)). In Ethiosemitic languages the non-independent imperfective is marked
either with a purpose or a locative marker.
(7) hokkoppaati ’intotáa-ti.

afternoon.snack.acc eat.cnv.purp.1p-cop
‘We intend to eat our afternoon snack.’
A formal distinction between prospective aspect and intentional is found in

K’abeena, Amharic, Gumär, Muher, Wolane, and Zay but not in Libido and
Oromo. In the latter two languages the prospective aspect and the intentional
are not differentiated morphosyntactically. Libido uses a converb plus a cop-
ula for both categories, i.e. the same construction used in K’abeena only for
the prospective aspect. In Oromo, however, a construction consisting of a ver-
bal noun in the dative or a verbal noun marked with a possessive morpheme
followed by a copula is used.
The experiential perfect is expressed by a construction including the re-
spective verb ‘know’ in the main clause and its complement expressed by a
converb. The latter expresses the event experienced by the subject.
(8) ’ameerikáani ’orootéeni kasseenta-’i? ’ee, ameerikáani

America.loc go.cnv.2p know.prv.2p-q yes America.loc
’oróo’ni kansóommi.
go.cnv.1p know.prv.1p
‘Have you ever been in America?’ ‘Yes, we have [the experience to
have] been in America.’
All languages except Oromo lack a verb ‘have’. To express possession the
respective verb ‘to exist, to be present’ is used. The possessor is marked with
the dative case, the possessum with the nominative.
(9) kii bíkku c’úulu yóo-’e.

2s.gen size child.nom exist.3-1s.obj
‘I have a child of the same age/size as yours.’
This construction is used in all languages except Libido to express obligation,

too. In this case the subject is a verbal noun. Libido cannot make use of this
construction, because it does not have object agreement markers.
96 Joachim Crass
(10) ’oró’u yóo-’e.

go.vn.nom exist.3-1s.obj
‘I have to go.’
Converbs occur in many languages of the ELA. Typical is that converbs

usually cross-reference the subject. According to Tosco,
the presence of the converb in most Ethiopic Semitic languages is probably

the result of old Cushitic influence … Within Cushitic, the converb is typ-
ically found in the languages of the highlands … It is likewise found in the
Omotic languages of the highlands, … and can therefore be considered a
genuine areal feature. (Tosco 2000: 345)
The number of converbs differs considerably in the languages. Whereas in

Ethiosemitic languages generally one or two converbs occur, the number of
converbs in HEC languages is much higher. In K’abeena two mainly sequen-
tial converbs, one progressive converb, one negative converb, one affirmative
and one negative purpose converb occur.
Furthermore, K’abeena marks objects on the verb according to the pri-
mary/secondary object pattern, i.e. in the case of ditransitive verbs the re-
cipient is marked on the verb, not the theme. This pattern is attested also for
Ethiosemitic languages.
(Ethio-)Semitic verbs are integrated into K’abeena on the basis of the 3s.m
of the perfective, which is the citation form of verbs in Semitic languages.
This is remarkable because the 3s.m is the citation form only according to
scientific tradition, not in the language use. Normally, the citation form is the
verbal noun. However, the 3s.m seems to have a similar function in the lan-
guage use. The final vowel in (Ethio-)Semitic verbs, which marks the 3s.m,
is replaced by the suffix -u which marks verbal nouns, the citation form of
verbs in K’abeena. Example: From the Amharic verb kättäbä ’he vaccinated’
the final vowel -ä is replaced by the suffix -u. Therefore, the citation form in
K’abeena is kattabu ‘vaccinate’. Other examples are K’AB ’anabbabu ‘read’
from AMH anäbbäbä, K’AB tamaaru ‘lern’ from AMH tämarä, K’AB t’aafu
‘write’ from AMH ts’afä or t’afä. When the second radical of a three-conso-
nantal verb is not geminated in K’abeena, the verb is definitely not borrowed
from Amharic, because in this language the second consonant is always gem-
inated in the perfective. The source can only be one of the Gurage languages.
However, in these languages, the situation is not homogeneous. Consonant
K’abeena 97
gemination vs. non-gemination is a feature which yields different verb class-

es. Furthermore, in some cases, languages can be excluded as source, because
sound shift took place. An example is the K’abeena verb faradu ‘judge’. In
Dobbi, Muher, Mäsqan and Soddo (like in Amharic) the verb is färrädä. In
Chaha the cognate verb is fänädä, in Ezha fännädä and in Endegegn, Silt’e,
Wolane and Zay färädä (Leslau 1979: 241). Since Wolane is the only contact
language of K’abeena, the verb most probably is borrowed from Wolane.
The source of verbs of the religious domain is often unclear. They may be
borrowed either directly from Arabic or via Amharic or another contact lan-
guage. An example is saggadu ‘pray’, borrowed from Amharic säggädä or
another language with gemination of this verb in the perfective. Here Arab-
ic can be excluded as direct contact language because the verb is realized as
sadžada. i.e. without gemination of the second consonant, which in addition
is a palatal affricate and not a velar plosive. In other cases the source is not
clear. The K’abeena verbs katabu ‘write’ and k’ara’u ‘read’ might be bor-
rowed directly from the Arabic verb kataba und qara’a. The two verbs katabu
and k’ara’u are used mainly in religious contexts or by religious people, the
verbs t’aafu ‘write’ and ’anabbabu ‘read’, borrowed from Amharic, in non-
religious contexts.
Complementizers are grammaticalized out of a similative marker, which

is not considered to be the result of language contact. Primarily they mark
complement clauses (example (11)) but they may be used to mark affirma-
tive and negative purpose clauses as well (example (12) and (13)). This fea-
ture is found in K’abeena, Libido, Oromo, Amharic, Gumär, Muher, Wolane,
and Zay. However, in all languages it is not the main type to mark purpose
clauses.
(11) moggó-gga híilu ríccu yoo-ba.

theft-simil bad thing.nom exist.3-neg
‘There is nothing being as bad as theft.’
(12) shukúru ga’áta ’ameetánu-gga dagáammi.

Shukur.nom tomorrow come.ipv.3s.m-simil know.ipv.1s
‘I know that Shukur will come tomorrow.’
98 Joachim Crass
(13) ’áti k’ama’áanti-gga k’aawwáanka-si buuru

2s.nom drink.ipv.2s-simil coffee.loc-3s.m butter.acc
wartó-si.
put.prv.3s.f-3s.m
‘She put butter into the coffee in order that you may drink it.’
According to Heine and Kuteva (2002: 91) the “directionality proposed here
[i.e. the grammaticalization of a complementizer to a marker of purpose
clauses, J.C.] has not yet been established beyond reasonable doubt. More
data to substantiate this hypothesis are required”. The fact that the grammat-
icalization of a complementizer to a marker of purpose clauses occurs in the
languages of the HEC-Gurage sub-area shoes that this grammaticalization
is more frequent than has been considered to be. However, the overall rarity
makes it reasonable to consider the occurrence in this area to be due to lan-
guage contact.
The (Ethio-) Semitic noun wäk’t/wak’t ‘time’ is used in K’abeena as the
head of a relative clause to form a temporal clause.
(14) kesa da’iyoommi wak’ti tassh yiyo-’e.

2s.acc meet.prv.1s.rel time.acc ideo say.prv.3s.m-1s.obj
‘I was happy when I met you.’
The Ethio-Semitic noun məknəyat ‘reason’ is used as the head of a relative

clause to form a causal clause.
(15) ’ámru dunkiyo-’i mikiññaati daggoom-ba.

Amru.nom be.late.prv.3s.m-rel reason.acc know.prv.1s-neg
‘I don’t know why Amru was late.’
The (Ethio-)Semitic noun säbäb/sabab ‘reason’ is used as the head of a rela-

tive clause to form a causal clause. The head noun of the relative clause is the
subject of a complement clause.
(16) c’úulu-’i harafáta ’ameetu’náani

child.nom-1s.pss eid.al.adha come.neg.cnv.3s.m
fakk’oo sabábati ma ’ikkó-gga
stay.away.prv.3s.m.rel reason.nom what.acc be.prv.3s.f-cmpl
zaaññiyóommi.
not.know.prv.1s
K’abeena 99
‘I don’t know what was the reason why my child did not come to the
Eid al-Adha.’
In one case the Amharic denominal derivational suffix -äñña producing adjec-
tives (and nomina agentis) is borrowed in K’abeena. Suffixed to the K’abeena
verb ’it-u ‘eat’ it yields ’itañña ‘glutton, stodger’.
K’abeena and most of the other languages extensively use ideophones.
These ideophones are verbalized by the respective verb ‘say’. Partly, the in-
ventory of ideophones is identical in form and meaning in K’abeena and its
contact languages. In the following, I list a few ideophones of K’abeena and
Amharic, including the respective citation form of the verb ‘say’ which is in
K’abeena the verbal noun yu and in Amharic – according to scientific trad-
ition, not in the language use – the 3s.m of the perfective alä. K’AB ’illimm yu,
AMH əlləmm alä ‘disappear suddenly, vanish, disappear from sight’; K’AB
bogg yu, AMH bogg alä ‘flare, blaze, appear suddenly (light)’; K’AB sillimm
yu, AMH səlləmm alä ‘fall into a swoon, be in a coma’, K’AB will yu, AMH
wəll alä ‘desire or crave something (food, drink, smoking), have a momentary
strong desire for something’ (the English translations are from Kane 1990).
6. Constituent order
SOV word order is an areal feature of the ELA. Most scholars consider Ethi-
osemitic languages to have shifted their word order because of intensive con-
tact with Cushitic, especially central Cushitic languages.
7. Syntax
Another feature found widespread in the languages of the area is the fact
that copulas differ in main and subordinate clauses. In K’abeena two types
of main clause copulas occur (cf. Crass 2003). One type is gender inflected,
namely -ha for masculine and -ta for feminine. The other type of copula,
namely -ti, is uninflected. In subordinate clauses (relative clauses, comple-
ment clauses, adverbial clauses) the fully inflected verb ’ihu ‘be’ occurs.
(17) ’ísu rosisaanco-ha.

3s.m teacher.acc-cop.m
‘He is a teacher.’
100 Joachim Crass
(18) gu’mára halaale ’ikko-’i riccu hasaawwiyye!

always true be.prv.3s.m-rel thing.acc talk.imp.p
‘Tell always the truth!’ [Tell always thing, which are true!]
8. Lexicon
The ELA is characterized by several interesting types of lexicalizations.

Hayward (1991) distinguishes three categories of lexicalizations, which he
exemplifies with data on Amharic, Oromo and Gamo, an Omotic language
cluster spoken in South West Ethiopia. According to Hayward (1991: 140)
these lexicalizations reinforce “the very real cultural unity of Ethiopia” (cf.
Hayward 2000). The three categories are (1) single-sense lexicalizations, (2)
lexicalizations with two or more distinct senses and (3) lexicalizations involv-
ing similar derivations. The first category comprises “single-sense lexicaliza-
tions of typically indigenous concepts”, the second category lexicalizations
“showing inter-linguistic matching across the three languages” and the third
category lexicalizations with a “similar (parallel) ‘derivational pathway’.” To
the first category belong mainly nouns such as lexical items for seasons of
the year, categories of terrain, categories of dung/excrement, super-categories
for birds, types of borrowing and skin colour classification of people of the
region. Furthermore, this category includes the suppletive imperative of the
verb ‘to come’ (Ferguson’s feature G17) and particles with the meaning ‘Take
this!’ which have no obvious etymological relationship to a verb. The second
category, lexicalizations with two or more distinct senses, is predominantly
comprised of verbs and some nouns. Examples: the respective verbs have
the basic meaning ‘hold, catch’ and the secondary meaning ‘start, begin’. The
respective verbs with the basic meaning ‘play’ have the secondary meaning
‘chat’. The third category includes verbal derivations, compound verbs, i.e.
ideophones verbalized by the respective verb ‘say’, possessive constructions
including two NPs, and ideomatic expressions. Examples for verbal deriv-
ations are the causative of the respective verb ‘want’ having the meaning
‘need’, the causative of the respective verb ‘enter’ having the meaning ‘marry’
and the causative of the respective verb ‘pass the night’ having the meaning
‘administer’. Compound verbs are ‘become silent’, ‘hurry up’ and ‘jump up
suddenly’. Possessive constructions including two NPs have a word by word
meaning and a metaphorical meaning. Examples are ‘son of man/people’ hav-
ing the meaning ‘mankind, human being’ and ‘land of man/people’ with the
meaning ‘foreign country’. Idiomatic expressions are ‘regain/recover control,
K’abeena 101
take courage’ being composed of the noun ‘heart’ and the verb ‘return (intr.)’,
and ‘catch cold’, of which the noun ‘cold’ is the subject and the experiencer
the object of the verb ‘catch’.
K’abeena shares at least several of these lexicalizations. I deal only with
some lexicalizations of Hayward’s second and third category, which I con-
sider especially interesting. The verb ’afu has the basic meaning ‘hold, catch’
and the secondary meaning ‘begin start’, the verb ’alapp’u ‘play’ the sec-
ondary meaning ‘chat’. Hayward’s examples of verbal derivations as part of
the third category are attested also in K’abeena. The causative of verb hasu
‘want’ is hasisu ‘need, be necessary’, the causative of the verb ’a’yu ‘enter’
is ’a’isu ‘marry’ and the causative of the verb garu ‘pass the night’ is gasshu
‘administer’.
Most of the borrowed nouns are expressions of cultural goods of differ-
ent kinds (food, cloths, topics related to Islam) and some abstract nouns.
Examples for abstract nouns are K’AB suusi from AMH sus ‘mania, pas-
sion, rage’ (dt. Sucht) K’AB ’umuri from ARAB cum(u)r ‘age’, K’AB keerti /
hayraati from ARAB khayr (engl. ‘good, benefit’), K’AB haali from ARAB
ḥaal ‘situation’ (Crass 2005: 5260). Examples for nouns of mainly typical
muslim cultural heritage: K’AB maskiida ‘mosque’ from AMH mäsgid, K’AB
sheet’aani ‘devil’ from AMH säyt’an or ARAB shayt’an, K’AB t’uuri ‘pun-
ishment by God which befalls a wrongdoer’ from AMH t’ur, K’AB kitaaba
‘book’ from ARAB kitaab. Examples for vegetation are K’AB ababa ‘flower’
from AMH abäba, K’AB baarzaafi ‘eucalyptus tree’ from AMH bahər zaf
(Remark: AMH bahr zaf literally means ‘tree of the sea/ocean’. The lexeme
bahər in compounds often designates foreign origin of an item (Kane 1990:
855). An example for geography is K’AB alamíta ‘world’ from AMH aläm.
Examples for tools are K’AB akaafa ‘shovel’ from AMH akafa, K’AB bil-
laawa ‘(kind of) knife’ from AMH billawa, K’AB faasa ‘axe’ from AMH fas.
An example for minerals is K’AB work’a ‘gold’ from AMH wärk’. Examples
for body parts (and related abstract nouns) are K’AB angooli ‘brain’ from
AMH ang(w)äli [ango:li], K’AB k’albi ‘heart, mind, intelligence, reason’
from AMH k’älb, K’AB nafséeta/nabséeta/nafsíta/nabsíta (-éeta and -íta are
different flexion classes of the noun) ‘soul, life’ from AMH näfs, K’AB sabri
‘patience’ from ARAB sabr. An example for cultural goods of different kinds
is K’AB birati ‘metal’ from AMH bərät. Furthermore, there are contact phe-
nomena in areal cultural vocabulary, especially in relation with the Ensete
plant (cf. Crass and Meyer 2005).
Other word classes are interjections, e.g. K’AB ciff, AMH cəff, an interjec-
tion to chase away cats and fillers, e.g. K’AB bal, AMH bäl ‘well’. In the case
102 Joachim Crass
of interjections to call or chase away animals a remarkable correspondence is

attested in many languages of the area.
An interesting case is the K’abeena abstract noun ma-ricc-oomáta ‘es-
sence, nature’. It has the following structure: The question word ma ‘what’ is
followed by the noun ricc-u ‘thing’ yielding “what-thing”. To this the abstract
noun suffix -oomáta is added. In Amharic, the structure of the respective ab-
stract noun mənənnät is similar. Here the question word mən ‘what’ is derived
by a suffix for abstract nouns, namely -(ə)nnät.
Greetings are expressed by identical patterns, namely by questions. In
order to greet somebody in the morning, one says in K’abeena garee gal-ti
‘Did you pass the night well?’, i.e. garee ‘well’ precedes the verb gal-u ‘pass
the night’, inflected for the second-person singular. The equivalent expression
in Amharic is dähna addär-k, i.e. dähna ‘well’ precedes addär-ä ‘he passed
the night’, in this example inflected for the second-person singular mascu-
line. In the evening one asks in K’abeena garee hos-si ‘Did you pass the day
well?’. The equivalent in Amharic is dähna wal-k.
9. Conclusion
K’abeena shares a lot of phonological, grammatical and lexical features with

other languages spoken in the highlands of Ethiopia. This fact allows it to
refer to this area as the ELA. The core area comprises the languages of the
highlands of Ethiopia. The more distant a given language is situated from this
core area the fewer features it has. However, the areal status of individual
features is not accepted generally (for the discussion cf. Crass 2002, Crass
and Bisang 2004, Tosco 2000, Zaborski 1991). Furthermore, the existence of
the ELA is denied by Tosco (2000). However, since only a relatively small
number of languages are described adequately, these findings must be con-
sidered preliminary.
Abbreviations
AMH amharic cnd conditional

acc accusative cnv converb
abl ablative cop copula
ARAB Arabic ELA Ethiopian linguistic area
cmpl complementizer f feminine
K’abeena 103
gen genitive prv perfective

HEC Highland East Cushitic pss possessive
ideo ideophone pst past
imp imperative purp purpose
ipv imperfective q question
K’AB K’abeena rel relative clause
loc locative s singular
m masculine simil similative
neg negative vn verbal noun
nom nominative 1 first person
obj object agreement 2 second person
p plural 3 third person
Note
1. This research was undertaken in the scope of the Collaborative Research Center
295 Cultural and linguistic contacts: Processes of change in North Eastern
Africa and West Asia (Sonderforschungsbereich 295 Kulturelle und sprachliche
Kontakte: Prozesse des Wandels in historischen Spannungsfeldern Nordostafri-
kas/Westasiens).
References
Bisang, Walter
1996 Areal typology and grammaticalization: Processes of grammatical-
ization based on nouns and verbs in east and mainland south east Asian
languages. Studies in Language 20 (3): 519597.
2006 Linguistic areas, language contact and typology: Some implications
from the case of Ethiopia as a linguistic area. In: Yaron Matras, April
McMahon and Nigel Vincent (eds.), Linguistic areas: Convergence in
historical and typological convergence, 7598. Houndmills: Palgrave
Macmillan.
Braukämper, Ulrich
1980 Geschichte der Hadiya Süd-Äthiopiens (Studien zur Kulturkunde 50).
Wiesbaden: Steiner.
Crass, Joachim
2002 Ejectives and pharyngeal fricatives: Two features of the Ethiopian
language area. In: Baye Yimam, Richard Pankhurst, David Chapple,
Yonas Admasu, Alula Pankhurst, and Birhanu Teferra (eds.), Ethiopian
studies at the end of the second millennium. Proceedings of the 14th
104 Joachim Crass
International Conference of Ethiopian Studies, 611 Nov. 2000, 1679–

1691. Addis Ababa. Addis Ababa: Institute of Ethiopian Studies.
2003 The copulas of K’abeena: Form, function, and origin. Afrika und Über-
see 86: 2342.
2005 Das K’abeena. Deskriptive Grammatik einer hochlandostkuschi-
tischen Sprache (Cushitic Language Studies 23). Köln: Köppe.
Crass, Joachim, and Walter Bisang
2004 Einige Bemerkungen zum äthiopischen Sprachbund und ihre Relevanz
für die Areallinguistik. In: Walter Bisang, Thomas Bierschenk, Detlev
Kreikenbom, and Ursula Verhoeven (eds.), Kultur, Sprache, Kontakt,
169198. Würzburg: Ergon.
Crass, Joachim, and Ronny Meyer
2005 Die Komplexität kultureller und sprachlicher Kontakte am Beispiel der
Nomenklatur zur Ensete-Pflanze. In: Walter Bisang, Detlev Kreiken-
bom, and Ursula Verhoeven (eds.) Prozesse des Wandels in historischen
Spannungsfeldern Nordostafrikas/Westasiens. Akten zum 2. Symposi-
um des SFB 295, 15.10.–17.10.2001, 411427. Würzburg: Ergon.
2007 Ethiopia. In: Bernd Heine and Derek Nurse (eds.), A Linguistic Geog-
raphy of Africa. Cambridge: Cambridge University Press. 228249.
Faber, Alice
1997 The genetic subgrouping of the Semitic languages. In: Robert Hetzron
(ed.), The Semitic Languages. 315. London, New York: Routledge.
Fleischman, Suzanne
1989 Temporal distance: A basic linguistic metaphor. Studies in Language
13: 150.
Haspelmath, Martin
1997 From Space to Time: Temporal Adverbials in the World’s Languages.
München, Newcastle: Lincom.
Hayward, Richard
1991 À propos patterns of lexicalization in the Ethiopian language area. In:
D. Mendel and U. Claudi (eds.), Ägypten im afro-asiatischen Kontext.
Aufsätze zur Archäologie, Geschichte und Sprache eines unbegrenzten
Raumes. Gedenkschrift Peter Behrens, 139156. (Afrikanistische Ar-
beitspapiere. Sondernummer 1991). Köln: Institut für Afrikanistik.
2000 Is there a metric for convergence? In: Colin Renfrew, April McMahon,
and Larry Trask (eds.), Time Depth in Historical Linguistics Vol.1,
621640. Cambridge: McDonald Institute for Archeological Re-
search.
Heine, Bernd
1994 Areal influence on grammaticalization. In: Martin Pütz (ed.), Lan-
guage contact and language conflict, 5568. Amsterdam/Philadel-
phia: John Benjamins.
K’abeena 105
Heine, Bernd, and Tania Kuteva

2002 World Lexicon of Grammaticalization. Cambridge: Cambridge Uni-
versity Press.
2003 On contact-induced grammaticalization. Studies in Language 27 (3):
529572.
Kane, Thomas Leiper
1990 Amharic-English dictionary. Wiesbaden: Harrassowitz.
Leslau, Wolf
1979 Etymological Dictionary of Gurage (Ethiopic). Volume III. Etymo-
logical Section. Wiesbaden: Harrassowitz.
Meyer, Ronny, and Renate Richter
2003 Language Use in Ethiopia from a Network Perspective (Research in
African Studies 7). Frankfurt: Lang.
Tosco, Mauro
2000 Is there an “Ethiopian language area”? Anthropological Linguistics 42
(3): 329365.
Zaborski, Andrzej
1991 Ethiopian language subareas. In: Stanislaw Pilaszewicz and Eugeniusz
Rzewuski (eds.), Unwritten Testimonies of the African Past. Proceed-
ings of the International Symposium Held in Ojrzanów N. Warsaw on
78 November 1989, 123134 (Orientalia Varsoviensia 2). Warsaw:
Wydawnictwa Universytetu Warszawskiego.
Grammatical borrowing in Likpe (Sɛkpɛlé)
Felix K. Ameka
1. Background
Sɛkpɛlé is the auto-denomination of the language spoken in the area known

as Likpe which is to the east and north-east of Hohoe (the district capital and
an Ewe (Gbe) speaking town) as far as the Togo border in the northern part
of the Volta Region of Ghana (see Map 1). There are approximately 23,000
residents in the area who speak the language (District Assembly Representa-
tive 1998 figures). A small percentage, living in the two modern migrant vil-
lages, speaks the language as a second language. If we add the other native
speakers in the diaspora, there may well be more than 30,000 speakers of the
language today.
Sɛkpɛlé is one of the fourteen languages most recently characterized as
Ghana–Togo–Mountain (GTM) languages (Ring 1995) that were first recog-
nized as a group and referred to as Togorestsprachen by Struck (1912) and,
in English, as “Togo Remnant languages”,1 for instance by Westermann and
Bryan (1952: 96). Their genetic classification and cultural history have re-
mained controversial (see e.g. Nugent 1997, 2005). The distinctive typologic-
al features of these languages that separate them from the surrounding lan-
guages such as Ewe and Akan include (i) the active noun class system similar
to that of the Bantu languages which they have inherited, undoubtedly, from
Proto-Niger-Congo, (ii) head marking at the clause level through subject
cross-referencing on the verb, and (iii) their highly agglutinative nature espe-
cially their derivational verb morphology. Some scholars wonder whether the
group of languages is not a socio-cultural or a typological grouping masquer-
ading as a genetic unit (Blench 2001, Maddieson 1998). Nevertheless, they
are classified as belonging to Kwa (Niger-Congo). It is difficult, however, to
establish the Ghana-Togo-Mountain (GTM) languages as a group in relation
to Kwa. Heine (1968) subclassified the GTM languages into Ka-Togo and
Na-Togo subgroups. The current view is that the two subgroups branch out
individually from Proto-Kwa as in Figure 1 (Williamson and Blench 2000).
Sɛkpɛlé is a Na-Togo language and has two major dialect divisions, name-
ly, Sɛkpɛlé and Sekwa.2 Table 1 shows the dialects and the villages where
they are spoken.
108 Felix K. Ameka
Ega
Avikam
Alladian
Ajukru
Abidji
Abbey
Attié
Potou Ebrie
Mbatto
Krobu
Potou-Tano
Abure, Eotilé
West Tano
Akan
Tano Nzema-Ahanta
Central Tano Anyi, Baule, Anufɔ
Bia
Efutu-Awutu
Guan South Larteh-Cherepong-Anum

Northern Guang
Ga
Proto-Kwa
Dangme
Lelemi-Lefana
Akpafu-Lolobi
Lipke, Santrokofi
Na-Togo Logba
Basila, Adele
Avatime
Nyangbo-Tafi
Ka-Togo
Kposo, Ahlo, Bowiri
Kebu, Animere
Ewe
Gbe Gen, Aja
Fon-Phla-Phera
Figure 1. Proto-Kwa (from Williamson and Blench 2000: 29)
A large part of the Likpe-speaking population is multilingual in Sɛkpɛlé,

Ewe – the dominant lingua franca – and Akan, not to mention (Ghanaian) Eng-
lish (cf. Ring 1981). Ewe, the dominant lingua franca, co-exists with Likpe in
all spheres of life in the home, the community, in the church, in the market,
Likpe (Sɛkpɛlé) 109
Table 1. Sɛkpɛlé dialects and their geographical distribution

Language Sɛkpɛlé
Dialects Sekwá Sɛkpɛlé
Sub-dialects L2 communities Situnkpa Semate Sela
Villages Bakwa Alavanyo Avedzime Mate Bala
Todome Wudome Agbozume Abrani Kukurantumi
Nkwanta Koforidua
© Likpe Local Committee

Map 1. Likpe Traditional area
110 Felix K. Ameka
at school and other public ceremonies such as funerals and marriages. Akan is
less prominent, although it is the lingua franca in a neighboring community.
Likpe was first written in about 1933 using the Ewe orthography (see Dogli
1933). The next time there was writing was with the new wave of literacy and
language development under the auspices of the Ghana Institute of Linguis-
tics, Literacy and Bible Translation (GILLBT) in 1987. Today there are pam-
phlets of stories, proverbs and literacy materials as well as a pamphlet contain-
ing a translation of the letters of Paul (New Testament). There are also cassettes
containing some drama and text in Likpe. Likpe is used partially as medium of
instruction and is taught outside the regular curriculum in some villages after
school hours to both Primary and Junior Secondary School pupils.
The focus of the present chapter is on the grammatical changes in Sɛkpɛlé
due to the influence of Ewe, the dominant lingua franca, and areal conver-
gence. That is, the impact of the neighbouring languages on Likpe grammar.
Likpe seems to have innovated constructions such as the present progressive
and an Undergoer Voice construction due to Ewe influence. At the same time
there are several patterns that are found in Likpe due to pressures of areal adap-
tation, for instance, postpositions and arguably, serial verb constructions have
emerged in the language through this mechanism. There are also grammatical
items such as intensifiers and particles and interjections which are also shared
among the neighbouring languages, but Likpe has borrowed some connectives
and a complementizer directly from Ewe (see also Ameka 2007).
2. Typology
Likpe is an agglutinative language with some head marking at the clause level
and dependent marking in the NP (except for qualifiers), properties presuma-
bly inherited from Proto Niger-Congo. Features of nouns, including class and
number, are marked by prefixes on nouns. Ewe, on the other hand, is predomi-
nantly isolating with agglutinative features. Plural marking in Ewe is by an
enclitic which attaches to the element in the NP that occurs before the Intensi-
fier. Likpe marks plural number for a small set of nouns by a suffix. The use of
such a structure in Sɛkpɛlé is due to its contact with Ewe (see Section 3).
The functional load of the morphological process of Reduplication seems
to have increased in Likpe due to its contact with Ewe. Some adjectival modi-
fiers in Sɛkpɛlé are formed by verbal reduplication similar to the pattern found
in Ewe. For instance,
(1) a. Reduplication of a property verb to form a qualifier

ná ‘become.black, dirty’ ná-ná ‘dirty’
bù ‘become.wet, rotten’ bùbù ‘wet, rotten’
b. Reduplication of action verbs to form result-state qualifiers
là ‘cut’ là-là ‘torn’
f ‘to arrive newly’ f-f ‘new’
Gerund formation of verb and complement NP structures involves the re-

duplication of the verb with the NP preposed to it. Such a process is avail-
able in Ewe and could have influenced the process. In fact it appears that
the gerund involving reduplication is in competition with the ‘older’ form of
gerund formation involving the use of the class marker for deverbal nominals.
Compare both types of gerund formation shown in (2):
(2) a. bi-sí-tk-tk Compare Ewe te-ʃa-ʃa

cm-yam-red-be.on yam-red-plant
‘yam planting’ ‘yam planting’
b. bi-sí bu-tkə
cm-yam cm-be.on
‘yam planting’
The gerunds formed by reduplicating the verb part seem to be entering the
language through translation, especially of texts mediated through Ewe.
Likpe noun stems tend to participate in sg/pl class pairing system for number
marking. Some kin terms do not have plural counterpart classes. Kin terms
belonging to ego’s parents’ generation and above and proper names are suf-
fixed with the form -m ‘pl’ to signal their plurality. This form is heterosemi-
cally related to the third-person plural pronoun.
(3) a. Kofi kú Áma-mə́ ə-diə.

name com name-pl scr-quarrel
‘Kofi and Ama and co quarrelled.’
b. ambe ‘mother’ ambe-mə́ ‘mother-pl’
éwú ‘grandmother’ éwú-mə́ ‘grandmother-pl’
112 Felix K. Ameka
This Likpe structure is a replication of the Ewe pattern where plural number
is marked by a clitic =wó ‘pl’, which is attached to the last element in the NP
before the intensifier, and which is also in a heterosemic relation to the ‘3pl’
pronominal form wó. The use of the Ewe form as an associative plural, that is,
N=wó means ‘N and co’, e.g. Áma=wó [Ama=pl] ‘Ama and co’, could have
served as the model for the copy.
4. Verbal structure
Likpe, like its closest genetic relatives, marks tense, aspect and mood categor-
ies by prefixes on the verb (including for example past progressive). How-
ever, it has developed a present progressive periphrastic construction similar
to the one found in Ewe which has the form: Subject -l ‘hold’ (NP) Gerund.
(4a) below is a clause in the aorist with no segmental marker for the category,
while (4b) is an instantiation of the progressive construction in relation to the
state of affairs represented also in (4a) (see Ameka 2002).
(4) a. o-té ka-m.

3sg-sell cm-rice
‘She sold rice.’
b. ɔ-l ka-m bo-té.
3sg-hold cm-rice cm-sell
‘She is selling rice.’
Compare the Ewe equivalent of (4b) in (4c) in terms of the structure and order
of the elements. The phonetic similarity between the operator verb in the two
languages could have facilitated the adaptation of the structure into Likpe.
(4) c. Ewe
é-le mlu dzráx-ḿ.
3sg-be.at:pres rice sell-prog
‘She is selling rice.’
An operator for the expression of necessity has also been adopted into the lan-
guage. This may be due to areal rather than Ewe specific influence. Moreover,
the construction is one of the structures in which Likpe uses a complemen-
tizer that is borrowed from Ewe, namely b ‘quot’, as illustrated in (5).
(5) é-hiɛ̃´ bə́ u-tsyi wə ú-su u-bíkə.

impers-need quot 3sg-carry 3sg 3sg-go 3sg-bury
‘It was necessary that he (Skunk) should take her (his mother) to go
and bury.’
There is no special marking of verbs that are borrowed. They are only phono-
logically adapted; for example, the verb form that is used in example (5) above
has the form hia ‘need’ in both Ewe and Akan, but is adapted in Sɛkpɛlé as
hiɛ ‘need’.
Likpe uses several patterns for marking voice/valency that conform to
areal patterns: the reflexive is formed by the use of a pronoun plus a gram-
maticalized form of the word for ‘body’, i.e. əsúə (similar to the structure one
gets in Akan, but not in Ewe). The use of 3pl to express general subjects and
impersonal passive meanings is also an areal pattern.
There is an Undergoer Voice construction expressed periphrastically where
the Undergoer-like argument is linked to the subject position of the clause
and of an operator verb nɔ ‘hear’.3 The operator verb takes a nominalized verb
complement with the Actor-like argument, if expressed, functioning as its
first object (and like the Goal argument in a double object construction). The
semantics of the construction belongs to the semantics space of the so-called
‘potential passives’. There is a similar periphrastic construction in Ewe but
its operator verb is a modal verb nyá, grammaticalized from the verb ‘know’.
The Actor-like argument in the main event, if expressed, is marked as a dative
(experiential) object in the Ewe construction, as illustrated in (6).
(6) Ewe
nynu-a nyá kp-ná (ná-m).
woman-def mod see-hab dat-1sg
Lit: ‘The woman is see-able (to me).’
i.e. ‘The woman is beautiful (to me).’
(7) Likpe
u-sió -m á-nɔ (mɛ) bó-be.
cm-woman agr-det scr-hear 1sg cm-look
Lit: ‘The woman hears (me) looking.’
i.e. ‘The woman is beautiful (to me).’
114 Felix K. Ameka
(8) Likpe
n-t á-nɔ bú-nə í-tə́ be-tsyúé.
cm-alcohol scr-hear cm-drink 3sg:impers-give cm-some
Lit: ‘Alcohol hears drinking give some.’
i.e. ‘Alcohol-drinking is enjoyable to some (people)’,
‘Some people like drinking alcohol.’
One argument in support of the Ewe construction potentially influencing

the Likpe construction comes from the periphrastic nature of the construction
in Likpe. One would have expected an affixal marking of such a meaning on
the verb in Sɛkpɛlé. Secondly, if the Actor is expressed it takes the form of a
Goal argument similar to the dative marking in Ewe. Thirdly, there is a varia-
tion on the expression of the Actor-like argument in the Sɛkpɛlé construction
which replicates the Ewe pattern, as illustrated in (8) (see Ameka 2005a for
further details on the construction in both Ewe and Likpe).
Likpe has borrowed a few grammatical items from Ewe and also makes use
of several forms that are found throughout the Lower Volta Basin area, i.e.,
the area in which Kwa languages are spoken. It has borrowed the contrast
connector gaké ‘but’ from Ewe and adapted it as appropriate as kaké ‘but’
in the Sɛkpɛlé dialect since unlike Sekwa, there are no [−anterior] [+voice]
consonants in the Sɛkpɛlé dialect.
Languages in the area tend to have two or three disjunction markers, one
of which tends to be used in interrogative contexts. Likpe has two disjunc-
tion markers: nyé ‘this or that, it does not matter which’ and lee ‘this or that,
I don’t know which’. It appears the latter form is influenced by one of the
disjunction markers in Ewe, the form lóó ‘this or that, I don’t know which’.
Because of their ignorative feature they tend to be used in interrogative con-
texts and can link phrases or clauses. A hint that this may be the case is that
Tuwuli, another GTM language, has a cognate disjunctive marker nye ‘or’
used in general contexts while in interrogative contexts the other form mbɔe
is used (Harley 2005). Likpe seems thus to have adapted the Ewe form for
use in interrogative contexts.
Likpe has also borrowed a complementizer b ‘comp, quot’ from Ewe
which it uses in addition to its own complementizer ŋkə ‘quot’. Sometimes
the Ewe and the Likpe complementizers are doubled. Compare the use of
both forms in similar contexts in (9) and (10) both taken from a Likpe settle-
ment history narrative.
(9) sé ɔfu kɔdzó -m le-te ŋkə məə-tsyá

when name name agr-det dep-know quot 3pl-too
a-slé eto be-tídi bé-ni ko ŋkə oo, atúu …
cm-church poss cmpl-person 3pl-cop int quot interj welcome
‘When Ofu Kwadzo got to know that they too were church people, he
said oo welcome, (he and them will work together).’
(10) nyã bəə-b bə-tu m nyã bə́ə oo ka-sɔ kpé.

and 3pl-come 3pl-meet 3pl and quot interj cm-land be.in
‘And they came to meet them and they said “oh there is land”.’
While the relativizer itself is not borrowed, Likpe seems to be develop-

ing the use of the definiteness determiner as an optional relative-clause-final
marker. This is an areal pattern. The interesting thing is that in the other lan-
guages like Ewe, Akan, Ga etc. the same form that is used in this relative
clause context is used to mark all other background information constituents
such as left-dislocated topic NPs and preposed adverbial clauses such as con-
ditionals and temporals. In Likpe, as we shall see below, there are different
forms for marking these, one of which is a borrowed form from Ewe.
(11) kɔsídá kó l-yɛ (mə́)

week agr dep-pass det
‘the week which passed, the past week’
In addition, two adverbial connectors are borrowed from Ewe: álé bé ‘so
that’ (example 12) and tógb bé ‘although’.
(12) álé-bé ŋko ni kasé min-yi ba-kpɛlé eto akokosa n.

thus-quot this cop how 1sg-know cmpl-Likpe poss history emph
‘So this is how I know the history of Likpe (to be).’
Various intensifying words (focus particles) found in Likpe are forms

that have diffused through the West African littoral area. However, the Likpe
forms resemble more closely the forms used in Ewe than the others. Thus
words such as ko ‘only, just’, boŋ ‘rather’, tsyá ‘also’ are shared with Ewe.
Similarly, interjections such as ahã̂ ‘now I know’ and fillers like oo, an
116 Felix K. Ameka
utterance-initial vocative particle, o …, answer particles like ee ‘yes’, o ‘no’;

agreement-signalling particles like yoo ‘OK’ and the palatal click with nasal
release, both as an agreement marker and signalling continuation in the sense
of ‘I understand’, are all shared by various languages in the area. Additionally
utterance final particles for expressing attitudinal meanings; e.g. ló ‘I advise
you’, are widespread in the area. Some of these forms give evidence of con-
vergence among the languages in the area and their occurrence in Likpe can
be attributed to areal adaptation.
A form ma- used for ‘privative’ derivations in Ewe has occurred in some
derived words in Likpe (see example 13). But it is unclear whether it is copied
from Ewe or it is retained from some ancestor language since other GTM lan-
guages, e.g. Tuwuli (Harley 2005) seem to have similar forms. (But of course
they could also have taken it from Ewe.)
(13) a. Likpe
lə-fə n bə-n-sí ko kasɔ́-ma-nɔ-ma-nɔ
cm-time agr 3pl:hab-lig-sit int under-priv-hear-priv-hear
ə-b-lu-f m bə-tsyú l ntí.
scr-vent-leave-dir 3pl cmpl-neighbour loc midst
‘The time they stayed there then misunderstanding emerged
among their neighbours.’
b. Ewe
nú-gɔme-ma-se-ma-se
thing-under-priv-hear-priv-hear
‘misunderstanding’
Likpe has also adopted the background information marking particle lá

‘tp’ from Ewe (although it is an item that is consciously recognized and ed-
ited out of texts).
(14) kasé mi-nɔ nyã ní b bó ba-kpɛlé lá

how 1sg-hear 3sg cop quot 1pl cmpl-Likpe tp
bo kə-síə-kɔ fefe ka búu-siə ní atébubu.
1pl cm-sit-place last agr 1pl:past-sit cop name
‘How I heard it is that we the Likpe people, our last place of settle-
ment where we stayed was Atebubu.’
The background information particle in Ewe is in a heterosemic relation to

the definiteness marker and is used at the end of left dislocated NPs, connec-
tors as well as preposed adverbial phrases and clauses and relative clauses. In
Likpe, the Ewe form la´ occurs in all these contexts. The Likpe indigenous
form of marking background information in these contexts is to lengthen the
phrase-final vowel.
7. Syntax
One of the features of Likpe grammar that could have emerged due to areal
pressure from the surrounding languages like Ewe and Akan is verb seriali-
zation in a single clause. Dimmendaal (2001: 386) claims that the spread of
serial verb constructions (SVC) to the GTM languages could account for the
reduction in verb derivational morphology that is in progress in these lan-
guages (cf. Hyman 2004). Perhaps an indication that this may be so is that the
features of SVCs in Sɛkpɛlé share some features with Akan SVCs and other
features with Ewe SVCs, and there are other features that are common to all
three languages. For instance, the shared subject argument is expressed with
each verb in the SVC in both Likpe and Akan, but expressed only once in an
Ewe SVC. Negation, on the other hand, is expressed only once in an SVC in
both Likpe and Ewe but recapitulated with each verb in Akan. For all the lan-
guages the verbs in an SVC should be marked for aspect and modality values
that are semantically compatible (see Ameka 2005b).
Likpe uses locative verbs in the expression of predicative possession. The
verbs are kpé ‘be.in’, t ‘be.at’ and tk ‘be.on’. These are used to express both
general location and ‘have’ possession (see Ameka 2007b for a discussion
of their use and semantics). With each of these verbs there are two constitu-
ent orders for the possessive use: one in which the Possessor is linked to the
Subject position and the possessed phrase linked to the object function. The
second order involving Figure–Ground reversal has the possessed linked to
the subject position and the possessor to the post-verbal object position. The
two orders are illustrated for the verb kpé ‘be.in’ below.
(15) a. Possessor–verb–possessed
o-kpé a-fokpá
3sg-be.in cmpl-footwear
‘He has shoes.’
b. Possessed–verb–possessor
a-fokpá kpé wə
cmpl-footwear be.in 3sg
‘He has shoes.’
118 Felix K. Ameka
It is possible that the second order in (15b) is due to Ewe influence. Ewe uses
a general locative verb le ‘be.at:pres’ in a periphrastic construction to express
predicative possession: ‘have’. The structure has the form possessed–verb–
possessor NP + possessive postposition si ‘hand’. This Ewe structure is the
only order possible for expressing predicative possession in Ewe using the
locative schema.
8. Lexicon
A number of constructions that one finds in Likpe grammar are based on se-
mantic formulas that are available in other languages in the area. For instance,
a verb–noun collocation which literally translates as ‘see/look way/road’ is
used to express the idea of ‘hope’ as illustrated for Likpe in (16).
(16) ó-be ku-sú ŋkəə mba uuka-wuuns-ko é-bu-b

3sg-look cm-way quot those 3sg:phab-help-assoc 3pl:fut-come
ba-wuuns-ko w u-bik wo ambé -m.
3pl-help-assoc 3sg 3sg-bury 3sg mother agr-det
‘He hoped that those who he used to help would come to help him
bury his mother.’
Similarly various verb–verb collocations in serial verb constructions form

semantic formulas for the expression of particular meanings in the serializing
languages. One such formula is for the expression of the sense of ‘believe’. It
is composed of two verbs the first of which is invariably a ‘receive/get’ verb
and the second an ingestion or imbibing verb such as ‘eat’ or ‘hear’. Likpe
makes use of the ‘receive hear’ pattern which is the same pattern that we find
in Ewe. The Likpe form is exemplified in (17).
(17) n-fo n-nɔ míə yɔɔ-lkɛ.

1sg-get 1sg-hear 1sg:quot 3sg:fut-be.good
‘I believe it will be good.’
A formula for expressing the notion of ‘begin’ relates to making contact with
the bottom of the situation that is began. Such expressions occur in several
languages in the area.
9. Conclusion
Various changes have taken place and continue to take place in Likpe gram-
mar due to its contact with Ewe on the one hand, and due to pressures of areal
adaptation, on the other. The present progressive construction, plural mark-
ing on kinship terms, Undergoer Voice construction, and constituent order
for predicative possession are constructional patterns that have been directly
replicated from Ewe. Patterns involving serial verb constructions and post-
positions, intensifiers and various semantic formulas occur in Likpe due to
their presence in the convergence area. Furthermore, Likpe has borrowed
lexical as well as grammatical items from Ewe replacing indigenous terms
in some cases. It is significant that the grammatical items that have been
borrowed have discourse structuring or organizational functions such as a
contrast marking conjunction, a disjunction marker, and reason expressing
connectives. Nouns and verbs are also borrowed. Borrowed verbs do not re-
ceive any special treatment. Nouns borrowed into Likpe are integrated on the
basis of form and meaning into the noun class system. Thus a noun like agbeli
‘cassava’ which in Ewe is made up of a prefix a- and the stem and which is
singular or collective is borrowed into Likpe and analysed as a plural noun to
fit the a- plural class and a singular form is formed as le-gbeli ‘one tuber of
cassava’. It is in this domain that the borrowing of nouns has an effect on the
grammar of Sɛkpɛlé.
Abbreviations
agr agreement marker fut Future

assoc associative verb extension impers impersonal
cm noun class marker int intensifier
com comitative interj interjection
comp complementizer lig ligature
cop copula loc locative
dat dative (preposition) neg negative marker
dem demonstrative past past tense
det determiner phab past habitual
dep pragmatically dependent pl plural
subject cross-reference poss possessive marker
dir directional pot potential
emph emphatic particle pres present
120 Felix K. Ameka
priv privative scr subject cross-reference marker

prog progressive sg singular
q propositional question tp background topic marker
red reduplicative
Notes
1. They have also been referred to as “Central Togo” (Dakubu and Ford 1988).
2. The findings reported here are based on my own field investigations that I have
been conducting in various Likpe communities intermittently over the past dec-
ade or so. I am very grateful to my consultants, especially E. K. Okyerefo,
Justina Owusu, Cephas Somevi, Comfort Atsyor and the late A. K. Avadu, for
patiently teachinjg me their language.
3. I use the terms Actor and Undergoer following their usage in Role and Refer-
ence Grammar (RRG) see Van Valin and La Polla (1997: 141147). To charac-
terize the objects in a double object construction, I use the terms “Theme” and
“Goal”. The latter is a cover role for the object with the semantic relation of
recipient, beneficiary, maleficiary etc. or the dative more generally.
References
Ameka, Felix K.
2002 The progressive aspect in Likpe: Implications for aspect and word
order in Kwa. In: Felix K. Ameka and E. Kweku Osam (eds.), New
directions in Ghanaian Linguistics, 85111. Accra: Black Mask
2005a “The woman is seeable” and “The woman perceives seeing”: Under-
goer voice constructions in Ewe and Likpe. In: M. E. Kropp Daku-
bu and E. Kweku Osam (eds.), Studies in the Languages of the Volta
Basin, Volume 3, 4362. Legon: Department of Linguistics, Univer-
sity of Ghana.
2005b Multiverb constructions on the West African littoral: Microvariation
and areal typology. In: M. Vulchanova and T. A. Åfarli (eds.), Gram-
mar and Beyond: Essays in Honour of Lars Hellan, 1542. Oslo:
Novus Press.
2007a Grammars in contact in the Volta Basin (West Africa): On contact in-
duced grammatical change in Likpe. In: A. Y. Aikhenvald and R. M. W.
Dixon (eds) Grammars in Contact: A Cross-Linguistic Typology, 114–
142. Oxford: Oxford University Press.
2007b The coding of topological relations in verbs: The case of Likpe (Sɛkpɛlé).
In: Felix K. Ameka and Stephen C. Levinson (eds.), The Typology and
Semantics of Locative Predication: Posturals, Positionals and Other
Beasts, 10651103 (Linguistics 45 (5/6), special issue).
Blench, Roger M.
2001 Comparative Central Togo: What have we learnt since Heine? Paper
presented at the 32nd Annual Conference on African Linguistics: Ber-
keley, 2325 March 2001.
Dakubu, M. E. Kropp, and Kevin C. Ford
1988 The Central-Togo languages. In M. E. Kropp Dakubu (ed.), The lan-
guages of Ghana, 119154. London: Kegan Paul.
Dimmendaal, Gerrit J.
2001 Areal diffusion and genetic inheritance: An African perspective. In:
A. Y. Aikhenvald and R. M. W. Dixon (eds.), Areal diffusion and gen-
etic inheritance, 358392. Oxford: Oxford University Press.
Dogli, A. (Rev.)
1933 Likpe Catechism No 1 and 2. Keta: Catholic Printing Office
Heine, Berndt
1968 Die Verbreitung und Gliederung der Togorestsprachen. Berlin: Diet-
rich Reimer.
Harley, Matthew
2005 A descriptive grammar of Tuwuli: A Kwa language of Ghana. Ph.D.
thesis, SOAS.
Hyman, Larry
2004 How to become a Kwa verb. Journal of West African Languages 30
(2): 6988.
Maddieson, Ian
1998 Collapsong vowel harmony and doubly articulated fricatives: Two
myths about the phonology of Avatime. In: Ian Maddieson and Thomas
Hinnebusch (eds.), Language History and Linguistic Description in
Africa, 155166. Trenton: Africa World Press.
Nugent, Paul
1997 Myths of Origin and Origins of Myth: Politics and the Uses of History
in Ghana’s Volta Region. Berlin: Das Arabisch Buch.
2005 A regional melting pot: The Ewe and their neighbours in the Ghana-
Togo borderlands. In: Benjamin Lawrence (ed.), The Ewe of Togo and
Benin, 2945. Accra: Woeli.
Ring, Andrew J.
1981 Ewe as a Second Language: A Sociolinguistic Survey of Ghana’s Cen-
tral Volta Region. Legon: Institute of African Studies.
1995 Lɛlɛmi tone. Papers from GILLBT’s seminar week 30 January–3 Feb-
ruary 1995, Tamale, 1995: 1626.Tamale: GILLBT Press.
122 Felix K. Ameka
Struck, R.
1912 Einige Sudan-Wortstämme. Zeitschrift für Kolonialsprachen 2: 233–
253, 309323.
Van Valin, Robert D. Jr, and Randy J. La Polla
1997 Syntax: Structure, Meaning and Function. Cambridge: Cambridge
University Press
Westermann, Dietrich, and Margaret A. Bryan
1952 Languages of West Africa. Handbook of African Languages, Vol-
ume 2. London: Oxford University Press, for the International African
Institute.
Williamson, Kay, and Roger Blench
2000 Niger-Congo. In: Bernd Heine and Derek Nurse (eds.), African Lan-
guages: An Introduction, 1142. Cambridge: Cambridge University
Press.
Grammatical borrowing in Katanga Swahili
Vincent A. de Rooij
1. Background1
Katanga Swahili, also known as Shaba Swahili, is a contact variety of Swahili

spoken in the urban centers of Southern Katanga, dr Congo. Katanga Swahili
resulted out of contacts between speakers of closely related Bantu languages.2
Despite the strong structural similarities of these languages, the phonology,
tma system, and morpho-syntax of Katanga Swahili have been restructured
considerably. Elsewhere (de Rooij 1997), I argued that, apart from adult sec-
ond language learning, nativization, i.e. first language acquisition by locally
born children, may have played a role in the genesis and development of
Katanga Swahili. I also argued that selective simplification (Kapanga 1993)
cannot have been the sole restructuring process since even features which are
characteristic of both the ‘lexifier’ language, the East African Coast variety of
Swahili, and the genetically related ad/substrate languages, such as agglutina-
tive verbal morphology and the noun-class agreement system, have been lost
to varying degrees in Katanga Swahili.
Katanga Swahili is widely used as a first language in Southern Katanga
and is spoken by an estimated number of at least 2 million people.3 In the
cities of the Southern Katanga Copperbelt, multi-lingualism is widespread.
Apart from Katanga Swahili, many speak French, Congo’s official language,
and some ‘ethnic’ language (Kabamba 1979). Katanga Swahili is used in
all informal settings: it is used in the domestic sphere but also in informal
public settings (public transport, markets, shops). Informal notes and letters
are often written in Swahili but since Katanga Swahili has no standard orth-
ography, writing is often done in idiosyncratic ways (cf. Blommaert 1999,
2004; Fabian 1990). Books and newspapers in Swahili are widely available
but the variety used in these publications is very similar to the Standard Swa-
hili of Tanzania and is almost like a foreign language to speakers of Katanga
Swahili.
According to Fabian (1986) Katanga Swahili was well established as a
language distinct from the East African variety of Swahili by 1940. Katanga
Swahili in its present form, however, cannot be taken as representing the
language as spoken around 1940, although there are probably no dramatic
124 Vincent A. de Rooij
differences between the two stages. Between 1940 and the present day, the
urban centers of Southern Katanga and Elisabethville (the colonial name of
present-day Lubumbashi) in particular, have absorbed large numbers of mi-
grants, mainly from the neighbouring Kasai provinces (Fetter 1976: 173–
176). The massive influx of Luba-Kasai, Songye, Kanyok and Kete speakers
from Kasai, all of whom have/had to learn Swahili, does/did of course have
an impact on the language. It should also be noted that the majority of these
migrants’ children acquire/d Katanga Swahili as their first language instead
of their parents’ native language. Therefore, Katanga Swahili as it is spoken
today is to be seen as the outcome of processes of second language learning
and nativization that have been going on since the 1940s.
2. Phonology
Katanga Swahili has a symmetrical five-vowel system (see Table 1), also
found in ad/substrate languages. Just like East Coast Swahili, it does not
have phonemic vowel lengthening nor does it have grammatical, apart from
one exception, or lexical tone which are all prominent in the ad/substrate lan-
guages. Phonetic values of /e/ and /o/ range, depending on the environments
they occur in, from [e] to [ɛ] and from [o] to [ɔ ] respectively.
East Coast Swahili dental and velar fricatives do not occur in Katanga
Swahili. In East Coast Swahili, they are found exclusively in Arabic borrow-
ings. Many of these words denote Islamic concepts and therefore play no role
in Katanga where Islamic influence is virtually non-existent. In the few gen-
erally used words of Arabic origin, dental fricatives /ð/ and /θ/ become /z/ and
/s/ respectively. The velar fricative /γ/ occurs in Katanga Swahili as /k/. ECS
glottal fricative has also been lost. Several other ECS phonemes have not
been fully retained in Katanga Swahili either, most probably due to ad/sub-
Table 1. Katanga Swahili vowels

front back
close i u
close-mid e o
open a
Katanga Swahili 125
Table 2. Katanga Swahili consonants

labio- post-
bilabial dental alveolar alveolar palatal velar glottal
plosives pb t <d> k <g>
nasals m n
fricatives f <v> s <z> <h>
affricates tʃ <dʒ>
trills <r>
tap
lateral l
approximant
glides w j
prenasalized mp mb mf mv nt nd nʃ nk ng
consonants ns nz ndʒ
Note: <c>indicates ECS phoneme with weak phonemic status in Katanga Swahili
strate influence from Bemba and Luba-Kasai, where they do not function as
phonemes (cf. Kashoki 1968 and Burssens 1939). In Luba-Kasai and Bemba
we do not find: voiced palatal affricate /ʤ/, voiced velar plosive /g/, glottal
fricative /h/, and alveolar vibrant /r/. In Katanga Swahili, these are commonly
replaced by /j/, /k/, ø, and /l/ respectively. Furthermore, in Bemba we do not
find: voiced alveolar plosive /d/, voiced labio-dental /v/ and alveolar fricative
/z/. These are often replaced by /l/, /f/, and /s/ respectively.
/s/, /z/, /t/ are often palatalized, especially by speakers with a Luba-Kasai
background, to [ʃ], [ʒ], and [c~tʃ] respectively when followed by /i/. Further-
more, the ECS preverbal tense affix -li- is normally pronounced as [ri] or[ɾi].
Katanga Swahili has a range of prenasalized consonants that are also found in
substrate languages (cf. Bostoen 1997: 91). The status of some of these prena-
salized consonants as phonemes remains to be settled (de Rooij 1997: 335).
It seems clear that contact with ad/substrate languages has resulted in a
phonological system that has drifted away from the East Coast Swahili sys-
tem and has become more similar to the systems of ad/substrate languages.
Phonemes that do not occur in ad/substrate languages have been lost, partly
or completely. The clearest example of this process is /h/ which is almost cat-
egorically left unarticulated. Other phonemes have blended into one, where
the phoneme that is absent in one or more ad/substrate languages has been
lost or weakened. An example is the blending of East Coast Swahili velar
plosives /g/ and /k/ into /k/.
Nominal structures in Katanga Swahili are more analytic than those in ECS and
ad/substrate languages. Katanga Swahili has retained the typically Bantu noun-
class agreement system, but has done so in a reduced and simplified form.
Table 3 lists the noun class prefixes of ECS and Katanga Swahili. Noun
classes 1 through 10 are arranged pairwise where the even numbered class
prefix denotes plurals and the odd numbered class prefix singulars (e.g. class
7 ki-tabu ‘book’ versus class 8 bi-tabu ‘books’). The number of noun class
prefixes in Katanga Swahili has increased in comparison to ECS: although it
has lost one (Cl. 10 collapses with Cl.6) it has added three (Cl. 11, 12, 13)
which have been borrowed from Luba-Kasaï and Bemba. The differences in
morphophonemic shapes of noun class prefixes 1, 2, 3, 5, 8, 11, and 14 can
also be attributed to ad/substrate influence.
Table 3. Noun class prefixes in Katanga Swahili, ECS, Luba-Kasaï and Bemba (some
variants of prefixes not shown for reasons of clarity)
Noun class ECS Katanga Luba-Kasaï Bemba
Swahili
1 m- mu- mu- mu-
2 wa- ba- baba-
3 m- mu- mu- mu-
4 mi- mi- mi- mi-
5 ji-/ri-/∅- di- li-
6 mama- mama-
7 ki- ki- t- t-
8 vi- bi- bi- fi-
9 N-/∅- N- niN-
10 N- niN-
10 → 6 ma-
11 u- lu- lu- lu-
12 kaka- ka-
13 tu- tu- tu-
14 u- bubu- bu-
15 ku- ku- ku- ku-
16 papa- ɸa- pa-
17 ku- ku- ku- ku-
18 mu- mu- mu- mu-
Katanga Swahili 127
It should be noted that infinitives are morphologically marked as nouns by

noun class 15 prefix ku-. The locative classes 16, 17, and 18 have a different
status than they have in ECS. In Katanga Swahili, pa-, ku-, and mu-, occur
as pre-prefixes and may function as prepositions, as shown in (13) where mu
is followed by an NP consisting of a demonstrative (ile) and a plural noun
(mashiku ‘days’).
(1) (…) mais u-na-kufwa mu ile ma-shiku.

but you-tma-die loc dem 6-day
‘(…) but you will die during that period.’ (Félicien/VDK1: 8/38)
In ECS, on the other hand, locative phrases are formed by suffixing a general
locative affix -ni to a noun, while pa-, ku-, and mu- can only be affixed to
noun modifiers. The use of locative prefixes as pre-prefixes in Katanga Swa-
hili must be attributed to ad/substrate influence, since it occurs in all Central
Bantu languages (Grégoire 1975: 17). The semantics of Katanga Swahili pa-,
ku-, and mu- is the same as in the ad/substrate languages and ECS where,
roughly speaking, pa- expresses a general locative meaning (at), ku- ex-
presses direction (toward), and mu- expresses being inside of (in).
Noun–Adjective agreement has been simplified radically: most adjec-
tives have only one generalized form that is used with nouns from differ-
ent classes. Agreement is marked most strongly in classes 1,2,7,8,12,13,14,
especially in the plural classes 2 and 8 among these, but ultimately depends
on the strength of the generalized form. This phenomenon does not occur in
neighboring languages and can, therefore, not be the result of borrowing.
Reduction and simplification is also found in subject and object agree-
ment on the verb. In Katanga Swahili object concord markers co-indexing
non-human objects are very seldomly used, except for classes 7/8. The gener-
alized, but not categorical (Bostoen 1997: 106), use of i-, as a subject marker
in classes 3 through 10 is striking. This restructuring cannot be explained
by invoking ad/substrate influence, because the ad/substrate languages make
use of the same agreement system as ECS, where markers have roughly the
same morphophonemic shape as the prefixes of the nouns they refer to. The
use of marker i- seems to correlate strongly with the feature [−human]: it
does not occur with nouns belonging to classes 1/2, denoting human beings
while its use is favored in all other classes except classes 7/8 and 11 through
14. Noun class prefixes 11 through 14 stand apart from the others in that
they are used productively to derive nouns with very specific meanings (e.g.
diminutives).
Katanga Swahili has the preverbal tma affixes that are typical of Bantu lan-
guages. According to Schicho (1988,1990) the ECS preverbal tense affixes
that have survived in Katanga Swahili have lost much of their meaning as
realizations of tense and may in many cases be regarded as a kind of dummy-
elements that have to be realized for morpho-syntactic reasons. The most
frequently used preverbal tense affixes in Katanga Swahili are: -na- mark-
ing present tense, -li- or -ri- marking past tense, and -ta- marking future
tense. According to Schicho (1988: 568), in a narrative sequence time refer-
ence needs to be marked only once by a tense affix on a verb, by sentence-
initial adverbs, or may even be left unexpressed if time reference can easily
be inferred from contextual information. Schicho claims that the following
distinctions provide the basis for what he calls the Aspect-dominated tma
system of Katanga Swahili:
1. [+anterior]
(including [perfect/resultative])
2. [−anterior, −posterior]
3. [posterior/irrealis]
4. [progressive]
(including [habitual], [intensive], [durative], [iterative])
tma is often expressed by auxiliary verbs. [+anterior] with perfective/re-

sultative aspect can be expressed by using -toka (mu-) ‘leave, quit (loc)’, and
-isha ‘finish’, also shortened to -sha as shown in (2). -isha is also used in this
way in ECS but -toka (mu) is not.
(2) mi-na-isha ku-pakala vernis.

I-tma-finish inf-apply varnish
‘I have already varnished it.’ (Schicho 1988: 569)
[posterior/irrealis] can be expressed by -tafuta ‘look for’ and by -enda ‘go’

as in (3).
(3) (h)a-ba-ta-enda ku-ra nani, ku-ra nkuku.

neg-they-tma-go inf-eat filler inf-eat chicken
‘they will not eat ehm, eat chicken.’ (Michel/M1: 5/09)
Katanga Swahili 129
[progressive] with habitual, intensive, durative, or iterative aspects can be ex-

pressed by the verb -anza ‘start, begin’ as in (4) or by the copulative element
-ko- used as a preverbal affix as shown in (5).
(4) a-na-anza ku-fan(ya) ma-bêtise.

he-tma-start inf-do 6-stupidity
‘He started (and went on) doing foolish things.’ (Dédé m./M1: 7/08)
(5) disons be-ko-na< be-ko-na-soigner eh?

let’s say they-cop-tma< they-cop-tma-care for tag
‘Let’s say they provide medical care, don’t they?’ (Félicien/VDK1: 7/17)
In ECS -isha is used to mark perfective/resultative aspect but the use of -anza
‘start’ to mark habitual/durative aspect and -enda ‘go’ or -tafuta ‘look for’ to
express inchoative aspect can be connected to semantically similar verbs in
Luba-Kasai (Bostoen 1997: 121).
In ECS and the ad/substrate languages, negation is expressed by a verbal
prefix. In Katanga Swahili this strategy is still used with one important dif-
ference: in most cases, a sentence-final second negation element is added (cf.
Schicho 1992: 84). In (6) we find (h)apana, originally a negative locative
copula in ECS, but nowadays used almost exclusively as a negative answer
to a yes/no question meaning.
(6) mais zamani (h)a-ba-ku-anz-ak-e ku-rima vile

but long ago neg-they-tma-start-int-fin inf-cultivate thus
(h)apana.
neg
‘But long ago they didn’t cultivate (the fields) like that.’ (Papa Tshib-
angu/VT3: 1/04)
Periphrastic constructions, however, do also occur making use of Swahili

verbs with an inherently negative meaning as in (7) and (8). This use of nega-
tive verbs does not occur in ECS but has been attested in the following ad/
substrate languages: Bemba, Luba-Katanga, Luba-Kasai, Lala, and Lamba
(Kamba Muzenga 1981: 56).
(7) kama u-na-kosa kw-enda ku-bar hii.

if you-tma-refuse inf-go loc-bar dem
‘If you don’t go to that bar.’ (Schicho 1990: 478)
(8) a-na-acha ku-tumika.

he-tma-discontinue inf-work
‘He doesn’t work any more/longer.’ (Schicho 1990: 478)
Another loss of verbal morphology in favor of periphrastic constructions is

evident in relativization. In ECS, two relativization strategies occur: a relativ-
izer may be infixed or suffixed on the verbal complex, or a relative pronoun
with a relative concord suffixed to it may be used as in (9). This construction
alternates with the one in (10) where the relative concord is infixed in the
verbal complex:
(9) mtu amba-ye a-li-mw-ona simba a-na-ogopa sana.

man pron-rel he-past-it-see lion he-prs-fear much
‘The man who saw the lion is very frightened.’ (Vitale 1981: 90)
(10) mtu a-li-ye-mw-ona simba a-na-ogopa sana.

man he-past-rel-it-see lion he-prs-fear much
‘The man who saw the lion is very frightened.’ (Vitale 1981: 90)
In Katanga Swahili we find a demonstrative, agreeing with the noun it modi-

fies, which functions as a relativizer. It may be placed before or after the
noun. Examples are given in (11)–(12).
(11) u-ko d’accord na (h)ii mambo e-ko-na-sema?
you-cop in.agreement with dem things she-cop-tma-say?
‘Do you agree with the things she’s saying?’ (Dédé m./M1: 2/10)
(12) (h)aba ba-filles tu-ko-na-ona ba-na-zunguluka (h)umu.
dem 2-girls we-cop-tma-see they-tma-walk around here
‘these girls we see walking around here.’ (Fidélie/M1: 4/10)
The development of relativizers from deictic elements is a wide-spread phe-
nomenon in contact-induced languages (Romaine 1988: 250). In the case of
Katanga Swahili, this common trend has undoubtedly been reinforced by ad/
substrate pressure from Bemba (de Rooij 1997: 330).
Numerals in dates and years are mostly in French. Discourse markers, includ-
ing items that are traditionally classified as conjunctions, constitute another
Katanga Swahili 131
group of items that occur almost exclusively in French (De Rooij 2000). Fre-
quently occurring are bon as a marker of topic development and transition,
non as a quotation marker, mais as a marker of contrast, puisque and parce
que as markers of causal relation and alors, donc and et puis as markers of
conclusion and succession. Examples of donc, et puis, are given in (13); mais
occurs here as an element initiating a switch to French. In (14), mais is used
repeatedly in a monolingual Swahili fragment.
(13) njo eh Mungu, ni Mungu w-a richesse,

top eh God cop God 1-conn wealth
‘God, is a god of riches, (0.5)
(0.5) ni Mungu w-a or, ni Mungu

cop God 1-conn gold cop God
is a God of gold, is a God
w-a argent. (0.5) ↓donc (h)ii richesse

1-conn silver (0.5) so dem wealth
of silver. (0.5) therefore, all this riches
yote (h)ii i-na-tu-appartenir shi

all dem it-tma-us-belong to we
belongs to us, we who are
ba-toto yake. (1.0) et puis, Mungu

2-children poss.3sg (1.0) furthermore God
His children. (1.0) furthermore, God
a-shi-na Mungu w-a bu-chafu mais

he-neg-cop God 1-conn filth but
is not a God of squalor mais (but)
c’est un Dieu de la propreté.

it’s det God of det cleanliness
it’s a God of cleanliness.’ (Fidélie/DM13) (De Rooij 2000: 461
[glosses and translation slightly adapted, VDR])
(14) njo ku-sema: lu-fu ya bamalaika: c’est que

top inf-say 11-death conn 2-angel it.means.that
‘It means that it was the death of angels: it means that
mu-ntu: a-ju-e ma-neno: mais a-na-kufa tuu

1-man he-know-neg 6-word but he-tma-die just/merely
a human being who does not understand what is happening but still
bule. mm? mu-ntu yee hapana ku-jua ma-neno:

for.nothing tag 1-man he neg inf-know 6-word
he dies a senseless death. see? a man who does not understand what
mais yee a-na-kufa: bule.

but he he-tma-die for.nothing
is happening but he dies a senseless death.’ (Fabian 2000: #5) [trans-
lation slightly adapted, VDR])
Question words, demonstratives, connectors, and free pronouns, all re-

main Swahili. In De Rooij (2000) it is suggested that quasi wholesale bor-
rowing of French markers has pragmatic and syntactic explanations. The
pragmatic explanation holds that French markers stand out better in Swahili
environments and, hence, are more salient and effective as discourse structur-
ing devices. Using Fench markers in Swahili is, furthermore, not a problem
from a syntactic perspective since markers are not or hardly integrated in the
morphosyntactic frame of sentences.
6. Conclusion
It is clear that features of several Bantu languages have been borrowed into
Katanga Swahili: auxiliary verbs to express aspect, locative noun prefixes
(that are similar to prepositions), negative verbs to express negation, and an
analytic relativization strategy. It is not always possible to identify one lan-
guage as the source language because of the strong structural similarities
between potential source languages. It is clear, however, that those elements
that have been borrowed into the language can all be described as analytic.
The change from an agglutinative morphosyntax to a more analytic structure
is all the more striking since all of the languages involved are related Bantu
languages, all of them agglutinative in structure. This particular change may
Katanga Swahili 133
be explained as a result of strategies promoting semantic transparency in

first language acquisition and second language learning under less than ideal
learning conditions.
The other major instance of borrowing, the use of French discourse mark-
ers, fits a general pattern in language contact, well-documented in many stud-
ies (see e.g. the contributions to Maschler, ed. 2000).
Abbreviations and transcription conventions
(0.5) pause (in seconds) loc locative

↓ sharp low-falling pitch neg negation
contour past past tense
2,6-[noun] class 2 or 6 noun prefix prs present tense
cop copula rel relativizer
dem demonstrative tag tag
fin final vowel of verb form tma tense/mood/aspect
inf infinitive top topicalizer
int intensive aspect
Notes
1. This chapter is based in part on de Rooij 1995, 1996, 1997. I am grateful to those
who commented on these earlier publications and to the editors of this volume
for their comments. All usual disclaimers apply.
Unless indicated otherwise, all examples in this chapter were selected from the
author’s fieldwork data. Fieldwork was carried out in Lubumbashi, dr Congo, in
1991 (June–October) and 1992 (June–December) with grants from the Institute
for Functional Research into Language and Language Use (IFOTT) and the
Netherlands Foundation for the Advancement of Tropical Research (WOTRO).
The financial support of both institutions is hereby gratefully acknowledged.
2. In 1971, Katanga was renamed into Shaba as part of Mobutu’s policy of ‘za-
ïrianization’. After Mobutu’s removal from power in 1997, this renaming was
undone and the province was given back its former name, Katanga.
3. Due to the fact that nearly all research on Swahili as spoken in Katanga has been
carried out in Lubumbashi, the capital of Katanga, some authors refer to the lan-
guage as ‘Lubumbashi Swahili’ (see e.g. Gysels 1992, Polomé 1968, Schicho
1982). How and to what degree Swahili as spoken on the Copperbelt resembles
varieties of Swahili spoken elsewhere in Katanga remains an open question.
References
Blommaert, Jan
1999 Reconstructing the sociolinguistic image of Africa: Grassroots writing
in Shaba (Congo). Text 19: 175200.
2004 Writing as a problem: African grassroots writing, economies of liter-
acy, and globalization. Language in Society 33 (5): 643671.
Bostoen, Koen
c. 1997 Het Shaba-Swahili: Geschiedenis en bronnen [Shaba Swahili: History
and sources]. MA Thesis, Ghent University.
Burssens, Amaat
1939 Tonologische Schets van het Tshiluba (Kasayi, Belgisch Kongo). Ant-
werpen: De Sikkel.
Fabian, Johannes
1986 Language and Colonial Power: The Appropriation of Swahili in the
Former Belgian Congo, 18801938. Cambridge: Cambridge Univer-
sity Press.
1990 History from Below: The ‘Vocabulaire of Elisabethville’ by André Yav
(Texts, translations, and interpretive essay). Amsterdam: John Ben-
jamins.
2000 The history of Zaire as told and painted by Tshibumba Kanda Matulu
in conversation with Johannes Fabian, Third Session, Part 2. Archives
of Popular Swahili 2 (7). http://www2.fmg.uva.nl/lpca/aps/tshibum-
ba3b.html (3 August 2007).
Fetter, Bruce
1976 The Creation of Elisabethville 19101940. Stanford: Hoover Institu-
tion Press.
Grégoire, Claire
1975 Les locatifs en bantu. (Annales du Musée de l’Afrique Centrale, Série
in-8º, Sciences Humaines, no 83.) Tervuren: Musée de l’Afrique Cen-
trale.
Gysels, Marjolein
1992 French in urban Lubumbashi Swahili: Codeswitching, borrowing, or
both? Journal of Multilingual and Multicultural Development 13 (1/2):
4155.
Kabamba, Mbikay
1979 Stratigraphie des languaes et communications à Lubumbashi. Prob-
lèmes sociaux zaïrois 124125: 4774.
Kamba Muzenga
1981 Les formes verbales négatives dans les langues bantoues. (Annales
du Musée de l’Afrique Centrale, Sciences Humaines 106.) Tervuren:
Musée de l’Afrique Centrale.
Katanga Swahili 135
Kapanga, André Mwamba

1993 Shaba Swahili and the processes of linguistic contact. In: Francis Byrne
and John Holm (eds.), Atlantic Meets Pacific: A Global View of Pidg-
inization and Creolization, 441458. Amsterdam: John Benjamins.
Kashoki, Mubanga E.
1968 A Phonemic Analysis of Bemba: A Presentation of Bemba Syllable
Structure, Phonemic Contrasts and Their Distribution. (Zambian Pa-
pers 3.) Manchester: Manchester University Press.
Maschler, Yael (ed.)
2000 Discourse Markers in Bilingual Conversation. Special issue of Inter-
national Journal of Bilingualism 4 (4).
Möhlig, Wilhelm J. G.
1981 Die Bantusprachen im engeren Sinn. In Bernd Heine, Thilo C. Schade-
berg, und Ekkehard Wolff (eds.), Die Sprachen Afrikas, 77116. Ham-
burg: Helmut Buske Verlag.
Polomé, Edgar C.
1968 Lubumbashi Swahili. Journal of African Languages 7 (1): 1425.
Romaine, Suzanne
1988 Pidgin and Creole Languages. London: Longman.
De Rooij, Vincent A.
1995 Shaba Swahili. In: Jacques Arends, Pieter Muysken, and Norval Smith
(eds.), Pidgins and Creoles: An Introduction, 179190. Amsterdam/
Philadelphia: John Benjamins.
1996 Cohesion through Contrast: Discourse Structure in Shaba Swahili/
French Conversations. Amsterdam: Ifott.
1997 Shaba Swahili: Partial creolization due to second language learning and
substrate pressure. In: Arthur K. Spears and Donald Winford (eds.),
The Structure and Status of Pidgins and Creoles, 309339. Amster-
dam/Philadelphia: John Benjamins.
2000 French discourse markers in Shaba Swahili conversations. Internation-
al Journal of Bilingualism 4 (4): 447467.
Schicho, Walter
1982 Syntax des Swahili von Lubumbashi. Vienna: Afro-Pub.
1988 Tense vs. aspect in Sango and Swahili of Lubumbashi. In: Moham-
mad Ali Jazayery and Werner Winter (eds.), Languages and Cultures:
Studies in Honor of Edgar C. Polomé, 565579. Berlin: Mouton de
Gruyter.
1990 AUX, Creole und Swahili von Lubumbashi. Zeitschrift für Phonetik,
Sprachwissenschaft und Kommunikationsforschung 43 (4): 476483.
1992 Non-acceptance and negation in the Swahili of Lubumbashi. African
Languages and Cultures 5 (1): 7589.
Vitale, Anthony J.
1981 Swahili Syntax. Dordrecht: Foris.
Grammatical borrowing in Khuzistani Arabic
Yaron Matras and Maryam Shabibi
1. Background
The Khuzistani dialect of Arabic (henceforth Kh. Arabic) is spoken natively

by over 3 million people, who constitute roughly 7 percent of the population
of the province of Khuzistan in south-western Iran. The dialect is the east-
ernmost representative of the continuum of Mesopotamian dialects of Arab-
ic, which cover the river lands of southern Iraq in the west (Ingham 1982:
14). Arab settlement in the area is believed to go back to the beginning of
the Christian era. In the centuries following the advent of Islam, the Arabic
language enjoyed the status of the literary language of religion, scholarship,
and administration, as well as being the primary language of everyday com-
munication in the province. This changed with the coming to power of Reza
Shah Pahlavi in 1926 and the introduction of an intensive campaign favouring
Persian as the only official state language. The policy included the settlement
of a Persian-speaking population in the province.
Unlike other dialects of Arabic, Kh. Arabic has not attracted much atten-
tion within the linguistic community. Ingham (1997) devotes a chapter to the
dialect, focusing, however, on an introductory discussion of phonology and
vocabulary only; and Shabibi (2006) provides an overview of the structures
of the dialect along with an analysis of contact-induced developments in mor-
phosyntax.
Persian is now the only language of education, local media and newspa-
pers, administration, and most urban commerce in the province of Khuzistan.
Arabic is the language of the family and Arabic-speaking neighbourhoods,
though even as an informal language it is now in decline, and Persian is the pre-
ferred language of the younger generation born since the 1970s. All educated
adult speakers of Arabic are bilingual, and Arabic monolingualism is limited
to the uneducated older generation, and to the older generation in rural com-
munities. Arabic literacy is limited by and large to reading the Qur’an, and
to a very basic level of instruction in Modern Standard Arabic, though even
most educated Arabs have no active command of Modern Standard Arabic.
There is, however, considerable exposure to Arabic-language satellite media,
and so to the broadcast (oral) version of Modern Standard Arabic. In some
138 Yaron Matras and Maryam Shabibi
cases, Khuzistani Arabic speakers are able to read modern Arabic by drawing
on their exposure to these media, combined with their basic familiarity with
the Arabic script and with Classical Arabic (Qur’an).
2. Phonology
The only apparent phonological contact phenomenon in Kh. Arabic is the

interchange of /ɣ/ and /q/, in words such as /ɣarīb, qarīb/ ‘close’ (cf. Modern
Standard Arabic /qarīb/ ‘close’, /ɣarīb/ ‘strange’). This matches the realiza-
tion in Persian of etymological /q/ as /ɣ/. Phonemes that are otherwise absent
from the Arabic system, most notably /p/, are retained in Persian loanwords:
panjara ‘window’,
3. Morphological typology
A major change under Persian influence is the levelling of the status of at-
tributes. In Arabic, adjectival attributes follow the head noun, and agree with
the head noun in gender, number, as well as in definiteness:
(1) Standard Arabic (and other dialects)

a. walad kabīr
boy big.m
‘a big boy’
b. l-walad l-kabīr
def-boy def-big.m
‘the big boy’
Nominal attributes, by contrast, are conjoined by means of the attributive

Iḍāfa-construction, whereby only the dependent (genitive) noun is overtly
marked for definiteness:
(2) Standard Arabic (and other dialects)

walad l-mudīr
boy def-director
‘the director’s son’
Khuzistani Arabic 139
In Persian, both types of attributes are treated in the same way: The attribute
(whether adjectival or nominal) follows the head, and an attributive particle
(the Ezāfe marker) mediates between the two:
(3) Persian
a. pesar-e bozorg
boy-ez big
‘the big boy’
b. pesar-e modīr
boy-ez director
c. xūne-ye sefīd
house-ez white
‘the white house’
d. moʔallem-e madrese
teacher-ez school
‘the school teacher’
The pattern in Kh. Arabic matches the Persian arrangement (note that, as
in other dialects of Arabic, the definite article l- assimilates to dental conson-
ants, resulting in gemination of that consonant):
(4) Khuzistani Arabic

a. walad č-čibīr
boy def-big.m
‘the big boy’
b. walad l-modīr
boy def-director
c. bīәt l-abyaḍ
house def-white
‘the white house’
d. moallәm-at l-madrәsa
teacher-f.cons def-school
‘the school teacher’
Note that in the adjectival attributive construction in (4a and 4c) overt def-
initeness agreement between noun and adjective is lacking, just like in the
genitive attribute construction in (4b and 4d). Based on the Persian model,
Kh. Arabic has reanalysed the definite article in such constructions as a mark-
er of attribution, which matches the Persian (definite) Ezāfe marker -(y)e. Its
distribution now resembles that of the Persian Ezāfe attributive marker: It ap-
pears, like Persian -(y)e, between the two constituents of the attribution, and
it is used to link both adjectival, and nominal attributes.
Further evidence that the functions of the Persian construction are mapped
onto Arabic structures is provided by the position of the feminine Construct
State or Iḍāfa-marker -at, seen in (4d) in a position that is not untypical of
Arabic as a whole. In Arabic, the Construct State marker (still recognisable in
the vernaculars only in the feminine singular) is reserved for nominal attribu-
tion, as in (4d). But in Kh. Arabic we find it in adjectival attributive construc-
tions as well, as in (5a–b); it even attaches directly to adjectives, as in aly-at
‘high.f’ in (5b):

a. jazīr-at l-xaḍra
island-f.cons def-green
‘the green island’
b. ṭōf-at aly-at l-bīәt
wall-f.cons high-f.cons def-house
‘the high wall of the house’
This matches again the distribution of the Persian Ezāfe marker -(y)e (6):
(6) Persian
a. jazīre-ye sabz
island-ez green
‘the green island’
b. dīvār-e boland-e xūne
wall-ez tall-ez house
‘the high wall of the house’
Note that in the ‘mixed’ type, in (5b), involving both an adjectival-attribute

(‘high wall’) and a genitive attribute (‘wall of the house’), the first (adjectival)
attribution relies exclusively on the Construct State marker, while the second
(nominal) relies on the combination of the Construct State marker with the
following definite article. In fact, Kh. Arabic allows for variation in such
cases, and the Construct State marker may be accompanied by a definite art-
icle in both positions. Consider example (7), where the nouns are masculine,
and there is no option of using an overt Construct State marker:
(7) a. Khuzistani Arabic

walad č-čibīr l-modīr
boy def-big def-director
‘the director’s big/eldest son’
b. Standard Arabic (and other dialects)
walad l-mudīr l-kabīr
boy def-director def-big
c. Persian
pesar-e bozorg-e modīr
boy-ez big-ez director
The crucial aspect of the Kh. Arabic construction is (1) to have a marker of
attribution mediating between the head and its attribute, (2) to place the at-
tribute in a position immediately following its head, and (3) to avoid any overt
marking of definiteness in the adjectival attribution. In all this, Kh. Arabic
copies precisely the Persian attributive construction. Contrasting with Per-
sian, it retains a distinct marking of attribution with feminine singulars, but
allows this marking to assimilate into the generic function of the attributive
marker. The outcome of the process is (1) the loss of the distinction between
nominal and adjectival attribution, (2) the loss of overt marking of definite-
ness in attributive constructions, (3) a change in the word order in complex
(‘mixed’) attributive constructions (as in 5b and 7a), and, finally, (4) gender
variation in the marking of the attributive construction, with optional use of
the definite article to accompany the Construct State in feminine singulars in
complex attributions.
The most notable contact-induced change in Kh. Arabic nominal structures is

the status of the Iḍāfa-construction alluded to above. The replication of a con-
struction type that is similar to the Persian Ezāfe leads, as discussed above,
to the abandonment of definiteness agreement. The decline of overt definite-
ness marking can also be observed in other constructions in the language,
notably in the absence of a definite article with definite head nouns of rela-
tive clauses. This too follows a Persian model (where definiteness generally
remains unmarked):

mara lli šift-ū-ha xābar-at.
woman rel saw-2pl.m-3sg.f called-3sg.f
‘The woman that you saw called.’
In the derivation of verbs, the tendency to paraphrase inchoative and causa-

tive verbs drawing on an analytic construction rather than on derivational
morphology, although found in other dialects of Arabic, appears to be re-
inforced by Persian. Thus we find:
š-šijra z-zɣīr-a šwayye šwayy tṣīr čibīr-a.
def-tree def-small-f little little become.3sg.f big-f
‘The small tree gradually grows.’
Loan-verbs appear to be limited to the replication of Persian compound
verbs consisting of a nominal stem (masdar) and a verbalizing element or
‘light verb’ (Persian kardan ‘to do’ or šodan ‘to become’). The nominal stem,
often itself an Arabic loan into Persian, is replicated directly in Kh. Arabic,
while as corresponding native light verb ṣaww- ‘to do’ is employed for Per-
sian kardan, and ṣār- ‘to become’ for Persian šodan, thus: Persian taaɣīb-
eš kard ‘he followed him’ (follow-3sg did.3sg) is rendered ṣawwā-h taɣīb
(did.3sg-3sg follow).
An additional change to the verb system, brought about through Persian
influence, concerns the tense system. Persian has both a simple past tense,
which is expressed by the person-inflected past stem of the verb, and a com-
posite past tense, which consists of a past participle and an auxiliary. The
auxiliary, based on the existential verb, may inflect for person as well as
tense; the present-tense auxiliary is used to form the perfect, the past-tense
auxiliary forms the pluperfect. Arabic, by contrast, has only one, simple past
tense, though combinations of the past-tense existential verb with the lexic-
al verb (usually in the imperfect or present-future) are also possible, usually
expressing habitual aspect of conditional mood. Kh. Arabic copies the Per-
sian composite past tense, drawing on inherited resources. The only available
participle form in Arabic is the present participle, which inflects for gender
and number (but not for person), and it is this form that serves as the basis for
composite past tense in Kh. Arabic. Since the Arabic existential verb does not
have a present-tense form, the only available auxiliary is a past-tense auxil-
iary; the construction thus matches the Persian pluperfect:

mәn rәħ-әt. lә-l-bīet, huwwa mā-rāyәħ čān
when went-1sg to-def-home he neg-going.sg.m was.3sg.m
‘When I went home he had not gone away.’
b. Persian
vaɣti raft-am xūne, ūn na-rafte būd.
when went-1sg home he neg-gone was.3sg.m
‘When I went home he had not gone away.’

mәn gabul šāyfat-ha čәnәt.
from past seeing.sg.f-3sg.f was.1sg
‘I had seen her before.’
b. Persian
az ɣabl ūn-o dīde būd-am.
from past 3sg-acc seen was-1sg
‘I had seen her before.’
A series of Persian discourse markers, fillers, tags, and focus particles are
used in Kh. Arabic. Most of these elements are well integrated into Kh. Arab-
ic and are not perceived by speakers as foreign. The category that is most
obviously influenced by Persian is that of discourse markers with a primar-
ily interaction-qualifying rather than syntactic-semantic function: xō/xōb/xōš
‘well’, xōlāse ‘in sum’, albate ‘of course’, hič ‘at all, altogether’, ham ‘indeed,
well’:
(12) xōb w-hāy sabab ham l-laði gabal čān …

dm and-this reason dm rel once was
‘Well, and the reason that indeed once existed for this …’
(13) xolāse hīč mā-rəħ-na madrasa.

dm dm neg-went-1pl school
‘After all, we didn’t go to school at all.’
(14) albate čān-an ham b-ðīč z-zamān banāt č-čān-an

dm were-3pl.f dm in-that def-time girls rel-were-3pl.f
yarħ-an.
go-3pl.f
‘Of course there were indeed girls at the time who used to go [to
school].’
These are accompanied by Persian-derived focus particles: ham ‘too’ and ham
… ham ‘both … and’.
(15) ðīč ənd-ha θnīən frūx ana ham ənd-ī θnīən.

that.f poss-f two children I too poss-2sg two
‘She has two children, and I have two children, too.’
(16) ham ana w ham alī rəħ-na l-əl-pārk.

both I and also Ali went-1pl to-def-park
‘Both Ali and I went to the park.’
(17) umm-ī ham ɣəsl-at lə-mmāīn ham naḍḍəf-at

mother-1sg both washed-3sg.f def-dishes and cleaned-3sg.f
l-bīət.
def-house
‘My mother [both] washed the dishes and cleaned the house.’
Optional, occurring in variation alongside various Arabic-derived counter-

parts such as ħatta ‘even’ or lākin ‘but’, is the contrastive correlative balke
‘but [… also]’:
(18) huwwa mū bass bāhūš balke šujjā ham.

he neg only clever but brave too
‘He is not only clever but also brave.’
Further Persian borrowings that are generalized in Kh. Arabic are the con-
cessive subordinating conjunctions agarče and bā īnke, both 'although/ even
though’, and the factual complementizer ke ‘that’:
(19) huwwa rāħ lwaħda l-əl-pārk agarče umm-a

he went.3sg.m alone to-def-park although mother-3sg.m
gall-at l-a lā-yrūħ.
said-3sg.f to-3sg.m neg-go.3sg.m
‘He went to the park alone, even though his mother told him not to go.’
(20) rayyāl-na bə-l-yōm xəṭab, bā īnke θaləθtaš sana

man-1pl in-def-day proposed.3sg.m although thirteen year
umr-ī sawwūm rāhnamāī ubū-y qəbal b-ī
age-1sg third secondary school, father-1sg accepted for-3sg.m
‘When my husband proposed, although I was [just] thirteen years
old, third year of secondary school, my father agreed.’
(21) tədr-īn ke rayl-əč ala kəl-šī čaððab.

know-2sg.f comp husband-2sg.f on everything lied.3sg.m
‘You know that your husband lied about everything.’
The latter, the Persian complementizer and relativizer ke, does not appear
in non-factual (subjunctive) complements, where instead we find the Arabic
(historical) relativizer l-laði or illi, which also continues to cover the func-
tion of a relativizer. Nonetheless, occasionally Persian ke is also found in the
position of the relativizer:
(22) əbən uxū ɣāzī ke huwwa w mart-a hnā …

son brother Ghazi rel he and wife-3sg.m here
‘Ghazi’s nephew, who is here with his wife …’
From this we might assume a gradual process of convergence in steps, as

follows: in stage 1, the Persian model of having an identical marker for com-
plement clauses and relative clauses (ke) is copied into Kh. Arabic, with the
effect of generalizing the relativizer l-laði/illi (at the expense of the historical
Arabic complementizer ’inn-) to cover the function of complementizer. The
result is a convergence of patterns among the two languages. In stage 2, the
actual Persian marker ke is adopted into Kh. Arabic in factual complement
clauses, as seen in (21). The result is a split within Kh. Arabic between factual
and non-factual complements, whereas the same marker is used in both lan-
guages to introduce factual complements. Finally, in stage 3, the beginnings
of which are attested in the contemporary language, Persian ke infiltrates
Kh. Arabic relative clauses as well, as seen in (22).
One change in constituent order has already been mentioned above, in Sec-
tion 2: it concerns the shift in ‘complex’ attributive constructions, away from
the Arabic norm, which allows an adjectival modifier to be separated from its
head (by a nominal modifier of the complex noun phrase), toward the Per-
sian-type constituent order, whereby each attribute must immediately follow
its head. We repeat example (7) here:

walad č-čibīr l-modīr
boy def-big def-director
‘the director’s big/eldest son.’
b. Standard Arabic (and other dialects)
walad l-mudīr l-kabīr
boy def-director def-big
‘the director’s big/eldest son’ (also: ‘the big director’s son’)
c. Persian:
pesar-e bozorg-e modīr
boy-ez big-ez director
A further issue related to the order of constituents in Kh. Arabic concerns

the position of the copula-auxiliary /čān/, which, in the composite past tense
(pluperfect), follows the lexical verb: mā-rāyәħ čān ‘he had not gone away’
(Persian: na-rafte būd) (see examples 10 and 11).
Noteworthy is also the flexible position of the causal conjunction čīe ‘be-
cause’. Like its Persian counterpart čon, it can also occupy the final position
in the adverbial clause expressing cause:

līeš mā-reħ-tī l-әl-madrasa?
why neg-went-2sg.f to-def-school
čān edd-ī xuṭṭār čīe.
was.3sg with-1sg guests because
‘Why didn’t you go to school?’
‘Because I had guests.’
b. Persian
čerā be madrese na-raft-ī?
why to school neg-went-2sg
mehmūn dāšt-am čon.
guest had-1sg because
‘Why didn’t you go to school?’
‘Because I had guests.’
Finally, we must consider what appears to be the beginning of a shift in

word order, extending the contexts in which Object–Verb order is favoured
to comply more frequently with the Persian type. Object–Verb order in Arab-
ic is generally highly marked and is employed as a means to topicalize the
direct object. Kh. Arabic makes use of such strategies, which include – un-
like Persian, where OV prevails – the pronominal resumption of the object
in a position following the lexical verb. Nevertheless, such constructions in
Kh. Arabic do not necessarily express the topicalization of the object:
(24) lə-bnayya d-dār naḍḍəf-at-ha.

def-little.girl def-room cleaned-3sg.f-3sg.f
‘The little girl cleaned [it] the room.’
(25) haðan xālāt-ī līsāns-hən kaẓẓ-ann-a.

these aunts-1sg degree-3pl.f gained-3pl.f-3sg.m
‘My aunts received [it] their degree.’
8. Lexicon
The presence of numerous Persian lexical borrowings is a distinguishing fea-

ture of Kh. Arabic, setting it apart from other neighboring dialects of Arabic.
Nevertheless, there is considerable sociolinguistic stratification in the use of
Persian vocabulary among different groups of speakers (cf. Shabibi 1998). As
the principal language of the public sphere, Persian supplies numerous lexic-
al items in the domains of trade, institutions, tools, and other aspects of public
and technical life (e.g. xarīd-o-furūš ‘trade’, pīč guštī ‘screwdriver’, lebās šū’ī
‘washing machine’, etc.). In everyday vocabulary, Persian idioms are common-
ly calqued in Kh. Arabic, facilitated by the fact that those idioms themselves
are often based on Arabic loan vocabulary in Persian, and so even more easily
replicable in Kh. Arabic: Consider Kh. Arabic wāyәd mamnūn, lit. ‘very grate-
ful’, in the sense of ‘thank you very much’, based on Persian xeyli mamnūn, or
Kh. Arabic yarreti zaħma, lit. ‘you have taken trouble [on my behalf]’, also
an expression of gratitude, from Persian zahmat kešīdī. Here, the fact that the
languages already share a large part of their vocabulary (as a result of earlier,
historical influence of Arabic on Persian), makes replication of lexical Matter
redundant, and promotes in turn replication of idiomatic Patterns surrounding
a pivotal word in the idiom that is already shared by both languages.
9. Conclusion
Matter replication of Persian material is found in Kh. Arabic primarily in the

domain of lexical vocabulary, and in part in grammatical vocabulary, cover-
ing discourse markers that operate strictly on the interaction level (i.e. not
conjunctions), focus particles, a correlative particle, a complementizer and
relative particle, and concessive subordinating conjunctions. Pattern repli-
cation is most notable in the emerging change of constraints on word order
(extension of marked word-order patterns), the favouring of analytic con-
structions and emergence of a new analytic past tense (pluperfect), and the
reduction of overt marking of definiteness. Perhaps the most remarkable con-
tact-induced change, one which strongly affects the typology of attribution in
the language, is the identification of Kh. Arabic grammatical morphemes in
attributive constructions – the Construct State marker (visible in the feminine
singular only) and the definite article that appears between head and attribute
– with the Persian attributive particle, and the consequent merger of two his-
torically distinct attributive constructions – adjectival and nominal – into a
single type, replicating the state of affairs in Persian.
Abbreviations
acc accusative indef indefinite article

comp complementizer m masculine
cons construct state marker neg negation
cop copula past simple past (perfective)
def definite article pl plural
dm discourse marker poss possessive expression
ez Persian Ezāfe attributive marker rel relative particle
f feminine sg singular
References
Ingham, Bruce
1982 North East Arabian Dialects. London: Kegan Paul International.
1997 Arabian Diversions: Studies on the Dialects of Arabia. Reading: Ith-
aca Press.
Shabibi, Maryam
1998 Variation in the use of Persian loan words among Iranian Arabic speak-
ers. Unpublished Msc. Dissertation. University of Edinburgh.
2006 Contact-induced grammatical changes in Khuzistani Arabic. Ph.D.
thesis. University of Manchester.
Grammatical borrowing in Domari
Yaron Matras
1. Background
Domari (also Domi, sometimes also Qurbati) is the Indo-Aryan language spo-
ken by a population of commercial nomads in the Middle East. The language
retains some archaic Indo-Aryan features, such as the Middle Indo-Aryan
present-tense conjugation, but also shows some radical re-structuring in the
past-tense conjugation and in syntactic typology. In this respect it resem-
bles Romani, also an Indo-Aryan diaspora language of originally peripatetic
groups. The self-appellations rom and dom are also related, both deriving from
Indic ḍom. However, some isoglosses separating the two language appear to
be rather ancient, and it is highly unlikely that they both split from the same
ancestor language after leaving India.
The present chapter deals with the only dialect that has been extensively
documented – that of the Palestinian Doms of Jerusalem (see Matras 1999;
Macalister 1914). The language is confined strictly to oral use within the fam-
ily and, to a limited extent, with members of other Dom communities. The pre-
cise history and date of arrival of the group in the region remain unknown. The
language shows a layer of Kurdish influence, and the community has a sense
of affinity to another, Arabic-speaking population of commercial nomads who
are referred to as “Kurds”. Arabic has been the principal contact language
for many centuries. The Jerusalem community began shifting to Arabic in the
1960s, and individuals who were raised since this period are largely monolin-
gual in Arabic, with only passive exposure to Domari. It is estimated that out
of a total number of between 6001000 community members, only around
10 percent speak Domari fluently; the language is thus endangered or even
moribund. Whatever information is available suggests that this is also the situ-
ation at least in urban Dom communities elsewhere in the Middle East.
2. Phonology
The Domari sound system strongly resembles that of its contact language, Pal-
estinian Arabic, though it is not always obvious that this is due to borrowing
152 Yaron Matras
or convergence. All consonants with the exception of /p/, /v/ (interchangeable

with /w/), /tš/ and /g/ are shared with Arabic. Uvular /q/ appears also in the
pre-Arabic component and may be the outcome of Iranian (Kurdish) influ-
ence (e.g. qišṭoṭa, alongside kišṭoṭa ‘small’). The glottal stop /ʔ/ is distinctive
only in the postposed negation marker -éʔ:
(1) n-mang-am-éʔ
neg-want-1sg-neg
‘I don’t want.’
The pharyngeals [ħ] and [ʕ] appear to be restricted to Arabic-derived lex-

ical loans, but pharyngealization of dentals /t, d, s, z/ is transferred to the pre-
Arabic component as well: [wɑ:ṭ] ‘stone’, [ḍɑnḍ] ‘tooth’. Consonant gemina-
tion also appears independently of the Arabic component: [tilla] ‘big’. Under
Arabic influence, the affricates /tš, dž/ are being reduced to sibilants /š, ž/.
In the case of the voiced affricate, a similar change has fairly recently taken
place in the Arabic dialect as well, and some variation is still observable.
All vowel sounds with the exception of [ɔ] and [ʌ], both rather infrequent,
are shared with Arabic. As in Arabic, there is variation in the realization of
the short vowel phonemes /i/ [i, , ɪ] /a/ [a, æ, ɑ] and /u/ [u, ʉ, ʊ], and of long
/ā/ [a: æ:, ɑ:], with back vowels preferred in the vicinity of uvular and pha-
ryngealized consonants, as in [ṭɑ:ṭ] ‘Arab’, but [tat] ‘heat’. As in Arabic, the
vowels /u/ and /i/ are often interchangeable: džuwir/džiwir ‘woman’ (also a
feature of Kurdish). Prothetic and epenthetic vocalization around initial con-
sonant clusters can also be regarded as a general regional phenomenon. Pros-
ody and intonation are largely shared with Arabic.
As a New Indo-Aryan language, Domari will have undergone a re-structuring

of its past tense formation leading presumably at some stage to the emergence
of split morphological ergativity. Evidence to this effect is the use of the ori-
ginally oblique form of the pronoun for ama ‘I’, and the construction of the
past-tense conjugation, based on the attachment of what once were oblique
personal clitics (kard-o-m ‘I did’ < *karda-o-me ’done-by-me’) (cf. Matras
2002: 145151). The language is, however, no longer ergative, though this
might be attributed to the general drift away from ergativity in northwestern
Indo-Aryan frontier languages as well as in Iranian, and there is no con-
Domari 153
crete evidence linking it with the influence of Arabic, which is not ergative,
either.
The overall similarity in word-order rules results in a similar positioning of

nominal objects in the sentence. The most extensive Arabic influence on nom-
inal structures is in the domain of local relations, more specifically the almost
wholesale borrowing of Arabic prepositions. We can assume that Domari
lacked prepositions altogether before contact. Local relations expressions
that are not borrowed are expressed either by case suffixes, or by genitive-
possessive location expressions which consist of a location adverb inflected
for (oblique) possession and the Locative case, preceding a head in the Ab-
lative. There is only a very small set of such inherited location adverbs, all
expressing strict spatial relations: mandža ‘in’, bara ‘out’, paš ‘behind’, agir
‘in front’, atun ‘above’, axār ‘below’, and čanč- ‘next to’:
(2) čanč-is-ma kury-a-kī

next.to-3sg.obl-loc house-obl.f-abl
‘next to the house’ (lit. ‘in its-side from-the-house’).
Arabic-derived prepositions may be integrated into this format, as long as the

meaning is stative:
(3) žamb-is-ma lāč-a-ki

next.to-3sg.obl-loc girl-obl.f-abl
‘next to the girl’ (< Arabic žamb ‘next to’).
Temporal and more specified spatial relations are generally expressed through
Arabic prepositions, the nouns appearing in the Ablative case (serving as a
general prepositional case):
(4) baʕd wars-ak-ki

after year-indef-abl
‘after a year’ (< Arabic bad ‘after’)
The Arabic prepositions ma ‘with’, min ‘from’, la ‘to’, fī ‘in’and ind ‘at’
compete with the synthetic cases Associative, Ablative, Dative, and Locative
154 Yaron Matras
respectively. Occasional doubling of case may be observed, e.g. Arabic fī ‘in’

+ Locative case -ma (see example 6), though in general these prepositions
too trigger the default Ablative/Prepositional case on the noun (example 5):
(5) maʕ bɔy-im-ki

with father-1sg.obl-abl
‘with my father’ (< Arabic ma ‘with’)
(6) fī šare-ma
in street-loc
‘on the street’ (<Arabic fī ‘in’)
Non-Arabic expressions are preferred only with pronominal clitics:
(7) a. ab-us-ke
for-3sg-ben
‘for him’
b. minšān zirt-an-ki
for child-obl.pl-abl
‘for the children’ (< Arabic minšān ‘for’)
(8) a. nkī-man
at-1pl
‘by us/at ours’
b. ʕind wud-as-ki
at old-obl.m-abl
‘at the old man’s’ (< Arabic ind ‘at’)
Domari has two formats for the possessive construction:
(9) a. bɔy-im kuri

father-1sg.obl house
b. kury-os bɔy-im-ki
house-3sg.nom father-1sg.obl-abl
‘my father’s house’
The second, (9b), in the order head–determiner, is by far the more widespread,
and contrasts with the normal Indo-Aryan (including Romani) determiner–
head construction. It matches however the Iranian type (cf. Kurdish mal-a
Domari 155
bav-ê min ‘house-det father-det me’) as well as the Arabic type bēt abū-y
‘house father-1sg’, and is matched even more closely by the (less frequent)
Arabic construction bēt-ō la-ʔabū-y ‘house-3sg to-father-1sg’.
As in Arabic, citation forms of many inalienable nouns must include pos-
sessive marking, thus bɔy-om ‘my father’ for ‘father’, cf. Arabic abū-y. Arabic
plural suffixes are often retained with Arabic nouns, and usually doubled by
a Domari plural ending: Arabic mislim-īn ‘Muslim-s’, Domari mislim-īn-e.
Finally, the overuse of the Domari demonstratives aha/ihi/ehe with nouns,
and a slight erosion of their deictic focusing quality, as in ‘and this man went
into this house, to fetch this jar of water’, resembles the tendency in Arabic
discourse toward generalization of the reduced demonstrative hā- (< hāda/
hādi/hadōl), which tends to accompany the Arabic definite article in similar
contexts (Domari has no definite article).
Domari differs from Arabic in its structure of past tenses, but the two lan-
guages share a distinction between the present indicative and subjunctive
(Domari lah-ami ‘I see’, lah-am ‘that I see’; Arabic b-ā-šūf ‘I see’, ā-šūf
‘that I see’), as well as the absence of an explicitly structured future tense.
For aspectual distinctions, Domari relies directly on Arabic-derived auxil-
iaries expressing habitual, inceptive, and iterative aspect, which retain their
Arabic tense and person inflection. Unlike in Arabic, however, these auxil-
iaries are followed by an indicative, not subjunctive, form of the main lexic-
al verb:
(10) kunt aw-ami.

was.1sg come-1sg
‘I used to come.’
(11) ṣārat mangišk-ari.

began.3sg.f beg-3sg
‘She began to beg.’
(12) baqēt kamk-ami.

continued.1sg work-1sg
‘I continued to work.’
156 Yaron Matras
The conditional is formed with the Arabic auxiliary kān with the Domari an-
terior past:
(13) a. Domari
law ēr-om xužoti kān laher-d-om-s-a.
if come.past-1sg yesterday cond see-past-1sg-3sg-ant
b. Arabic
law až-īt mbāreħ kān šuft-ō.
if came-1sg yesterday cond saw.1sg-3sg
‘If I had come yesterday I would have seen him.’
Modal expressions, with the exception of sak- ‘to be able to’, are all bor-
rowed from Arabic. The modal expression for obligation and necessity lāzim,
and that for possibility, mumkin, are impersonal. The expression for desire/
intention, bidd-, retains its Arabic person inflection, and is followed by the
Domari subjunctive.
Arabic loan-verbs are integrated into Domari by incorporating the isolated
stem of the subjunctive form – e.g. -štrī- from a-štrī ‘that I buy’, -īš- from
a-īš ‘that I live’ – into the inflected loan-verb integration markers -k- from
-kar- ‘to do’, for transitives, and -h(r)- ‘to (have) become’, for intransitives:
štrī-k-ami ‘I buy’, īš-hr-omi ‘I live’.
Domari and Arabic both lack a verb ‘to have’. Although contact influence
will not have been the source of the absence of ‘to have’, the specific Domari
possessive expression wāšī-m ‘with-me, at-mine’ for ‘I have’ (rather than a
construction of the type ‘to-me there-is’, as in other Indo-Aryan languages)
does resemble Arabic ind-ī ‘at-mine’. There are also similarities in the organ-
ization of existential predications. While Arabic has nominal predications in
the present tense and lacks a present-tense copula, Domari, in contrast with
its overall SVO structure, retains an enclitic copula, which, in the past tense,
is modified by the (non-clitic, non-final) Arabic-derived copula kān:
(14) a. Domari
ama mišta-hr-omi.
I ill-cop-1sg
b. Arabic
ʔana ayyān.
I ill
‘I am ill’
Domari 157
(15) a. Domari
ama kunt mišta-hr-om-a.
I was.1sg ill-cop-1sg-ant
b. Arabic
ʔana kunt ayyān.
I was.1sg ill
‘I was ill.’
Like Arabic (but also Kurdish), Domari lacks infinitives in modal construc-
tions, and employs a present-subjunctive form of the embedded verb in-
stead:
(16) bidd-ī dža-m kury-a-ta

want-1sg go-1sg.subj house-obl-dat
‘I want to go home’
Domari shows a massive amount of Arabic loans in its grammatical vocabu-

lary. All numerals above ‘5’, with the exception of ‘10’ and ‘100’, are de-
rived from Arabic (e.g. sitte zirt-ēni ‘six children-pred’), and some speakers
also use Arabic numerals for ‘3, 4, 5’. The quantifiers akam ‘few’ and kull
‘all’ (alongside Kurdish-derived giš) are from Arabic, as are most indefinite
pronouns, with the exception of ek-ak lit. ‘one’ for ‘somebody, anybody’.
Most Arabic indefinite pronouns show only a shallow level of grammatical-
ization, and are similar in structure to the nouns from which they derive (e.g.
ħāža ‘thing, anything’, wāħad ‘one, anyone’, maħall ‘place, somewhere’).
They are integrated into Domari by adding the Domari indefinite article -ak:
maħal-ak ‘somewhere, anywhere, nowhere’, ħāža-k ‘something, anything,
nothing’. Other indefinites, such as dāʔiman ‘always’, are borrowed direct-
ly. Expressions for the days of the week, dates, and usually also seasons are
Arabic-derived, with times of the day showing a mixture of Turkish (ṣabaħtan
‘morning’), Arabic (zuhur ‘midday’), and Indic (arāti ‘night’).
Domari borrows the Arabic interrogatives ʔayy ‘which?’, qaddēš ‘how
much?’, and waqtēš ‘when?’ (though not from Jerusalem Arabic; they ap-
pear to have been adopted prior to settlement in the city, from Beduin or rural
dialects). Further pronouns borrowed from Arabic include the reciprocal
baḍ (laherde baḍ ‘they saw one another’), and the third-person resumptive
pronoun iyyā-, which retains Arabic-derived gender and number inflection
158 Yaron Matras
(mana illi torim iyyā-h ‘the bread that you gave me [it]’; see below, on rela-
tive clauses).
All connectors (coordinating and subordinating conjunctions), discourse
markers, interjections, particles, fillers, and tags in Domari are Arabic, and
assume the same position in the utterance as they do in Arabic. They include
ū ‘and’, fa ‘and so’, bass/lākin ‘but’, wila ‘or’, wala ‘nor’, iza / law ‘if’, lamma
‘when’, qabel mā ‘before’, bad mā ‘after’, badēn ‘and then’, yanī ‘that is’,
the phasal adverbs (e.g. lissa ‘still, yet, no longer’) and focus particles (e.g.
bass ‘only’, kamān ‘too’), the particles ʔa ‘yes’ and la ‘no’, and more. The
factual complementizer inn- carries Arabic inflection and, like in Arabic, may
either be impersonal, with a default third person masculine singular marker,
or agree with the subject of the complement clause:
(17) ama sin-d-om inn-o/inn-ak atu īšhr-ori hinēn.

I hear-past-1sg comp-3sg.m/comp-2sg you live.2sg here
‘I heard that you live here.’
While place deictics remain Indic, many temporal deictic expressions are
borrowed from Arabic, including hallaʔ ‘now’ and badēn ‘later’.
In the adjective, comparative and superlative forms are borrowed whole-
sale from Arabic, including their lexical forms, rendering all adjectives in the
language (except those whose positive forms are also Arabic-derived) supple-
tive:
(18) a. tilla–akbar–akbar wāhed

‘big bigger biggest’
b. qišṭoṭa–azɣar–azɣar wāhed
‘small smaller smallest’
This can be explained by the motivation to borrow a comparative/superlative

procedure, but the inability to segment the Arabic comparative/superlative
into analysable morphemes (cf. Arabic kbīr–akbar ‘big–bigger’).
Constituent-order rules in Domari are on the whole fully compatible with

Arabic. Both languages have flexible word order, with a tendency toward
SV(O) in isolated, categorical sentences, and with other patterns as options.
Domari 159
Since most Indo-Aryan and Iranian languages are SOV, this can be consid-
ered a clear case of convergence with Arabic. The only exception appears to
be in present-tense copula clauses, discussed above (examples 1415). The
more frequent Domari possessive construction shows the order possessed–
possessor noun (example 9b), matching that of Arabic.
The original position of the Domari adjective was in front of the noun (see
Macalister 1914). Adjective-noun constructions are still encountered (20a),
but they are greatly outnumbered in all contexts by an alternative construc-
tion in which the noun is followed by the adjective, to which a non-verbal
predication marker is attached (19b):
(19) a. ēr-ī qišṭoṭ-ī šōnī.

came-3sg.f little-f girl
‘A little girl came.’
b. ēr-ī šōnī qišṭoṭ-ik.
came-3sg.f girl little-pred.f
‘A little girl came [A girl came, being little].’
The order of (19b) – originally a marked construction – matches the noun–

adjective order found in Arabic.
The position of local relation expressions is mixed. While adverbial con-
structions based largely on inherited material may have the modifier in a pos-
ition following the noun (example 21), Arabic-derived prepositions occupy
the same pre-nominal position as in Arabic (21):
(20) kury-is-ma bara

house-3sg.obl-loc outside
‘outside the house’
(21) min kury-a-ki

from house-obl-abl
‘from the house’
8. Syntax
In both languages, non-verbal predications allow subject and predicate to

appear in adjacent positions, which might be interpreted as ongoing conver-
gence on the part of Domari with the structure of Arabic nominal clauses:
160 Yaron Matras
(22) a. Domari
ama mišta-hr-omi.
I ill-cop-1sg
b. Arabic
ʔana ayyān.
I ill
‘I am ill’
(23) a. Domari
wuda bizzot-ēk.
old.m poor-pred.m
b. Arabic
l-xityār miskīn
the-old.m poor.m
‘The old man is poor.’
Another sign of convergence in non-verbal predications is the fact that, while

Domari retains Indo-Aryan negation particles (na/n) elsewhere, in non-ver-
bal predications is adopts Arabic mišš:
(24) pandžī mišš bizzot-ēk.

3sg neg poor-pred.m
‘He is not poor.’
The Arabic verbal negator mā is used for the negation of modal and auxil-
iary verbs borrowed from Arabic, as well as in the vicinity of Arabic-derived
prepositions and particles:
(25) warik-ar-a mlāy-ēk minšan mā džan-ad-is.

wear-3sg-ant veil-pred.m so.that neg know-3pl.subj-3sg
‘She used to wear a veil so that one would not recognise her.’
Clause combining rules in the two languages are by and large identical,
Domari drawing entirely on the pool of Arabic connectors and subordinating
conjunctions:
(26) qabel mā dža-m xałłaṣk-ed-om kam-as.

before comp go-1sg.subj finish-past-1sg work-obl
‘Before I left I finished my work.’
Domari 161
(27) iza wars-ari, n-aw-am-eʔ.

if rain-3sg neg-come-1sg-neg
‘If it rains, I shall not come.’
(28) na kil-d-om bara liʔann-hā wars-ari.

neg go.out-past-1sg out because-3sg.f rain-3sg
‘I did not go out because it is raining.’
(29) ū daʔiman/ yaʕnī/ kunt ama kury-a-m-ēk wala.

and always that.is was.1sg I house-obl-loc-pred.f and.not
kil-šami wala aw-ami.
exit-1sg and.not come-1sg
‘And I was always/ I mean/ at home, not going out nor coming.’
(30) kānū lamma qayišk-ad-a kunt wēšt-am-a wāšī-san.

was.3pl when cook-3pl-ant was.1.sg sit-1sg-ant with-3pl
‘When they were cooking I used to sit with them.’
As in Arabic, there is no infinitive in Domari, and the verbs of modal clauses

normally appear in the subjunctive (see example 16). Finally, relative clauses
in Domari take over the Arabic structure, including both the uninflected rela-
tivizer illi, and the presence of an Arabic-derived resumptive pronoun with
third person head nouns, which agrees with the head noun in gender and
number, retaining Arabic inflection:
(31) mana illi to-r-im iyyā-h

bread rel gave-2sg-1sg res-3sg
‘the bread you gave me’
(32) ple illi to-r-im iyyā-hum

money rel gave-2sg-1sg res-3pl
‘the money(pl) you gave me’
9. Lexicon
All semantic domains of the Domari lexicon adopt Arabic loans. The only
limitations on lexical borrowing are in the domain of grammatical vocabu-
lary, more specifically referential and deictic pronouns, and place deictics,
162 Yaron Matras
both of which categories seem uninfluenced by borrowings. Alongside Arab-

ic loans, Domari has also retained some Kurdish lexicon (zara ‘boy’, Kurdish
zaro), as well as Turkish lexical loans (qapi ‘door’, Turkish kapı).
10. Conclusion
It would be useful at this stage to remind the reader that under “borrow-
ings” and “loans” we understand those (Arabic-derived) forms in Domari, for
which the language has no inherited alternative; they are thus distinguished
from ad hoc switches or mixing patterns. This said, the extent of Arabic bor-
rowing into Domari can be described as nothing less but massive. It is in-
deed easier to point out those domains in grammar in which borrowing is
not found; even for those, exceptions or some hedging of another kind can
usually be found: There is no borrowing of case inflection (but Arabic has no
synthetic case markers), of synthetic tense marking (though Arabic-derived
modality and aspect auxiliaries retain their Arabic tense inflection), of person
marking on verbs, prepositions, or nouns (though Arabic-derived modality
and aspect auxiliaries retain their Arabic person inflection, and Arabic per-
son agreement markers may appear with Arabic-derived complementizers
such as inn- ‘that’ or liʔann- ‘because’, as well as on the resumptive pronoun
iyyā-). There is also no borrowing of definite articles (which exist in Arabic
but not in Domari, but may occasionally accompany Arabic nouns in Domari
discourse), of personal pronouns, or of demonstratives. These domains thus
appear as “resistant to borrowing” – at least in the history of Domari so far;
but given the extent of grammatical borrowing in the language, we may have
a tentative indication of those domains of grammar which the forces of con-
tact-induced change may find more difficult to infiltrate. Already the presence
of (at least some) Arabic-derived items in the Domari set of lower numerals,
in verbal negation, and in existential constructions, put Domari on the ex-
treme side of the continuum for grammatical borrowing.
A remarkable feature of Domari–Arabic contact is the reliance on the
borrowing of actual linguistic matter, or MAT-borrowing. While in some
domains this is the obvious choice, it is not at all self-evident that Domari
should use Arabic-derived prepositions, inflected aspectual auxiliaries, or
even subordinating conjunctions. The absence of language-internal gram-
maticalization processes to replicate the Arabic model (pattern replication
or PAT-borrowing) in these domains indicates considerable flexibility within
the speech community; it appears to allow itself to shift and re-define the
Domari 163
demarcation boundaries between the two separate sets of forms, rules and
constructions – the “internal code” Domari, and the “external code” Arabic –
which together constitute the speakers’ linguistic repertoire, and to maintain a
boundary that is almost symbolic, drawing only on a limited amount of every-
day vocabulary items, deictic and anaphoric reference tools, and the structur-
ing of tense and of person agreement, as the almost exclusive components
of the linguistic instrument used to flag and negotiate in-group identity. The
other linguistic-mental processing operations, most notably those associated
with discourse and utterance organization and clause combining, rely entire-
ly on Arabic structures; for these operations, the two codes are inseparable,
having undergone “fusion” (cf. Matras 1998). In this respect, the absence of
PAT-borrowing in a series of grammatical domains might be interpreted as a
kind of “weak resistance” against the collapse of cross-linguistic demarcation
boundaries, or perhaps as “full acceptance” of fusion.
One outstanding domain that relies on pattern replication is the formation
of non-verbal predications. The presence of nominal sentences in Arabic,
but not in Domari, is a major typological difference between the languages.
Here, Domari accommodates by replicating at least one principal feature of
the Arabic nominal sentence, namely the placement of Subject and Predicate
in adjacent positions, not separated by a verb. The verbal element in Domari
then follows the predicate; somewhat ironically, this is also the only construc-
tion type in which Domari resists full accommodation to Arabic word-order
rules, maintaining a verbal copula in enclitic position. As discussed above,
in the past tense this difference too is minimized, once again by resorting to
MAT-borrowing of Arabic copula forms.
Abbreviations
abl ablative m masculine

ant anterior neg negation
ben benefactive nom nominative
comp complementizer obl oblique
cond conditional past simple past (perfective)
cop copula pl plural
dat dative pred (non-verbal) predication marker
det (possessive) determiner res resumptive pronoun
f feminine sg singular
indef indefinite article subj subjunctive
loc locative
164 Yaron Matras
References
Macalister, R. A. S.
1914 The Language of the Nawar of Zutt, the Nomad Smiths of Palestine.
(Gypsy Lore Society Monographs 3). London: Edinburgh University
Press.
Matras, Yaron
guistics 36: 281331.
1999 The state of present-day Domari in Jerusalem. Mediterranean Lan-
guage Review 11: 158.
2002 Romani: A Linguistic Introduction. Cambridge: University Press.
Grammatical borrowing in Kurdish
(Northern Group)
Geoffrey Haig
1. Background
Kurdish is the cover term for a bundle of closely related west Iranian lan-
guages, spoken across a large area of the Middle East centering at the in-
tersection of the Turkish, Iranian and Iraqi national borders. The number of
speakers is variously estimated at between 20 and 40 million. Traditionally,
three major dialect clusters are identified: The Northern Group, often re-
ferred to as Kurmanji (also spelt Kurmanjî, Kurmanci, Kurmancî); the Cen-
tral Group, often referred to as Soranî; and the Southern Group. In terms of
numbers of speakers, the Northern Group is the largest, encompassing all
the Kurds of Turkey1 and Syria, plus the northernmost Kurds of Iraq (Zakho,
Dohuk), Kurds of west Iran around Lake Urmia, plus outliers in Azerbai-
jan, Armenia, and Georgia. The Central Group includes most of the Kurds
of Iraq around the cities of Suleimania, Kirkûk, and Erbil, plus speakers in
Iran around the cities of Sanandaj and Kermanshah. While the distinction
between Northern and Central Group Kurdish is not controversial, the exact
demarcation of the Southern Group remains hotly disputed, but I will not
enter these issues here. This chapter is concerned solely with the Northern
Group of Kurdish.
Speakers of the Northern Group have maintained long-standing relations
with speakers of many languages. Alongside the national languages such
as Arabic, Armenian, Azerbaijani, Georgian, Persian, Turkish and Russian,
there has been contact with numerous minority languages, for example var-
ieties of Eastern Neoaramaic, some indigenous languages of the Caucasus,
Turkoman, varieties of Romani (see for example the Chapter on Domari), to
name but a few. Obviously it is not possible to cover the full range of contact
situations and outcomes in the space of this chapter. Instead I will be focuss-
ing on the Kurdish of Central Anatolia, and restricting the analysis to the
impact of the (now) major contact language, Turkish. The areas considered
are Muş, Erzurum and Tunceli, where contact with Turkish has tradition-
ally been fairly strong, and where the number of other languages involved
is somewhat less than in many parts of the the Kurdish speech zone. Of the
166 Geoffrey Haig
three different local varieties considered, the Erzurum and Muş ones appear
close enough to be identified by their respective speakers as “my dialect”, but
the Tunceli variety shows some distinct features which are, to my knowledge,
not found elsewhere. In the interests of brevity, I will refer to these varieties
collectively as Central Anatolian Kurdish (CAK), although the term is not an
established one in Kurdology. In assessing the impact of Turkish, I have taken
the Kurdish of Zakho and Dohuk (North Iraq), often referred to as Bahdinî
(in assorted spellings), as a benchmark for a less Turkish-influenced brand of
Kurdish, against which Kurdish influence on Central Anatolian Kurdish can
be gauged. In addition to my own data, I draw on the results of other pub-
lications on Kurdish–Turkish language contact (Dorleijn 1996, Bulut 2002,
2005, 2006, Matras 2002, Haig 2001, 2006, forthc.). Two final caveats need
to be mentioned before we proceed to the data. First, Armenian must have
been a considerable influence on CAK up to the beginning of the twentieth
century, but it is unfortunately not possible to address the issue of Armenian
influence here. Second, comparisons with Turkish are generally drawn on
the basis of colloquial standard Turkish. But in fact, local varieties of Turk-
ish from the area differ in many respects from the standard. However, these
dialects represent “a rather neglected spot on the Turkological map.” (Bulut
2002: 51), so one often has little choice but to fall back on the convenient fic-
tion of the colloquial standard. Nevertheless, for the local variety of Turkish
from Erzurum at least, a reasonably reliable source is available, Gemalmaz
(1995), which I refer to at some points.
The sociolinguistics of Kurdish in Turkey is extremely complex, variegat-
ed, and poorly described. Prior to the founding of the Turkish Republic in
1923, relations between the two speech communities were not marked by any
great prestige assymetry. In fact, in the partly autonomous regions of Anato-
lia, Kurdish enjoyed considerable prestige as the language of many power-
ful landowners and religious leaders, and was learned as a second language
and used as a lingua franca by speakers of many other speech communities.
However, as a result of the nationalist currents accompanying the founding
of the Turkish Republic, the status of Kurdish deteriorated rapidly, and the
language has been officially non-existent for much of the Republic’s history
(see Haig 2003, 2004). The advent of compulsory schooling, military serv-
ice, and the intrusion of mass-media to the most isolated parts of Kurdistan
have led to large-scale language shift, and a drastic reduction in the number
of children acquiring Kurdish fully as an L1. The recent changes in the wake
of EU-fueled reforms, while of considerable symbolic importance, have done
little to reverse these trends. Unfortunately, there is as yet no serious empir-
Kurdish (Northern Group) 167
ical research on these ongoing developments so I am obliged to draw on the

personal observations of speakers I have worked with over the past years in
assessing the situation. The speakers who provided the narratives from which
most of this data has been taken are all Kurdish native speakers, between 50
and 75 years old (two males, two females). All except one (a woman in her
seventies, from Erzurum) are also fluent speakers of Turkish.
2. Phonology
The segmental phonology of Turkish consists to a large extent of cross-

linguistically unmarked elements, most of which are present in the phon-
eme inventories of the neighbouring languages anyway. It is thus difficult
to pinpoint phonological influence of Turkish on Kurdish. The best candi-
date for contact influence in the vowels is the partially systematic use of
fronted rounded vowels /y/ and /ø/ in some varieties of Kurdish (see Bulut
2005: 225226 for further discussion). There is little evidence for the trans-
fer of Turkish vowel harmony into Kurdish. Syllable structure in Kurmanji is
somewhat less constrained than in Turkish, but vowel epenthesis (rather than
consonant deletion) is the usual strategy for breaking up consonant clusters,
as it is in Turkish. As in Turkish, hiatus is generally avoided.
Kurdish phonology does show features that bespeak of contact influence,
but not very obviously from Turkish. For example, Kurdish in Central Anato-
lia exhibits a three-way distinction among the stops – between voiced, voice-
less aspirated, and voiceless non-aspirated – giving rise to a Caucasus-style
three-way stop distinction. There is some disagreement whether the relevant
phonetic parameter is ejective vs. non-ejective (according to some Soviet
authors) or voice–onset time (MacKenzie 1961; Kahn 1976), or both (Jas-
trow 1977). There is also disagreement on its origin; I consider contact influ-
ence from Armenian the most likely source. A second notable characteristic
of Central Anatolian Kurdish, at least of Muş and Erzurum, is the widespread
presence of pharyngealized segments, not only as expected in borrowings
from Arabic (e.g. dæʢwat ‘wedding celebration’), but also in words that his-
torically do not have pharyngeal segments, for example mʢær ‘snake’, tʢæl
‘bitter’, mʢæhinə ‘mare’. In a sense, the pharyngeals are extraneous to the
basic phonology: they are restricted to individual lexical items, they play no
part in morphology, their functional load is very limited, and there is con-
siderable cross-speaker and cross-dialect variability in the extent of their
presence.
168 Geoffrey Haig
Central Anatolian Kurdish is both prefixing and suffixing, with a small meas-
ure of Ablaut, restricted to the past stem formation of a closed set of irregu-
lar verbs. Turkish is exclusively suffixing, with practically no stem-vowel
alternations. Morphological alignment in Kurdish is split ergative (ergative
with past tenses of transitive verbs), thus contrasting with Turkish, which is
accusative throughout. There is evidence that ergativity in Kurdish is disap-
pearing, but this need not necessarily be linked to contact influence (as sug-
gested by Dorleijn 1996), as similar tendencies can be observed throughout
Iranian (Haig 2007).
A likely instance of contact influence is the complete loss of clitic pro-
nouns in the Northern Group. Such mobile pronominal clitics are a highly
salient feature of most West Iranian languages, including Persian. They can
be traced back to the clitic genitive/dative pronouns of Old Iranian, and are
thus a deeply rooted genetic trait of Iranian. Crucially, such clitics are mobile
(special clitics in the sense of Zwicky 1977). For example, in Central Kurd-
ish, personal pronouns in oblique functions frequently cliticize and move to
other constituents:
(1) nân=mân lagal bi-xô!

food=1pl.clitic with imp-eat:pres:2s
‘Eat a meal with us!’
The first-person plural clitic pronoun mân is syntactically a complement of

the preposition lagal, but it is phonologically hosted by the noun nân. As
mentioned, mobile pronominal clitics are one of the most salient features of
West Iranian. But they are completely lacking in the Northern Group of Kurd-
ish, as they are in Zazaki, likewise spoken in Central Anatolia. The presence
or absence of such mobile pronominal clitics appears to follow an areal dis-
tribution, having crossed, for example, genetic boundaries in the case of var-
ieties of Turkic spoken in Iraq under heavy contact pressure from Iranian and
Semitic languages with pronominal clitics (Bulut 2005: 227228). It seems
that there is an areal isogloss in the Iranian languages between languages
with mobile pronominal clitics, and those that lack them, with non-clitic lan-
guages concentrated in the west (Northern Group Kurdish and Zazaki) and
north (languages of the Caspian, Don Stilo, p.c.). It is not unreasonable to
link this to Turkic influence; the Turkish of Turkey and Azerbaijan has lacked
such mobile pronominal clitics throughout its attested history, cf. the Anato-
lian Kurdish and Turkish versions of (1):
(2) bi me va nan bi-xwe!

circ 1pl:obl circ food imp-eat:pres:2s
(3) bizim-le yemek ye!

1pl:gen-com food eat:imp.2s
Otherwise, contact influence on the morphology is difficult to ascertain, and

there is no attested borrowing of actual bound morphology (but see the con-
ditional clitic below), unless attached to a Turkish base.
The structure of the noun phrase in Kurdish differs from that of Turkish in
most respects. Turkish is consistently head-final, and marks case relations
through a single phrase-final affix or postposition. Nouns lack gender, and
there are no agreement phenomena between elements of the noun phrase.
Kurdish on the other hand is largely head-initial, while dependents to the
head are linked to it via the Ezāfe-particle, an unstressed vowel, sensitive to
the gender and number of the head.
There is a binary case distinction between the unmarked Direct and the
marked Oblique case in the Northern Group of Kurdish. The Oblique has
different realizations, depending on the gender of the noun, the presence or
absence of determiners, and is subject to some lexical exceptions (e.g. Ablaut
in a small set of masculine nouns). In Central Anatolian Kurdish, these rules
appear to have become simplified, so that in most environments the Oblique
is expressed through a single suffix [-i:],2 and Ablaut for case marking is
completely absent. The Oblique case marker thus ends up superficially re-
sembling the Turkish Accusative suffix, particularly that of the local Turkish
dialects, where the Accusative suffix tends towards a unified [-i] realization
rather than the standard Turkish realization which varies according to the
laws of vowel harmony.
Other case relations (Instrumental, Benefactive, Comitative, local rela-
tions etc.) are expressed through adpositions. Old Iranian was predominantly
prepositional (this is certainly true for Old Persian, pace Harris and Campbell
170 Geoffrey Haig
1995: 140), and the Southern and Central Groups of Kurdish are also clearly
overwhelmingly prepositional. However, there are rather interesting signs of
a contact-induced shift in the patterning of adpositions. In addition to the
basic inventory of prepositions, the Northern Group also has a number of
circumpositions, for example:
(4) Benefactive / Indirect Object: ji te ra ‘for you’

Locative: di dinyayê da ‘in the world’
While circumpositions are common to many West Iranian languages, in some

varieties of Central Anatolian Kurdish, the prepositional part of the these cir-
cumpositions is omitted, leaving the sole marker of case a postposed phrasal
clitic, or postposition:3
(5) Benefactive / Indirect Object: te ra ‘for you’

Locative: dinyayê da ‘in the world’
The Tunceli “postposition” -ra is the functional equivalent of the Turkish da-
tive suffix, while the Tunceli locative -da covers most of the ground that the
Turkish locative -da/-de does.4 Note that the expression of locative is thus
actually phonetically almost identical in the two languages, though this is a
coincidental outcome of the process outlined above, rather than actual bor-
rowing of matter. Thus Central Anatolian Kurdish has actually acquired two
postpositions via restructuring of its indigenous circumpositions. Given the
lack of such forms in other varieties of the Northern Group, there seems little
doubt that we are dealing with Turkish influence.
Central Anatolian Kurdish has also actually borrowed two Turkish post-
positions, göre and ait, but it does not use them as postpositions:
Tu. postposition Ku. preposition or noun

X-dat göre (li) gora X ‘according to X’
X-dat ait aitê X ‘belonging to X’
A dubious candidate for matter borrowing of a postposition is Turkish sonra,

an adverb meaning ‘after, later’, but also used as a postposition. In Central
Anatolian Kurdish a word şûnda is used in an almost identical manner, for
example as a postposition, where the NP it governs requires the Kurdish
preposition ji ‘from’:
(6) ji wê roj-ê şûnda

from that:obl day-obl after
‘from that day on’, ‘after that day’
This calques the Turkish expression (same meaning):
(7) o gün-den sonra

that day-abl after
However, it is by no means certain that şûnda really is borrowed from Turkish

(claimed for example by Chyet 2003); the phonological differences are dif-
ficult to account for, and some authors consider it to be related to the native
noun şûn ‘place’. This may therefore be a case of a conspiracy of phonologic-
al similarity and functional parallels leading to identity of usage across the
two languages, rather than straightforward borrowing.
Possessive constructions in Central Anatolian Kurdish are Ezāfe construc-
tions, with the possessor following the possessed, as in:
(8) heval-ek-î min

friend-indef-iz 1s:obl
‘a friend of mine’
This contrasts with the head-final construction in Turkish (same meaning):
(9) ben-im bir arkadaş-ım

1s-gen a friend-poss1s
No variety of the Northern Group known to me shows clear evidence for

contact-induced change in the possessive NP. For predicative expressions of
possession, however, there are notable similarities. Both languages lack a lex-
ical ‘have’ verb. For predicative expressions of possession, a possessive con-
struction such as those just shown is combined with an existential predicate:
(10) heval-ek-î min he-ye.

friend-indef-iz 1s:obl existent-cop(3s)
(11) ben-im bir arkadaş-ım var.

1s-gen a friend-poss1s existent-cop(3s)
‘I have a friend.’
172 Geoffrey Haig
While this could be mere chance similarity, it is notable that the Zakho and
Dohuk dialects of the Northern Group, with much less exposure to Turkish
influence, regularly use a different possessive construction, using a fronted
oblique Possessor (own fieldwork, see also MacKenzie (1962: 320) for fur-
ther examples):
(12) te kalem he-ye?

2s:obl pen existent-cop(3s)
‘Do you have a pen?’
This type of construction, historically certainly older than (10), is now entire-
ly lacking in Central Anatolian Kurdish, and was never present in Turkish.
4.1. Comparative forms of adjectives
Historically Kurdish has an adjectival suffix for the comparative degree, as in

dirêj-tir ‘long-er’ and mezin-tir/meztir ‘larg-er’, etc. The Central Group also
has a suffix for the superlative degree, but this is absent in Anatolian Kurd-
ish. Instead, Superlative is expressed periphrastically, for example through
a partitive construction (of-them the larger) etc. In Turkish, comparative is
expressed through the particle daha, and superlative is expressed through the
particle en, both combining with the basic form of the adjective:
(13) uzun, daha uzun, en uzun ‘long, longer, longest’, etc.
In Central Anatolian Kurdish, the Turkish comparative and the superlative

particles are frequently borrowed, and in my data, the native comparative suf-
fix can then be omitted (though according to Dorleijn (1996) and Bulut, (In
print), there is variation in this regard):
(14) daha mezin, en mezin, etc. ‘larger, the largest’
In both languages, the standard of comparison is expressed through an abla-

tive marker (preposition ji in Kurdish, case marker in Turkish). However, the
comparative suffix is usually retained when the standard of comparison is
explicitly mentioned, see Dorleijn (1996: 53). Thus at least for the generation
of speakers I have gathered data from, the original comparative suffix is still
available as an option, certainly for adjectives of high frequency.
On the face of it, the verb system of Kurdish and that of Turkish are very
different. Turkish verbs display a rich range of suffixes and enclitics distin-
guishing voice, tense, evidentiality, negation, interrogative, mood, and as-
pect, while person marking constitutes the final layer of suffixation. Kurdish
has a system of two stems, a present and a past stem, which combine with
prefixes and suffixes distinguishing two moods, and with the past stem, an
aspect distinction. Future tense in Kurdish is expressed through a clausal
clitic plus a subjunctive form of the verb. Passive and causative are expressed
via auxiliary verbs only. But the most striking distinction between the two
verb systems is that while Turkish possesses a rich inventory of productive
non-finite forms (infinitives, participles, converbs, verbal nouns), Kurdish has
practically no non-finite verb forms. The sole exceptions are the so-called
infinitive (actually just a conventionalized citation form, with low frequency
and productivity in syntax), and a secondary participle (see below).
There are nevertheless traces of contact influence from Turkish on the
Central Anatolian verb system. The first is the use of a tense form based on a
participle in [i:] plus the clitic copula person endings, to express evidentiality.
There is some variation in the extent that this is systematically applied, but
certainly for the Tunceli dialect which I investigated, forms such as hat-i-ye
‘come-part-cop(3s)’ are clearly intended to express an evidential, an unwit-
nessed past, corresponding to the Turkish form gel-miş ‘s/he came (but I did
not see it)’. Thus what is borrowed is a semantic category, expressed using the
forms felt by speakers to come closest to the relevant expression units in the
donor language.
A straightforward case of matter borrowing in the grammar of verbs is the
Turkish clause-final clitic conditional marker with the forms =se/=sa (only
the former form is borrowed), used to mark the protasis of a conditional
clause. An example from Tunceli showing the use of this form is the follow-
ing (see Bulut 2006 for further examples and discussion):
(15) eer bapir-ê min ew-na ne-girt-ine cem

if grandfather-iz:m 1s:obl dem-pl neg-take:pst-pl to
xa=sa ew-na di-mir-in.
refl=cond dem-pl prog-die:pst-pl
‘If my grandfather had not taken them in they would have died.’
174 Geoffrey Haig
5.1. Borrowing of verbs
Kurdish shares with most Iranian languages a lack of productive morpho-

logical derivation of new verbs. Thus there is no morphology that would be
functionally comparable to, say, English -ize or -ify (as in priorit-ize and
beaut-ify etc.), so other means are necessary to create new verbs.5 In Kurdish,
as in most Iranian and Indo-Aryan language, new verbs are created using the
light verb strategy: a non-verbal element (noun, adjective, adverb, or even a
phrase of some sort) is combined with the light verbs kirin ‘do’, or bûn ‘be,
become’ (a couple of other lexical verbs also operate as light verbs, but are
ignored here). The resulting expressions are tightly bound, compound-like
units that behave in some, though not all respects, like a simple verb form
(see Haig 2002 for details). This strategy is ideally suited for incorporating
borrowed items, be they nouns, verbs or anything else in the donor language.
For example, Arabic verbal nouns are incorporated into Kurdish in this fash-
ion (e.g. fehm kirin ‘understanding do’=‘understand’, from an Arabic mas-
dar, or qayîl bûn ‘saying be’=‘to be in agreement, consent’, from an Arabic
participle).
A striking regularity in Kurdish is that many of the items borrowed into
such light verb constructions are Turkish -mIş verb forms. In Turkish, the
-mIş suffix forms both participles with perfective meaning, and finite verb
forms with evidential meaning. In the latter case, the -mIş suffix is followed
by person agreement. It seems likely that the participles are the point of de-
parture for the borrowed forms, which never have any additional morphology.
The borrowing of Turkic -mIş verb forms into Iranian languages is a well-
established phenomenon, found also for example in Tadjik (through contact
with Uzbek). In Kurdish, such forms are found throughout Anatolia, and are
very common in the texts collected by Le Coq and Lerch from the nineteenth
century – see Bulut (2006) for extensive discussion. However, they are con-
siderably less widespread in the Kurdish of Zakho and Dohuk, corresponding
to the lesser degree of Turkish influence in these areas. In Turkish itself the
vowel in the -mIş suffix has four possible realizations, according to the rules
of vowel harmony. In Kurdish, however, it is generally always realized with
the same short vowel, somewhere between [i] and [ö] (simply transcribed
with /i/ in the examples). The suffix has not to my knowledge been extended
to non-Turkish stems. Examples of borrowed verbs attested in my Central
Anatolian Kurdish texts are the following (see Bulut (2006) and Dorleijn
(1996: 4952) for further discussion and examples):
(16) annamiş kirin ‘understand’; düşünmiş kirin ‘think’; nişanlanmiş

bûn ‘get engaged’; başlamiş kirin ‘begin’; sevinmiş bûn ‘be happy’;
tanışmiş bûn ‘know, get to know’; sokmiş bûn ‘be inserted’;
dayanmiş bûn ‘endure’; dinlemiş kirin ‘listen’
It needs to be stressed that the -mIş verb forms used in Kurdish do not have the
same semantics as the Turkish verb forms on which they are apparently based:
they have lost their perfective participial sense, and certainly do not have any
sense of evidentiality. In effect, they are a tense-neutral kind of action nom-
inal. While some are simply centuries-old, established borrowings, acquired
as part of the Kurdish lexicon during L1 acquisition, it appears that bilingual
speakers command a rule which permits them to access the Turkish reper-
toire for additional verbs, yet to deploy them in Kurdish with the appropriate
change in meaning.
In addition to the well-known borrowing of -mIş verb forms, Central Ana-
tolian Kurdish has also borrowed Turkish bare verb stems. The ones attested in
my data are:
(17) karış ‘mixing up’ (used as noun); say kirin ‘count’; şaş kirin ‘make
an error’, ‘be confused’;6 inan kirin ‘believe’; bekle kirin ‘wait’; kapat
kirin ‘close’
Another area of matter borrowing in the verbal system are impersonal expres-
sions for obligation and necessity. For example gerek (in Kurdish reduced to
gere) and lazim ‘necessary’, the latter originally from Arabic, are commonly
borrowed. In standard Turkish at least, such expressions are combined with
nominalized clauses (lit. my-going is-necessary). Kurdish of course lacks such
nominalizations (compare the lack of non-finite verb forms discussed above),
so the syntax of such impersonal particles looks different: they are put clause-
initially and followed by a finite clause in the subjunctive (lit. is-necessary I
go). Interestingly, this syntactic pattern is in fact a feature shared throughout
the area, and is also found in local varieties of Turkish (Bulut 2005: 229).
As discussed in the preceding sections, words of all major classes are bor-
rowed (nouns, verbs, adjectives), numerals, adverbs, connectors, discourse
176 Geoffrey Haig
particles. No borrowing is attested of personal pronouns, of demonstratives,

of interrogatives, or of negation particles.
Numerals up to 20 are generally Kurdish, but there is great variation
among different speakers, and even with individual speakers, the choice of
language for numerals will vary according to speech situation and context.
Dates will very often be given in Turkish. Pronouns (personal, possessive
and indefinite) and demonstratives are generally Kurdish throughout, though
some are shared anyway (e.g. her ‘each, every’ in both languages). Turkish
uses a subordinator ki, originally of Iranian origin, while Kurdish uses an
etymologically related subordinator ke, ku, and sometimes ki (though in a
somewhat different set of contexts to the Turkish one), so that in practice it
may be difficult to distinguish borrowing from native use.
Postpositions were briefly discussed above; Turkish has no prepositions
which could be borrowed, but at least two Turkish relational nouns, orta
‘middle’ and yan ‘side’, are borrowed and in some dialects of Kurdish used as
prepositions meaning ‘between’ and ‘beside’ respectively. A further source of
borrowed elements concerns the vocabulary of time (much of it originally of
Arabic provenience): Turkish zaman, vakit ‘time’, ara ‘interval, space’ (Tu.
o arada ‘at that period’, calqued in Central Anatolian Kurdish as wê arê da),
sonra (as an adverb) ‘after, later’.
The largest group of borrowings concerns discourse markers and conjunc-
tions of various types. The following Turkish words (some of Arabic origin)
are used in spoken Anatolian Kurdish to varying degrees (given in standard
Turkish orthography):
(18) ama ‘but’, fakat ‘but (more stongly contrastive), artık ‘still, yet, al-
ready’, yani ‘I mean, you know’, çünkü ‘because’, demek ‘that means’
(Kurdified to demêga), sırf ‘only’, yalnız ‘just, only’ (usually with
metathesis of the liquid and nasal)
However, Kurdish retains a distinct set of such words which most speak-
ers are able to use: lê ‘but’, êdî (same meaning as Tu. artık), bes ‘but, well,
enough’, loma ‘because’ (borrowed from Arabic). They are therefore not re-
placed wholesale by Turkish elements, but Turkish elements are used along-
side them, with the frequency and distribution of the Turkish elements vary-
ing heavily from speaker to speaker, and speech situation to speech situation.
The heaviest concentration I have seen is in the Tunceli data, where the speak-
er’s narratives are regularly interspersed with Turkish discourse markers, ad-
verbs, and discourse-regulating phrases such as:
(19) ondan sonra ‘after that’; ister istemez ‘whether one likes it or not’; en
sonunda ‘finally’; tekrar ‘again’; yoksa ‘or else’; işte (general filler)
Connectors and subordinating conjunctions are only very sparingly used in

Turkish, and many of those that are used are borrowings from other lan-
guages. In Central Anatolian Kurdish, virturally no connectors are used, sim-
ple juxtaposition being the favoured means of clause combining – see Sec-
tion 8 for further discussion.
Both Turkish and Kurdish share a (largely) verb-final constituent order in the
clause, but elsewhere, constituent order diverges. For the NP, see Section 4,
in subordination, it is generally the main clause that precedes the subordinate
clause in Kurdish, while the opposite is found in (Standard) Turkish. How-
ever, in the spoken vernacular both languages prefer asyndetic sequences, so
the differences are less noticeable. Generally, there is little violation of the
Kurdish constituent-order rules, except when entire Turkish phrases are in-
serted, which can mostly be interpreted as code-switching.
However, the Tunceli dialect shows a striking change of word order in
a minor construction, which appears to be clearly influenced by the corres-
ponding Turkish construction. As it is one of the few fairly clear instances of
contact-induced change in constituent order in Kurdish, it is worth looking at
a little closer. In the interest of clarity, I will restrict myself to the past tenses,
where the changes can be observed most clearly. In Turkish, there is no lex-
ical copula verb for static senses of ‘be’ in main clauses. Rather, we find a
clitic tense marker, in the next example realized as =ti, to which a person
agreement suffix is added (here zero for the third person singular):
(20) Murat mühendis=ti.

Murat engineer=pst(3s)
‘Murat was (an) engineer.’
For processual senses (‘become’), however, an inflected form of the lexical

verb olmak is required:
(21) Murat mühendis ol-du.

Murat engineer become-pst(3s)
‘Murat became an engineer.’
178 Geoffrey Haig
In the Northern Group of Kurdish, on the other hand, the same lexical verb
(bûn) is used in both static and processual senses. The semantic difference is
normally indicated through a change of word order. With the static sense, the
verb follows the copula complement:
(22) Murat mezin bû.

Murat big be:pst(3s)
‘Murat was big/tall.’
In the processual sense, the verb precedes the copula complement. Compare
(23) with (22):
(23) Murat bû mezin

Murat be:pst(3s) big
‘Murat became big/grew up.’
Thus most varieties of the Northern Group mark the semantic distinction be-
tween static and processual senses of the copula through a change in word
order, whereas Turkish marks the distinction through the opposition full verb
vs. clitic. In the Tunceli dialect of Central Anatolian Kurdish, however, we
find the semantic opposition marked in the same manner as in Turkish. With
static senses of bûn, the initial [b-] of the verb is elided and the verb cliticizes
to the copula complement:
(24) Gund-ê ma pir rind=û.

Village-of us very fine=be:pst(3s)
‘Our village was very beautiful.’
With processual senses of bûn, however, the full form of the verb with initial
[b-] is required, but unlike in other varieties of the Northern Group, it follows
the copula complement:
(25) kili darmadaxin bû ew gund-da.

everything haywire be:pst(3s) that village-loc
‘Everything went haywire, in that village.’
Note that two changes must have happened to bring Tunceli into line with
Turkish in this respect: a fairly natural phonological change (lenition of [b]
to [w]/zero), together with cliticization, which affected the sentence final,
static sense of bû. The originally non-sentence-final, processual bû on the

other hand,was not affected by the phonological change, but it did undergo a
word-order shift, from after the copula complement to before it.
8. Clause combining and complex clauses
Standard Turkish has non-finite verb forms for clause combining (converbs),
for complementization (participles, infinitives), and for relative clause for-
mation (participles). Kurdish, on the other hand, has practically no non-finite
verb forms, so the languages differ radically as far as the available inventory
of forms is concerned. Nevertheless, in actual usage the differences are to
some extent levelled out. As Bulut (2002, 2005) points out, Turkish dialects
from east Anatolian make more sparing use of converbs than the standard
vernacular (see below on coordination).7 Investigation of the spoken Turkish
provided in Gemalmaz (1995) show that the kind of nominalizations used in
the standard (written) language for complementation, and for relative clauses
are extremely scarce, but are nevertheless possible. Nevertheless, the texts of
Gemalmaz (1995) show frequent use of non-finite adverbial clauses consist-
ing of a verbal noun plus a semantically bleached noun or adverb, formally
corresponding to standard Turkish git-tiğ-im zaman go-ptcpl-poss1s time
‘when I go/went’, or git-tik-ten sonra go-ptcpl-abl after ‘after (someone)
went’ (see also Menz 2002). There is nothing corresponding to this type of
adverbial clause in Kurdish; rather, a temporal adverb (e.g. gava ‘the time
(that)’=‘when’) occurs clause-initially.
However, in the domain of clause coordination, Turkish and Kurdish
show one striking similarity: coordination of sequential clauses is achieved
in both languages through simple unmarked juxtaposition of finite clauses.
Both local varieties of Turkish, and the Kurdish of the region, scarcely use
an and-type conjunction for clause combining.8 Other varieties of both lan-
guages do have such a form (standard Turkish ve, Kurdish û). Thus in the
most basic type of loose clause combining, there is a striking unity in the
neighbouring languages, but it does not extend greatly into the expressions of
more syntacticized inter-clausal relations (adverbial, subordination, relative
clauses etc.).
180 Geoffrey Haig
9. Conclusion
Despite centuries of coexistence of Turkish and Kurdish speakers in Anatolia,

the core grammars of Central Anatolian Kurdish and Turkish have remained
quite distinct: constituent order in the NP, inflectional morphology, gender
system, alignment in past tenses, means of subordination. As a rough indica-
tion of Turkish influence, it is instructive to compare the grammar of CAK
with the Bahdinî dialect of Kurdish spoken in North Iraq. The major gram-
matical differences are the following (some were discussed in the chapter,
others were not):
a. Loss of non-canonical subjects with predicates of possession, desire, ne-

cessity (cf. example (12) above);
b. loss of gender distinctions in some environments (not discussed above);
c. emergence of a unified plural marker (not discussed above);
d. increase in the number and frequency of circumpositions and postpos-
itions;
e. increase in text frequency of the use of pre-verbal adpositional arguments
(not discussed above).
The changes generally involve a loss of constructional variants, or changes

in the frequency of constructional variants, rather than the introduction of
completely new structures, either through matter or pattern borrowing. Taken
individually, they are examples of commonplace language changes that read-
ily occur in the absence of contact pressure. Yet it is a simple fact that all the
changes noted result in a grammar that is significantly closer to that of Turk-
ish. It did not have to be that way, and it certainly seems more than coinci-
dence. What we have then is the cumulative effect of small changes, each of
which serve to push the entire grammar a little further in a certain direction.
This type of gradual, cumulative change may be typical for the type of
long-standing coexistence on more or less equal footing that characterized
Turkish–Kurdish language contacts up the beginning of the twentieth cen-
tury. For example, code-switching and early Kurdish–Turkish bilingualism
may well have been quite unusual among the rural population in the pre-
Republican era (indeed, one of my speakers was still a Kurdish monolingual),
so we might expect the contact outcome to be quite different to that found in,
for example, very small and threatened minority languages surrounded by a
dominant language, where bilingualism has been the norm for an extended
period. The Tunceli data nevertheless show the results of very intensive bor-
rowings, possibly reflecting the region’s relative proximity to the Turkish-

speaking west, but this needs closer monitoring.
While the present chapter has been based on data from speakers all over
50, it seems likely that the speech of the younger generation (under 30), most
of whom have attended Turkish-speaking schools and have had early expo-
sure to Turkish mass media, will differ significantly in the type and extent
of borrowing. However, the current situation is characterized not merely by
an increase in Turkish influence, but also by parallel developments which
render the analysis considerably more complicated: the emergence of Kurd-
ish mass-media, and the resurgence of Kurdish nationalism which has led to
the politicization of language issues, and a backlash of purism against Turk-
ish influence. These factors serve to make the evaluation of Turkish influence
on the speech of younger Kurdish speakers a very complex topic for future
research.
Notes
1. Note that the Northern Group of Kurdish does not encompass the closely-related
Iranian language Zaza(ki), spoken in Central Anatolia and referred to by some
authors as a variety of Kurdish.
2. There is considerable cross-speaker variation here. Some speakers still distin-
guish two forms of the Oblique suffix, phonologically [e:] and [i:], depending
on the gender of the noun.
3. The same phenomenon is found in varieties of Kurdish further North in Azerbai-
jan and the Caucasus generally.
4. The postpositional particles vary as to whether or not they trigger the Obliqe
case of their complement NP, but the details are too complex to discuss here –
see Bulut (2006: 101104) for discussion.
5. Kurdish must at some stage have had productive denominal verbalizing morph-
ology, as there are verbs in the language obviously derived from borrowed nouns
(e.g. zewicîn ‘to marry’ from Arab. zQwdZ), but these processes are now de-
funct.
6. Widespread in all varieties of the Northern Group.
7. I am very grateful to Christiane Bulut for a number of important observations on
the Turkish dialects of Eastern Anatolia (p.c.).
8. In Kurdish, an enclitic form of û / u is sometimes used in combining NPs: dê=u
bav ‘mother and father’, and it may occasionally cliticize to a verb form (use of
clitic ‘and’ is widespread in colloquial Persian, see Stilo 2004). But not all cases
of a postverbal clitic û have an immediately following second clause, so it is dif-
ficult to ascertain the function of the clitic in these cases.
182 Geoffrey Haig
References
Bulut, Christiane
2002 Evliya Çelebi as a linguist and dialectologist: seventeenth century East
Anatolian and Azeri Turkic dialects. In: N. Tezcan and K. Atlansoy
(eds.), Evliya Çelebi ve Seyahatname, 4963. Gazimağusa, North
Cyprus: Doğu Akdeniz Üniversitesi.
2005 Zum Kopierverhalten türkischer Übergangsdialekte. In: Walter Bisang,
T. Bierschenk, D. Kreikenborn, and U. Verhoeven (eds.), Prozesse des
Wandels in historischen Spannungsfeldern Nordostafrika/Westasien,
221233. Würzburg: Ergon.
2006 Turkish elements in spoken Kurmanji. In: Hendrik Boeschoten and
Lars Johanson (eds.), Turkic Languages in Contact. Proceedings of the
Wassenaar Meeting, Feb. 1996, 95121. Wiesbaden: Harrassowitz.
Chyet, Michael
2003 Kurdish–English Dictionary. New Haven: Yale University Press.
Dorleijn, Margreet
1996 The Decay of Ergativity in Kurmanci. Language Internal Or Contact
Induced? Tilburg: Tilburg University Press.
Gemalmaz, Efrasiyap
1995 Erzurum ili ağızları, II. Cilt: Inceleme, Metinler ve Dizinler. Anka-
ra: Türk Dil Kurumu. [The dialects of the Erzurum region, Volume 2:
Analysis, Texts, Glossaries]
Haig, Geoffrey
2001 Linguistic diffusion in modern East Anatolia: From top to bottom. In:
Alexandra Aikhenvald and R. M. W. Dixon (eds.), Areal Diffusion and
Genetic Inheritance: Problems in Comparative Linguistics, 195224.
Oxford: Oxford University Press.
2002 Complex predicates in Kurdish: Argument sharing, incorporation, or
what? Sprachtypologie und Universalienforschung/Language Typ-
ology and Universals 55 (1): 2548.
2003 Sprachenvielfalt und Sprachenpolitik am Rande Europas: die Minder-
heitensprachen der Türkei. In: Dieter Metzing (ed.), Sprachen in Eu-
ropa. Sprachpolitik, Sprachkontakt, Sprachkultur, Sprachentwicklung,
Sprachtypologie, 167186. Bielefeld: Aisthesis.
2004 The invisibilisation of Kurdish: The other side of language planning in
Turkey. In: Stefan Conermann and Geoffrey Haig (eds.), Die Kurden:
Studien zu ihrer Sprache, Kultur und Geschichte, 121150. Hamburg:
EB-Verlag.
2006 Turkish influence on Kurmanjî: Evidence from the Tunceli dialect.
In: Lars Johanson and Christiane Bulut (eds.), Turkic-Iranian Con-
tact Areas. Historical and Linguistics Aspects, 279295. Wiesbaden:
Harrassowitz.
2007 Alignment shift in Iranian languages. A Construction Grammar

Approach. Berlin: Mouton.
Forthc. Spoken Kurdish from the Muş and Erzurum regions. Wiesbaden: Rei-
chert.
Harris, Alice, and Lyle Campbell
1995 Historical Syntax in Cross-Linguistic Perspective. Cambridge: Cam-
bridge University Press.
Jastrow, Otto
1977 Zur Phonologie des Kurdischen in der Türkei. Studien zur Indologie
und Iranistik 3: 84106.
Kahn, Margaret
1976 Borrowing and Regional Variation in a Phonological Description of
Kurdish. Ann Arbor, Michigan: Phonetics Laboratory of the Univer-
sity of Michigan.
MacKenzie, David
1961 Kurdish dialect studies, Volume I. London: Oxford University Press.
1962 Kurdish dialect studies, Volume II. London: Oxford University Press.
Matras, Yaron
guistics 36 (2): 281331.
2002 Kurmanjî complementation: Semantic-typological aspects in an areal
perspective. Sprachtypologie und Universalienforschung/Language
Typology and Universals 55 (1): 4963.
Menz, Astrid
2002 The dialects of Erzurum. Some remarks on adverbial clauses. Turkic
languages 6: 199214.
Stilo, Don
2004 Coordination in three Western Iranian languages. Vafsi, Persian and
Gilaki. In: Martin Haspelmath (ed.), Coordinating Constructions,
270330. Amsterdam: Benjamins.
Zwicky, Arnold
1977 On Clitics. Bloomington: Indiana University Linguistics Club.
Arabic grammatical borrowing
in Western Neo-Aramaic
Werner Arnold
1. Background
The modern Aramaic dialects are the remnants of a wide variety of old and mid-
dle Aramaic dialects that dominated the Middle East in antiquity. The western
variety of Aramaic survived only in three villages in the Qalamūn mountains
in Syria, namely Maʿlūla, Baxʿa and Jubbʿadīn. Although the three villages lie
close to each other, there are remarkable differences in the language, so that
one can speak of three different dialects. In the phonology the dialect of Baxʿa
is more archaic than the two other dialects. Jubbʿadīn is the most progressive
dialect. On the other hand the dialects of Jubbʿadīn and Maʿlūla are in morph-
ology and vocabulary more archaic than that of Baxʿa, where gender distinc-
tion is lost in the plural of verbs, adjectives and pronouns.
The population of Maʿlūla is Christian with a small Muslim minority.
Baxʿa and Jubbʿadīn are purely Muslim villages. There are no significant dif-
ferences in the dialect between Muslims and Christians in Maʿlūla. Today the
language, known as Western Neo-Aramaic (or Maʿlūla Armaic), is spoken by
a maximum of 10,000 people. Among all Neo-Aramaic languages Western
Neo-Aramaic (WNA) is the only language with a growing number of speak-
ers. The language is a vernacular, not written, and only spoken in everyday
life, within the village and the families. The introduction of a writing sys-
tem (with Hebrew characters!) by a teacher of English in Maʿlūla is still in
the beginning and not accepted by everbody in the village. The language of
instruction and religious worship is Arabic. Therefore all inhabitants of the
three villages speak Arabic as a second mother tongue. While in Baxʿa und
Jubbʿadīn an Arabic dialect is spoken which is very similar to the dialect of
the neighbouring villages, the people of Maʿlūla adopted the city dialect of
Damascus not later than in the nineteenth century. The sound changes which
are connected with the adoption of the new dialect did not effect the Aramaic
language of the village, including the vocabulary which was borrowed from
Arabic until that time. The fact that nearly all Arabic loans in Maʿlūla origin-
ate from the period before the change from the rural dialect to the city dialect
of Damascus shows that the contact between the Aramaeans and the Arabs
186 Werner Arnold
was intimate during the centuries and that the required Arabic vocabulary
was incorporated into the Aramaic dialect to such a high degree, that after
the change to the dialect of Damascus, only a limited number of words were
borrowed. One can find words mainly from the standard language and only
sporadically words originating in the dialect of Damascus in this last period
of loans.
Arabic has been the only contact language for more than a thousand years,
therefore NWA contains only a few Turkish and Greek loans. The fact that
Aramaic and Arabic have a long common history and that they are very close-
ly related languages that widely correspond in phonology and morphology,
facilitates the mutual adoption and assimilation of loans, but makes the iden-
tification of borrowings from one to the other very difficult.
The present work is based predominantly on my own fieldwork in the three
villages that has been carried out in the years 19851987. More results of that
research are published in Arnold (1990, 2000, 2002).
2. Phonology
2.1. Consonants
The most noticeable sound shifts which occurred in WNA concern the so-
called Begadkephat consonants. In an earlier stage of the Aramaic language
the phonemes /b/ [b], /g/ [g], /d/ [d], /k/ [k], /p/ [p] und /t/ [t] had the spirant
allophones /v/ [v], /ġ/ [ɣ], /¦/ [ð], /x / [x], /f/ [f] and /§/ [θ] which appear ba-
sically after vowels. In WNA the acoustic difference between spirant and plo-
sive pronunciation is preserved. However, it is fixed for each single word and
within each root so that the former allophones have become phonemes. The
spirants are preserved with the exception of the voiced labiodental fricative
[v] which shifted to the voiced bilabial plosive [b] most likely under the influ-
ence of Arabic which does not know a phoneme /v/. The old voiced plosives
were devoiced and the old voiceless plosives [k] and [t] were palatalized. The
same sound change that occurred within Aramaic can also be observed in the
majority of the Arabic loans:
(1) *d [d] → t [t] ḳerta < qird ‘monkey’

*t [t] → č [ʧ] čamam < tamām ‘completely’
*b [b] → p [p] xappōza < xabbāz ‘baker’
Western Neo-Aramaic 187
The phonemes /¦̣/ [ð~] and /ž/ [ʒ] (Baxʿa /ǧ/ [ʤ]) are restricted to Arabic
loans:
(2) _̣arfa < _̣arf ‘skin bag for butter’

žayša, Baxʿa ǧayša < ǧayš ‘army’
We might posit a pronunciation [g] for the Arabic phoneme /ǧ/ [ʤ] for the
earliest time of language contact between Aramaic and Arabic in Syria; in any
case, this Arabic /ǧ/ was treated like Aramaic /g/ and shifted after consonants
to the voiceless consonant [k] and in word initial position and after vowels to
the spirant ġ [ɣ]:
(3) initial position: ġmōʿča < gamāʿa ‘crowd’

after vowels: farraġ < farrag ‘he looked’
after consonants: mawkʿa < mawgaʿ ‘pain’
In all later borrowings Arabic /ǧ/ appears in all positions as ž [ʒ] (in Baxʿa ǧ
[ʤ]). The consonants [d], [g] and the glottal stop ʾ [ʔ] in word-central position
are attested only in very few loans of Arabic and European origin which are
not fully assimilated to the Aramaic system of phonology.
The assimilation of [n] to the following consonant is a very old phenom-
enon of the Aramaic language. Younger speakers try to avoid the assimilation
of [n] under the influence of Arabic, where [n] is normally not assimilated to
the following consonant, if the consonant [n] can be reconstructed from other
derivations of the same root, as in the following example:
(4) nōḥeč ‘he comes out’

yiḥḥuč → yinḥuč ‘he should come out’
2.2. Vowels
The vowel system of WNA is more complicated than the Arabic system. The
Arabic short vowels [i] and [u] appear as in WNA in loans of Arabic origin in
stressed syllables as [e] and [o].
The Arabic long vowel ā [a:] appears in Western Neo-Aramaic in general
as ō [o:] and is shortened to [a] in unstressed syllables:
188 Werner Arnold
(5) Arabic Aramaic

siʿr > séʿra ‘price’ (but pl siʿrṓ)
burǧ > bórža ‘tower’ (but pl buržṓ)
qāḍi > ḳṓ_̣ya ‘judge’ (but pl ḳa_̣yṓ)
In some loans the Arabic imāla (i-umlaut) ā [a:] → ē [e:] is attested (wētya
< wādi ‘valley’) as in many Arabic dialects in the area (Arnold and Behnstedt
1993: 98105).
2.3. Stress and syllable structure
Aramaic words stress only the final syllable or the penultima. In analogy to
the Arabic geminated stems a vowel a is inserted after the geminated radical
in Baxʿa and Jubbʿadīn, so that the stress is now on the antepenultima:
(6) Baxʿa and Jubbʿadīn záppani ‘I sold’

Maʿlūla zappni
3. Nominal structures (the integration of loans)
Masculine singular nouns of Arabic origin receive the ending -a, feminine
singular nouns the ending -a or -ča. Masculine plural nouns have the ending
-ō(ya), the feminine plural-ending is -(y)ōa. After numerals, an enumeration
plural is formed:
(7) Arabic Aramaic (Maʿlūla)

qism ḳesma ‘part’
ḳismō ‘parts’
iᵊr qism ‘two parts’
samaka samᵊka ‘fish’ (sg)
samkōa ‘fish’ (pl)
arč samkan ‘two fish’
Some Arabic loans receive the original Aramaic diminutive endings -ōna (m)
and -(a)nīa (f), which express diminution no longer:
(8) Arabic Aramaic

muʿallim mʿallmōna ‘teacher’ (m)
muʿallima mʿallmanīa ‘teacher’ (f)
As Arabic and Aramaic are two closely related Semitic languages, they share
most of their nominal patterns. Only a few forms are borrowed from Arabic
as miCCaC/muCCaC and maCCūC:
(9) Aramaic < Arabic

mufčḥa < miftāḥ ‘key’
makčūba < maktūb ‘letter’
Gender distinction is lost in Baxʿa and among some speakers in Maʿlūla and
Jubbʿadīn too, in all plural forms of the adjectives and of the independent or
suffixed pronouns. This can be explained only by the influence of the neigh-
bouring Arabic villages, which do not know gender distinction in these forms
at all.1
The two old Semitic tenses perfect and imperfect are preserved2 probably
under the influence of Arabic (Correll 1978: 153). They are used to express
preterite tense and subjunctive exactly as in the Arabic dialects of Syria.
Different from Arabic are the two new tenses, the present tense and the per-
fect that have developed from the old participles. Future tense or optative
is expressed in Maʿlūla and Baxʿa with the auxiliary verb batt- (< Arabic
badd-)3 with Aramaic person inflection and following subjunctive just as in
Arabic:
(10) Aramaic: batt-e yīxul

Arabic: badd-o yākol
will/want-3sg eat-3sg.subj
‘He will/wants to eat.’
To express a progressive action the Arabic prefix ʿam- (< ʿammāl) with present
tense is used:
190 Werner Arnold
(11) Aramaic: ʿamšō mōya

Arabic: ʿambišrab mayy
‘He is drinking water.’
Verbs of Arabic origin are treated like Aramaic verbs. They are supplied with
the same inflectional affixes as Aramaic and carry the same object suffixes.
In fact only the radicals of the Arabic root are borrowed and treated as an
Aramaic root. Under the influence of the Arabic dialects in the surrounding
villages Baxʿa has lost gender distinction in all plural forms of the verb. The
masculine form was generally adopted.
The following paradigm gives the preterite tense of the basic stem of the
verb i_̣ḥek ‘to laugh’:
(12) Coll. Arabic Aramaic (Maʿlūla)

_̣aḥak i_̣ḥek ‘he laughed’
_̣aḥak-at _̣iḥk-a ‘she laughed’
_̣aḥak-u i_̣ḥek ‘they laughed’
_̣aḥak-t _̣iḥk-ič ‘you (m sg) laughed’
_̣aḥak-ti _̣iḥk-iš ‘you (f sg) laughed’
_̣aḥak-tum _̣iḥk-ičxun ‘you (m pl) laughed’
_̣aḥak-tin _̣iḥk-ičxen ‘you (f pl) laughed’
_̣aḥak-t _̣iḥk-i ‘I laughed
_̣aḥak-na _̣iḥk-innaḥ ‘we laughed
The Arabic verbal roots are integrated into the WNA system of verbal stems
exactly like Aramaic roots within the traditional Aramaic system of verbal
stems. This is the case with the following Arabic stems:
(13) Arabic loan Aramaic word

I stem iḍḥeḳ ‘he laughed’ išmeʿ ‘he heard’
II stem ḥammel ‘he loaded’ baššel ‘he cooked’
IV stem aġreḳ ‘he fell asleep’ arkeš ‘he woke up’
V stem čḥammal ‘he endured’ čzappan ‘he was sold’
All Arabic stems can be incorporated into the Aramaic stem system. Arabic
stems, which do not correspond to an Aramaic stem are converted in the fol-
lowing way:
(14) Arabic WNA

III stem šāraṭ > šōreṭ ‘to bet’
VI stem tarāfaq > črōfeḳ ‘to accompany’
VII stem infaǧar > inᵊfžar ‘to explode’
VIII stem iftaham > if ᵊčham ‘to understand’
X stem istaqbal > sčaḳbel ‘to accept’
The Arabic VII. and VIII. stems can be formed also from Aramaic roots to
express the passive to the Aramaic I. stem and have replaced with some few
exceptions the old Aramaic passive stem E§pʿel:
(15) Aramaic fḥ I stem ifaḥ ‘to open’

VII stem inᵊfaḥ ‘to be opened’
Aramaic nġb I stem inġab ‘to steal’
VIII stem inᵊčġab ‘to be stolen’
On the other hand the Aramaic passive stem ččaCCaC (< ettaCCaC) can also
be formed from Arabic roots:
(16) Aramaic wrx Arabic wqf

awrex awḳef ‘to prolong’, ‘to erect’
ččawrax ččawḳaf ‘to be prolonged’, ’to be erected’
Many borrowings of the Arabic I stem appear in Western Neo-Aramaic in the

IV stem. The reason for this change is unclear, but it should be mentioned that
the IV stem in Western Neo-Aramaic is very productive whereas in the Arab-
ic dialects of Syria the IV stem has disappeared nearly completely:
(17) Arabic I stem bada > Aramaic IV stem abᵊt to begin
All ordinal numerals are borrowed from Arabic and have with the excep-
tion of awwal ‘first’ the Arabic imāla ā > ē (ēni ‘second’, ēle ‘third’, rēbeʿ
‘fourth’, etc.).
Western Neo-Aramaic borrowed from Arabic the reflexive ḥōl- (< ḥāl)
which is used beside Aramaic nefš-. Furthermore the Arabic reciprocal baʿ_̣-
192 Werner Arnold
is borrowed from Arabic. All of them are used with the Aramaic pronominal
suffixes.
The dativus ethicus is attested in Western Neo-Aramaic and in the Arabic
dialects of the area and may be a result of language contact:
(18) Aramaic Arabic

_mex-le šaʿa nām-lo sāʿa ‘He slept himself for an hour.’
The Arabic coordinating conjuctions fa ‘and so’, walla ‘or’ and lakin (also
lakan, lakinni, lakōn; in Jubbʿadīn lačin, ličin, līčin) or bass for ‘but’ are
borrowed from Arabic. The subordinating conjunction _ōb ‘if’ is a calque of
Arabic inkān in which in is translated by Aramaic _ and kān by the Aramaic
equivalent ōb (Spitaler 1938: 117). The Arabic word is also used in the form
nkōn. Other borrowings are iza and law ‘if’, innu (also inne and inni) ‘that’
and lamma ‘when’ (beside Aramaic mi_), illa ‘except’ and others.
For the times of the day only _̣ahwa (< ḍaḥwa) ‘late morning’ and ʿaṣᵊr
‘afternoon’ are of Arabic origin. The days of the week are preserved in Maʿlūla
and Jubbʿadīn, but they are replaced by the Arabic names in Baxʿa.
Adjectives with the Arabic prefix m-, the infix -č- (< Arabic -t-) and the
suffix -ōnay (< -āni) expressing affiliation (amrikōnay ‘American’) were in-
corporated into the Aramaic vocabulary and among young speakers have
sometimes replaced the inherited Aramaic form:
(19) Aramaic adjective: išmeʿ ‘audible’

Young speakers: mašmuʿ (Arabic masmūʿ)
Old Aramaic has no pattern to express elative terms so that WNA was forced
to realize comparative and superlative forms by adoption of the Arabic mor-
phological pattern ʾaCCaC:
(20) Arabic WNA

ṣaġīr izʿur ‘small’
aṣġar minn-o azʿar menn-e ‘smaller than he’
aṣġar wāḥid azʿar aḥḥa_ ‘the smallest’
6. Syntax
The relative particle ti (also či, Baxʿa ći, all < *dī) is Aramaic but the way of
subordinating relative clauses is fully compatible with Arabic. In syndetic
relative clauses the antecedent of the relative clause is determined and fol-
lowed by the relative particle while asyndetic relative clauses have no relative
particle and the antecedent is indetermined. In old Aramaic such asyndetic
relative clauses are unattested (Correll 1978: 117) and must be considered as
borrowed from Arabic.
(21) a. Asyndetic
wō rōʿya ʿamraʿēl ʿizzōye.
ep-pret shepherd-indet herd-3sg.pres.pm goat-pl-sf.3sg.m
‘There was a shepherd, (who was) herding his goats.’
b. Syndetic
hanna ġamla ti ṭʿille.
dp.3sg.m camel rp carry-perf.3sg.m-sf.3sg.m
‘This camel, which has carried him.’
The coordination of circumstantial clauses (Arabic ḥāl) by means of the con-

junction w- (and) is not attested in old Aramaic but occurs frequently in WNA
and in the modern Arabic dialects of the area (Grotzfeld 1965: 101). Correll
(1978: 147) argues that this construction is borrowed from the Arabic dialects
as no other Aramaic dialect has similar circumstantial clauses.
(22) w hū ʿammallex 4 willa ōle sōblᵊ

cp ip-3sg.m walk-3sg.m.pres-pm lo! come-3sg.m-pret mayor-cs
blōta.
village
‘While he was walking, lo! – the mayor of the village came.’
7. Conclusion
Aramaic and Arabic are two closely related Semitic languages with a long
common history. The fact that morphemes in the Semitic languages normally
consist of three radicals facilitates the mutual adoption and assimilation of
loans, but makes the identification of borrowings from one to the other very
difficult. The Arabic dialects spoken in the surroundings of the WNA speech
island themselves are very much influenced by Aramaic and sometimes pre-
serve Aramaic words which no longer occur in WNA. For many centuries
this type of aramaicized Arabic was the only contact language of WNA, and
Correll (1978) believes that the conservatism of Arabic is the reason for the
194 Werner Arnold
archaic structure of WNA in comparison to the eastern Aramaic dialects,

where under the influence of Turkish and Iranian languages the old verbal
system collapsed. WNA has enriched its vocabulary with thousands of words
from Arabic and has incorporated the Arabic system of verbal stems, but the
integration of Arabic borrowings was performed by a total adaptation to the
Aramaic phonological and morphological system. In general, WNA has pre-
served its linguistic heritage or has developed independent from Arabic.
Abbreviations
c consonant (radical) perf perfect tense

cp coordination particle pl plural
cs construct state pm progressive modifier
det determined pres present tense
dp demonstrative pronoun pret preterite
ep existential particle rp relative pronoun
f feminine sf suffix
indet indetermined sg singular
ip independent personal pronoun subj subjunctive
m masculine WNA Western Neo-Aramaic
Notes
1. In the inflexion of the verb, gender distinction in plural forms is also lost.
2. This is not the case with eastern Aramaic dialects.
3. In Jubbʿadīn the auxiliary verb bēl- of unclear origin is used.
References
Arnold, Werner
1990 Das Neuwestaramäische. V. Grammatik (Semitica Viva 4/V), Wies-
baden.
2000 The Arabic dialects in the Turkish province of Hatay and the Aramaic
dialects in the Syrian mountains of Qalamūn: Two minority languages
compared. In: Jonathan Owens (ed.), Arabic as a Minority Language,
347370. Berlin, New York.
2002 Zur Geschichte der arabischen Lehnwörter im Neuwestaramäischen.
In: Norbert Nebes (ed.), Neue Beiträge zur Semitistik. Erstes Arbeits-
treffen der Arbeitsgemeinschaft Semitistik in der Deutschen. Morgen-
ländischen Gesellschaft vom 11. bis 13. September 2000 an der Frie-
drich-Schiller-Universität Jena. Ed. by (Jenaer Beiträge zum Vorderen
Orient 5), Wiesbaden, 511.
Arnold, Werner, and Peter Behnstedt
1993 Arabisch-Aramäische Sprachbeziehungen im Qalamūn (Syrien). Eine
dialektgeographische Untersuchung mit einer wirtschafts- und sozial-
geographischen Einführung von Anton Escher (Semitica Viva 8),
Wiesbaden.
Correll, Christoph
1978 Untersuchungen zur Syntax der neuwestaramäischen Dialekte des Li-
banon (Maʿlūla, Baḫʿa, Ǧubbʿadīn); mit besonderer Berücksichtigung
der Auswirkungen arabischen Adstrateinfllusses; nebst zwei Anhän-
gen zum neuaramäischen Dialekt von Ǧubbʿadīn (= Abhandlungen
für die Kunde des Morgenlandes XLIV, 4), Wiesbaden.
Grotzfeld, Heinz
1965 Syrisch-arabische Grammatik (Dialekt von Damaskus) (Porta lin-
guarum orientalium, Neue Serie VIII). Wiesbaden.
Spitaler, Anton
1938 Grammatik des neuaramäischen Dialekts von Maʿlūla (Antilibanon).
(Abhandlungen für die Kunde des Morgenlandes XV, 4). Leipzig.
Grammatical borrowing in
North-eastern Neo-Aramaic
Geoffrey Khan
1. Background
Aramaic, which was one of the major Semitic languages in the pre-Islamic
Middle East, still survives today in various vernacular dialects. These Neo-
Aramaic dialects can be divided into four main subfamilies, which include
(1) the Western group spoken in Maʿlūla and various other villages in the
region of Damascus, (2) the Ṭuroyo group, spoken in Ṭūr ʿAbdīn in south-
eastern Turkey and in the village of Mlaḥsō in southern Turkey, (3) Mandaic,
spoken in the city of Ahwāz, Iran, and the surrounding region, and (4) the
north-eastern group.
North-Eastern Neo-Aramaic (NENA)1 contains a far greater diversity of
dialects than any of the other groups. These were spoken across a wide area
encompassing northern Iraq, north-western Iran, south-eastern Turkey, Arme-
nia and Georgia. A large proportion of the speakers of these dialects, however,
have been supplanted from their original places of residence due to political
events during the twentieth century and now live in a diaspora of émigré com-
munities in various parts of the world. On account of this, many of the dialects
are now facing extinction. The NENA group includes dialects spoken by Jews
and Christians. The Jewish dialects in all cases differ from the Christian dia-
lects, even where the Jews and Christians lived in the same town or region.
There are considerable differences, for example, between the Jewish dialect
and Christian dialect in the towns of Urmi, Salamas, Sanandaj and Sulemani-
yya, in which the two communities lived side by side. In other geographical
areas, such as Zakho and the surrounding region, the differences between the
dialects of the two communities are of a lesser degree.2 These dialectal cleav-
ages between confessional communities appears to have evolved not only
through social divisions but also through different migration histories. Where
Jewish and Christian communities existed side by side in towns such as those
mentioned above, in some cases it can be established that the settlement of
the Jews in the town was earlier than that of the Christians, who were more
recent immigrants from the villages in the surrounding countryside.
198 Geoffrey Khan
The NENA dialects exhibit extensive grammatical borrowing from the

non-Semitic languages with which they have been in contact for many cen-
turies. The main source of borrowing is Kurdish, an Iranian language that was
spoken in numerous dialects across the NENA region. Some features have
been borrowed also from Arabic and Turkic languages. Grammatical borrow-
ing is very diverse across the NENA group and much of it still remains un-
studied. For this reason I shall here take as a case study one particular dialect
of the group that exhibits widespread borrowing and has been fully described,
viz. the Jewish dialect of Sulemaniyya (North West Iraq).3
There was a large Aramaic speaking Jewish community in Sulemaniyya
since the foundation of the town in 1784 by Ibrāhīm Pāshā Bābān. The ma-
jority of the Jews came from the village of Qaradax, situated twenty miles to
the south. Between 1950 and 1952 the entire community, which consisted of
approximately 500 households, migrated to Israel. Today the dialect is still
spoken by only a few elderly immigrants in Israel and is likely to be totally
extinct within the next few years. The Muslims of Sulemaniyya are almost
entirely Kurdish speaking, with a small minority of inhabitants whose first
language is Turkmen. The Jews of the town spoke the local Kurdish dialect in
order to communicate with the Muslim inhabitants.
The source of the grammatical borrowing in the Aramaic dialect can in
some cases be identified as the Sulemaniyya Kurdish dialect. In other cases,
however, it appears to have arisen by contact with other Kurdish dialects or
other Iranian languages in more remote regions. Sometimes the source is
Arabic or a Turkic language. Some of these features may have been trans-
ferred to the Sulemaniyya Jewish Aramaic dialect indirectly through other
Jewish Aramaic dialects. A few elements have been borrowed also from Heb-
rew, the traditional language of Jewish education in the community.
2. Phonology
A number of phonological changes in the Aramaic dialect appear to be con-

tact-induced. In the Aramaic dialect, for example, the original interdental s
and _ shift to the lateral l:4
(1) *bea > bela ‘house’

*ʾida > ʾila ‘hand’
North-eastern Neo-Aramaic 199
This shift is found in all Jewish NENA dialects that were spoken in commu-
nities east of the Zab river (the so-called ‘trans-Zab group’).5 The shift of * to
the lateral /l/ seems to have been preceded historically by a shift to the voiced
stop /d/. This is shown by the fact that in some dialects in which the shift to
/l/ has taken place the reflex of * has remained as /d/ in some words. For
instance, in the Jewish dialects of Urmi, Ruwanduz and Rustaqa the root ‘to
come’ is ʾdy (< *ʾy), e.g. ʾidyele ‘he came’. The process resulting in the lateral
/l/ described above, therefore, can be regarded as resulting from the weaken-
ing of the articulation of /d/. Interdentals are absent in Sulemaniyya Kurdish
and post-vocalic /d/ is to an interdental approximant /đ/ (MacKenzie 1961:
3, 8) but there is no shift of dentals to the lateral l. Such a shift, however, can
be identified in the Mukri dialect of Kurdish spoken in north-western Iran
(Kapeliuk 1997).
The phoneme /w/ often has a labio-dental realization [v] in the Aramaic
dialect but the corresponding phoneme in the local Kurdish is always realized
as a bilabial. The labio-dental realization is a development found in Farsi and
also in the Aramaic dialects of western Iran.
In the Aramaic dialect an /l/ in the environment of pharyngalized conson-
ants sometimes shifts to /r/:
(2) parṭixwa < palṭixwa ‘We used to go out.’
This has been found only in the speech of women informants. A parallel
phonological shift is attested in the Kurdish speech of some women from
Sulemaniyya and regularly in the Kurdish dialects spoken in Arbel, Koy San-
jaq and Ruwanduz (MacKenzie 1961: 4, 28).
After a vowel, /d/ and /z/ may freely alternate with each another. This pro-
cess appears to be restricted to specific lexical items and is best treated as an
alternation of phonemes rather than as allophonic alternation:
(3) ʾidyo ~ ʾizyo ‘today’

qadome ~ qazome ‘tomorrow’
guda ~ guza ‘wall’
In the speech of the closely related Aramaic dialect of the Jews of Ḥalabja
the three-way alternation d ~ _ ~ z is sometimes heard, with the intermediate
interdental:
200 Geoffrey Khan
(4) xădir ~ xă_ir ~ xăzir ‘he becomes’

guda ~ gu_a ~ guza ‘wall’
The weakening of the stop /d/ to an interdental /¦/ or sibilant /z/ may have
been stimulated by contact with the Kurdish dialects of the region, in which
the articulation of /d/ is weakened to an interdental approximant in postvo-
calic position. The shift of etymological /z/ to /d/ would, therefore, have to be
regarded as a back-formation. It is also relevant to note that the alternation d
~ _ ~ z is found in the Jewish Persian dialects of western Iran.6
The stress patterns of the Aramaic dialect are similar to those that are
found in the Sulemaniyya Kurdish dialect. In both dialects stress is generally
placed at the end of a word. Exceptions to this tend to be found in the same
categories of words:
(5) Vocative
Aramaic táta ‘father!’
Kurdish mā´mwastā ‘teacher!’
Past verbs
Aramaic híyen ‘They came.’
Kurdish hátin ‘They came.’
Stress groups combining two or more words
Aramaic tré-yome ‘two days’
Kurdish dé-rož ‘two days’
The Jewish Aramaic dialect disinguishes two genders (masculine and femi-
nine). This contrasts with the Kurdish dialect of Sulemaniyya, which makes
no morphological distinction in gender. Under the influence of the Kurdish
dialect, however, the original gender distinction in the third-person singular
pronoun of the Aramaic dialect has been lost:
(6) Aramaic Kurdish

ʾo aw ‘he/she/it’
The demonstrative adjectives in Aramaic have also lost the distinction be-
tween singular and plural in imitation of Kurdish:
(7) Aramaic Kurdish

ʾay am ‘this/these’
ʾo aw ‘that/those’
Many nouns have been borrowed by the Aramaic dialect from Kurdish. Since
the Kurdish dialect makes no gender distinctions, they have been assigned a
gender in the Aramaic dialect. The grammatical gender of all loanwords that
refer to human beings corresponds to the sex of the referent. The majority of
loanwords that refer to inanimate objects or small animals are construed as
feminine in gender, e.g. šāx (f.) ‘mountain’, qali ‘carpet’ (f.), jiji (f.) ‘hedge-
hog’ The same applies to loans from Arabic which have entered the dialect
through Kurdish, e.g. kĭteb (f.) ‘book’, majlis (f.) ‘meeting’. There is a size-
able residue of inanimate loans that are construed as masculine in gender.
The gender assignment of these appears to have a semantic basis, in that most
of the nouns in question either denote (i) a long, thin entity, e.g. qamīš (m.)
‘cane’, top (m.) ‘gun’ or (ii) a collective or non-solid entity, .e.g. xaḷūz (m.)
‘coal’, čay (m.) ‘tea’.
Many borrowed nouns are adapted to Aramaic morphology by adding an
Aramaic nominal ending. In most cases the Aramaic masculine ending -a is
added, irrespective of the gender assignment:
(8) boqa (f.) ‘frog’ < Kurdish boq

lăša (m.) ‘body’ < Kurdish laš
In a few cases the Aramaic feminine ending -ta is added to Kurdish loans that
have been assigned feminine gender:
(9) masita (f.) ‘fish’ < Kurd. masi

dargušta (f.) < Kurd. darguš
The Aramaic dialect uses a definite article suffix -ăke that has been borrowed
from Kurdish. In the local Sulemaniyya Kurdish dialect the form of the art-
icle is -aka. The form -ăke is likely to have its origin in a form of the particle
with an oblique case marker -aka-y, which is found in Kurdish dialects lying
to the North and North East of Sulemaniyya (MacKenzie 1961: 5758). Fur-
thermore, the morphological behaviour of the particle in the Aramaic dialect
is different from that of the particle in Kurdish. In the Aramaic dialect it can-
not take any further suffixes. In Kurdish, by contrast, it may take plural (-ān)
and pronominal suffixes:
202 Geoffrey Khan
(10) Aramaic Kurdish

barux-ăke dost-aka ‘the friend’
barux-awal-ăke dost-akān ‘the friends’
barux-an dost-aka-mān ‘our friend’
Possessive constructions in which a head noun is qualified by a determiner

noun are normally formed by juxtaposing the two. The Aramaic genitive par-
ticle d-, which regularly occurs in such constructions in some NENA dialects,
is rarely used. The construction without the d- corresponds to a function-
ally equivalent construction in Sulemaniyya Kurdish whereby a head noun
is linked to a following determiner noun by means of a compound vowel -a
(MacKenzie 1961: 64):

brona mălik kur̄-a pāšā ‘the son of the king’
The final -a of the Aramaic noun is the general nominal ending, but in this
environment it is identified with the Kurdish compound vowel -a. The Iranian
Ezāfe particle is occasionally used in genitive constructions when the head
noun is a loanword:
(12) maktab-i hulaye ‘school of the Jews’
In both the Aramaic and Kurdish dialects the copula verb ‘to be’ is expressed
by an enclitic on the predicate that is inflected for person and number like a
verb. Although there are signs of the emergence of such a enclitic in earlier
Eastern Aramaic, its full development and acquisition of verbal inflection are
apparently due to the influence of Kurdish:

ʾo jwan-ya aw jwān-a ‘She is beautiful’
ʾat jwan-yat to jwān-ī ‘You are beautiful’
The simple past tense of the Aramaic dialect has acquired an ergative type of
inflection by contact with Kurdish. The intransitive verbs have active subject
inflection, which resembles the inflection of present tense verbs, whereas
transitive verbs have a passive type of inflection, with the verb agreeing with
the patient of the action and the agent expressed by an oblique agentive pro-
nominal affix that does not serve as the grammatical subject of the verb. The
oblique pronominal affix is identical to the pronominal object affix of the
present form of the verb. These structures correspond to what is found in the
Sulemaniyya Kurdish dialect:
(14) Second person

Aramaic
Simple past intransitive Present
mīl-et ‘You (ms.) died’ mel-et ‘You (ms.) die’
mīl-at ‘You (fs.) died’ mel-at ‘You (fs.) die’
mīl-etun ‘You (pl.) died’ mel-etun ‘You (pl.) die’
Simple past transitive Present + pronominal object
qṭil-lox ‘You (ms.) killed him’ qăṭil-lox ‘He kills you (ms.)’
qṭil-lax ‘You (fs.) killed him’ qăṭil-lax ‘He kills you (fs.)’
qṭil-laxun ‘You (pl.) killed him’ qăṭil-laxun ‘He kills you (pl.)’
Kurdish
Simple past intransitive Present
mird-ī ‘You (s.) died’ a-mir-ī ‘You (s.) die’
mird-in ‘You (pl.) died’ a-mir-in ‘You (pl.) die’
Simple past transitive Present + pronominal object
kušt-it ‘You (s.) killed him’ a-t-kužē ‘He kills you (s.)’
kušt-tān ‘You (pl.) killed him’ a-tān-kužē ‘He kills you (pl.)’
In addition to this simple past tense, consisting of a past stem and inflectional
endings, the Aramaic dialect also has a compound perfect consisting of a pas-
sive participle and an enclitic form of the verb ‘to be’. The same two types of
past conjugation are found in the Kurdish dialect:

Simple past: mīl ‘He died’ mird ‘He died’
Compound past: mila-y ‘He has died’ mirduw-a ‘He has died’
Although there are clear contact-induced resemblances between the Aramaic

and Kurdish dialects, a closer look reveals several differences. In the Jewish
204 Geoffrey Khan
Aramaic dialect, for example, the agentive pronominal suffix on the simple
past transitive verb is bonded to the verb. In Kurdish, on the other hand, this
agentive suffix is moveable and is generally attached to a word that occurs
earlier in the clause. The transitive predication ‘the man killed the dog’, for
example, is expressed in the two dialects as follows:
(16) Aramaic
gor-ăke kalb-ăke qṭil-le.
man-the dog-the killed-past:3ms-3ms:obl.
Kurdish
pyāw-aka sag-aka-y kušt.
man-the dog-the-3ms:obl killed-past:3ms
The compound past tense in the Kurdish dialect has an ergative form of inflec-
tion whereas the corresponding conjugation in the Jewish Aramaic dialect has
an active type of inflection in both the intransitive and transitive verbs:
(17) Aramaic
gor-ăke mila-y.
man-the died-cop:3ms
‘The man has died.’
gor-ăke kalb-ăke qiṭl-u-y.
man:the dogs.the killed-3pl:obl-cop:3ms
‘The man has killed the dogs’
Kurdish
pyāw-aka mirduw-a.
man-the died-cop:3ms
‘The man has died.’
pyāw-aka sag-akān-ī kuštuw-in.
man:the dogs.the-3ms:obl. killed-cop:3pl
‘The man has killed the dogs.’
It is worth noting that in a few Jewish NENA dialects in neighbouring west-

ern Iran, the compound past verbal form is ergative in inflection. This is the
case, for example, in the Jewish dialects of Sanandaj and Kerend, e.g.
(18) Jewish Sanandaj

gor-ăke kalb-ăke qiṭl-en.
man:the dogs.the killed-cop:3pl
‘The man has killed the dogs.’
The Kurdish dialect of Sulemaniyya does not, in fact, have a pure ergative
system regarding case-marking and verb agreement in either of the past tens-
es, since it is lacking an ergative case marker on the agent noun. The agent is
not marked with an oblique case but is extraposed and resumed by an oblique
pronominal element. This feature is shared by the Jewish Aramaic dialect of
Sulemaniyya, but the Aramaic dialect has moved even further away from the
ergative system than the Kurdish one. In the Aramaic dialect not only is the
compound past conjugation inflected actively but even the simple past con-
jugation is beginning to be treated as an active form. One reflection of this
is that the patient of the action is sometimes given oblique inflection. This is
regularly the case, for example, when the object is a first- or second-person
pronoun:
(19) qṭil-le ʾillox.

killed-3ms:obl 2ms:obl.
‘He killed you.’
One linguistic innovation that is found in the verbal system of the Jewish
Aramaic dialect is the expression of the present progressive by an infinitive
combined with a enclitic form of the verb ‘to be’:
(20) šaqola-yet.
taking-cop:2ms
‘You are taking.’
This appears to have developed from a construction in which the infinitive is

combined with the locative preposition b-, which is still preserved in some
NENA dialects, e.g. Christian Urmi:
(21) bi-šqala-vit.
in-taking-cop:2ms
‘You are taking.’
206 Geoffrey Khan
No direct parallels to this infinitive construction are found in Sulemaniyya

Kurdish, or indeed in other Kurdish dialects but there are close parallels to
the Aramaic construction in languages spoken further north, such as Turkish,
Eastern Armenian (Chyet 1995: 246) and some Iranian dialects belonging
to the Tati group spoken in north west Iran (e.g. Chali, cf. Yar-Shater 1969:
225), e.g. Turkish almak-ta-sın ‘You are taking’ (taking-loc-cop:2S.); Chali
xordan-u-ind ‘They are eating’ (eating- loc-cop.3pl).
A large number of other grammatical words have been borrowed from Kurd-
ish. These include a variety of particles and adverbials, including subordinat-
ing clausal conjunctions such as ʾagar ‘if’, taku ‘in order that’, nakun ‘lest’ and
the relative particle ga, connectives such as ʾinja ‘then’, ham ‘also’, yan ‘or’,
čunga ‘because’, modal particles such as the volitive particles ba and mar ex-
pressing deontic modality in verbs, the phasal particle heštan ‘still’, ‘yet’, the
comparative particle biš ‘more’, the negative modifier of nouns hič ‘no’, and a
variety of adverbials, e.g. čannakaw ‘suddenly’, dubara ‘again’, har ‘always’.
The Kurdish post-verbal particle awa (after a vowel wa) is widely used in
the Aramaic dialect after Aramaic verbs. It often expresses the sense of ‘re-
turning back’, ‘restoring’, ‘repetition’ or ‘completion’:
(22) hiye-wa.
come:past:3ms-particle
‘He came back.’
(23) qadome der-awa.

tomorrow return:fut:3ms-particle
‘He will return tomorrow.’
The Kurdish inclusive particle -iš has been borrowed. In the Aramaic dialect,
however, the particle is not as integrated into the morphology as it is in Kurd-
ish. This is reflected by the fact that it is always an external affix in word-final
position in the Aramaic dialect, whereas in Kurdish it precedes a pronominal
suffix (MacKenzie 1961: 128):

nošew-iš xō-š-i ‘also himself’
Some particles in the Aramaic dialect are borrowed from Kurdish but do not
correspond exactly to what is found in Sulemaniyya Kurdish. On some occa-
sions we seem to be dealing with doublets. This applies, for example, to the par-
ticles gal ‘with’ and laga ‘at the home of’ in the Aramaic dialect, both of which
appear to be related to the particle lagaḷ in the Sulemaniyya Kurdish dialect.
A particle of Aramaic original may imitate the use of an Aramaic particle.
This applies, for example, to the Aramaic particle k- which is attached to
present base verbs in the indicative mood. In the present state of the dialect
the particle has, in fact, been lost by phonetic attrition, but is still preserved
in some verbs. The etymology of the particle is clearly Aramaic (cf. Babylo-
nian Talmudic Aramaic qā- < qāʾem ‘rising up’), but its use is likely to be an
imitation of the use preverbal present indicative particles in Kurdish, in the
case of Sulemaniyya Kurdish this is a-.
6. Syntax
The basic word order of the Jewish Aramaic dialect is SOV when the verbal
arguments are free-standing nominals, which corresponds to that of the Kurd-
ish dialect (see examples 16 and 17).
Various borrowed subordinating particles are used to introduce subordin-
ate clauses in the Aramaic dialect.
The Kurdish subordinating particle ka-, which is used in the Kurdish dia-
lect of Sulemaniyya and the surrounding region (MacKenzie 1961: 131), has
been borrowed by the Aramaic dialect (pronounced either ka- or ga-) to in-
troduce various types of subordinate clause. Attributive relative clauses are
often introduced by this Kurdish particle. The native Aramaic relative particle
d- has been completely lost in relative clauses after head nouns. As in Kurd-
ish, the relative clause follows its head:
(25) ʾo-baxta ka-xăzitta ga-doka šwawt-i-ya.

that-woman who-see:2ms in-there neighbour-1s:obl-cop:3fs
‘The woman whom you see there is my neighbour.’
The Kurdish particle ka- is sometimes used in the Aramaic dialect as a subor-
dinating temporal conjunction with the sense of ‘when’:
(26) ga-wīš-wa, ʾanten-wa-le.

when-dried:3ms-pastp took:3pl:imp-pastp-3ms:obl
‘When it had dried, they took it.’ (R:40)
208 Geoffrey Khan
The particle ka- is also used as a complementizer to introduce factive comple-

ment clauses, mainly after the verb ‘to know’:
(27) kăyen-wa ga-ʾo brata ʾil-d-o-brona

know:imp:3pl-pastp comp-that(dem) girl objm-that(dem)-boy
gba.
loves:imp:3fs
‘They would know that the girl loves the boy.’
The protasis of a conditional constructions in Aramaic is generally introduced

by the Kurdish particle ʾagar ‘if’:
(28) ʾagar la-gbitta,ʾana kun-naw ta-naš xet.

if neg-love:imp:3ms:subj-3fs obl to-person other
‘If you do not love her, I shall give her to some other person.’ (R:165)
Temporal ‘when’ clauses are frequently introduced by the Kurdish nominal

waxta ‘time’:
(29) waxta ʾana menna, ʾay tre-brone ruwwe, băxew la-koli

time I die-1ms dem two-sons big care neg-do:fut:3pl
l-yal-ax.
objm-children-2fs.obl
‘When I die, these two older boys will not look after your children.’
(R:17)
The Arabic particle ḥatta, or its variant form hatta with a laryngeal, is used
as a clausal conjunction, generally with the sense of ‘until’ and introduces
an event that marks the endpoint of the activity or situation denoted by the
main clause:
(30) rĭqa-le bari ba-ṣiwa ḥatta

ran:past-3ms.obl after-1s.obl with-stick until
di-le-lli.
hit:past-3ms.obl-1s.obl
‘He ran after me with a stick until he beat me.’
Syntactically subordinate clauses that have a concessive sense are generally

introduced by the Hebrew particle afillu:
(31) ʾafillu ʾo barux-i-ye, ʾana yaridew

although he friend-1s.obl-cop.3ms. I help-3ms.obl
la-kunna.
neg-do:fut.1ms
‘Although he is my friend, I shall not help him.’
7. Lexicon
There is a high degree of lexical borrowing in the Aramaic, especially in the

category of nouns. On the basis of a sample corpus of material, the broad
assessment of the proportion of loanwords in the lexicon of the various cat-
egories of words is as follows: nouns, 67%, adjectives, 48%, particles, 53%,
verbs, 15%. The vast majority come from Kurdish. A few originate in other
languages, such as Arabic and Turkish, though most of these are likely to
have been transmitted through Kurdish. The Arabic loans sometimes exhibit
features that are distinctive of Arabic lexemes in Kurdish, such as the pronun-
ciation of the tā’ marbūṭa as -at (e.g. ḥukmat ‘government’, sa‘at ‘hour’) and
the use of broken plurals with a singular sense (e.g. tujār ‘merchant’). The
Aramaic dialect exhibits also many calques of Kurdish expressions.
Verbs constitute the most resistant category to loaning, as is commonly
the case in language contact situations. The greater facility with which nouns
are loaned is clearly demonstrated by the so-called ‘phrasal verbs’, which
consist of an inflected verbal form combined with a noun or particle as its
complement. These are nearly all based on a Kurdish model. In most cases,
however, only the noun or particle complement is a direct loan of a Kurdish
word. The verb is an Aramaic calque of the Kurdish:

rek lpl rek kawtin ‘to agree’
swal ʾwl swal kirdin ‘to beg’
sayr ʾwl sayr kirdin ‘to look at’
gargurke ʾwl gargurke kirdin ‘to crawl’
xafad ʾxl xafat xuwardin ‘to be distressed’
gorane ʾmr gorani wittin ‘to sing’
wa dmy wā zānīn ‘to think’
Borrowed verbs, by contrast, are inflected fully with the Aramaic verbal in-
flection. The existence of a rich inflectional morphology in verbs is no doubt
210 Geoffrey Khan
one reason why the verbal section of the lexicon has been more resistant
than nouns to borrowing. One of the few verbs in the Aramaic dialect that
have been loaned from Kurdish is dyy (< Kurd. dān). This is used in various
phrasal verbs that are based on Kurdish models, e.g.

bāz dyy bāz dān ‘to jump’
čirike dyy čirike dān ‘to shout’
čapḷe dyy čapḷe dān ‘to clap’, ‘applaud’
The Kurdish verb dān has a wide range of meanings, including ‘to give’, ‘to
hit’, ‘to put’. The corresponding Aramaic form, dyy, regularly occurs in the
Aramaic phrasal verbs that have Kurdish models with dān. The distribution
of dyy in the Aramaic dialect outside of this context, however, is not as wide
as that of Kurdish dān. When used independently of phrasal expressions, the
verb dyy in Aramaic most commonly has the sense of ‘to hit’. It is not used
in the meaning of ‘to give’, which is one of the basic senses of the Kurdish
source word dān. The Aramaic dialect retains the native verb to express this
meaning (present indicative: kul, past: hiwle) and so has resisted a complete
lexical transfer.
The verb borrowed from Kurdish in some cases has taken on the full basic
meaning of the Aramaic equivalent but nevertheless the dialect retains the
Aramaic verb and uses it in a slightly different meaning. This applies, for ex-
ample, to the fate of the Aramaic verb prx. In earlier Aramaic *prḥ had the
sense of ‘to fly’ and, indeed, this sense is still retained by the verb in some
NENA dialects. In the Jewish dialect of Sulemaniyya, however, the meaning
‘to fly’ is expressed by the Kurdish loanword fry (< Kurd. firīn). The native
verb prx, nevertheless, is still retained in the sense of ‘to jump over’, which is
connected conceptually with the notion of ‘flying’.
On some occasions the contact with Kurdish has not brought about a dir-
ect borrowing of lexical items but rather as given rise to the development of
a phonetic resemblance between morphological forms in the two languages.
In the Aramaic dialect, for example, the verb ‘to come’ (< *ʾy) undergoes
irregular phonetic contraction and loses its original middle radical * com-
pletely. As a result, the base of the present conjugation of the verb resembles
phonetically the corresponding Kurdish verb:
(34) Aramaic
k-e (3ms indicative) he (3ms subjunctive)
k-en (3pl indicative) hen (3pl subjunctive)
Kurdish
e (3s indicative) b-e (3s subjunctive)
en (3pl indicative) b-en (3pl subjunctive)
One may perhaps include here the irregular loss of the final /m/ in the word
ʾidyo ‘today’ (< *ʾid-yom) under the influence of the vocalic pattern of the
Kurdish equivalent imro.
Another case of phonetic convergence is the Aramaic form gbe, which
functions as an invariable verb form expressing necessity. The form is derived
from the Aramaic verb ‘to want’. Its function is identical to that of the Kurd-
ish invariable verb abe (MacKenzie 1961: 106), which it resembles phonet-
ically, e.g.

gbe hezex abe biçīn ‘We must go’
Some lexical items in the Aramaic dialect are non-Semitic loanwords yet do
not correspond to what is found in the local Kurdish dialect of Sulemaniyya.
This is the case, for example, with the two basic kinship terms in the Jewish
Aramaic dialect tata ‘father’ and lala ‘maternal uncle’. They are not found
in the local Kurdish dialect, but parallels can be found in several other lan-
guages of the region, the nearest being Hawrami, spoken in the Hawraman
mountains. Similarly a loanword in the Aramaic dialect may be identified in
the Kurdish dialect of Sulemaniyya but with a different meaning, whereas it
is used with the same meaning as is found in the Aramaic dialect in a more re-
mote linguistic source. This applies to the word baba, which, in the Aramaic
dialect of Sulemaniyya, usually means ‘grandfather’ rather than ‘father’. In
the Kurdish dialect of Sulemaniyya the cognate word has the sense of ‘father’
(bāb, bāw), but in Hawrami it is used with the sense of ‘grandfather’ as in the
Aramaic dialect:
(36) Jewish Sul. Ar. Sulemaniyya Kurdish Hawrami

tata bāw, bāwk tata ‘father’
lala xal lalo ‘maternal uncle’
baba ba-pīr baba ‘grandfather’
On some occasions a Kurdish loanword in the Aramaic dialect has a different

range of meaning from what it has in the source language. The word čolăka,
for example, is used in the Aramaic dialect with the meaning of ‘bird’. In
212 Geoffrey Khan
Kurdish it has the specific meaning of ‘sparrow’, the general term for bird’
being mal.7 The loanword mayīn is used in the Aramaic dialect as a general
word for ‘horse’, whereas in the Kurdish source language it refers specific-
ally to a ‘mare’.
Most surviving speakers of the Aramaic dialect who have been resident
in Israel since the 1950s use Hebrew words in their Aramaic speech. A large
proportion of these words are taken from Modern Hebrew. Particularly com-
mon are the connective particles ʾaz ‘then’ and ʾaval ‘but’. As is the case
with lexical transfer from Kurdish, Hebrew verbs are not so freely borrowed.
Speakers prefer to form verbal phrases containing a Hebrew nominal element
and an Aramaic verb:
(37) mazmīn koli-wa-le.

invite-heb.participle do-imp.3pl-pastp.3ms.obl
‘They would invite him.’
Certain Hebrew words that occur in the informants’ speech, however, existed
in the Aramaic dialect before the immigration of the Jewish community to
Israel. These can usually be distinguished by phonetic features that are not
characteristic of Modern Hebrew. In such words, for example, vocalic šewa
is pronounced /ă/, as in băraxa ‘blessing’, nădaba ‘charity’, băli ‘without’.
Consonantal gemination is pronounced, as in ʾafillu ‘even if’, keʾillu ‘as if’,
sukka ‘booth’. Beth is pronounced as a stop where it is a fricative in Modern
Hebrew, as in tob ‘good’, kabod ‘respect’. In some cases a Hebrew word has
undergone a phonetic process under the influence of Kurdish, which dem-
onstrates that it is a heritage from the Aramaic dialect as it was spoken in
Kurdistan, e.g. mira < mila ‘circumcision’.
Abbreviations
comp complementizer obj object

cop copula objm object marker
dem demonstrative obl oblique
fut future pastp past particle
imp imperfective pres present
loc locative subj subject
neg negator
Notes
1. The term was coined by Hoberman (1988: 557) to replace ‘Eastern Neo-Arama-
ic of earlier classifications (cf. Socin 1882: v; Duval 1896: 125; Tsereteli 1977,
1978). This was necessary in order to distinguish the north-eastern dialects from
modern Mandaic, which is as distant typologically from them as the western
Neo-Aramaic dialects.
2. Cf. Hopkins (1993: 65).
3. The dialect is described in Khan (2004). All the data in this chapter are taken
from this description, which is based on extensive fieldwork undertaken in Israel
with Jewish immigrants from Sulemaniyya.
4. The transcription follows that adopted in Khan (2004). Vowel length is largely
predictable from the syllable structure. As a general rule, in open syllables a
vowel is long and in closed syllables it is short. Vowels following this principle
are not marked by diacritics. The breve and macron signs are used only when the
vowel is short in an open syllable or long in a closed syllable respectively.
5. For the main characteristics of the trans-Zab group, see Mutzafi (2004: 910).
6. I am grateful to Don Stilo (personal communication) for drawing my attention
to these phenomena in the Kurdish and Jewish Iranian dialects.
7. This is similar to the semantic relationship between Arabic ʿuṣfūr ‘sparrow’ and
Hebrew ṣippor ‘bird’, which are cognates.
References
Chyet, M. L.
1995 Neo-Aramaic and Kurdish: An interdisciplinary consideration of their
influence on each other. Israel Oriental Studies 15: 219249.
Duval, R.
1896 Notice sur les dialectes néo-araméens. Mémoires de la société de lin-
guistique de Paris 9: 125135.
Hobermann, R. D.
1988 The history of the Modern Aramaic pronouns and pronominal suffixes.
Journal of the American Oriental Society 104: 221231.
Hopkins, S.
1993 The Jews of Kurdistan in Eretz Israel and their language. Peʿamim.
Studies in Oriental Jewry 56: 5074.
Khan, G.
2004 The Jewish Neo-Aramaic Dialect of Sulemaniyya and Halabja. Leiden:
Brill.
MacKenzie, D. N.
1961 Kurdish Dialect Studies, Volume 1. London: Oxford University Press.
214 Geoffrey Khan
Mutzafi, H.
2004 The Jewish Neo-Aramaic Dialect of Koy Sanjaq. Wiesbaden: Harras-
sowitz.
Socin, A.
1882 Die neu-aramäischen Dialekte von Urmia bis Mosul. Texte aund Über-
setzungen. Tübingen: Laupp.
Tsereteli, K. G.
1977 Zur Frage der Klassifikation der neuaramäischen Dialekte. Zeitschrift
der Deutschen Morgenländischen Gesellschaft 127: 244253.
1978 The Modern Assyrian Language. Moscow: Nauka.
Yar-shater, E.
1969 A Grammar of Southern Tati dialects. The Hague: Mouton.
Grammatical borrowing in Macedonian Turkish
Yaron Matras and Şirin Tufan
1. Background
The variety described here is representative of the Turkish dialects spoken in

the Republic of Macedonia, especially those in the west of the country, and
to a considerable extent also of Rumelian or Balkan Turkish as a whole (cf.
e.g. Matras 1998, 2004; Friedman 2003). The Balkan or Rumelian dialects of
Turkish descend directly from Ottoman Turkish and are generally considered
mutually comprehensible with Standard Turkish (henceforth ‘Tk.’); there are
even direct historical links with Anatolian Turkish (cf. Caferoğlu 1964). We
draw here primarily on data from the dialect of Gostivar, a city in the western
part of the Republic of Macedonia – henceforth GT for ‘Gostivar Turkish’
(for a comprehensive description see Tufan 2007).1
Turkish is the native language of the Turkish ethnic minority in the various
Balkan countries. It is the first language of many Muslim Romani commu-
nities, and it is also spoken by some Albanians, Macedonians, and other eth-
nicities as a second or third language.2 As the official language of the Ottoman
Empire, Turkish was a lingua franca and the language of administration and
trade in the Balkans for more than half a millennium (between the fourteenth
and early twentieth century). With the final collapse of the Ottoman Empire
(1912), Turkish became a minority language. In Macedonia, it was not until
the 1950s that its status became regulated and Turkish-language education,
cultural institutions, and media received state backing. The form of Turkish
taught at school was Standard Turkish, while the vernacular continued to be
used in the private domain. Turkish speakers in the region are generally bi-
and often trilingual, speaking, in western Macedonia, alongside the state lan-
guage, also Albanian. Over the past century, and especially since the 1950s,
the importance of the state language and its relevance to career progression,
education, and mobility has grown immensely, and this is reflected in the
amount and the nature of Macedonian lexicon that has found its way into the
local varieties of Turkish. In today’s Republic of Macedonia, Turkish speak-
ers have direct contact with Standard Turkish not only through schooling, but
also through satellite television and the internet, which are present in almost
every Turkish household. Successive waves of emigration to Turkey in recent
216 Yaron Matras and Şirin Tufan
decades have further fortified personal ties with Turkey, and visits to Turkey
are frequent, resulting in even greater exposure to Standard and Anatolian
Turkish.
2. Phonology
Among the consonants, we find the dental-alveolar affricate /ts/, which has
its source in Macedonian and Albanian. It is found not only in loanwords
(Albanian-derived tsapo ‘goat’, Macedonian-derived tsevka ‘pipe’) and in
borrowed affixes (Macedonian feminine-agentive -itsa), but it is also trans-
ferred occasionally into native Turkic words: tsıs ‘shut up’ (cf. Tk. sus). Ini-
tial consonant clusters are permitted in GT which do not appear in Tk.: GT
(also Macedonian and Albanian) Stambol ‘Istanbul’, Tk. İstanbul. There are,
on the other hand, also cases of simplification. The surrounding non-Turkic
languages simplify Turkish geminates in Turkish borrowings (cf. Friedman
2003: 58), and this trend is also found in GT: /dükan/ ‘shop’, Tk. /dükkan/
‘shop’; /akıli/ 'clever', Tk. /akıllı/. As in the neighbouring languages, there is
a weakening of /h/, though the origins of this development in Western Rume-
lian Turkish are thought to be in the features carried by immigrants from
northeast Anatolia (Németh 1956: 21): GT /ayvan/ ‘animal’, Tk. /hayvan/;
GT /paali/ ‘expensive’, Tk. /pahalı/; GT /saba/ ‘morning’, Tk. /sabah/. Recent
contact with Standard Turkish appears to have triggered the re-introduction
of /h/, and variation is commonly found, especially in grammatical function
words such as /em, hem/ ‘and’, /er, her/ ‘every’, or /ep, hep/ ‘all’.
In line with the absence of vowel-length distinctions in both Macedonian
and Albanian, there is a tendency in GT to shorten ‘double’ or ‘lengthened’
vowels, which appear in Turkish in loans of Persian and Arabic origin: thus
/galiba/ ‘probably’ (Tk. /ga»liba/), /hala/ ‘yet’ (Tk. /ha»la/). The loss of /ö/ –
which does not exist in the contact languages – may also be a contact-induced
phenomenon. In GT historical /ö/ is usually realized as /ü/ or as /o/: GT ürenci
‘student’, Tk. öğrenci; GT dort 'four', Tk. dört.
The feminine derivational markers -ka and -(i)tsa are borrowed from Mace-
donian, and are productive with Turkish word stems: arkadaş ‘friend’ (gen-
der-neutral, and by default masculine), arkadaş-ka ‘female friend’; koyşi
Macedonian Turkish 217
‘neighbour’, koyşi-ka ‘female neighbour’; yalanci ‘liar’, yalanci-tsa ‘female

liar’. The suffix -(i)tsa is further extended to denote a female affiliated with an
identified male, thus: dayo ‘maternal uncle’, day-tsa ‘maternal uncle’s wife’;
Muzafer-itsa ‘Muzaffer’s wife’. The extended distribution of the inherited di-
minutive suffix -çe appears to be influenced by the presence of a similar form
in the neighbouring languages: kış-çe ‘little girl’, Macedonian devoj-če.
The case of the dependent in possessive constructions is also affected by
contact. The possessor often appears in the ablative case, still accompanied,
as in Tk., by possessive inflection on the object of possession (the head),
while in Tk. the possessor appears in the genitive:
(1) a. Gostivar Turkish

kıskardeş-i güvegi-den
sister-3sg.poss groom-abl
‘the groom’s sister’
b. Standard Turkish
damad-ın kız kardeş-i
groom-gen sister-3sg.poss
The construction seems to copy the propositional marking of the possessor in

Macedonian, which appears either in the ablative or dative:
(2) Macedonian
a. sestra-ta na zet-ot
sister-def to groom-def
b. od zet-ot sestra-ta
from sister-def groom-def
The copula in GT appears, like in Macedonian, as an independent verb, and

not, as in Tk., in an enclitic form. Since this concerns issues of constituent
order, the position of the copula will be discussed further in Section 6. A
characteristic feature of the verb in Rumelian Turkish is the loss of the modal
infinitive, and the reduction of converbal forms in general. As a strategy of
clause linkage, this issue is discussed in Section 7 on ‘Syntax’.
Gostivar Turkish continues the general Turkish pattern of forming new

verbs by incorporating lexical nouns from the contact language, and integrat-
ing them with a light verb which differentiates valency. Both et- ‘do’ and
yap- ‘make’ are employed with transitives, and ol- ‘become’ with intran-
sitives: yaparsın komparatsiya 'you compare', privatizir oldi ‘it was priva-
tized’. Idiomatic structures are often copied as loan-blends, involving Mat-
ter replication of a Macedonian noun, accompanied by a translation of the
Macedonian verb: rutina alayim ‘I shall get into the habit’, lit. ‘I shall take a
routine’, Macedonian da zemam rutina.
A number of conjunctions and particles are borrowed from Macedonian and

Albanian. Matras (2004) notes for the Turkish dialects of eastern Macedonia
that the Macedonian additive conjunction i is regularly used when conjoin-
ing phrases, while Turkish ve is limited to conjoining constituents (as in ex-
ample 3). Note that the adversative conjunction ama is identical in Turkish
and Macedonian, Macedonian having borrowed it from Turkish. The Slavic
contrastive-addition marker a indicates opposition between two phrases:
(3) İlk-okul-i ve orta-okul-i bitır-dı-m Türkçe

first-school-acc and middle-school-acc finish-past-1sg Turkish
dil-ın-de, a fakulted-i bitır-dı-m
language-poss-loc and/however university-acc finish-past-1sg
Makedonce dil-ın-de.
Macedonian language-poss-loc
‘I finished primary and secondary school in Turkish, but university in
Macedonian.’
Another use of a is for disjunction:
(4) Amerika a Alman yatırım-i dır.

America or German investment-poss is
‘It is an American or German investment.’
It is possible that this function results from a blend between the Macedonian
contrastive-additive a, and the Albanian-derived question particle a, which is
also borrowed into GT:
(5) A git-tı-n Stambol-a?

q go-past-2sg Istanbul-dat
‘Have you been/ did you go to Istanbul?’
The Albanian requestive particle lu(te)m is also borrowed:
(6) Gel benım-le lum.

come me-inst req
‘Please come with me.’
Subordinating conjunctions are mainly grammaticalized interrogatives and

thus of Turkish origin, but the presence of kose ‘as if’ seems to indicate a
contamination of Macedonian kako ‘as if’ and Albanian kinse ‘as if’, possibly
reinforced by the similarity to the Turkish conditional verbal augment -se.
Possibly, an Albanian model sepse ‘because’ is also behind the use of se as a
subordinator of cause (‘because’).
Although on the whole still an SOV language, flexibility of word order in

Turkish is exploited in GT to extend pragmatically restricted variants to wider
contexts, thereby increasing harmony between GT and its contact languages
in the organization of utterance structures. Word order shift has acquired dif-
ferent degrees of stability with different constructions.
In the possessive construction, the order head–modifier has become the
preferred order in GT, mirroring the order in the Macedonian and Albanian
constructions:

ruba-lar-i damad-ın
clothes-pl-3sg.poss groom-gen
b. Macedonian
ališta-ta na zet-ot
clothes-def to groom-def
c. Albanian
teshat e dhandrit
clothes att groom
d. Standard Turkish
damad-ın eşya-lar-ı
groom-gen clothes-pl-3sg.poss
‘the groom’s clothes’
The object of comparison is expressed in GT with the help of a preposition

neka ‘like’, grammaticalized from the interrogative ne kadar ‘how much’,
copying the Macedonian preposition kolku ‘as much’. It is positioned, as in
Macedonian, between the attribute and the object of comparison:

güzel neka Meryem
beautiful like Meryem
b. Macedonian
ubava kolku Merjem
beautiful like Meryem
c. Standard Turkish
Merye kadar güzel
Meryem as.much beautiful
‘as beautiful as Meryem’
This is the only obvious indication of a shift, in any construction, from the
postpositional structure of Turkish, to prepositions.
In verb phrases, the most stable case of word-order convergence with the
neighbouring languages concerns the position of the copula. Whereas the
Turkish copula is enclitic, GT tends to preserve a more conservative inde-
pendent copula stem in i-, which, however, occupies the position between the
subject and the predicate noun, as in the contact languages:

Sen (i)-sın küçük bir kış-çe.
you cop-2sg small indef girl-dim
b. Macedonian
Ti si edno malo devoj-če.
you cop.2sg indef small girl-dim
c. Albanian
Ti je nji vajz ë vogël.
you cop.2sg indef small att girl
d. Standard Turkish
Sen küçük bir kız-sın.
you small indef girl2sg
‘You are a small girl.’
This is the general rule in the copula construction, irrespective of the word
class or case of the predicate (e.g. adjective, locative noun, etc.):
(10) Siz i-dı-nız ev-de.

you cop-past-2pl house-loc
‘You were at home.’
In other constructions, deviation from verb-final order is much less pragmat-

ically marked, and much more frequent, than in colloquial Tk., indicating a
drive toward harmonization of the utterance planning procedures with those
of the contact languages. Consider the following sentences, in which direct
and indirect objects follow the verb without any inference of de-focusing or
de-topicalization (which would be the reading accompanying such construc-
tions in Tk.):
(11) Ben gür-dü-m korkuli rüya.

I see-past-1sg scary dream
‘I saw a scary dream.’
(12) Ben ver-dı-m bikate ekmek sizın dort tene

I give-past-1sg little bread 2pl.poss four item
beygir-ınız-e.
horse-2pl.poss-dat
‘I gave your four horses some bread.’
(13) Onlar gid-ecek-ler dügün-e benım-le.

they go-fut-3.pl wedding-dat me-inst
‘They will go with me to the wedding.’
The default position for objects that constitute new topical information in lex-
ical predications remains, however, the pre-verbal position:
(14) Parlament-ın-de var-dır iki dil.

parliament-3sg-loc exist-cop.3sg two language
Arnaut-lar Arnautçe konuş-ur.
Albanian-pl Albanian speak-aor
Makedon-lar Makedonce konuş-ur.
Macedonian-pl Macedonian speak-aor
Azınlık-lar Makedonce konuş-ur.
minority-pl Macedonian speak-aor
Bir tek Arnaut-lar Arnautçe konuş-ur.
one only Albanian-pl Albanian speak-aor
‘In Parliament, there are two languages.
The Albanians speak Albanian.
The Macedonians speak Macedonian.
The minorities speak Macedonian.
Only the Albanians speak Albanian.’
7. Syntax
Some of the most remarkable changes that have affected Rumelian Turkish
– a characteristic feature of this group of Turkish dialects – is the adoption
of clause combining strategies that are similar to those employed in the sur-
rounding Indo-European languages. Essentially, these are based on the juxta-
position of finite clauses, linked through independent semantic markers that
introduce the subordinate clause (subordinating conjunctions). This system
replaces almost entirely the Turkic system of converbs and nominal embed-
ding.
Modal complements are not introduced by a conjunction, but make use of
the historical optative, which, now expressing dependency on the main verb,
serves as a subjunctive, with the complement clause generally following the
main clause (see also Matras 1998, 2004):

Yarın ist-er-ım oyna-(ya)-im dügün-de.
tomorrow want-aor-1sg play-subj.1sg wedding-loc
b. Macedonian
Utre saka-m da igra-m na svadba-ta.
tomorrow want-1sg comp play-1sg at wedding-def
c. Albanian
Nesër dua të luj në darsëm.
tomorrow want.1sg comp play.1sg in wedding
d. Standard Turkish
Yarın düğün-de oyna-mak isti-yor-um.
tomorrow wedding-loc play-inf want-prog-1sg
‘I want to dance at the wedding tomorrow.’
The finite embedded predicate in the subjunctive replaces the historical Turk-
ish infinitive. The same type of construction is used in manipulation clauses
(modal complements with different subjects):
(16) Daa çok sev-er-ım anlat-ır-sın kimse.

more much like-aor-1.sg tell-aor-subj.3sg somebody
‘I prefer somebody to narrate it [to me].’
Factual or epistemic complements, which in Tk. may be expressed through

either finite clauses, or nominalizations, always appear as postposed finite
clauses, introduced by the subordinator ki, which is also common in Tk.:
(17) Hised-ıl-mes ki vardır sonbaar.

feel-pass-neg.aor comp exist.cop.3sg autumn
‘It does not feel like autum.’
In this manner, GT aligns itself with the other Balkan languages also in re-
spect of the distinction between factual and non-factual complements. While
the other languages have complements that specialize for factual/epistemic
and non-factual/subjunctive (e.g. Macedonian deka vs. da, Greek oti vs. na,
Bulgarian če vs. da, Romani kaj vs. te, and so on), in GT the opposition is
expressed by using the inflected subjunctive on the verb in modal comple-
ments, and the ki complementizer (and indicative mood) in epistemic com-
plements.
Relative clauses also undergo re-structuring in Rumelian Turkish. Like the
other Rumelian Turkish dialects, GT shows a relativizer ne, derived from the
interrogative ‘what’, which mediates between the head noun and the finite,
postposed relative clause (see Matras 1998, 2004). This replaces both the
Turkish gerundial relative clause, and its finite counterpart in ki. The forma-
tion once again matches that of the principal contact language Macedonian,
where the relativizer is equally derived from the interrogative ‘what’:

O kış-çe ne gel-di biz-de şimdi yaşa-r
that girl-dim rel come-past 1pl-loc now live-aor.3sg
Stambol-da.
Istanbul-loc
b. Macedonian
Devoj-če-to što dojde kaj nas sega živee vo İstanbul.
girl-dim-def rel came at us now live.3sg in Istanbul
c. Standard Turkish
Biz-e gel-en kız şimdi İstanbul-da yaşı-yor.
1pl.dat come-ger girl now Istanbul-loc live-prog.3sg
‘The girl that came to (visit) us now lives in Istanbul.’
Like relative clauses, embedded clauses in GT are finite, usually postposed to

the main clause, and introduced by an interrogative, functioning as a conjunc-
tion; Turkish-type nominalizations of embedded propositions are not found.
Adverbial clauses show a mixed pattern in relation to convergence tenden-
cies. One type of adverbial clause shows an overwhelming tendency to copy
the Indo-European subordination type: postposed finite subordinate clauses
introduced by a conjunctions. To this end, a series of grammaticalization pro-
cesses take place giving rise to new subordinating conjunctions. The semantic
relations involved in clause combinations of this type are those of time (intro-
duced by açin ‘when’ in GT, or by ne zaman ‘when’ in other dialects of Mace-
donian Turkish), location (introduced by nerde ne ‘where’ < lit. ‘where what’,
cf. Macedonian kade što lit. ‘where what’), reason (introduced by niçin ‘be-
cause’ < ‘what-for’, cf. Macedonian zošto lit. ‘for-what’), manner (introduced
by kose ‘as if’, possibly a contamination of Macdeonian kako ‘how’, Albanian
kinse ‘as if’, and Turkish -se ‘if’; see above), and comparison (introduced by
neka ne ‘as much as’ < ne kadar ne ‘how much what’, cf. Macedonian kolku
što). Purpose clauses and final clauses are equally finite, and show the verb of
the subordinated clauses in the subjunctive. They are introduced respectively
by the complementizer ki, directly reinforcing the subjunctive (cf. Macedo-
nian prepositional reinforcer za da), and the conjunction çaki ‘until’.
A second type of clause linkage remains largely unaffected by contact-
induced restructuring. This involves conditional clauses (‘If I pass my exam
my dad will buy me a bicycle’), and concessive clauses (‘Although I want to
go to Antalya, I won’t be able to go’). Both are marked by the conditional
marker -se (on its own for conditional clauses, with addition of de or hem
‘too’ for concessive clauses), which is added to a finite subordinated clause.
Thus, where Turkish already operates with finite subordinations, there ap-
pears to be no motivation to re-organize the structure of clause linking.
8. Lexicon
Despite multilingualism in the region where GT is spoken, lexical borrow-

ing is predominantly from Macedonian, reflecting the growing importance
of the state language in the past two to three generations especially. Lex-
ical borrowing from Macedonian naturally affects in the first instance se-
mantic areas belonging to the public domain, such as names for institutions
(teatar ‘theatre’, fakultet ‘faculty’, univerzitet ‘university’, vodovod ‘water
board’, kanalizatsiya ‘infrastructure’, militsiya ‘police’, armiya ‘army’, or-
dinatsiya ‘dental surgery’, klinika ‘clinic’, autobuska stanitsa ‘bus station’),
terms for practitioners and professionals (elektriçar ‘electrician’, stomatolog
‘dentist’, sestra ‘nurse’, apotekarka ‘pharmacist’, direktor ‘director’, student
‘student’, privatnik ‘having business in the private sector’), academic subjects
and professions (meditsina ‘medicine’, farmatsiya ‘pharmacy’, stomatologi-
ya ‘dentistry’, ispit ‘exam’, praktiçno ‘practical exam’, poen ‘mark’, matura
‘graduation’), construction and technology (kuyna ‘kitchen’, patos ‘flooring’,
şifunyer ‘cabinet’, parno ‘central heating’, radiator ‘radiator’, garaja ‘ga-
rage’, satelitska ‘satellite’, kaseta ‘cassette’, elektrika ‘electricity’), as well as
miscellaneous domains (maçka ‘cat’, şatka ‘duck’, sok ‘fruit juice’, zvuçnost
‘sound’, spetsiyalizatsiya ‘specialization’, tragediya ‘tragedy’, etc.).
9. Conclusion
It is interesting to note once again that Turkish has only been a minority lan-
guage in Macedonia for some three to four generations now. The fact that
Matter borrowing is limited to a rather small number of discourse particles
and conjunctions, may be a reflection of this recent retreat of Turkish from
public life, and its replacement, to a considerable degree, by Macedonian.
The lexicon, of course, reflects the recent dominance of Macedonian-speak-
ing society in the public domain, employment, technology, and so on. Never-
theless, the restructuring of clause combining strategies based largely on a
Macedonian model constitutes a radical departure from the Turkic syntactic
type, and it is most certainly much older than the retreat of Turkish as the
language of the public domain. Rather, the changes in this domain reflect
century-old multilingualism. It appears that in daily communication, speak-

ers were under pressure to organize complex utterances in a compatible way
across the various languages that constituted their linguistic repertoire.
What is essentially an economy-driven motivation – reducing multiple
pattern types across the linguistic repertoire to just one – might be understood
as a harmonization of utterance-organization strategies (Matras 2004). The
two areas that are most obviously affected are clause combining strategies,
and to a somewhat lesser extent, word order. With the former, it is the pack-
aging of supplementary information through finite subordinations that pre-
vails, and to this end a series of grammaticalization processes are triggered,
exploiting elements of the inherited lexicon, often following the Macedonian
model (‘replica grammaticalization’ in the terms of Heine and Kuteva 2005).
The latter, word order, involves harmonization of strategies of mapping in-
formation status at the level of the linear organization of the utterance. Here,
some constructions, such as possessive noun phrases and existential (copula)
predications, appear more vulnerable to the pressure toward harmonization
than others. Nevertheless, even word order in the basic verb phrase shows
a partial relaxation of the pragmatic constraints on the appearance of post-
verbal objects. This in turn provides an extended scope to employ such con-
structions, which resemble the word order rules of the contact language.
We may speculate that it was possible for pattern-replication of this type
to emerge in the vernacular language long before Turkish retreated as the offi-
cial language of the public sphere: it exploited constructions that pre-existed,
to some extent at least, in colloquial usage, such as semi-embedded finite
optative constructions (see discussion in Matras 1998 and 2004), or finite
subordinations introduced with ki, or pragmatically-marked constructions
involving de-topicalization of direct and indirect objects (in post-verbal pos-
ition). Pattern replication was thus a kind of compromise, allowing speakers
to maintain language loyalty while assisting the levelling of certain language
processing strategies within the multilingual repertoire. We suggest that this
latter aspect is a crucial component of the history of linguistic areas, for
which the Balkans have long served as a prototype example.
Abbreviations
abl ablative att attributive marker

acc accusative cop copula
aor aorist dat dative
def definite article past past tense marker

dim diminutive pl plural
fut future poss possessive
gen genitive prog progressive
indef indefinite marker q interrogative particle
inf infinitive rel relative particle
inst instrumental req requestive
loc locative sg singular
neg negation subj subjunctive
Notes
1. Examples are taken from Tufan’s fieldwork in Gostivar; observations are based
partly on fieldwork data collected by Matras among speakers from Stip.
2. Figures or even estimates of numbers of speakers in the entire region are diffi-
cult to obtain. Ethnic Turks in the Republic of Macedonia itself number around
70,000.
References
Caferoğlu, Ahmet
1964 Anadolu ve Rumeli ağızları ünlü değişmeleri. TDAYB. 133.
Friedman, Victor A.
2003 Turkish in Macedonia and Beyond. Wiesbaden: Harrassowitz
Heine, Berndt, and Tania Kuteva
2005. Language Contact and Grammatical Change. Cambridge: Cambridge
University Press.
Matras, Yaron
1998 Convergent development, grammaticalization, and the problem of
‘mutual isomorphism’. In: Winfried Boeder, Christoph Schroeder, and
Karl-Heinz Wagner (eds.) Sprache in Raum und Zeit, 89103. Tübin-
gen: Narr.
2004 Layers of convergent syntax in Macedonian Turkish. Mediterranean
Language Review 15: 6386.
Németh, Gyula
1956. Zur Einteilung der Turkischen Mundarten Bulgariens. Sofia: Bulga-
rische Akademie der Wissenschaften.
Tufan, Şirin
2007 Language convergence in Gostivar Turkish (Macedonia). PhD thesis,
University of Manchester.
Grammatical borrowing in Kildin Saami
Michael Rießler
1. General
The present chapter deals with borrowings in Kildin-Saami.1 The principal

contact language for Kildin is Russian, and has been so at least since the
end of the Middle Ages. There are no grammatical borrowings detected from
any languages other than Russian. However, many of the contact phenom-
ena dealt with may also be traced in the other Kola Saami languages Akkala,
Skolt, and Ter, which are facing the same contact-linguistic environment or
have even been exposed to stronger assimilation pressure. But since Kildin
is the best documented and the most accessible of the Kola Saami languages
this investigation will be restricted to observations on this language.
Most of the data used for this investigation have been taken from existing
descriptions of Kildin. Examples given without reference come from my own
field notes and have been cross-checked with native speakers.
The orthographic representation of examples follows the standards set in the
dictionary written by Kuruč, Afanasjeva, and Mečkina (1985).
1.1. Linguistic background
Saami is a branch of the Uralic language family. All Saami languages are
fairly similar in grammatical structure and lexicon. They form a dialect chain
stretching from central and northern Scandinavia to the eastern tip of the
Kola Peninsula. Kildin belongs to the group of East Saami languages. The
other subgroups of Saami are Central Saami (spoken in the northern parts of
Finland, Sweden and Norway) and South Saami (spoken in Central Scandi-
navia), each of which includes several languages.
One characteristic of the phonology of most Saami languages is the oc-
currence of preaspirated voiceless stops and affricates [p, t, k, ¿, ʧ].
Negation in Saami is expressed by means of an inflected negation auxiliary
followed by the non-finite main verb in a special connegative form. Phrase
structure in Saami is for the most part head-final, including the predominant
occurrence of postpositions instead of prepositions and strict head-finality in
230 Michael Rießler
noun phrases with noun, adjective, and pronoun modifiers. Relative clauses,
however follow the noun they modify. In the verb phrase a shift from SOV
to SVO word order seems to be taking place. The change in the order of
verb and direct object as well as the introduction of prepositions and relative
clause constructions are probably contact-induced. However, these changes
go back to Common Saami tendencies and are subsequently not dealt with
in the present investigation.
One Kola Saami characteristic – as compared to the western Saami lan-
guages – is the relatively large consonant inventory, which is mostly due to
the fact that almost all consonants have a phonologically distinct palatal-
ized counterpart. As for the nasal and lateral dentals /n/ and /l/ there is an
opposition not only to the respective palatalized phonemes, but also to the
nasal and lateral palatals /ɲ/ and /ʎ/; consider the minimal triples mānn /
maÃnÃ/ ‘moon; month’ – mannҍ /manÃ/ ‘egg’ – mann' /maɲÃ/ ‘daughter in
law’ and pāll /paÃlÃ/ ‘ball’ – māll' /maÃlÃ/ ‘juice’ – māll'j /maÃʎÃ/ ‘rust’. The
existence of a phonological opposition between palatal and palatalized con-
sonants seems to be uncommon cross-linguistically (cf. Stadnik 2002: 31,
elsewhere).
1.2. Sociolinguistics and geography
Kildin is currently spoken on the Kola Peninsula in the northwestern-most

part of the Russian Federation by no more than 700 people. The language is
endangered due to language shift to Russian and is hardly ever heard in pub-
lic life nowadays. Only elder Saami use their mother tongue in conversation
with family members, relatives or friends. Among the younger generation,
there is a strong decline in active language competence due to the lack of a
vibrant speech community and the lack of any social motivation for learning
and using Saami (an overview on the current socio-political situation of the
Saami in Russia is given in Scheller 2006).
The integration of the Kola Saami into the Russian Empire, their adapta-
tion to Russian culture, and their conversion to Orthodox Christianity began
as early as the fifteenth century. Nevertheless, despite the longstanding as-
similation pressure, the territorial communities of the Saami were able to
preserve their social, economic and cultural identity – at least in the central
and northern areas – until the end of the nineteenth century. By the first half
of the twentieth century, the Saami culture was on the verge of destruction.
The tapping of mineral resources and the military armament of the region
Kildin Saami 231
were connected with an immense influx of manpower from Russia and other
republics of the Soviet Union. The dissolution of the traditional Saami com-
munities – as the result of forced integration of Saami reindeer herders into
large new agricultural co-operatives – and the resettlement of the Saami for
socio-political, economic and military reasons led to a dispersion of the ori-
ginal speech communities. The former compact Saami settlements and co-
herent local speech communities were replaced by mixed communities of
Saami speaking different local varieties, together with non-Saami (above all
Komi and Russians). As a result, within a few decades, the indigenous Saami
people became a tiny, scattered minority without any great influence on pol-
itical decisions (for an overview on Kola Saami history and further references
see Kulonen, et al. 2005: 261265).
1.3. Contact induced change in Saami
Along with the changes in Kola Saami culture and society, we witness a
number of contact-induced linguistic changes. These changes – almost exclu-
sively of Russian origin – concern nearly all domains of the grammar but are
especially strong in the lexicon. Still, contact-induced change in Kola Saami
has not yet been the subject of a systematic investigation. Certain contact-
induced features of Russian origin are mentioned in works on Kola Saami,
for example in Kert's (1971) grammar of Kildin. Russian influence on Kola
Saami is also the subject of a published conference abstract by Klaus (1977)
dealing with borrowed adverbs and the use of Russian numerals in Saami
speech. The most considerable listing of contact-induced features in Kildin
is found in a paper by Kert (1994). But even here it is mostly lexical borrow-
ings which are dealt with. Only a few grammatical features are listed briefly
by the author, such as borrowed phonemes in loanwords, the borrowed su-
perlative particle, and borrowed function words (Kert 1994: 112). All these
features are dealt with in more detail in the respective sections below.
Another phonological feature which Kert believes to be borrowed from
Russian is palatalization (Kert 1994: 111). This idea is shared by Stadnik
(2002: 34, 165; without reference to Kert). Neither of the two authors, how-
ever, offer any explanations for the contact-linguistic mechanisms behind the
proposed development. Instead of being caused by borrowing I find it much
more reasonable to assume that Kildin Saami palatalization is triggered by
another, probably language-internal development, namely the apocope of the
reduced stem-final vowels.
2. Phonology
Kildin has borrowed some phonemes along with Russian loanwords. In most
cases, however, the phonological distinctiveness of these phonemes is weak
since they only occur in loanwords and there are almost no real minimal pairs
available. Consider, for example, the innovative voicing opposition in word-
initial plosives, sibilants, and the labio-dental fricative, i.e. (original) /p, t, k,
ʃ, s, v/ – (innovative) /b, d, g, ʒ, z, f/, cf. purrk /purÃk/ ‘lower part of the rein-
deer's antlers’ – būrka /buÃrka/ <Ru búrka ‘leather boot’ or šūrr /ʃuÃrÃ/ ‘big’
– žoarr /ʒɒÃrÃ/ <Ru žara ‘heat’. Also the occurrence of the voiceless velar
fricative in word-initial position is restricted to loanwords, as for example in
xoz'enҍ /xozen/ <Ru xoz'áin ‘host’. Another example of a borrowed phono-
logical feature is the introduction of word-initial consonant clusters, which
do not otherwise occur in Kildin unless in loanwords such as, for example,
the word floht [flɔtÃ] <Ru flot ‘fleet’.2
Besides the examples mentioned, Kildin phonology does not seem to ex-
hibit any other borrowed features.
3.1. New domains of illative case
In Kildin the illative case has expanded its domain to impersonal construc-
tions such as in example (1). This use of illative is clearly influenced by the
equivalent use of a dative construction in Russian (Szabó 1984: 3637).
(1) a. Russian
Tebe ne nado znat′.
2sg.dat neg is_nessessary.3sg to_know
b. Kildin Saami
Tonnҍe e= be tīdtҍe.
2sg:ill.sg neg is_nessessary.3sg to_know
‘You don‘t need to know.’
The equivalent construction in other Saami languages would normally be one

with a main verb and the subject in nominative case (e.g. NSa dárbbahit ‘to
need, to have use of’). However, this use of illative in Kildin is probably not
a very recent innovation since similar examples are found in all Kola Saami
Kildin Saami 233
languages (according to Itkonen's (1958) dialect dictionary of the Kola Saami

languages). Another, though archaic, impersonal verb in Kildin, given by
Itkonen (1958: 33) as glk (~gɛlk, galk) 3sg.prs ‘need, shall’ (cf. NSa galgat
inf ‘to shall, to have to’) requires either illative or genitive. Since similar im-
personal constructions – with either genitive or partitive – occur in Finnish
it cannot be ruled out that their use in the Kola Saami languages was origin-
ally calqued, based on the Finnish model. Russian influence could than have
caused the replacement of an original genitive or partitive (depending on the
verb) in these impersonal constructions. Note also that the impersonal verbs
bedt and ebe (e=be ← ejj bedt [neg:3sg.prs is_nessessary]) are borrowed
from Finnish (or Karelian) pitää ‘to need’ which is used in a construction
with a partitive subject.
In a construction expressing the age of someone or something in years,
illative case is obligatory as well (2a). This expression is also clearly mod-
eled on the analogous usage of dative in Russian (2b) and is not equivalent
to the construction used in other Saami languages, cf. the North Saami ex-
ample (2c).
(2) a. Kildin Saami

mɨnnҍe l'ēv kōll′m gke.
1sg:ill be.3pl three year-part
b. Russian
mne tri god-a.
1sg:dat three year(m)-gen.m.sg
lit. ‘to me are three years.’
c. North Saami
mun lea-n golbma jagi.
1sg be-1sg:prs three year\gen.sg
‘I am 3 years old.’
3.2. Diminutives and augmentatives
A case of pattern-borrowing in the nominal morphology of Kildin can be

found in the adoption of a second diminutive form. The two diminutives of
Kildin are formed by means of the suffixes -a and -enč. Whereas the first di-
minutive suffix (3b) simply marks diminution (without expressing the speak-
er's attitude to the noun in question), the second diminutive (3c) has rather
expressive semantics and could be characterized as a complimentary form.
(3) Diminutive and complimentary in Kildin

a. koabp' ‘ditch’
b. koab'-a ‘small ditch’
ditch-dim1
c. koab'-enč ‘dimple’
ditch-dim2
The model of the differentiated diminutive seems to have been borrowed from
Russian, where graded diminutive forms are quite frequent. The system of
graded diminutive (and augmentative) forms with single and combined suf-
fixes is on the whole very typical of Russian as well as of other Slavic lan-
guages. In (3b) the feminine diminutive suffix -ka is attached to the noun jáma
‘ditch’ resulting in a form with the “simple” diminutive reading ‘small ditch’.
(3c) constitutes a diminutive form as well, although with the rather expressive
meaning ‘nice small ditch, dimple’.
(4) Diminutive and complimentary in Russian

a. jáma ‘ditch’
ditch(f)
b. jám-ka ‘small ditch’
ditch-dim1:f
c. jám-oč-ka ‘nice small ditch; dimple’ (complimentary)
ditch-dim2-dim1:f
The existence of a second diminutive (with complimentary semantics) in Kil-

din is clearly innovative since the other Saami languages only have the one
simple diminutive form. The complementary (=dim2) suffix -(e)nč of Kildin
goes back to the Proto-Saami diminutive suffix *[-ńʒ́e] as does the diminutive
suffix -š / -ž(a)- of Central and South Saami (Korhonen 1981: 320), cf. NSa
mánná ‘child’, mánáš ‘little child’, mánážat ‘little children’.
The Kildin Saami innovative diminutive suffix -a, following the weak con-
sonant stem, is identical to the weak flexional stem of the original diminutive
suffix -enč, occurring, e.g., in the genitive or locative of the diminutive noun
(koab'-enč ‘dimple’ – koab'-a [ditch-compl\gen:sg] ‘of the dimple’ – koab'-
a-s't [ditch-compl-loc:sg] ‘in the dimple’). The diminutive koab'a dim ‘small
ditch’ is thus homophone with the genitive of the second diminutive koab'a
compl\gen:sg ‘of the dimple’. Obviously speakers of Kildin generalized the
genitive (or weak stem) form of the original diminutive noun as the unmarked
diminutive form at some point. At the same time the meaning of the original
Kildin Saami 235
diminutive form of -enč has been shifted to that of a complementary form.

Russian influence did not necessarily trigger the first step in the develop-
ment of the new diminutive form. On the contrary, giving up the original
diminutive suffix -enč in favor of -a might just be the result of the overall
tendency towards phonological reduction and loss of suffix codas in Kildin.
Historically, -a does not belong to the suffix but is a reflex of the original sec-
ond syllable vowel, cf. NSa niibi ‘knife’ – niipa-š dim and KSa njjp ‘knife’
– njp-a dim – njp-enč compl.
The archaic diminutive suffix -enč and the innovative suffix -a had prob-
ably been in free variation for a while (and might still be in the speech of some
Kildin speakers) before the first one acquired a secondary meaning. But in any
case, the reinterpretation of the original diminutive as complementary must
be the result of Russian influence.
In Russian (and other Slavic languages), it is not only diminutives which
have graded semantics, but augmentatives do as well.
(5) Gradated augmentatives in Russian

a. dóm ‘house’
house
b. dóm-išče ‘large house’
house-aug
c. dóm-iško ‘worthless/bad (large) house’ (pejorative)
house-aug.dim
Kildin performs analogously here as well. Kildin has an augmentative suffix,

although I find that derived augmentative nouns are used less frequently than
diminutives in Kildin. The most common augmentative suffix in Kildin is
-p'ihk /-pikÃ/ [-pik]. A second augmentative suffix with a more expressive,
pejorative meaning is -p'agka /-pagÃa/. End'ukovskij (1937: 139) also gives
pejorative forms on -p'ikenč ?/-pigenʧ/, which is obviously a combination of
an augmentative and a diminutive suffix. My language consultants do not use
this form. However, I find it tempting to believe that -p'agka is a secondary
development out of an original -pikenč and I suggest the following develop-
ment of the suffix (exemplified by the noun pēr̥̥r t ‘house’):
(6) a. aug pēr̥t-p'ihk [house-aug]

b. pejor1 pēr̥t-p'igk-enč [house-aug-dim1]
c. pejor2 pēr̥t-p'agk-a [house-aug-dim2]
The weak consonant stem in (5a) pēr̥t- (nom.sg pēr̥̥rt) is triggered by the suf-
fixation of aug. Regular stem alternation rules would also account for the
suffixal consonant gradation in (5b) -p'igk (← -p'ihk) as well as for the suf-
fixal umlaut caused by -a dim2 in (5c) -p'agk (← -p'igk).
The etymology of the augmentative suffix -p'ihk, however, remains obscure.
There is no augmentative in western Saami languages or even in Fennic, and
the Slavic augmentative suffixes have different shapes and thus cannot be the
source either. Saami speakers have suggested to me that the second pejorative
suffix may have been borrowed from the Russian word b'aka, which means
‘excrement’ (‘doo-doo’) in child language. I find this rather unlikely. But even
if b'aka has been borrowed from Russian, this would not explain the origin of
-p'ihk. In Komi there is a noun pik meaning ‘trouble, misfortune’ which can
also be used as an interjection with the meaning ‘Tough luck! Too bad!’ (Lyt-
kin 1961: 537538), cf. Ru bedá with the same meaning). Another possible
source (according to Jurij Kusmenko p.c.) could be Scandinavian bäck ‘tar’.
Even though the lexical source of the suffix remains unknown, the ex-
istence of an augmentative in Kildin and its derivation into a pejorative by
means of extending it with a diminutive suffix is obviously due to Russian
influence.
4. Verbal structures: analytic future tense
Kildin seems to be grammaticalizing an analytical future tense form with the

auxiliary all'ke ‘to start, to become’. Future tense is not known to occur in
other Saami languages.
(7) Analytic future tense in Kildin (Kuruč 1985: 558)

mjj all'k'-ep lōgk-e.
1pl fut-1pl read-inf
‘We will read.’
The construction in Kildin is obviously modeled on the Russian future tense

with the future auxiliary budet'. The latter construction is clearly grammat-
icalized since the verb budet' has no lexical meaning.
(8) Analytic future tense in Russian

my bud-em čita-t'.
1pl fut-1pl read-inf
‘We will read.’
Kildin Saami 237
According to Kuruč (1985: 558) all'ke is used as future auxiliary without any
lexical meaning. This is true at least for the speech of younger Kildin Saami,
who according to my own observations tend to use all'ke as future marker
similar to the Russian construction with budet'. However, the lexical use of
all'ke is also attested, as in the following example taken from а fairy tale re-
corded in 1975 (Sammallahti 1998: 148).3
(9) Sjj ell'k-enҍ kānnҍc kānҍc-es' =že šoabš-e.

3pl start-3pl.prt friend friend\gen-poss:3sg top love-inf
‘They fell in love (with each other).’
5.1. Pronouns: Negative indefinites
In the Saami languages negative indefinites are usually derived from inter-
rogatives by means of a suffix (or particle), cf. NSa guhte-ge [who-neg]
‘nobody’, gosa-ge [where-neg] ‘nowhere’, etc.4 Kildin exhibits a different
construction with a negation prefix, borrowed from the Russian negative par-
ticle ni (~ né); consider the examples in (10).
(10) Negative indefinites with ni in Russian and Kildin

Ru KSa gloss translation
a. niktó ni-k’ē neg-who? ‘nobody’
b. nigdé ni-kas’t neg-where? ‘nowhere’
c. ničtó ni-mī neg-what? ‘nothing’
The marker ni- is productively used with all interrogatives in Kildin, which is
obviously the result of calquing the Russian model. Note that the Kildin inter-
rogative pronouns are all inherited from at least Proto-Saami. Exactly as in
the Russian source model, ni- attaches productively to interrogatives not only
in nominative but in other cases as well, for example ni-mēnn [neg-what.
acc:sg] ‘nothing(acc)’, ni-k'ējn [neg-who.com:sg] ‘with nobody’, ni-k'ēnn
[neg-who.gen:sg] ‘nobody's’, etc. Finally, the double-negated construction
in Kildin appears to be calqued from Russian as well.
(11) Kildin Saami

ni-mēnn munn emm ujn
neg-what.acc:sg 1sg neg:1sg.prs see.conn:prs
Russian
ničego ja ne vižu
neg.what.acc:sg 1sg neg see.1sg:prs
‘I don't see anything.’
Note that ni in Russian is not a prefix since it can be separated from the inter-
rogative when the latter is used with prepositional interrogatives, as in Ru ni s
kém [neg with instr] ‘with nobody, with no one’. In Kildin, however nothing
can be inserted between ni- and the interrogative. The obligatory boundedness
of ni- to its host and its productive use (even though restricted to the closed
class of interrogatives) suggests that ni- is a prefix in Kildin. This marker thus
constitutes the only prefix in this otherwise exclusively suffixing language.
5.2. Particles and discourse markers
Lexical borrowing may go hand in hand with the introduction of grammatical

structures. This is obviously the case with the discourse-pragmatic structuring
of conversation and narration in Kildin which looks quite similar to that of
Russian due to the overall borrowing of Russian discourse markers like vot
‘well’, nu vot ‘now then’, tak ‘so’, ved' ‘you know’, etc.
Another example of borrowing can be seen in the Russian topic marker
=že (~ =še) ‘but; you know, you see; also’.
(12) Ovvan=že ‘Ivan, you know’

Ovvan top
The use of =že in Kola Saami, however is clearly not a very recent innovation
since it was already mentioned in the Kola Saami grammar of Halász (1883:
40).
Analogue to Russian, the enclitic =že in Kildin also occurs as part of the
(lexicalized) adverb ndtše ‘also’, cf. Ru tože (← tot že ‘the same’). The first
component of the adverb in Kildin, however, was inherited at the latest from
Proto-Saami (cf. NSa na ‘so, well’ and ná ‘so, this way’).
All discourse markers mentioned above have functions similar to those in
the source language. However, these have replaced original Saami markers
Kildin Saami 239
rather than introducing a new structure of text ordering. A case of introducing

a new discourse-pragmatic model along with the borrowed marker is found in
the discursive use of the conjunction a ‘and, but’ <Ru a.
(13) KSa Mēhkal loge knīga. A Jussi?

Ru Mixajl. pročital knigu. A J.?
M. read book but J.
‘Michael read the book. And what about Josh?’
5.3. Connectors
Several Russian borrowings have replaced Saami connectors (which in their

turn are often old borrowings from Baltofinnic into Common-Saami or Proto-
Saami), cf. jesli (~ jesle) ‘if’ <Ru jésli ‘if’, patamúšte ‘because’ <Ru potomú
čto ‘because’, štobe (~ štop) ‘so that’ <Ru čtó by ‘so that’, što (~ šte) ‘that’
<Ru čto ‘that’, etc. Consider the following example with the subordinating
conjunction šte (from Kuruč 1985: 267).
(14) KSa Munn tēdta, šte tōnn puadak.

Ru Ja snaju, što ty prijedeš.
‘I know that you come.’
Besides the Common Saami phrasal and clausal coordinator ja ‘and’ which is
a Proto Saami borrowing from Germanic, cf. Gothic -jah (Sammallahti 1998:
249), the coordinator i ‘and’ <Ru is often used by Kildin speakers. Other Rus-
sian coordinators in Kildin are ili ‘or’ <Ru and ne ‘but’ <Ru no ‘but’. Kildin
also borrowed a relative particle katóre from the Russian relative pronoun
kotóryj m ‘which’.
5.4. Adjectives: Analytic superlative
Comparison in Kildin is usually expressed by means of the suffixes -a comp

and -mus sup; consider the adjective nūrr ‘young’ – nūr-a comp – nūr-a-mus
comp-sup.5 But the superlative of adjectives can also be formed analytically
with the borrowed particle same <Ru sámoj m. In Russian the superlative
marker agrees with the adjective in gender, number and case (the latter only
if the adjective is used attributively). In Kildin, where there is no category
gender nor do attributive adjectives agree in case, the superlative marker

occurs as an uninflected particle.
(15) a. Synthetic superlative in Kildin

kugk-a-mus čuekas
long-comp-sup road
‘the longest road’
b. Analytic superlative in Kildin
same kugk'-es' čuekas
sup long-attr road
c. Double marked superlative in Kildin
same kugk-a-mus čuekas
sup long-comp-sup road
Whereas the innovative analytic superlative in (15b) seems to be used without

a difference in meaning as compared to the original synthetic superlative in
(15a), the double superlative construction in (15c) is used to further empha-
size the superlative notion ‘longest’.
6. Lexical borrowings
Kildin Saami has borrowed a large amount of Russian vocabulary. Nouns,

verbs, and adjectives have all been borrowed. By far the largest part of loan-
words from Russian belong to modern world concepts; however, Russian
loanwords are found in all other semantic domains as well.
There are also a fair number of function words among the recent lexical
borrowings of Russian origin. Examples for borrowed function words in-
clude the adverb toal'ke /tɒlke/ [tɒlkə] (~ [tɔkə]) <Ru tol'lko [tɔlkɐ] ‘just;
only’, the conjunction i ‘and’ <Ru i ‘and’, and the ordinal numeral p'ērve
‘first’ <Ru pervyj m. ‘first’. The introduction of these function words belongs
to the borderland between lexical and grammatical borrowing (see also Sec-
tion 5 above).
The most salient characteristic of loanwords of the youngest stratum, that
is, words borrowed after approximately 1950 ad, is the low degree of their
phonological integration. Many of the most recent loanwords have been
Kildin Saami 241
adopted without any phonological adoptation. The word for ‘car’, mašína
(<Ru mašína ‘car’), serves as an example. The word retains both its origin-
al second-syllable stress (Saami has first-syllable stress as a rule) and its
original syllable structure (trisyllabic word stems are quite rare in Kildin).
The loanword pāss'pe /paÃsÃpe/ ‘thanks!’ (<Ru spasíbo [spa'sibɐ] ‘thanks’)
on the other hand clearly belongs to an older stratum. The first consonant
/s/ was lost since Saami does normally not allow word-initial consonant
clusters. The word stress is shifted from the second to the first syllable. The
vowel and stem consonant in the first syllable have become long while the
second syllable vowel is apocoped (the palatalization of the cluster /sÃp/ is
a remnant of the front vowel /i/ of the contracted second syllable). Finally,
the word final vowel [ɐ] of the Russian source word is further reduced to
schwa.
7. Conclusions
Contact with Russian has resulted in a few grammatical changes in Kildin

Saami. Changes concerning the borrowing of actual linguistic matter (or
MAT-borrowing) are mostly found at the level of discourse-pragmatic text
structuring (discourse markers, coordinators, subordinators). Interestingly,
almost all of the examples of borrowed function words are also found in other
Uralic contact languages of Russian (cf. Majtinskaja 19781979).
Apart from that, structural changes due to MAT-borrowing can found in
the replacement of an original synthetic construction by a new analytic con-
struction in adjectival morphology (the superlative particle) and, for nominal
morphology, in the replacement of an original negative suffix by a borrowed
negative prefix on interrogatives.
Other changes in verbal and noun morphology, such as the grammatical-
ization of an analytic future tense and the introduction of secondary diminu-
tive and augmentative forms are rather due to pattern replication (or PAT-
borrowing), i.e. due to the adaptation of similar models from Russian.
The core areas of Kildin morphology and phonology seem to be relatively re-
sistant to MAT-borrowings: there are no borrowings of synthetic case, tense,
and person markers, or of demonstratives or other pronouns. Furthermore,
the distribution of new phonemes is clearly restricted to the loanwords with
which they where borrowed.
Abbreviations
acc accusative m masculine

attr attributive n neuter
aug augmentative neg negative
com comitative nom nominative
comp comparative NSa North Saami
compl complimentary part partitive
conn connegative pejor pejorative
dat dative pl plural
dim diminutive poss possessive
f feminine prs present
fut future prt preterite
gen genitive PSa Proto-Saami
ill illative Ru Russian
inf infinitive sg singular
instr instrumental sup superlative
KSa Kildin Saami top topic
loc locative
Notes
1. I would like to thank my colleagues K. Hildebrandt, K. Kotcheva, J. Kusmenko,
J. Wilbur as well as the editors of this volume for their helpful comments. The
usual disclaimers apply. My fieldwork on Kola Saami was supported by the
Gesellschaft für bedrohte Sprachen e.V. and the VolkswagenStiftung.
2. Note, however, how the preaspirated articulation of the voiceless stop in floht
[flotÃ] is a clear example of the phonological extension adaptation of the Rus-
sian loanword according to the Saami phonological rules.
3. The stem alternation (e ← a) in the example (9) is the result of regular Umlaut.
4. The gloss neg might, however be questionable since the morpheme can indeed
be added to other semantic word classes in North Saami and expresses some
kind of emphasis rather than negation in most constructions where it occurs.
5. Note that the iconic comparative marking with sup attaching to comp is the
result of the relatively recent apocope of the original comparative suffix -mp
together with the shift of the original word stem boundary. The comparative suf-
fix -a is part of the stem historically, cf. NSa nuorra ‘young’ – nuora-t comp –
nuora-mus sup. The comparative nūr-amp ‘younger’ can still be found in older
text collections and descriptions of Kola Saami languages but is now regarded
as archaic by speakers of Kildin.
Kildin Saami 243
References
End'ukovskij, A. G.
1937 Saamskij (loparskij) jazyk [The Saami (Lappish) language]. (Jazyki i
pis'mennosti narodov Severa). Moscow-Leningrad.
Halász, Ignácz
1883 Orosz-lapp nyelvtani vázlat [An outline of Russian–Lappish gram-
mar]. Nyelvtudományi közlemények 17: 145.
Itkonen, Toivo Immanuel
1958 Koltan- ja kuolanlapin sanakirja = Wörterbuch des Kolta- und Kola-
lappischen. (Lexica Societatis Fenno-Ugricae 15.) Helsinki: Suoma-
lais-ugrilainen seura.
Kert, Georgij Martynovič
1971 Saamskij jazyk (kil'dinskij dialekt): Fonetika, morfologija, sintaksis
[The Saami language (Kildin dialect): Phonetics, Morphology, Syn-
tax]. Leningrad: Nauka.
1994 Saamsko-russkie jazykovye kontakty [Saami–Russian linguistic con-
tacts]. In: Pjotr Mefodievič Zajkov (ed.), Pribaltijsko-finskoe jazykoz-
nanie, 99116. Petrozavodsk: Rossijskaja Akademija Nauk.
Klaus, Väino
1977 O nekotoryx javlenijax vlijanija russkogo jazyka na saamskij jazyk
[On some effects of Russian influence in Saami]. In: Issledovanie fin-
no-ugorskix jazykov i literatur v ix vzaimosv'azax s jazykami i liter-
aturami SSSR. Tezisy dokladov Vsesojuznogo Naučnogo Soveščanija
Finnougrovedov, 27–30 okt. 1977 g., 33. Užgorod.
Korhonen, Mikko
1981 Johdatus lapin kielen historiaan [Introduction to the history of the
Lappish language]. (Suomalaisen Kirjallisuuden seuran toimituksia
370.) Helsinki: Suomalaisen Kirjallisuuden seuran.
Kulonen, Ulla-Maija, Irja Seurujärvi-Kari, Risto Pulkkinen, and Johanna Roto
2005 The Saami: A Cultural Encyclopaedia. (Suomalaisen Kirjallisuuden
seuran toimituksia 925.) Helsinki: Suomalaisen Kirjallisuuden seura.
Kuruč, Rimma Dmitrijevna
1985 Kratkij grammatičeskij očerk saamskogo jazyka [A short grammat-
ical sketch of Saami]. In: Rimma Dmitrijevna Kuruč, Nina Jelisejevna
Afanasjeva, and Jekatarina Ivanovna Mečkina (eds.), Saamsko-russkij
slovar' = Saam'-rūšš soagknehk', 529567. Moskva: Russkij Jazyk.
Kuruč, Rimma Dmitrijevna, Nina Jelisejevna Afanasjeva, and Jekatarina Ivanovna
Mečkina
1985 Saamsko-russkij slovar' = Sа¯аm'-ruušš soagknehk' [Saami-Russian
dictionary]. Moskva: Russkij Jazyk.
Lytkin, V. I.
1961 Komia-roča slovar’ = Komi-russkij slovar’ [Komi-Russian dictionary].
Moscow: Gosudarstvennoje izdatel’stvo inostrannyx i nacional’nyx
slovarej.
Majtinskaja, Klara J.
19789 Zaimstvovannye elementy, ispol'zuemye v finno-ugorskix jazykax pri
obrazovanii form naklonenij [Borrowed elements, used in inflectional
forms in Finno-Ugric languages]. Etudes finno-ougriennes 15: 227–
231.
Sammallahti, Pekka
1998 The Saami languages:. An introduction. Kárášjohka: Davvi Girji.
Scheller, Elisabeth
2006 Die Sprachsituation der Saami in Russland. In: Antje Hornscheidt,
Kristina Kotcheva, Tomas Milosch, and Michael Rießler (eds.), Grenz-
gänger: Festschrift zum 65. Geburtstag von Jurij Kusmenko, 280290.
(Berliner Beiträge zur Skandinavistik 9.) Berlin: Nordeuropa-Institut
der Humboldt-Universität.
Stadnik, Elena
2002 Die Palatalisierung in den Sprachen Europas und Asiens. Eine areal-
typologische Untersuchung. (Tübinger Beiträge zur Linguistik 461.)
Tübingen: Narr.
Szabó, László
1984 The Function of the Inessive–Elative and the Dativ–Illative in Kola
Lappisch. Nordlyd. Tromsø University Working Papers on Language
and Linguistics 8: 452.
Grammatical borrowing in Yiddish
Gertrud Reershemius
1. Background
The history and development of Yiddish, a west-Germanic language, is inter-

twined with language contact from its very beginning. Three layers of con-
tact, historical, recent and current, can be distinguished.
1.1. Historical contact
The Yiddish language originates from medieval times and developed through
contact: Jewish speakers of Old High German and later Middle High Ger-
man varieties enriched these vernaculars with a component, mainly in the
lexicon, from Hebrew and Aramaic. These languages of the scriptures and
religious practice served as written and high varieties in a situation of internal
Diglossia. The spoken language was written mainly for informal purposes,
such as private letters, memoirs, notes and entertaining or devotional litera-
ture, addressed to those who were unable to read and write in Hebrew, e.g.
women or those who could not afford an extended education. It is not known
whether in early stages the vernacular was identical with the varieties spoken
by Non-Jews: a majority of scholars in the field of historical Yiddish linguis-
tics assume that Yiddish developed on the substrate of an older spoken Jewish
language, either Aramaic or a Romance-based variety (Jacobs 2005: 956).1
In their view the language of Ashkenazic Jews has always been distinct from
the varieties spoken around it, and it clearly is the case that the Germanic
component in Yiddish developed into a form distinct from German varieties
(cf. Timm 2005 for the lexicon).
Written sources of the language exist from the late thirteenth century.
These sources are easily recognizable as Jewish since they were written in
Hebrew letters. From the beginning of the sixteenth-century sources prove
Yiddish definitely to be distinct from German varieties, in lexicon and phon-
ology (Timm 1987).
246 Gertrud Reershemius
1.2. Recent contact
Yiddish in the West (Western Yiddish) remained in contact with spoken Ger-
man varieties and the slowly evolving German standard language. However,
migration of Jews to Slavic-speaking areas in Eastern Europe due to persecu-
tion in the West ever since the First Crusade formed Eastern Yiddish through
language contact with Slavic languages and the addition of a Slavic compon-
ent, especially to the lexicon, phonology and lexical derivation. Eastern Yid-
dish thrived from the eighteenth century onwards, and became a fully-fledged
language able to cover all oral and written linguistic domains of modern life
and to contribute to world-class literature. Western Yiddish, however, was
abandoned by its speakers during the nineteenth century in most parts of
the Western Yiddish language area apart from some remote provinces in the
Southwest and in the Northwest of the German- and Dutch-speaking coun-
tries, where remnants of a Yiddish vernacular could be found well into the
twentieth century in small rural Jewish communities (e.g. Lowenstein 1969).
Currently, the term ‘Yiddish’ refers to Eastern Yiddish, a language spoken by
more than 10 million speakers until the Second World War.2 Recent language
contact refers to the formative contact with Slavic and, to a certain extent,
with Baltic languages, as well as to contact with both Standard German and
spoken German varieties which took place over the centuries and which in-
fluenced both languages. Yiddish–German contact is an extremely complex
issue due to the languages’ close structural relatedness and the various pos-
sible modes of contact. Therefore, an analysis cannot be attempted in the
framework of this chapter.
1.3. Current contact
Current contact involves Yiddish and a number of languages such as English,

Israeli Hebrew, Spanish, and Flemish, since the beginning of the twentieth
century. After the Holocaust – approximately 5 million of its victims were
speakers of Yiddish – and anti-semitic persecution in Eastern Europe dur-
ing the 1950s and 1960s, a large majority of surviving Eastern European
Jews emigrated to the United States, Israel, South America or Western Eu-
rope. Most emigrants did not retain their language, although in some places
Yiddish-speaking enclaves developed, consisting of strictly orthodox Jewish
communities, which preserved the diglossic set-up of traditional Jewish so-
ciety to a certain extent. Currently, the number of Yiddish-speakers seems to
Yiddish 247
be growing again and is estimated to be around a million worldwide (Jacobs

2005: 3). More precise figures are not available which is mainly due to the
orthodox communities’ hesitance to take part in surveys.
The present chapter must restrict itself to recent contact due to the fact that
sociolinguistic data and analyses of the highly diverse framework of current
contact is not yet at a level to draw conclusions (cf. Isaacs 1999a, 1999b). The
chapter therefore focuses on contact with Slavic languages, which led to the
acquisition of the Slavic component in Yiddish.3
2. Phonology
The phoneme inventory of Yiddish has been influenced by recent language

contact, although apart from a general tendency for palatal dental plosives to
become fricatives (/c/ + /dz/ and /č/ + /dž/) the changes are usually restricted
to certain geographic parts of the historic Yiddish language area:
a. In North-Eastern Yiddish, new phonemes evolved: /t’/, /d’/, /s’/, /n’/, /l’/.
b. In the east of Ukraine /h/ was replaced by /ɦ/, whereas it was dropped in
the west. In some areas of Ukraine /h/ and /ɦ/ form oppositions.
c. /l/ changed generally to /ł/, in eastern areas to /w/.
Voicing might have undergone contact-induced change, as Weinreich de-
scribes:
the transformation of the consonantal tense/lax opposition into a voiceless/

voiced one. … In Yiddish, the Slavic environment has had a still further im-
pact: it was probably western Ukrainian … and Southern Belorussian … that
served as a model for the distinguishing of voice even at the end of words. …
final distinctive voice in Yiddish is found further north and east than in the
coterritorial Slavic and Baltic languages, and the interaction of internal Yid-
dish structural causes with Slavic influences must be considered. (Wein-
reich 1958: 374)
As for vowels, in Belorussia /o/ changed to /a/ in pretone position, as for

example in xalile–xolile ‘God forbid’ (Weinreich 1958: 372). Although the
distinction between long and short vowels has been lost in most dialects
of Yiddish, it is not considered to be due to contact with Slavic Languages
(Weinreich 1958: 374).
Two further areas of possible influence by recent language contact might
be stress and consonantal clusters. A number of non-phonemic secondary
stresses have been eliminated, and initial consonantal clustering, as for ex-
ample in Dlile ‘Delilah’ or pgam ‘defect’, has been extended under the influ-
ence of Slavic languages (Weinreich 1958: 377).
The following nominal structures may have evolved or been influenced due
to contact: the possessive, word order of the predicate, gender, diminutives
and derivational suffixes.
3.1. Possessive
The genitive case has disappeared in Yiddish. Instead, a periphrastic construc-

tion with the preposition fun ‘of’ is used: dos tixl fun der mame ‘mother’s
headscarf’. At the same time, a possessive form was retained for persons, e.g.
majn bobes bobe ‘my grandmother’s grandmother’, in which the Germanic
neuter and masculine genitive suffix -s was also applied to feminine nouns. In
some cases the definite article is added in the dative: dem suns froj ‘the son’s
wife’.4 Lötzsch (1974: 451452) claims that the development of the Yiddish
possessive could have been influenced by the flexibility of word order in the
Slavic languages, particularly Russian. He admits, however, a number of dif-
ferences between the Yiddish and the Russian possessive constructions.
3.2. Word order of the predicate
An inflected adjective in the predicate conveys a distinct semantic meaning:

ix bin krank ‘I am sick (now)’ but ix bin a kranker ‘I am a sick man’ (Wein-
reich 1958: 383). Weinreich claims that this distinction follows the example
of long and short forms of adjectives in Russian. Another feature concerns
the positioning of the adjectives after the noun they determine, following the
example of Slavic languages: “In Yiddish adjectives may be placed after the
noun they determine, usually with an affective connotation: dos land dos
farbotene ‘the forbidden country’, more solemn than dos farbotene land; di
toxter majne ‘my daughter’, with a touch of disparagement absent in majn
toxter.” (Weinreich 1958: 382).
Yiddish 249
3.3. Gender
Northeastern Yiddish lost the neuter gender, as did two of its contact lan-
guages, Lithuanian and Latvian (see Eggers 1998: 346347). In the case of
internationalisms, borrowings from more than one language usually in the
semantic areas of academic, scientific or technological discourse, gender is
applied in accordance with Slavic languages, e.g. univerzitet (m) ‘university’,
komitet (m) ‘committee’, zignal (m)‘signal’.
3.4. Diminutives
Diminutive suffixes -inke, -enyu, -ke, as for example in muminke ‘dear/little

aunt’, tatenyu ‘dear/little father’ or Avromke ‘dear/little Abraham’ (here con-
vergence with a Germanic-component suffix -ke) are borrowed from Slavic
languages. (see Weinreich 1980: 531)
Yiddish has two degrees of diminutives in most nouns (-l and -ele, cf.
hun ‘hen’ – hindl ‘hen / little hen’ – hindele ‘little hen’), which follows the
example of Slavic languages, e.g. Polish. Although the suffixes used are Ger-
manic, the German language(s) do not have two degrees of diminutives.
Furthermore, the NortheasternYiddish dialect, which does not show the neu-
ter gender any more, has “adopted the un-Germanic but Slavic rule that di-
minutives have the same gender as the base form.” (Weinreich 1958: 380)
3.5. Derivational suffixes
A number of affixes were borrowed from Slavic languages:5
-še and -ke to create feminine nouns, e.g. gubernatorše ‘the governor’s wife’
or lererke ‘teacher’
-ák, -l’ák, -ńák, -áč to create pejorative forms of certain nouns, e.g. cvujak
‘hypocrite’; paskudniak ‘vicious person’ or jungač ‘thug’
-ák to describe somebody’s origin, e.g. litvak ‘someone from Lithuania’
-éc to indicate strength, e.g. boxeréc ‘strong bloke’
-čik, -eši to create endearment forms, e.g. Avromčik ‘dear Abraham’; mameši
‘dear mother’
-(e)ńu, -inke to create endearment forms in the vocative (see under 3.4)
-ixe to create nouns for female animals; e.g. lejbixe ‘lioness’

-arńe to name localities, e.g. xasid-arńe ‘gathering place of khassidim’
-ńik (m) and -ńice (f) as personifiers, e.g. tšajnik ‘teapot’ or ejšesišnice ‘adul-
teress’
-úk to create pejorative expressions for professions, e.g. šusteruk ‘shoe-
maker’(pej.)
Yiddish makes wide use of its set of Germanic prefixes and applies them,
following the example of Slavic verbs as loanblends – e.g. farčepen ‘to pro-
voke’, where the stem is from Slavic origin and the prefix Germanic – or
loan translations – e.g. opbrengn ‘to bring back’ which is modelled on Polish
odniešč – or “… in verbs where the Yiddish prefix is applied productively
without reference to a Slavic model.” (Weinreich 1958: 381). Loan-verbs are
integrated directly into the morphological marking of Yiddish verbs. The verb
is usually borrowed in the unmarked inflected form.
The following structures have been influenced by contact with Slavic lan-
guages: Aspect/Aktionsart, modal particles and verbal derivational patterns.
4.1. Aspect/Aktionsart
The significant development of lexical (prefixation) as well as grammatical

aspect in Yiddish (semelfactive, e.g. a kuk ton – ‘to have a quick look’, and it-
erative, e.g. ix flejk gejn – ‘I (always) go’) is very likely to have been triggered
by contact with Slavic languages and their complex inventory of grammat-
ical aspect. It does not copy, however, the Slavic structures but “represents
a further development of a Germanic system” (see for a detailed discussion
Aronson 1985: 185; Albert and Meijering 2001).
4.2. Modal particles
The following adverbs with mainly modal functions were borrowed from
Slavic languages: “e.g. až to emphasize great quantity, het to emphasize
distance, na to accompany giving, ot to accompany pointing, take ‘indeed’,
xoč(be) at least (cf. Polish choćby), jakoš ‘somehow’, jakbe ‘as if’ (cf. jakby).
Yiddish 251
Note also the peculiar adverbs male (+interrogative) ‘no matter (what, who,
etc.)’ and same (+superlative) ‘very’ (e.g. der same grester ‘the very big-
gest’).” (Weinreich 1958: 390).
As for verbal derivational patterns, the Slavic affix -ke can function as a
verbalizer of interjections, e.g. bom-ke-n ‘to say “bom”’ (Weinreich 1958:
378).
In the areas of pronouns, particles and discourse markers, and adjectives and
adverbs, Yiddish shows signs of contact with Slavic languages.
5.1. Pronouns
The deictic particle ot is borrowed from Slavic and serves as a demonstrative

intensifier in Yiddish in combination with the der, di, dos-paradigm: e.g. ot
der jingl ‘this very boy’.
The indefinite pronoun abi is borrowed from Slavic:
(1) Zej hobn do tsu ton nit mit abi-vemen.

you have here to do.inf not with pron-who.dat
‘You are not dealing with anybody here’.
The indefinite pronoun ljade is borrowed from Slavic:
(2) Di eltere bojes zajnen ojx grejt a ljade tog tsu antlojfn.
the older boys are also ready a pron day to run off.inf
‘The older boys are ready to run off any day.’
The construction -(s’)nit -iz follows a similar construction in Slavic lan-

guages:
(3) Zi hot gor kejn kojex nit fartraxtn zix

she has intensif no strength not think.inf refl
iber velxe-nit-iz injonim.6
about which-not-be.3sg issues.
‘She hasn’t got the strength to think about any issues.’
Interrogative pronouns: the construction ver ... ver is formed according to the
example of Russian:
(4) Guthartsike balebostes hobn zi ba zix gehaltn,

kindhearted housewifes have her with refl keepparticiple
ver a xojdeš, ver lenger.7
pron a month, pron longer.
‘Kindhearted housewifes kept her with them, some for a month,
some for longer.’
The intensifying enclitic -že ‘then’ after the interrogative pronoun vos, e.g.
vos-že, is borrowed from Slavic:
(5) Vos-že est du?

what-intensif eat:2sg you?
‘What is it you are eating?’
An increased use of reflexivity can be observed in Yiddish for two reasons:

(a) a verb which etymologically was not reflexive, acquires reflexivity be-
cause the Slavic equivalent is reflexive, e.g. endikn zix ‘to come to an end’
from pol. kończyć się (example from Eggers 1998: 312). (b) As in Polish, the
reflexive pronoun zix can be used as a “solitude marker”, e.g. ix gej zix – ‘I am
walking along by myself’ (Katz 1987: 125).
5.2. Particles and discourse markers
A number of connectors, expressing addition and contrast, have been bor-

rowed from Slavic: “Yiddish has … the conjunctions to ‘in that case’, tsi
‘whether’ (Belor., dial. Pol. cy), (ńi)xaj ‘let…’, xoč ‘although’, (a)xibe ‘un-
less’, i…i ‘both… and’ (dial. even i ‘and’, in enumerations), (bodáj…) abí-
‘(let…) so long as’; …jak…abí- ‘be… as it may, but’-, dial. pokevanen ‘until’.”
(Weinreich 1958: 390). Among these connectors, tsi ‘whether’ functions as a
subordinating conjunction, xoč ‘although’ as a concessive clause marker.
The multifunctional focus particle ot is borrowed from Slavic languages
in the meaning of ‘just’, e.g. ot derzelbiger ‘just the one’. It is also used as a
place deictic, ‘here’ ot (in pointing) and ‘there’ ot (in pointing).
In the field of interjections, nu ‘well?’, ‘come on!’ is of Slavic origin.
Regionally, further discourse markers borrowed from various Eastern Euro-
Yiddish 253
pean contact languages may have been in use, e.g. in code-switching or semi-
habitual code-switching.
5.3. Adjectives and adverbs
Affixes -ske, -(ev)ate are borrowed from Slavic languages as adjectivizers.

The infixes -ičk-, -ink-, -en-, originating from Slavic, do not change the par-
ticular part of speech, but add a certain semantic feature and can be used to
modify adjectives.
Comparative forms in Yiddish show similarities to Polish and/or Russian
constructions: Yidd.: greser fun mir; Pol.: większy ode mnie – ‘taller than
me’.Yidd.: er iz mer enlex geven afn general vi af a jidn; Russ.: on byl bolee
poxož na – ‘he was more like a General than a Jew’ (Eggers 1998: 317). One
of three ways to form the superlative is an analytical construction with the
borrowed particle same + positive: majn same belibter frajnt ‘my most be-
loved friend’ (Lötzsch 1974: 455).
It is unclear whether constituent order in Yiddish has been influenced by re-

cent contact with Slavic or Baltic languages. The features which might have
developed due to contact are the following:
6.1. Possessor–possessed order
Yiddish allows for the positioning of the possessive pronoun after the noun as
an emphatic way of addressing a person, e.g. bruder majner! – ‘my brother!’,
which is also possible in Slavic languages and is more common than in Ger-
manic, where it appears very rarely and only in highly stylized forms, e.g.
German: Vater unser! ‘Our Father!’
6.2. Adjective–noun order
The adjective can follow the head noun (marked form), e.g. dos land dos
farbotene – ‘the forbidden land’ (Eggers 1998: 313). Some scholars, for
example Eggers (1998), claim that this construction is copying the generally
freer word order of Slavic languages. It needs to be taken into account, how-
ever, that spoken German varieties (e.g. Bavarian) know this construction as
well, e.g. in order to emphasize as in der Bub der narrische – ‘the foolish
boy’. The construction could therefore equally well be a development within
the Germanic component.
Additionally, Yiddish does not share the German “finite-verb last” rule
for subordinated clauses. Some scholars claim that this is due to contact with
Slavic languages (e.g. Weinreich 1958: 383 or Eggers 1998: 313, without
going into detailed discussion of this point). Ebert (1998) shows that con-
stituent order in German subordinated clauses is a fairly recent phenomenon,
since it only became the norm by the end of the sixteenth century. He comes
to the conclusion that it did not originate in spoken language but in the written
varieties used by Chanceries. Yiddish word order may be reflecting structures
common in older, spoken varieties of German (Reershemius 2005). Yiddish
word order in subordinated clauses could have developed from within the
Germanic component, but independently from an emerging German stand-
ard language. Current developments in spoken German underline this point,
since subordinate clauses introduced by weil – ‘because’ and obwohl – ‘al-
though’ tend not to follow the “finite-verb-last” rule of Standard German any
more.
7. Syntax
In the fields of negation, coordination, and embedding, recent language con-

tact needs to be considered as a formative factor.
7.1. Negation
Double negation (or poly-negation) in Yiddish is often mentioned as a phe-

nomenon which developed under the influence of Slavic languages. Double
negation, although not allowed in Standard German, is however a common
structure in older variants of German, as well as in many modern spoken Ger-
man varieties. Therefore, double negation in Yiddish can be considered as a
structure developed from within the Germanic component (see e.g. Eggers
1998: 315316). The fact that Slavic contact languages contain the struc-
Yiddish 255
ture as well might have supported it being established in Yiddish. Weinreich

(1958: 383) draws attention to another construction related to negation which
can be traced back to the influence of Slavic languages: after verbs expressing
fear, the subordinated clause is negated, e.g. ix hob mojre er zol nit kumen –
‘I am afraid he might come’, which is clearly uncommon in Standard German
as well as in spoken and (known) older varieties of German.
7.2. Coordination and adverbial clauses
A number of coordinating conjunctions are borrowed from Slavic languages

(see Section 5). The concessive subordinator xoč ‘although’ is borrowed from
Slavic.
7.3. Relative clauses
In Yiddish, interrogative pronouns function as relative pronouns, as in Slavic

languages. It is unclear, though, whether Yiddish follows the example of
Slavic here, since Standard German and spoken German varieties also use
interrogatives as relative pronouns in addition to the deictic paradigm der, die,
das. Word order in relative clauses follows Slavic examples:
(6) A dire fun fir tsimern, ejnem fun velxe er hot ajngeštimt

a flat of four rooms, one of rel pron he has agreeparticiple
tsu fardingn mir.8
to rent out me.
‘A flat of four rooms, one of which he agreed to rent out to me.’
As in some spoken Slavic varieties, Yiddish adds an anaphoric pronoun to

the relative pronoun vos in order to supply information about gender, number
and case, e.g. ... verter, vos er alajn hot šojn in zej ništ geglejbt ... – ‘words
in which he himself did not believe’ (example from Lötzsch 1974: 458). In
generalizing relative clauses, Yiddish also follows structures from Slavic lan-
guages, especially Russian, e.g. ... un vi kurts di švajgendike rege zol noxdem
nit zajn ... ‘however short the period of silence afterwards might be’ (ex-
ample from Lötzsch 1974: 458).
8. Lexicon
Lexical borrowing is probably the area where Yiddish has been influenced
most through recent language contact with Slavic, and to a certain extent
Baltic languages. Bin-Nun (1973) estimates that the Yiddish lexicon contains
approximately 10 to 15 percent of words borrowed from Slavic languages.
Nouns, verbs, adjectives, adverbs, pronouns, adpositions, conjunctions, dis-
course markers, interjections, particles and derivational affixes have been
borrowed into the semantic and communicative domains of religion, fauna
and flora, trade, geography, kinship terms, tools and body parts. Weinreich
(1958: 386387) also lists parts of house, household items, family, clothes
and food. The deictic particle ot can be used as a spatial expression for ‘here’
(in pointing) and ‘there’ (in pointing).
9. Conclusion
Recent language contact has led to a high degree of lexical borrowing in Yid-
dish from the Slavic languages. Yiddish phonology was clearly influenced
by Slavic, although some features remained regional. In the area of morpho-
syntax, many structures may or may not have evolved through contact. Closer
analysis shows that they could just as well be the result of internal develop-
ments in the Germanic component, which might, however, have been trig-
gered by contact. The discussion of these features underlines the fact that
it is insufficient to compare Yiddish only with Standard German. German
spoken varieties, which existed and exist to a certain extent independently
from the German Standard language, as well as older varieties, need to be
taken into account because they sometimes provide alternative explanations
to structures in Yiddish which have been put down to contact. Contact clearly
is not the only reason for the distinct development of the Yiddish language.
Grammaticalization processes within the Germanic component, which could
develop fairly unrestrictedly because of the absence of a standardized written
language for centuries, also need to be taken into account (cf. Reershemius
1997). By far the most productive area of contact with Slavic languages has
been word derivation, which has led to a unique blend of Slavic and Ger-
manic elements in Yiddish.
Yiddish 257
Abbreviations
Belor. Belorussian NE Yiddish Northeastern Yiddish

dat dative nom nominative
dial. dialectal Pej. pejorative
f feminine Pol. Polish
inf infinitive pron pronoun
intensif intensifier ref reflexive
LCAAJ Language and Culture Atlas rel pron relative pronoun
of Ashkenazic Jewry sg singular
m masculine Yidd. Yiddish
Notes
1. Paul Wexler’s (1991) hypothesis that Yiddish is a relexified variant of Sorbian

has been rejected by most scholars in the field, see e.g. Eggers (1998) and the
responses to Wexler’s article in International Journal of the Sociology of Lan-
guage 91 (1991).
2. According to Bin-Nun (1973: 8590) Yiddish was used by eight million speak-
ers worldwide in the 1930s, of which 6 million were based in Eastern Europe.
The LCAAJ, vol.1 (1992: 10) estimates that shortly before the Second World
War Yiddish had 10.5 million speakers with seven million living in Eastern Eu-
rope.
3. The chapter is based on a corpus of spoken Yiddish in Israel. The data was col-
lected in 1988/89 and will be accessible in digitized form through the Phono-
gramme Archive (Austrian Academy of Arts and Sciences) in Vienna.
4. Examples taken from Weissberg (1988: 127).
5. Examples from Weinreich (1958) and Eggers (1998).
6. Examples from Lötzsch (1974: 456457).
7. Example from Lötzsch (1974: 456).
8. Example from Lötzsch (1974: 458).
References
Albert, Ruth, and Henk D. Meijering

2001 Hat das Jiddische ein Aspektsystem? In: Sprache und Text in Theorie
und Empirie. Beiträge zur germanistischen Sprachwissenschaft. Fest-
schrift für Wolfgang Brandt, 2840. Stuttgart: Steiner.
Aronson, Howard I.
1985 On Aspect in Yiddish. General Linguistics 25: 171188.
Bin-Nun, Jechiel
1973 Jiddisch und die deutschen Mundarten. Unter besonderer Berücksich-
tigung des ostgalizischen Jiddisch. Tübingen: Moor.
Ebert, Rolf
1998 Verbstellungswandel bei Jugendlichen, Frauen und Männern im 16.
Jahrhundert. Tübingen: Niemeyer.
Eggers, Eckhard
1998 Sprachwandel und Sprachmischung im Jiddischen. Frankfurt a. Main:
Lang.
Isaacs, Miriam
1999a Haredi, haymish and frim: Yiddish vitality and language choice in a
transnational multilingual community. Internationa Journal of the So-
ciology of Language 138: 930.
1999b Contentious partners: Yiddish and Hebrew in Haredi Israel. Inter-
national Journal of the Sociology of Language 138: 101121.
Jacobs, Neil
2005 Yiddish. A Linguistic Introduction. Cambridge: Cambridge University
Press.
Katz, Dovid
1987 Grammar of the Yiddish Language. London: Duckworth.
Herzog, Marvin, Vera Baviskar, Ulrike Kiefer, Robert Neumann, Wolfgang Putschke,
Andrew Sunshine, and Uriel Weinreich (eds.)
19922000. The Language and Culture Atlas of Ashkenazic Jewry (LCAAJ).
Vol. 1 Historical and Theoretical Foundations, vol. 2 Research Tools,
vol. 3 The Eastern Yiddish – Western Yiddish Continuum. Tübingen:
Niemeyer.
Lötzsch, Ronald
1974 Slawische Elemente in der grammatischen Struktur des Jiddischen.
Zeitschrift für Slawistik XIV: 446459.
Lowenstein, Steven
1969 Results of Atlas Investigations among Jews in Germany. In: Marvin I.
Herzog, Wita Ravid and Uriel Weinreich (eds.), The Field of Yiddish.
Studies in Language, Folklore and Literature (3rd edn.), 1635. The
Hague: Mouton.
Reershemius, Gertrud
1997 Biographisches Erzählen auf Jiddisch. Tübingen: Niemeyer.
2005 Einige Bemerkungen zur Bewahrung von Merkmalen des älteren Deut-
sch im Jiddischen. In: Holger Briel and Carol Fehringer (eds.), Field
Studies: German Language, Media and Culture. Frankfurt/Main et al:
Lang, 1127.
Timm, Erika
1987 Graphische und phonische Strukturen des Westjiddischen. Tübingen:
Niemeyer.
Yiddish 259
2005 Historische jiddische Semantik. Die Bibelübersetzungssprache als

Faktor der Auseinanderentwicklung des jiddischen und des deutschen
Wortschatzes. Tübingen: Niemeyer.
Weinreich, Uriel
1958 Yiddish and Colonial German in Eastern Europe. In: American Con-
tributions to the Fourth International Congress of Slavicists, Moscow,
September 1958, 369419. The Hague: Mouton.
Weissberg, Joseph
1988. Jiddisch. Eine Einführung. Frankfurt/Main: Lang.
Wexler, Paul
1991 Yiddish: The fifteenth Slavic Language. A study of partial language
shift from Judeo-Sorbian to German. International Journal of the
Sociology of Language 91: 9150.
Grammatical borrowing in Hungarian Rumungro
Viktor Elšík
1. Background1
The language under description is a variety of Romani (Indo-Aryan) spoken

by long-settled Roms (Gypsies) of southern Slovakia and northern Hungary,
which is classified as the Northern (non-Vendic) subgroup of the South Cen-
tral group of Romani dialects (cf. Boretzky 1999; Elšík, Hübschmannová,
and Šebková 1999) and usually refered to as “Rumungro” in Romani lin-
guistics. The variety I chose to describe is one of the few Rumungro varieties
whose speakers are Hungarian bilinguals.2 It is the language of some 1,350
Rom inhabitants of the Hungarian village of Selice (Hungarian Sók, Rom-
ani Šóka) in southwestern Slovakia. In addition, there are about 150 Roms
in the village who speak a different (a North Vlax) dialect of Romani. The
former Roms are referred to as Rumungri (originally ‘Gypsy-Hungarians’)
by the latter group, who are called Pojáki (originally ‘Poles’) by the Ru-
mungri. Both groups use the ethnonym Rom for their own group and both
are called cigányok ‘Gypsies’ by Hungarians, although the Hungarian villa-
gers clearly differentiate between magyar cigányok ‘Hungarian Gypsies’ (i.e.
the Rumungri) and oláh cigányok ‘Romanian Gypsies’ (i.e. the Pojáki). At
present, both Rom groups taken together slightly outnumber the Hungarian
population of the village. Until recently, however, the Hungarians were in a
demographic majority and they remain the socially, economically, and polit-
ically dominant group in the village.
Rumungro is prevalently an oral language; some Rumungri are able to
write letters or text messages in Rumungro but the language is not used for
regular written communication. Nor is it used in massmedia or in formal
education. Although Romani in general is an officially recognized language
in Slovakia, there is no recognition of the Rumungro dialect specifically and,
so far, there have been no attempts at its standardization. The Rumungro of
Selice is the language of family and in-group communication among the local
Rumungri and the language of inter-group communication between the Ru-
mungri and the local Pojáki. While the latter learn Rumungro as their second
dialect of Romani (and speak a distinct ethnolect of it), the Rumungri usually
do not learn the dialect of the Pojáki. Many Hungarian villagers understand
262 Viktor Elšík
Rumungro well, although only a few have some active competence in it and
they are rarely fluent speakers. While all Selice Rumungri born before 1975
or so are native speakers of Rumungro, in some families children are pres-
ently spoken to only in Hungarian or Slovak, and left to acquire some com-
petence in Rumungro in adolescent and adult peer groups, if at all. Thus, Ru-
mungro of Selice is not a safe language, though it is not seriously endangered
yet.
All school-age or older L1 speakers of Selice Rumungro are multilingual.
First of all, they are fluent and highly competent in Hungarian, which they
use especially in their everyday communication with the Hungarian villa-
gers. Some young children may be monolingual in Rumungro, although early
acquisition of Hungarian appears to be the prevailing pattern nowadays. In
addition, most Rumungri are fluent in Slovak, the official and dominant lan-
guage of Slovakia, which they use outside of the village. Also, most have ac-
quired at least passive competence in Czech through their exposure to Czech
massmedia and employment-related stays in the Czech part of the former
Czechoslovakia (in the 1960s–1980s almost all families of the Selice Ru-
mungro community spent ten to thirty years there). Though both Hungarian
and Slovak (and to some extent Czech as well) may be classified as current
L2s of Selice Rumungro, it is clear that Hungarian enjoys a special sociolin-
guistic status: inter alia it is the language of the secondary ethnic identity
of the Selice Rumungri, who frequently refer to themselves as “Hungarian”
Roms, accepting the attribute ascribed to them by Hungarians.
As evidenced by lexical borrowings, Rumungro shares with other Rom-
ani dialects previous contact with West Iranian (Persian and/or Kurdish),
Ossetic, Armenian, and especially Greek; the latter language also had an
enormous impact on Romani grammar. On the other hand, most South
Slavic loanwords in Rumungro are dialect-specific within Romani. Some
of them can be identified as Serbian–Croatian or even Ikavian Serbian–
Croatian (Elšík, Hübschmannová, and Šebková 1999). Linguistic contact of
Rumungro with Hungarian is likely to have lasted for at least two centuries.
Widespread multilingualism of the Selice Rumungri in Slovak and Czech did
not develop before the 1920s and 1950s, respectively. While these second-
ary current L2s have contributed only a few marginal loanwords, Hungarian
has exerted, and continues to exert, a strong lexical and grammatical influ-
ence on Rumungro. The present chapter will focus on Rumungro borrowings
from Hungarian, although borrowings from other contact languages, both
pre-Hungarian and “post”-Hungarian (i.e. Slovak and Czech), will also be
discussed.
Hungarian Rumungro 263
2. Phonology
The inventory of Rumungro phonemes is identical to that of Hungarian, with

two exceptions. First, Rumungro retains distinctive aspiration in voiceless
stops and affricates, e.g. čór- [ʧo:r] ‘steal’ vs čhor- [ʧhor] ‘pour’, which is
absent from Hungarian. Second, Hungarian rounded front vowels are usually
replaced with their unrounded counterparts in loanwords, e.g. csütörtökön
[ʧytørtøkøn] > čiterteken [ʧiterteken] ‘on Thursday’, although some speak-
ers now tend to retain them in certain loanwords. Both vowel and conson-
ant inventories of Romani have been enlarged due to contact with Hungar-
ian. Instances of contact-induced phoneme loss are rare: they include the
merger of the voiceless uvular fricative [χ] with the glottal fricative /h/ [h]
and the merger of the palatal lateral [lj] with the palatal approximant /j/ [j],
e.g. *[χaljam] >hájam [hɒ:jam] ‘we ate’. On the other hand, contact with
Hungarian has given rise to several phonemic distinctions and numerous new
phonemes in Rumungro.
A major contact-induced change has been the development of distinctive
phonological quantity: vowel length, e.g. phirav- [phirav] ‘wear’ vs phírav-
[phi:rav] ‘make [so.] walk’, and consonant gemination, e.g. čuča [ʧuʧa]
‘empty’ (an inflectional form) vs čučča [ʧuʧ:a] ‘breasts’. Both types of quan-
tity have spread to the pre-Hungarian lexical component, although some in-
dividual geminates remain restricted to the Hungarian component. The inven-
tory of vocalic qualities, too, has been enlarged due to contact. Although the
open-mid front vowels – the short /ë/ [æ] and the long /e̋/ [æ:] – are mostly
restricted to Hungarian loanwords, they are phonologically distinct from their
closed-mid counterparts, e.g. dë [dæ] ‘but’ vs de [de] ‘give!’. In addition to
the phonological quantity difference, the long /á/ [ɒ:] is distinguished through
phonetic rounding from the short /a/ [a], as it is in the local Hungarian dialect.
Contact with Hungarian has also triggered the development of a series of pal-
atal consonants from palatalized dentals or palatalized velars, e.g. *[tatjar] >
taťar- [tacar] ‘make warm’, *[kjhil] > ťhil [chil] ‘butter’.
The Hungarian-origin phonemes play an important role in morpho-phono-
logical alternations. In addition, several morpho-phonological rules are bor-
rowed. For example, a morpheme-initial palatal approximant triggers gemi-
nation and a shift to a palatal of a preceding morpheme-final dental stop, as it
does in Hungarian, e.g. kafid-i [kafidi] ‘table’ → {kafid-ja} kafiďď-a [kafiÔ:a]
‘tables’. Rumungro also borrows vowel harmony from Hungarian, although it
remains restricted to a single type of alternation that affects only a few indig-
enous affixes, e.g. farkašš-a [farkaʃ:a] ‘wolves’ vs ke̋mívëšš-ë [kæ:mi:væʃ:æ]
264 Viktor Elšík
‘bricklayers’, bika-ha [bikaha] ‘with a bull’ vs këčkë-hë [kæʧkæhæ] ‘with a

goat’. Apart from the development of long vowels and geminate consonants,
the syllable structure of the pre-Hungarian component has remained unaf-
fected by contact with Hungarian. On the other hand, there is no adaptation
of Hungarian loanwords in terms of their syllable structure. The distribution
of long vowels in Rumungro suggests that they developed before the Hun-
garian-induced general shift of stress to word-initial position, e.g. *[barvalÕo]
> *[barva:lÕo] > barválo [bÕarvɒ:lo] ‘rich’. Intonation patterns are largely
identical to those of the local Hungarian dialect.
3. Typology
The typological profile of Asian (Proto-)Romani was altered rather signifi-

cantly already before the arrival of its speakers to Europe. Matras (2002: 196)
argues that, for example, the development of interrogative-based relativizers
or the reduction of non-finite constructions could have taken place in a west-
ern Asian convergence area, i.e. before the contact of Romani with Greek
in Asia Minor. The latter language nevertheless remains the major source of
typological innovations that are shared by Romani as a whole: the develop-
ment of a proclitic definite article, the emergence of prepositions (or a signifi-
cant expansion of their inventory), the shift to a basic predicate–object order,
and more (cf. Matras 1994, 2002: 198199).
Post-Greek L2s have had a less significant impact on major typologic-
al parameters of Rumungro. While Romani possessed a single prefix in its
Greek period, matter borrowing of several pronominal prefixes from South
Slavic and Hungarian, of a superlative prefix from Hungarian, and a gram-
maticalization of another pronominal prefix due to pattern replication from
Hungarian (see Sections 4 and 6), has increased the number of prefixes in
Rumungro by eight. Outstanding syntactic developments due to contact with
Hungarian include the creation of a class of preverbs (see Section 5), the
“re-introduction” of non-finite subordinate constructions (see Section 8), and
various modifications in word-order patterns (see Section 7).
Nouns are commonly borrowed into Rumungro. Pre-Greek and some (pre-
sumably early) Greek noun loans show full morphological integration and are
structurally indistinguishable from indigenous nouns; they have a so-called

oikoclitic morphology. Some (presumably late) Greek and post-Greek noun
loans, on the other hand, have so-called xenoclitic morphology (Elšík and
Matras 2006: 324333), which is characterized, above all, by borrowed nom-
inative inflections, mostly of Greek origin. These inflections were extracted
from lexical loans of nominative noun forms, and extended to later loanwords
as well. For example, the xenoclitic nominative singular feminine suffix -a
was extracted from Greek-origin nouns, e.g. cip-a ‘skin’ < tsip-a, and ex-
tended to nouns borrowed from South Slavic, e.g. péť-a ‘oven’ < Serbian–
Croatian pēć, and Hungarian, e.g. virág-a ‘flower’ < virág. Hungarian does
not contribute any xenoclitic noun inflections.
Like nouns, adjectives, too, are commonly borrowed. The distinction be-
tween xenoclitic and oikoclitic adjective inflection, which is reconstructable
for earlier post-Greek stages of Romani, has been lost in Rumungro due to
internal analogical developments (Elšík and Matras 2006: 329). Borrowed
adjectives now inflect like indigenous adjectives, showing Indo-Aryan in-
flections, e.g. žut-o ‘yellow’ < Serbian–Croatian žut. Unlike earlier adjective
loans, adjectives borrowed from Hungarian contain the overt adaptation suf-
fixes -av- or -n- of South Slavic origin, in addition to the indigenous inflec-
tions, e.g. sirk-av-o ‘grey’ < szürke, kík-n-o ‘blue’ < kék. Although Rumungro
possesses indigenous means to derive manner adverbs from adjectives, most
manner adverbs corresponding to Hungarian-origin adjectives are lexic-
al loanwords from Hungarian rather than internal derivations, e.g. okoššan
‘wisely’ < okosan (cf. okoš-n-o ‘wise’ < okos).
Extraction from lexical borrowings is the source of several derivational
affixes in Rumungro. There are three borrowed noun-deriving affixes that
are productive, in addition to several lexically restricted ones, which will
be left out of the present discussion. First, the South Slavic-origin suffix
-kiň- derives nouns denoting female humans from loanwords of Hungarian
human nouns, e.g. sakáč-kiň-a ‘female cook’ ← sakáč-i ‘cook’ (< szakács
‘cook’). Second, the Hungarian-origin suffix -áš- derives action nouns from
internal verb derivations in -áz- (see Section 5), e.g. ďij-áz-áš-i ‘singing’ ←
ďij-áz-in- ‘to sing’ (← ďíl-i ‘song’), and from a few underived indigenous
verbs, e.g. muk-áš-i ‘divorce, separation’ ← muk- ‘to leave, let’. Finally, the
Hungarian-origin prefix mí- derives nouns denoting artificial objects from
nouns denoting natural objects of the same type, e.g. mí-dand ‘artificial tooth’
← dand ‘tooth’. The prefix has been extracted from loanwords of Hungarian
compounds consisting of the noun mű ‘creation, artificial thing etc.’ and a
body-part noun, e.g. mí-këňek-a ‘artificial elbow’ < mű-könyök.
266 Viktor Elšík
Extraction of adjective- and adverb-deriving affixes is also attested. The

Greek-origin suffix -(i)k- derives several semantic types of relational adjec-
tives from nouns, e.g. dévl-ik-o ‘divine’ ← dél ‘God’, ninc-k-o ‘German adj’
← ninc-o ‘German n’ (< Serbian–Croatian nimac), meňassoň-ik-o ‘bridal’
← meňassoň-a ‘bride’ (< Hungarian menyasszony). Ethnic adjectives in -(i)k-
then form manner adverbs by the suffix -a, which is likewise of Greek origin,
e.g. ninc-k-a ‘in (a) German (way)’. The Serbian–Croatian-origin suffix -ast-
is a fully productive means to derive attenuative adjectives, e.g. gull-ast-o
‘rather sweet’ ← gull-o ‘sweet’, míl-n-ast-o ‘rather deep’ ← míl-n-o ‘deep’
(< Hungarian mély). The Hungarian-origin suffix -óš- derives a few active
adjectives from causatives of a class of indigenous verbs, e.g. dara-v-óš-n-o
‘frightening’ ← dara-v- ‘to frighten’ (← dara- ‘to fear’); the obligatory pres-
ence of the adaptation suffix -n- after -óš- shows that the latter has been
extracted from adjectives borrowed from Hungarian. Finally, a complex in-
terplay of matter borrowing from Hungarian and internal re-analysis has re-
sulted in the development of the suffix -šon, which derives manner adverbs
from a class of similative adjectives, e.g. roman-iká-šon ‘in a rather Gypsy
way’ ← roman-ikán-o ‘Gypsy-like’ (← roman-o ‘Gypsy adj’).
Affix extraction, however, does not appear to be responsible for the bor-
rowing of the superlative prefix lëg- from Hungarian, which derives the su-
perlative from the comparative, e.g. lëg-bar-eder ‘the biggest’ ← bar-eder
‘bigger’ (← bár-o ‘deep’), lëg-míl-n-eder ‘the deepest’ ← míl-n-eder ‘deep-
er’ (← míl-n-o ‘deep’ < Hungarian mély). The superlative prefix must have
been borrowed directly, without the mediation of lexical borrowing, since
Rumungro superlative forms are internal derivations rather than borrowings
of Hungarian superlatives, such as leg-mély-ebb ‘the deepest’. The direct
borrowing of the superlative prefix was probably facilitated by the result-
ing structural isomophism between the Rumungro and the Hungarian degree
derivation, viz. derivation of comparatives by suffixation and of superlatives
by further prefixation. Although the Rumungro comparative suffix is pre-
Hungarian (Indo-Aryan or, more likely, Iranian, cf. Matras 2002: 196), dia-
lect comparison with other Romani dialects suggests that the retention of
synthetic comparatives in Rumungro is likely to be due to contact with Hun-
garian.
An unambiguous instance of pattern replication from Hungarian is the
creation of associative plurals in Rumungro human nouns, which are dis-
tinct from their regular plurals, e.g. ke̋mívëš-ingere ‘bricklayer and his work
team’3 vs ke̋mívëšš-ë ‘bricklayers’ (← ke̋mívëš-i ‘bricklayer’); the category,
undocumented in other Romani dialects, replicates an identical distinction in
Hungarian, e.g. kőműves-ék vs kőműves-ek (← kőműves). Probably due to

pattern replication from the genderless Hungarian, Rumungro has lost femi-
nine derivation with nouns denoting (higher) animals: for instance, the mas-
culine gra ‘horse’ (originally also *‘stallion’) has lost its inherited feminine
counterpart *gras-n-i ‘mare’ and is now a generic designation of the species.
There are numerous instances in Rumungro of pattern replication concern-
ing the syntax and semantics of case markers. To mention just a couple of
examples: the dative case, which encodes beneficiaries and some recipients,
is now also used to encode certain predicate complements (Rumungro 1a,
Hungarian 1b), and the inferior spatial preposition tal ‘under’ also encodes
temporal telic extent (2ab). Finally, the Hungarian model has triggered the
grammaticalization of indigenous spatial adverbs into a series of separative
prepositions, cf. the Rumungro preposition anglal (3a), and the Hungarian
postposition elől ‘from the front of’ (3b).
(1) a. Romes-ke man hajovav.

Gypsy.obl-dat 1sg.acc feel.1sg
b. Cigány-nak érzem magam.
Gypsy-dat feel.1sg emph.1sg
‘I feel as a Gypsy.’
(2) a. Tal o pándž dí ári sasťíja.

under def five day outward get.healthy.pfv.3sg
b. Öt nap alatt gyógyult meg.
five day under get.healthy.pret pfv
‘S/he recovered in five days.’
(3) a. Naššov anglal mre jakha.

get.lost.imp from.the.front.of 1sg.gen:pl eye.pl
b. Tűnj a szemem elől.
get.lost.subj def eye.1sg.poss from.the.front.of
‘Get out of my sight!’, lit. ‘Get lost from the front of my eye(s)!’
Verbs are commonly borrowed into Rumungro. Pre-Greek and early Greek
loan-verbs show full morphological integration and are structurally indistin-
guishable from indigenous verbs. Post-Greek loan-verbs, on the other hand,
268 Viktor Elšík
are marked out by a specific adaptation marker, the Greek-origin suffix -in-,
which is added to an inflectional stem of the source verb, e.g. vič-in- ‘to
shout’ (< Serbian–Croatian vič-), dógoz-in- ‘to work’ (< Hungarian dolgoz-),
and followed by regular indigenous inflections. The suffix was extracted
from lexical borrowings of Greek verbs with the present stem in -in-. Though
none of these have been retained in Rumungro, the suffix has been extended
to those Greek loan-verbs that originally contained a different suffix, e.g.
rum-in- ‘to damage, spoil’ < Greek rim-az- ‘to ravage’. Dialect compari-
son suggests that the suffix -in- was originally specialized for non-perfective
adaptation of some transitive loan-verbs in Romani (Matras 2002: 130). In
Rumungro, however, it has developed into a general, aspect- and valency-
neutral, verb-adaptation marker.4 Nonce loan-verbs from Slovak (or Czech)
show a distinct pattern of morphological adaptation: their infinitive stems
get adapted by the Hungarian-origin adaptation suffix -ál-, in addition to the
regular adaptation suffix -in-, e.g. sledov-ál-in- ‘to observe, follow’ (< Slovak
sledova-).5
The adaptation suffix -in- is absent in passive participles of morphologic-
ally adapted borrowed verbs. Instead, the participles contain the Greek-
origin participle suffix -ime, e.g. rum-ime ‘spoiled’, vič-ime ‘shouted’, téle
dógoz-ime ‘worked away’ (lit. ‘downward worked’), sledov-ál-ime ‘ob-
served, followed’. The suffix was extracted from Greek lexical borrowings
and extended to post-Greek loan-verbs and several indigenous verb classes,
e.g. d-ime ‘given’ (← d- ‘to give’; cf. Elšík and Matras 2006: 331332).
Another borrowed non-finite marker is the Hungarian-origin infinitive suffix
-ňi (cf. Elšík and Matras 2006: 179), which has been extracted from lexical
borrowings of Hungarian infinitives and extended to a class of non-borrowed
verbs, viz. those derived by the suffix -áz- (see below). Like the participle
suffix -ime, the infinitive suffix -ňi is incompatible with the adaptation suffix
-in-, e.g. ďij-áz-ňi ‘to sing’ (← ďij-áz-in- ‘to sing’, stem). Unlike Hungarian
infinitives, the Rumungro Hungarian-origin infinitives do not allow any nom-
inal inflection (see Section 7 for syntactic details).
Extraction from Hungarian loanwords is also the source of two Rumungro
verb-deriving affixes. The suffix -áz-, in conjunction with the following adap-
tation suffix -in-, is a productive means to derive intransitive verbs from pre-
Hungarian nouns, e.g. paramis-i ‘fairy-tale’ (< Greek) → paramis-áz-in- ‘to
tell fairy-tales’. The second extracted derivational affix is the causative suf-
fix -tat-. Rumungro allows three different structural types of causatives of
Hungarian loan-verbs: lexical borrowing and adaptation of Hungarian causa-
tives, e.g. dógoz-tat-in- ‘to make [so.] work’ (< dolgoz-tat); internal deriv-
ation from a non-causative loan-verb by an indigenous causative marker, e.g.

dógoz-in-av-; or, most commonly, a combination of both types, which results
in double causative marking, e.g. dógoz-tat-in-av-. This pattern of double
causative marking has also been analogically extended to some classes of in-
herited and internally derived verbs, e.g. ďij-áz-in- ‘to sing’ → ďij-áz-tat-in-av-
‘to make [so.] sing’.
While morphological causatives have been inherited from Indo-Aryan,
dialect comparison within Romani suggests that their retention and produc-
tivity in Rumungro is due to pattern replication from Hungarian (cf. Hüb-
schmannová and Bubeník 1997). Morphological frequentatives, on the other
hand, represent a novel category in Rumungro: the existence of suffixal
frequentatives in Hungarian has triggered the development of the Romani
transitive suffix -ker- ~ -ger- into a valency-neutral frequentative marker in
Rumungro, e.g. ťin- ‘to buy’ → ťin-ger- ‘to buy frequently’. Pattern replica-
tion from Hungarian is also responsible for the creation of a specific class
of preverbs (coverbs, verbal particles), i.e. free adverbial forms that encode
spatial, aktionsart or lexical modification of the verb. Many, though not all,
preverb constructions are exact translations of their Hungarian models, e.g.
téle thov- < le-tesz [downward put] ‘pass (e.g. an exam)’, but Rumungro ánde
sov- [inward sleep] vs Hungarian el+alsz- [away sleep] ‘fall asleep’.6 Most
preverbs arose through grammaticalization of pre-Hungarian spatial adverbs
and many, though not all, of these still retain their spatial functions as well
(cf. also Schrammel 2005). In addition, numerous preverbs are matter bor-
rowings from Hungarian (see Section 6).
In addition to lexical verbs, nouns, adjectives, and manner adverbs, Ru-

mungro has borrowed numerous function (or less lexical) words from its dif-
ferent L2s. The modal particle of possibility šaj ‘can’ is likely to be of West
Iranian origin (Matras 2002: 196). Greek is the source of the cardinal numer-
als efta ‘seven’, ofto ‘eight’, ëňňa ‘nine’, and trianda ‘thirty’ and the ordinal
trito ‘third’; the quantifier buka ‘a little, a piece of’; the address particle more
‘hey, man!’; the temporal deictic particle paleg ‘then, after that’ (< ‘again’);
and the temporal adverb táha ‘tomorrow’. Serbian–Croatian provided the
quantifiers dosta ‘enough’, sako ‘every’, and cilo ‘whole’; the distributive
particle po; the optative/permissive particle nek ‘let’, which has also been
grammaticalized into a subordinator (cf. 11e,f); the focus particle ni ‘not
270 Viktor Elšík
even, neither’ and the related coordinator ni – ni ‘neither – nor’; the negative
pronoun ništa ‘nothing’; and the preverb préku ‘through; across, over’, which
has been grammaticalized within Rumungro from a borrowed spatial adverb
(cf. Section 5).
Most function words have been borrowed from Hungarian, the current L2.
Hungarian is the source of numerals (see below), the quantifier čepo ‘few, lit-
tle; a few, a little’ (< ‘a drop of’), the degree words igën ‘very, very much’
and túl ‘too, too much’, the generic obligative particle musaj ‘one has to’,
numerous preverbs (e.g. át ‘through; across, over’ or sít ‘apart’), and a few
marginal postpositions (e.g. sërint ‘according to’ or fëlé ‘in the direction of’).
Rumungro commonly borrows inflectional forms of Hungarian nominals, in-
cluding pro-words, which function as adverbs in the recipient language, e.g.
aňňira ‘to that extent’ (< sublative of annyi ‘that much’), magátú ‘by oneself’
(< ablative of the reflexive–emphatic pronoun maga), idëgembë ‘abroad’ (<
inessive/illative of idegen ‘foreign country’). Especially temporal adverbs
of this kind are abundant, e.g. akármikor ‘anytime whatsoever’, tavaskor
‘in the spring’ (< temporal case of akármi ‘anything whatsoever’ and tavasz
‘spring’), díbë ‘at noon’, márciušba ‘in March’ (< inessive of dél ‘noon’ and
március ‘March’), serdán ‘on Wednesday’ (< superessive of szerda ‘Wednes-
day’). Borrowing from Hungarian is extensive in discourse-related function
words, such as repetition adverbs (újbú or újra ‘again, anew’), utterance-level
adverbs (talán ‘perhaps’, bistoš ‘certainly’, përsë ‘of course, sure’, bizoň ‘in-
deed’), phasal adverbs (még ‘still’ and má ‘already’), focus particles (iš ‘also,
too’, čak ‘only’, ippën ‘just’, pont ‘exactly’, ëgís ‘entirely’), affirmative an-
swer particles (the regular hát ‘yes’, and the contrary-to-expectation dë ‘but
yes’), interjections (ëhë), fillers (hát), sequential discourse markers (no), and
more. Borrowed coordinators and subordinators are common and will be dis-
cussed in Section 8.
In addition to function words, Rumungro has borrowed several function-
word affixes, only some of which will be discussed here. The Greek-origin
suffix -t- derives regular ordinals from cardinal numerals, e.g. dúj ‘two’ →
dúj-t-o ‘second’. The Hungarian-origin suffix -tú marks separative orienta-
tion in local adverbs and posterior–durative relation in temporal adverbs,
e.g. ánglal ‘in/to the front’ → ánglal-tú ‘from the front’, ídž ‘yesterday’ →
idž-al-tú ‘since yesterday’. The South Slavic-origin prefix ni- and the Hungari-
an-origin prefixes vala-, akár-, and minden- apply to interrogative pro-words,
e.g. káj ‘where’ → ni-kháj ‘nowhere’ (negative), vala-káj ‘somewhere’ (spe-
cific indefinite), akár-káj ‘anywhere whatsoever’ (free-choice), and minden-
káj ‘everywhere’ (universal quantification). The Hungarian-origin prefixes
am- and uďan- apply to deictic pro-words, e.g. asso ‘such’ → am-asso ‘such
like the other’ (deictic contrast) and uďan-asso ‘just such like this/that one’
(deictic identity). All of the pronominal prefixes must have been borrowed
without the mediation of lexical borrowing.
There are also several instances of pattern replication from Hungarian in
function words. The genderless Hungarian is the source of gender neutral-
ization in the nominative of the Rumungro third-person singular pronoun:
the original feminine form ój ‘she’ has replaced the original masculine form
*óv ‘he’, assuming a gender-neutral function ‘s/he’ (cf. H ő ‘s/he’).7 On the
other hand, the development of a distinction between local pro-words of stat-
ive location and direction, e.g. káj ‘where’ vs kija ‘whither’, is likely to have
been modelled on an identical distinction in Hungarian. Due to a complex
interplay of pattern replication and internal re-analysis, the universal-quanti-
fication prefix sa- has developed as an alternative to the borrowed universal-
quantification prefix minden- (see above), e.g. sa-káj ‘everywhere’. Pattern
replication has also been involved in the grammaticalization of the reciprocal
pronoun jékh-ávr- [one-(an)other-] ‘each other’, which is a compound of an
identical structure as the Hungarian reciprocal pronoun egy-más. The expres-
sion of the phasal expression ‘no longer’ as a negation of ‘already’ is clearly
modelled on Hungarian.8 In syntax, adnominal cardinal numerals (optionally
in case of ‘one’) have lost case agreement with their head nouns due to Hun-
garian influence, e.g. dúj (*dúj-e) muršenca [two (*two-obl) man.pl.soc]
‘with two men’.
A final note concerns borrowing of Hungarian numerals. Two types of
loans must be distinguished: morphologically integrated loanwords, which
have no inherited, pre-Hungarian alternative (the cardinals nulla ‘zero’, ëzeri
‘thousand’, and miliomo ‘million’, the ordinal e̋šéno ‘first’, and most frac-
tion numerals), and morphologically unintegrated loanwords, which alter-
nate with inherited numerals. The unintegrated numerals allow or require,
due to Hungarian influence, the singular of some of their head nouns, viz. of
some Hungarian-origin nouns denoting currency units: contrast pándžvárdeš
hallér-ja ‘fifty hellers’ (indigenous numeral, plural noun) with ëtvën hallér-i
‘fifty hellers’ (Hungarian numeral, singular noun) < ötven hallér. Note that
the latter construction is not necessarily a code-switch, as the singular noun
is morphologically adapted in Rumungro. The alternation between inherited
and borrowed expressions also concerns various de-numeral derivations and
compounds, e.g. tritóneste [third.loc.sg.m] or harmadikán (< Hungarian)
‘on the third [day of a month]’, eftavardešberšiko or hëtvënívëšno (< Hungar-
ian) ‘seventy-year-old’.
272 Viktor Elšík
Linear order of the predicate, its arguments and adverbial adjuncts is flexible
in Romani, being largely determined by pragmatic factors (cf. Matras 1995,
2002: 167174). While syntactic non-configurationality is also characteris-
tic of Rumungro, numerous aspects of Rumungro clause-level order appear
to have been borrowed from Hungarian, likewise a non-configurational lan-
guage. A prominent example is the tendency to position focussed constituents
immediately before the finite verb; this frequently results in clause-final pos-
ition of the copula in non-verbal predications (4; second line).
(4) Odá hi gadžikano sokáši.

that.m cop.pres.3 nonGypsy(a) habit
Romano sokáši tista áver hi.
Gypsy(a) habit sheer other cop.pres.3
‘That’s a nonGypsy habit. The Gypsy habit is completely different.’
On the other hand, linear order at the noun phrase level is syntactically
determined in Rumungro: all types of adjectival modifiers, including descrip-
tive adjectives, adnominal possessors, demonstratives, and numerals, always
precede their head nouns. While the modifier–noun order prevails in all Rom-
ani dialects (cf. Matras 2002: 165167), it has been fully grammaticalized
in Rumungro due to contact with Hungarian. The alternative noun–modifi-
er order is simply ungrammatical, except in cases of afterthought whereby
the postposed modifier is a substantivized apposition. Rumungro exhibits an
etymological split in the order of adpositions: while those borrowed from
Hungarian are postposed to their object noun phrases, adpositions of pre-
Hungarian origin always remain preposed.9 An analogical split occurs with
focus particles translatable as ‘also, too’: the indigenous te is preposed to the
focused element, while the Hungarian-origin iš is postposed.
8. Syntax
A number of clause-level syntactic features that Rumungro shares with Hun-

garian is due to a typological or areal similarity between the two languages,
rather than due to immediate borrowing from Hungarian into Rumungro. For
example, both languages have uninflected pre-verbal negators, allow pro-
drop, and use a copula verb in non-verbal predication (though, unlike Hun-
garian, Rumungro does not allow copula deletion in the third-person present
affirmative). Rumungro also shares with Hungarian negative agreement of
the predicate with negative pro-words; this is clearly a post-Greek pattern in
Rumungro, though South Slavic is a more likely source than Hungarian.
The major structural domain of syntactic borrowing from Hungarian into
Rumungro is clause combining and phrase combining. Rumungro borrows
all of its coordinating conjunctions with the exception of conjunctive co-
ordinators, which are pre-Hungarian: plain disjunctive vaď ‘or’, contrastive
disjunctive vaď – vaď ‘either – or’, free-choice alternative ha – ha ‘whether
– or’, and several connectors with adversative and contrastive functions, e.g.
dë ‘but’, azomba ‘however’, mégiš ‘still, even so’, hanem ‘but rather’, and
meg and pëdig ‘but, in turn’ (5). Borrowed adverbial subordinators include
the causal mërt and mivël ‘since, because’ (6), and several non-simultaneous
temporal subordinators: the posterior mire and miëlét ‘before’, the posterior–
durative még ‘until’, and the anterior durative mióta ‘since’ (7).
(5) Dë ón na džan ánglal, hanem téle džan.

conj 3pl neg go.3pl to.the.front conj downward go.3pl
‘But they are not progressing, they are rather sinking.’
(6) Mivël čoháni ssa,

conj witch cop.3sg.pret
na tromalahi and-i khangéri te džan.
neg dare.3sg.rem in-def.f church(f) comp go.3pl.subj=inf
‘Since she was a witch, she did not dare to go to the church.’
(7) Mióta džukela hi amen, náne amen mačka.

conj dog.pl cop.3.pres 1pl.acc cop.neg.3.pres 1pl.acc cat
‘Since we have kept dogs, we do not keep a cat.’
Clausal complements of predicates of utterance, propositional attitude,

(acquisition of) knowledge, immediate perception and the like, are intro-
duced by the Hungarian-origin general subordinator hoď (8a). Like in Hun-
garian, this subordinator is also employed to introduce several types of ad-
verbial clauses (8b: reason clause) and, optionally, embedded interrogative
clauses (8c) and embedded polar questions (8d). The latter are – obligatorily,
unless an alternative construction is used – marked by the question enclit-
ic -i, which is also borrowed from Hungarian. The subordinator hoď may
also precede various pre-Hungarian subordinators that introduce embedded
274 Viktor Elšík
commands and other clausal complements of manipulative predicates (8e),

and purpose clauses (8f). Unlike in Hungarian, however, the subordinator
hoď cannot introduce such clauses by itself.
(8) a. Halíjom, hoď má n- aná le

understand.pfv.1sg comp already neg bring.1sg.fut 3sg.m.acc
uppe gódi.
on brain
‘I understood that I will not persuade him any more.’
b. Daráhi, hoď našlíja o lóvo.
fear.1sg.rem comp get.lost.pfv.3sg def.m money(m)
‘I was afraid that the money had gotten lost.’
c. Na šunde láčhe, (hoď ) ko vičinel taj so.
neg hear.pfv.3pl well comp who shout.3sg and what
‘They did not hear well who was shouting and what.’
d. Na džanav, (hoď ) muká -i man tutar te
neg know.1sg comp let.1sg.fut -q 1sg.acc 2sg.abl comp
čumiden.
kiss.3pl.subj=inf
‘I do not know whether I will let you kiss me.’
e. Phenďa mange, (hoď ) khére nek áčhovav.
say.pfv.3sg 1sg.dat comp at.home opt stay.1sg.subj
‘S/he told me to stay at home.’
f. Site le papaleg uppe alakhes, (hoď) káj nek
must 3sg.m.acc again upward find.2sg comp where opt
džanesahi le te phenen.
know.2sg.rem 3sg.m.acc comp say.3pl.subj=inf
‘You have to discover it again, in order to be able to say it.’
Due to pattern replication from West Iranian or Greek, complement

clauses of modal predicates were finite in the early European stages of Rom-
ani: the subordinate verb was introduced by an indigenous non-factual com-
plementizer and showed subject person–number agreement with the matrix
verb (Matras 2002: 161). Pattern replication from Hungarian has resulted in
a development of a non-finite complement form in Rumungro, through fos-
silization of a frequent finite form of the subordinate verb: the subordinate
verb now invariably shows third plural subjunctive inflections, irrespective
of the person–number of the matrix verb. This non-finite construction, which
may be termed the subjunctive infinitive (or the “new” infinitive, Boretzky
1996), encodes not only clausal complements of modal predicates (cf. 6, 8f)
but also clausal complements of some manipulative verbs (cf. 8d) and tightly
integrated same-subject purpose clauses. The Hungarian-origin infinitive in
-ňi (see Section 5) is used in identical syntactic contexts as the subjunctive
infinitive (9a–c). Like the infinitive in Hungarian, the Hungarian-origin in-
finitive in Rumungro does not allow any complementizer.
(9) a. Kam-áhi dógoz-ňi.

want-1sg.rem work(v)-inf
‘I would like to work.’
b. Muk-j-a l-a ďij-áz-ňi
leave-pfv-3sg 3sg-f.acc song-v-inf
‘S/he let her sing.’
c. Dža-s huhur-áz-ňi.
go-1pl mushroom-v-inf
‘We are going to go and collect mushrooms.’
Pattern replication from Hungarian has also occurred in relative clauses.

Although Selice Rumungro relativizers are formally identical to interroga-
tives, whereas Hungarian relativizers are not, the former partly copy the
“ontological” restrictions of the latter: human head nouns usually select a
person pro-word (‘who’) as a relativizer in Rumungro, while non-human
head nouns mostly select a thing pro-word (‘what’).
9. Lexicon
Out of a much larger inventory of early loanwords into Romani (as attest-
ed in different Romani dialects), Rumungro of Selice retains ca. 20 loan-
words from Iranian languages, ca. 10 loanwords from Armenian, and over 35
loanwords from Greek. In addition, there are over 30 loanwords from South
Slavic, which are mostly not shared with other dialects of Romani. Most of
the pre-Hungarian loanwords are nouns, while verbs and adjectives are less
numerous; only relatively few pre-Hungarian function loanwords have been
retained (see Section 6). While there are a few stable noun loanwords from
the secondary L2s of Selice Rumungro speakers (e.g. pepšo ‘black pepper’
from Czech), and while nonce borrowing of nouns and verbs from these lan-
guages is rather common, the by far most important current source of loan-
words is Hungarian.
276 Viktor Elšík
Hungarian loanwords include basic vocabulary in domains such as

body parts, bodily functions, kinship, or physical properties (e.g. ‘knee’,
‘to breathe’, ‘son-in-law’, ‘weak’). Unlike some Romani varieties that em-
ploy internal word-formation processes to create a layer of secret vocabu-
lary in certain semantic domains (cf. Matras 2002: 223), Rumungro does not
seem to avoid loanwords (such as čëndéri ‘policeman’) in these domains.
Instances of pattern replication without matter borrowing in complex re-
ferring expressions are exceptional, e.g. sobota-kurko [Saturday-Sunday]
‘weekend’ calquing local Hungarian szombat-vasárnap. An overwhelm-
ing majority of Hungarian compounds are borrowed rather than translated,
e.g. fog-orvoš-i ‘dentist’ < fog-orvos [tooth-doctor], though translations of
lexicalized preverb–verb collocations are common (see Section 5). Some
Hungarian compounds may be decomposed into adjective–noun colloca-
tions, e.g. világ-ik-o háború [world-adj-nom.sg.m war] ‘world war’ < világ-
háború [world-war].
Phraseological idioms are commonly translated from Hungarian; for an
example see (3). As several Hungarian types of greetings and similar expres-
sions are missing in the traditional Rumungro culture, some speakers have
started to fill in the “gap” by using Hungarian expressions, e.g. szia ‘hi; bye’,
jó étvágyat ‘bon appetit’. Some indigenous politeness expressions are used
in wider contexts due to cultural contact. For example, palikerav ‘I thank;
I greet’ is not traditionally used after being served a meal or coffee at home,
but some Rumungri would now use it in this context, as the local Hungar-
ians do.
10. Conclusions
The sociolinguistic situation of all Romani varieties is highly favourable to

contact-induced developments, since almost all Romani speakers are bilin-
gual in the relatively prestigious languages of the dominant “matrix” popula-
tions and since, at the same time, Romani linguistic ideologies are relatively
tolerant of borrowing in most functional domains. Moreover, the long-settled
Roms of the Hungarian regions of Slovakia have developed a strong orienta-
tion towards Hungarian cultural models, which facilitates Hungarian-induced
linguistic changes in the few extant Hungarian Rumungro varieties, including
Rumungro of Selice. This concluding section is an overview of various types
of contact-induced developments that have affected the structure of this par-
ticular Romani variety.
Both matter borrowing and pattern replication are well attested in Ru-
mungro. Lexical matter borrowing, i.e. borrowing of syntactically free sym-
bolic form–function units, is common in Rumungro with all grammatical
classes of content words (verbs, nouns, adjectives, and manner adverbs) and
with most classes of function words. Borrowed adpositions are rare, however,
and there is no matter borrowing of personal pronouns or of the definite art-
icle. Also, only adverbial categories of reflexive, deictic, interrogative, and
indefinite pro-words are lexically borrowed (so not, for example, adnominal
or pronominal demonstratives).
Lexical matter borrowing of paradigmatically related pairs (or sets) of
words may result in what I have termed affix extraction, i.e. indirect or lex-
ical affix borrowing. Note that affix extraction assumes not only an adoption
of an affix within loanwords from a certain L2 and its (potential) paradig-
matic identification, but also its analogical, language-internal, extension to
other etymological compartments within the L1 lexicon. There are numer-
ous instances in Rumungro of lexical matter borrowing of morphologically
complex Hungarian words (e.g. derived frequentative verbs) whose affixes do
not extend to non-Hungarian bases, and which are therefore not considered
to be instances of affix borrowing. Categories whose affixal markers did get
extracted in Rumungro are nevertheless numerous, and include derivational
as well as inflectional categories. Extracted affixes of pre-Hungarian origin
are: nominative noun inflections; a passive participle marker; non-inflection-
al loan-verb and loan-adjective adaptation markers; and markers deriving:
feminine human nouns; relational and attenuative adjectives; ethnic adverbs;
and ordinal numerals. Affixes extracted from Hungarian loanwords include:
an infinitive inflection; a non-inflectional loan-verb adaptation marker; and
markers deriving: action and artificial nouns; active de-verbal adjectives; de-
nominal and causative verbs; similative adverbs; and several unproductive
derivational markers.
The patterns of analogical extension of the extracted affixes to different
etymological compartments of the Rumungro lexicon are rather varied, and
they are not discussed in any detail in this chapter. I should only like to
point out here that several extracted affixes appear to have been “activated”
to apply to loanwords from a chronologically following L2. For example,
the South Slavic-origin suffixes -av-, -n-, and -kiň- apply to Hungarian loan-
words (see Section 4), and the Hungarian-origin suffix -ál- applies to Slovak
and Czech loanwords (see Section 5). Also left out of discussion were the
details of various processes of language-internal re-analysis that are involved
in extraction. For example, the Rumungro de-nominal verb-deriving suffix
278 Viktor Elšík
-áz- (see Section 5) does not correspond to any allomorph of its Hungarian
source, as its extraction involved a re-analysis of its boundary, e.g. ciga-
rett-áz-in- ‘to smoke cigarettes’ ← cigarett-a ‘cigarette’ (< cigarettá-z ←
cigaretta).
Although affix extraction is the source of a greater part of borrowed Ru-
mungro affixes, there are also affixes the borrowing of which appears not
to have been mediated by lexical matter borrowing. This is the case of the
Hungarian-origin superlative prefix (see Section 4) and of several pronom-
inal prefixes of Hungarian and South Slavic origin (see Section 6). These af-
fixes must have been borrowed directly, “by themselves”, since there are no
paradigmatically related pairs of lexical borrowings that could have served
as a source of their extraction. The process of direct affix borrowing, whose
possibility is sometimes claimed to be in need of demonstration (cf. Winford
2003: 6164), appears to be restrained by certain structural factors. Note
especially that direct affix borrowing only takes place in Rumungro when
the resulting morphological construction is, in effect, a “semicalque” on a
semantically equivalent L2 construction. For example, the Rumungro pro-
word vala-káj ‘somewhere’ consists of a directly borrowed indefiniteness
prefix and an indigenous local interrogative base, which “calques” the local
interrogative base of the Hungarian model vala-hol.
Several types of selective borrowing are attested in Rumungro. First, only
some inflectional forms of Hungarian nominals may be borrowed without
a parallel borrowing of the base forms of these nominals, e.g. the sublative
új-ra ‘again, anew’ (lit. ‘onto a/the new one’) but not the nominative *új
‘new’. Not surprisingly, the borrowed inflectional forms are those that fulfil
adverbial or discourse-related functions. Second, only some allomorphs of
an affix, or alternatives within an affix paradigm, may be borrowed. While
structural factors such as the degree of transparency in the source language
are known to play a role here (Winford 2003: 9197), sometimes functional
factors are clearly involved as well. For example, Rumungro borrows the
distal deictic-contrast prefix am- from Hungarian (see Section 6) without
borrowing its proximal, equally transparent, counterpart em-. Finally, several
function words are borrowed only in some of their source functions. Some-
times differences in the distribution of the source word appear to be respon-
sible for selective borrowing, as in the case of the Hungarian question clit-
ic (see Section 8). In other instances, however, selective borrowing reveals
functional motivations. For example, the Hungarian coordinators meg and
pëdig ‘and; but, in turn’ have only been borrowed in their adversative uses, in
which meg is postpositive and pëdig prepositive (see Section 8). Their con-
junctive uses, in which meg is prepositive and pëdig postpositive, are unat-
tested in my Selice Rumungro text corpus. Selective borrowing confirms that
borrowing is motivated by functional, as well as structural, factors.
In addition to matter borrowing, Rumungro frequently replicates gram-
matical patterns (constructions and categories) of its L2s without necessarily
borrowing the linguistic matter that encodes these patterns. As discussed in
Section 3, Greek was the major structural model for Romani in this respect.
There are not many constructions in present-day Rumungro whose origin is
South Slavic: the negative agreement with, and the de-interrogative structure
of, negative pro-words are rare examples. Numerous syntactic patterns, on
the other hand, have been modelled on Hungarian, the current L2: the so-
called subjunctive infinitive; the syntactic category of preverbs and many
individual preverb constructions; encoding of various case relations; absence
of case agreement in numeral constructions; negation of phasal adverbs;
ontological restrictions on relativizers; certain pragmatic and syntactic as-
pects of linear constituent order; and more. Pure replication of morphologic-
al constructions is rare, being represented especially by occasional transla-
tions of Hungarian compounds, including the reciprocal pronoun. However,
replication from Hungarian is responsible for the creation or elaboration of
some morphological categories (associative plurals in nouns, frequentatives
in verbs, and orientation in spatial adpositions and pro-words) and for the re-
duction of others (gender in anaphoric pronouns and feminine derivation in
nouns denoting animals). Also, the retention and productivity of some inher-
ited morphological categories (degree in adjectives and causatives in verbs)
are likely to have been motivated by pattern replication from Hungarian.
Matter borrowing and pattern replication frequently go hand in hand,
conspiring, so to speak, to make the L1 more like the L2. To mention some
examples: Hungarian-origin adpositions retain their postpositioning in Ru-
mungro; unintegrated Hungarian numerals tend to retain their property of re-
quiring singular head nouns; the replicated category of preverbs is enhanced
by a few lexically borrowed members; direct affix borrowing results in “sem-
icalques” of the model constructions (as discussed above); new phonemes,
which are first adopted within loanwords, may be later extended to other
lexical compartments, copying to some extent the phonotactics and morpho-
phonogical rules of the model language; and so on. However, matter bor-
rowing and pattern replication may also result in competing constructions,
as in the case of the two Rumungro infinitives (see Section 8), one of which
(the Hungarian-origin infinitive) does not allow a complementizer, while the
other (the replicated subjunctive infinitive) requires one.
280 Viktor Elšík
Abbreviations
1 first person loc locative

2 second person m masculine
3 third person n noun
abl ablative neg negator or negative form
acc accusative nom nominative
adj adjective(-deriving marker) obl oblique
comp complementizer opt optative–permissive particle
cond conditional pfv perfective inflection or particle
conj conjunction pl plural
cop copula poss possessive
dat dative pres present
def definite article or conjugation pret preterite
emph reflexive–emphatic pronoun q question particle
f feminine rem remote tense
fut future sg singular
gen genitive soc sociative
imp imperative subj subjunctive
inf infinitive tr transitive
itr intransitive v verb(-deriving marker)
loan loan-verb adaptation marker
Notes
1. The chapter is based on my linguistic research on Hungarian Rumungro that was

carried out during short but numerous fieldtrips to Selice, Slovakia, between
1997 and 2007. I wish to thank the late Milena Hübschmannová for introducing
me to the Selice Rom community; Július Lakatoš and Alena Krészová for their
hospitability and native speaker expertise; the Roma Culture Initiative of the
Open Society Institute, Budapest, for their financial support of my Rumungro
research in 20012002; and Adéla Gálová for her help with Hungarian example
sentences. The descriptive sources on Hungarian that I have consulted include
Abondolo (1988), Kenesei, Vago, and Fenyvesi (1998), Siptár and Törkenczy
(2000), and Tompa (1968).
2. Although all Rumungro varieties have been influenced by Hungarian, most
Rumungro speakers presently live in ethnically Slovak parts of Slovakia and
are Slovak bilinguals; an overwhelming majority of Rumungro communities in
Hungary and in the Hungarian parts of Slovakia have undergone language shift
to Hungarian.
3. The Rumungro associative plurals are similar in form to nominative plural agree-
ment (Suffixaufnahme) forms of inflectional genitives of the respective nouns,
though they differ from them in some interesting structural details (see Elšík and
Matras 2006: 322323).
4. The Greek-origin suffix *-(V)s-, which appears to have been the marker of perfec-
tive adaptation of all loan-verbs and of non-perfective adaptation of intansitive
loan-verbs (Matras 2002: 130), has acquired novel functions in Rumungro: it is
now an integral part of the suffix -(i)sal-, which serves as a stem extension in sev-
eral valency-changing or aktionsart derivations, e.g. cid-isaj-ov- ‘to stretch itr’
(anticausative) ← cid- ‘to pull’, térň-isaj-ár- ‘to make young’ (factitive) ← térn-o
‘young’, khand-isaj-ov- ‘to stink intensively’ (intensive) ← khand- ‘to stink’.
5. Although Kenesei, Vago, and Fenyvesi (1998: 357358) describe the Hungarian
suffix -ál- as a de-nominal verb-deriving marker, it in fact a verb-adapting suf-
fix, which is synchronically distinct from the de-nominal verb-deriving suffix
-(V)l.
5. When preposed to the verb they modify, Hungarian preverbs are orthographic
prefixes. Nevertheless, they are syntactically free elements.
6. However, oblique case forms of the pronoun have remained differentiated for
gender, e.g. the accusative le ‘him’ vs la ‘her’ (cf. Hungarian őt ‘him, her’).
7. The expression of ‘not yet’ as a negation of ‘still’ is congruent with Hungarian,
but is likely to be pre-Hungarian.
8. This contrasts with the contact-induced postpositioning of inherited prepositions
in some Romani dialects influenced by postpositional languages such as Turkish
or Finnish (cf. Matras 2002: 206).
References
Abondolo, Daniel
1988 Hungarian Inflectional Morphology. Budapest: Akadémiai Kiadó.
Boretzky, Norbert
1996 The “new infinitive” in Romani. Journal of the Gypsy Lore Society,
Fifth Series, 6, 151.
1999 Die Gliederung der Zentralen Dialekte und die Beziehungen zwischen
Südlichen Zentralen Dialekten (Romungro) und Südbalkanischen
Romani-Dialekten. In: Halwachs and Menz (1999: 210276).
Elšík, Viktor, Milena Hübschmannová, and Hana Šebková
1999 The Southern Central (ahi-imperfect) Romani dialects of Slovakia and
northern Hungary. In: Halwachs and Menz (1999: 277390).
Mouton de Gruyter.
282 Viktor Elšík
Halwachs, Dieter W., and Florian Menz (eds.)

1999 Die Sprache der Roma. Perspektiven der Romani-Forschung in Öster-
reich im interdisziplinären und internationalen Kontext. Klagenfurt:
Drava.
Hübschmannová, Milena, and Vít Bubeník
1997 Causatives in Slovak and Hungarian Romani. In: Yaron Matras, Peter
Bakker, and Hristo Kyuchukov (eds.) The Typology and Dialectology
of Romani, 133145. Amsterdam/Philadelphia: John Benjamins.
Kenesei, István, Robert M. Vago, and Anna Fenyvesi
1998 Hungarian. London/New York: Routledge.
Matras, Yaron
1994 Structural balkanisms in Romani. In: Norbert Reiter, Uwe Hinrichs,
and Jiřina van Leeuwen-Turnovcová (eds.), Sprachlicher Standard und
Substandard in Südosteuropa und Osteuropa. Wiesbaden: Harrasso-
witz. 195210.
1995 Connective (VS) word order in Romani. Sprachtypologie und Univer-
salienforschung 48 (1): 189203.
Press.
Schrammel, Barbara
2005 Borrowed verbal particles and prefixes: A comparative approach. In:
Barbara Schrammel, Dieter W. Halwachs, and Gerd Ambrosch (eds.),
General and Applied Romani Linguistics: Proceedings from the 6th
International Conference on Romani Linguistics, 99113. Munich:
Lincom Europa.
Siptár, Péter, and Miklós Törkenczy
2000 The Phonology of Hungarian. Oxford: Oxford University Press.
Tompa, József
1968 Ungarische Grammatik. The Hague: Mouton.
Winford, Donald
2003 An Introduction to Contact Linguistics. Oxford: Blackwell.
Grammatical borrowing in Manange
Kristine A. Hildebrandt
1. Background
Manange, also known by its endonym ŋjeshaŋ, ŋjeshaŋte, or ŋjaŋmi ‘our lan-
guage/our people,’ is a Bodish language of the Bodic subphylum of Tibeto-
Burman. It is spoken in northern central Nepal, and it is grouped with other
Tamangic (or ‘Gurungic’ or ‘TGTM’) languages, shown in Figure 1 (van
Driem 2001; Bradley 1997; Noonan 2003).1
Manange is spoken by members of a single ethnic group of under 5,000
speakers, located in the northern Manang district. Geographically, Manang
is known as the Inner Himalayan Valley, as it is surrounded to the south, east
and west by the Annapurna mountain range.
Manang is culturally and linguistically heterogeneous, divided into three
ethnic group areas: Gyasumdo to the south, the high elevation Nar valley to the
north, and the upper ŋjeshaŋ valley in the west (Snellgrove 1961). Although
Manange peoples live in all portions of the Manang District, the ŋjeshaŋ val-
ley is considered the traditional area of Manange habitation. Both Gurungs
and Mananges (or Manangis, Manangpas, Manangbas and Manangbhots by
Indic peoples) are the dominant ethno-linguistic groups of Manang.
Tibeto-Burman
Bodish
Tebetan Complex Ghale “Tamangic”

(incl. Sherpa) Tamang Seke
Gurung Kaike
Thakali Gyalsumdo
Manange Chatyal
Nar/Phu
Figure 1. Genetic affiliation of Manange
284 Kristine A. Hildebrandt
In terms of endangerment status, the Manange language can currently be

considered as small but relatively viable, with some prospect for endanger-
ment (using (Kincade 1991) as a model). Although the speaker population is
under 5,000, there seems to be continued transmission of Manange to younger
generations (albeit bilingual), combined with some small-scale displacement
via emigration of some generations of speakers from traditional Manang to
urban Kathmandu. Factors contributing to an observed small-scale loss of
Manange include the rise of access to formal education in Nepali, as well as
the general prestige of Nepali in terms of socio-economic advancement. Fac-
tors contributing to retention of the language include positive within-ethnic
group identity and various prestige factors, including the comparative wealth
and social status that Mananges have accrued as entrepreneurs.
The history of language contact in Nepal is complex, and the results of
this long-term inter-mingling of languages have had varying consequences
on typological and genetic features of different Tibeto-Burman languages
located there. Noonan (2003) charts the different types of grammatical bor-
rowing in a number of Tibeto-Burman languages from different sub-phyla.
Of the three main types of contact scenarios, the oldest situation is between
Himalayish languages (including Kiranti, Kham, Magar, Chepang, Newar,
and others), whose speakers have been long-time residents of Nepal. A more
recent type of contact is between speakers of the Tibetan-type languages of
the Bodish sub-group, including Manange (i.e. within-family contact). These
peoples are more recent immigrants to Nepal, having migrated within the
last two millennia, and occupying territories that are in close proximity. A
still more recent, and different, type of contact situation in Nepal has been
between speakers of Tibeto-Burman languages and Indo-European languages
like Nepali. Although Nepali was already well-established in western Nepal,
there has been more recent contact of this third type in eastern and cen-
tral Nepal. Now, as the influence of Nepali (and perhaps other non-Tibeto-
Burman languages) spreads throughout Nepal, cross-family contact is as (or
more) likely as within-family contact.
Despite the rather (geographically) remote location of the Manang vil-
lages, there is evidence that Mananges have been in regular contact with
speakers of other languages (Indic and Sinitic) for a long time. In 1956, David
Snellgrove, of the School of Oriental and African Studies, undertook a six
month journey through Nepal to update map information originating from the
Survey of India and to study Buddhist art and scriptures. He spent some time
in the Manang District, and he was initially intrigued by the lack of surprise
displayed by Mananges when they first encountered him. He also noted that
Manange 285
Manange youths “spoke Nepali willingly and fluently” (1961: 205). Snell-
grove also noticed silks from mainland China and Singapore adorning the
walls of local gompa buildings, suggesting some trade-oriented contact with
other Asian peoples. Snellgrove soon learned that Mananges had significantly
more contact with the world beyond the Nepalese borders than did many
other indigenous groups, holding posts in the Indian Army and having unique
travel rights to Malaysia and Singapore.
In recent generations, it has become commonplace for many Mananges
to migrate temporarily or permanently to the Kathmandu valley, or to lower
elevations within Manang during winter, to benefit from longer growing sea-
sons. In winters, women and children especially, stay in low elevation vil-
lages where Nepali (Indo-European) is spoken, and men may travel to other
regions in Nepal or to India (or beyond) for work (Rogers 2004). Although
some Mananges (and other peoples) do remain in Manang year-round, this
number seems to be declining as the years go by. As a result, for part of the
year, many Mananges are surrounded by, and use, Nepali either in urban
Kathmandu or in other lower elevation villages in Manang.
Another relevant factor for Manange language contact with Nepali is edu-
cation (Hildebrandt 2003, 2006). There is one school in each larger Manang
village nowadays, and instruction is in Nepali. In addition, a number of adults
who live in Manang (traditionally men, but increasingly women too) have
had some education either in Kathmandu or abroad. These opportunities for
formal education have lead to frequent and long-term contact with other lan-
guages, like Nepali, Hindi and increasingly, English.
Recently, Manang has become a tourist hot-spot because the popular “An-
napurna Long Circuit” bisects the district. As a result, a tourist-driven econ-
omy has emerged where wealthy Mananges build elaborate lodges to host
foreign trekkers. Other related tourist-oriented businesses have grown in the
area, including guided tour operations, porter services, and a solar-powered
cyber-cafe. Some aspects of this new economy are grounded in Nepali lan-
guage use (e.g. interaction with tour guides and porters), and so the economic
benefit of speaking Nepali has grown there.
Not all Mananges benefit equally from this new trekking economy. Many
Mananges still live traditional, subsistence-farming lives, usually because
they live in areas that are too far off the main trekking route to benefit from the
tourist industry in the way that more strategically located residents can. These
Mananges claim to use Nepali only sometimes, (e.g. with outside visitors)
Another observation is the recent immigration of Tibetans, Lhomis and
Nar-Phus to the Manang villages. They have come to Manang in search of
better economic opportunity; they rent vacated (Manange-owned) houses

and farm the land in a kind of share-cropping situation. Mananges report
that these new residents adopt Manange for local use, or else use Nepali with
them. My own (limited) interaction with these people has been in the Man-
ange language, and not in their traditional languages, nor in Nepali.
This report focuses on the one-way effects in Manange of language con-
tact with Nepali. Although there is preliminary evidence of contact-induced
changes between Manange and other T-B languages (e.g. some lexical bor-
rowing from unknown dialects of Tibetan), the effects of contact with Nepali
are easier to pinpoint and document. Further investigations can reveal the
potential effects of this contact on the Nepali spoken by different segments
of the diverse Manange ethnic community.
2. Phonology
The phonological structure of Manange in many ways typifies that of the Bod-
ish languages: there is no contrastive voicing opposition for obstruents, there
is an alveolar and post-alveolar opposition in consonant place of articulation,
and there is a velar nasal in word-initial position (e.g. 4ŋi ‘two’; 1ŋʌ ‘1.sg’).
Manange also has a voiceless plain and aspirated retroflex plosive, that while
contrastive in word-initial position in basic vocabulary (e.g. 1ʈu ‘sit/stay’ vs.
4
ʈu ‘six’), is still marginal in overall lexicon frequency. There is consider-
able evidence that the retroflex is one of the more reliable features of South
Asia as a linguistic area (Masica 2001; Noonan 2003). It has probably entered
into the Bodish languages via contact with Indic languages (which in turn
acquired it from Dravidian).
The most interesting case of contact-induced structural change in Manange
phonology is not obviously borrowing from Nepali, but rather a case of loss
or simplification (likely via analogical leveling). This has been documented in
Hildebrandt (Hildebrandt 2003; 2004) as a phonetic and phonological merger
of the tone system. The properties of the tone system employed by more lin-
guistically conservative speakers is as follows. All words (both native and old
loans, both mono- and disyllabic) fall into one of four tones, illustrated in (1).
As (1) shows, tone /1/ and /2/ words show low and high pitches, respectively.
The words from the two (falling) contour tones have an additional defining
property in that with tone /3/ words, if the initial consonant is an obstruent,
it is unaspirated, and with tone /4/ words, the initial obstruent is aspirated.
However, this distinction is not retained with sonorant-initial words, which
Manange 287
(1) Manange tones

Initial onset
Tone Pitch properties consonant properties Example
1 Low Level N/A ʈu ‘sit/stay’
2 High Level N/A ʈu ‘thread’
3 Very High Falling Unaspirated if Obstruent ʈu ‘cereal’
4 Mid-High Falling Aspirated if Obstruent ʈu ‘six’
are found in all four tones without any aspiration or voicing differences (e.g.
the near-minimal set 1ŋje ‘chew’; 2ŋi ‘seven’; 3ŋje ‘milk’; 4ŋje ‘spill’).
With Mananges who have grown up in a more intense environment of
Nepali bilingualism (mainly those Mananges who were born/raised in Kath-
mandu), the structure of the tone system shows marked changes. Hildebrandt
(2003) demonstrates that urban speakers show a large-scale phonetic merger
of the two contour tones into a two-way high-low opposition. In addition,
the conceptualization and grouping of words into different melody groups is
considerably fuzzier than for rural Mananges of the same age group.
It is not at all obvious that the altered tone system is simply one symp-
tom of a larger process of language loss (i.e. shift to Nepali). Manange in
the urban environments appears to be maintained in a situation of diglossia,
whereby its place in Manange life is firmly rooted in domestic, private envir-
onments, while Nepali is the language of necessity in public domains. Never-
theless, this structural result (along with others described below) appears to
be a consequence of such a maintenance scenario.
It is also not obvious that urban Manange is borrowing anything from
Nepali phonology. Nepali has no tone, and in fact has a four-way obstruent
voicing distinction (voiceless plain, voiceless aspirated, voiced plain and
voiced aspirated). There is no evidence that urban Mananges are incorporat-
ing obstruent onset voicing into their production of Manange. Rather, lexical
frequency may play a role in determining which words evidence phonetic
pitch merger and in determining the pitch properties of the emergent two-way
system.
There is some evidence of a possibly emergent iambic stress pattern (non-
contrastive) in Manange words, perhaps via contact with Nepali. In Nepali
polysyllabic words, (phonetic) stress is initial if all syllables of word are of
equal syllable weight (Acharya 1991). If non-initial syllables are of certain
(progressively heavier) weight, then stress falls right-ward, suggesting an
iambic tendency.
In Himalayish T-B languages, stress and tone favor an overall trochaic

(initial) pattern (cf. Bickel 1998; 2003). For example, in Kiranti languages,
main stress is almost always initial. In Bodish languages, the tone feature
of the initial element (syllable, morpheme) is carried across all other bound
units. In Manange, the tone pattern retains this initial/trochaic preference,
and most disyllabic words carry a main stress on the initial syllable. But there
are also some disyllabic (nouns) that are clearly stressed on the final syllable,
as in example (2), with pitch re-set, vowel amplitude/intensity and duration
as indicators of this stress (cf. Hildebrandt 2003; 2004).
(2) Final Main Stress

Form Meaning
2
nja.Õta ‘chain’
2
to.Õsoŋ ‘now’
3
ŋjo.Õkroŋ ‘breast’
3
to.Õɾe ‘grave’
4
ko.Õʈe ‘button’
2
tʃep.Õkjel ‘vulture’
3
ŋo.Õkɾoŋ ‘forehead’
I treat the different patterns of case-marking in Manange in this section, even

though it is essentially a topic of argument structure alignment, and appears
in the typology section of the database. Manange is like the other Bodish
(Himalayish) and Indo-European languages of the South Asian linguistic
area in showing some variation of a (split) ergative–absolutive alignment
pattern (cf. Masica’s ‘ergative belt’ 2001: 250251). Split ergativity is actu-
ally reconstructed by DeLancey (1989) back to Proto T-B, so its presence in
Manange is not necessarily attributed to contact with Nepali. However, the
different patterns of ergative case marking in the rural and urban Manange
communities is of interest.
Rural Mananges show a pattern in their speech of split ergativity that
aligns with modality. The A argument of a transitive verb in realis mode (i.e.
perfective and perfective progressive aspects, simple present ‘tense’) hosts
the ergative enclitic, while in irrealis mode (future, immediates, deontics,
etc.) is absolutive (zero) marked, shown in examples (3)–(5) (examples from
Hildebrandt 2004: 99100)
Manange 289
(3) Realis
1 2
mriŋ=tse naka 2puŋ 2kol-tsi.
woman=ERG chicken egg boil-perf
‘The woman boiled the egg.’
(4) Irrealis (future)

1
mriŋ*=tse 2naka 2puŋ 2kol(-pʌ).
woman*=ERG chicken egg boil(-nom)
‘The woman will boil the egg.’
(5) Irrealis (immediate)

1 1 1
ŋʌ*=tse nʌkju=ri 2prim-pi lʌ-tsi
1(sg)*=ERG dog=loc hit/kick-imm do-perf
‘I prepared to/was about to hit/kick the dog.’
With urban Mananges, there is no such split. The A arguments of transitive

verbs host the =tse enclitic, regardless of any aspect or modality distinc-
tions.
In this sense, the ergative–absolutive pattern is different both from rural
Manange and from Nepali, as Nepali has a split-ergative system aligning
with aspect (arguments of perfective-marked transitive verbs show -ley erga-
tive marking while imperfective verbs show absolutive/zero marking). It
appears then that there is a process of overgeneralization of ergative case-
marking for urban Mananges, whereby it has become a general marker of
transitivity.
A second likely contact pattern with nominal structures is shown by dif-
ferent patterns of NP constituent ordering across the rural and urban Man-
ange communities. This more closely resembles what Matras and Sakel term
‘pattern borrowing’, or the adoption of a strategy or structure from another
language (2007).
Nepali has a separate class of lexical adjectives and they are pre-nominal
in order (e.g. miʈo kana ‘tasty food’). And in fact this is the general pattern
with Indic languages of South Asia.
In Bodish languages, the situation is slightly more complex, as both NA
and AN order are attested. Nar-Phu is strongly NA in order, while Chantyal
and Tamang show AN. It is generally assumed that the pre-nominal order in
Bodish is the newer pattern via contact with Indic languages (Bickel 2001;
Masica 2001; Noonan 2003).
Manange again shows a now-familiar split across speaker communities.

The rural population shows overwhelmingly NA ordering in attributive NPs
(e.g. 1nʌkju 1tjʌ-pʌ dog big-nom ‘(the) big dog’). This pattern is also sup-
ported in Hoshi with a speaker from Prakaa Manang (Hoshi 1986: 212).
Urban Mananges show overwhelmingly AN ordering (e.g. 1tjʌ-pʌ 1nʌkju
big-nom dog) (Hildebrandt 2004; Genetti and Hildebrandt 2004).
The changes to verbal structures in Manange again fit a pattern-borrowing

type, and one of these (the dependency between negation and aspect mark-
ing) is evident mainly in the urban community of Manange speakers.
In most Bodish languages the periphrastic strategy of valency increasing
(causation) is the main (and usually) only available strategy (Noonan 2003),
shown for Manange in example (6) (example from Hildebrandt 2004: 107).
(6) Manange Periphrastic Causation Strategy

1 1
amʌ=tse lʌ-tse 1ŋʌ=tse 1taŋ 1pja-tsi.
mother=erg do-cc 1sg=erg floor clean-perf
‘My mother made me clean the floor.’
In the periphrastic strategy, the first clause contains an ergative-marked A and

the verb 1lʌ ‘do’, which hosts the clause chaining suffix -tse. The matrix clause
carries the aspect (perfective) suffix -tsi. Both rural and urban Mananges use
this strategy, and they also employ another (less productive) causation strat-
egy that is more morphological in structure, shown in (7) (example from
Hildebrandt 2004: 106):
(7) Manange Morphological Causation Strategy

1
ŋʌ=tse 3tʃʌ 1le 1lʌ-tsi.
1.sg=erg tea warm do-perf
‘I made the tea warm/warmed the tea.’
Here, the verb 1lʌ ‘do’ is used in a compound structure and it carries the aspect
affix. This structure is noticeably absent from other Bodish languages (except
for Chantyal, which also shows other structural borrowing from Nepali). In
Nepali causation is signaled only through morphological means (with a suf-
fix -āu):
Manange 291
(8) Nepali Morphological Causation (Acharya 1991: 168)

Subhadrā suśīla-lāī bhāta khũw-āu-chin.
Subhadrā Suśīla-dat rice eat-caus-3sgpres.fem
‘Subhadra makes Susila eat rice’
The second contact-induced change is seen only with the urban Manange
speakers, so they are apparently modeling their pattern of (Manange) verbal
inflection based on that in Nepali. For rural Manange speakers, the mor-
phological coding of aspect on the verb is dependent on negation: negative
marked (prefixed) verbs do not show aspect marking, with the resulting dif-
ference in (9) and (10):
(9) Affirmative2
1
ŋʌ=tse 1kola=ri 3ʃitaŋ 1lʌ-tsi.
1.sg=erg child=loc scold do-perf
‘I scolded the child.’
(10) Negative
1
ŋʌ=tse 1kola=ri 3ʃitaŋ 1a-lʌ.
1.sg=erg child=loc scold neg-do
‘I did not scold the child.’
Urban Mananges do not acknowledge this dependency, and both negated and
non-negated verbs can host the full range of aspect morphology (e.g. 3ʃitaŋ
1
lʌ-tsi and 3ʃitaŋ 1a-lʌ-tsi).
Numerals in Manange follow a base-ten system (2tʃu ‘ten’; 4ŋiʃu two-ten

‘twenty’; 2sumtʃu three-ten ‘thirty’; 4plitʃu four-ten ‘forty’, 4ŋʌtʃu five-ten
‘fifty’, 4ʈuktʃu six-ten ‘sixty’, etc.). Consecutive counting within the indi-
vidual bases follows a pattern of addition of single units to the multiple (e.g.
1
tʃukre ten-one ‘eleven’, 1tʃuŋi ten-two ‘twelve’, 1tʃupsẽ ten-three ‘thirteen’
etc.; 4ŋiʃu 4kri two-ten-one ‘twenty one’, 4ŋiʃu 4ŋi two-ten-two ‘twenty-two’,
4
ŋiʃu 2sẽ two-ten-three ‘twenty-three’ etc.).
There is evidence from other T-B languages that such a decimal system
in Manange may be a recent innovation, formed under pressure from similar
systems in Indic languages. Tamang (Tamangic), in comparsion, has a semi-
complete vigesimal system, and Dzongkha, a Tibetan dialect and the national
language of Bhutan, has a complete vigesimal system to the fourth power of
the base (Mazaudon 2003).
The decimal system in Manange has probabily been in place for awhile, as
the phonotactic alternations between simple and complex numerals indicate.
For example, the numeral ‘three’ in its bare form is 2sẽ, and in a compounded
form ‘thirty’ is realized with a word-medial bilabial nasal coda (2sumtʃu). A
similar situation is found with 4ʈu ‘six’, which is realized as 4ʈuktʃu ‘sixty’
with a velar plosive in word-medial coda position. Coda consonants are rare
in Manange, due to diachronic erosion of syllable-edges (this diachronic de-
velopment is frequently attested in many other Tibeto-Burman languages),
and these alternations suggest that the lexicalization of these numerals in such
a decimal structure took place at a stage when final codas were still present.
Bodish languages are unlike other T-B languages (e.g. Himalayish) in that
they do not possess a numeral classifier system. Manange seems to have bor-
rowed its single classifier -ta from Nepali. Nepali has a classifier system of
two: -janā for human count nouns: -ʈa for non-human count nouns (Acharya
1991: 100). Urban Mananges (optionally) use a segmentally altered form of
the non-human classifier for both human and non-human count nouns:
(11) Classifier
4
ŋi-t ha 1
kola
two-class child
‘two children’
4
ʃi-t ha 3
pʌle
one-class leg
‘one leg’
In addition to the numerals, one of the interrogative pronouns in Manange

shows matter borrowing (the borrowing of form rather than strategy) from
Nepali: 2puŋ 2kʌtti (egg + many). The loanword 2kʌtti (< Nepali kati ‘few/
some/little bit’) is used in other parts of speech in Manange, for example as
a loan-verb 2kʌtti 1lʌ many do ‘to try’.
6. Clause combining/syntax
There is a some evidence of Nepali contact phenomena in Manange clause

combining strategies. In Bodish languages, one productive way of signaling
Manange 293
adverbial clause linkage is via the use of converbal constructions, whereby

one of two verbs is the matrix verb, and the other (with a non-finite, converbal
marker) verb conveys manner information. This is seen in elicited structures
such as (12) (Hoshi 1986: 301) and in narrative structures (13) (Hildebrandt
2004; 136), where there is a dual reading of sequential actions where the
second action comes about in a causal relationship with the first. The converb
is marked with the clause chaining suffix -tse and the following (resultative)
verb (clause ) takes finite aspect marking.3
(12) Elicitation
1
juŋ 4tsoŋ 1lʌ-tse 2kje 1kʌ-tsi
stone sell do-cc profit come-perf
‘I sold stones and made a profit.’ (or, ‘Because I sold stones, I made
a profit.’)
(13) Narrative
1
u 3ja 2tipal=ko 2ʃʌmlepre 1jʌ-tse 1lʌ-tse
dist yak some=def forget go-cc do-cc
1
kim=ko ʌle 1lʌ-tse ʌtse tẽ 1ʈu 1mi.
3.pl=def seq do-CC like.this then stay evid
‘Having forgotten (about their friends), having done this, those yaks
stayed in the valley.’
Converbal structures similar to the ones above are found in abundance in

other Bodish languages.4
The use of another linker 2ta 3pi-na (lit. ‘what say-adv’) in ‘because’ ad-
verbials in Manange appears to be a structural calque of Nepali kina bhane
‘because’ (lit. ‘why say’) (Hildebrandt 2004: 110):
(14) With 2ta 3pi-na
1 2
ŋʌ=tse kristin=ri taŋ 1pin-tsi 2ta 3pi-nʌ
1.sg=erg Kristine=loc gift give-perf what say-adv
4 1
nese ki 2manaŋ=ri 1jʌ-pʌ-ro.
tomorrow 3.sg Manang=loc go-nom-rep
‘I gave Kristine a gift because (it is said) she will go to Manang to-
morrow.’
Mananges also make use of Nepali word/phrasal and clausal coordinators
ra and ani. In Manange, words and phrases can be coordinated via (Bodish)
tẽ or via Nepali ra:
(15) Borrowing of tẽ (Hildebrandt 2004: 109)

1 3
pʌ=ko tẽ pje=ko.
husband=def conj wife=def
‘the husband and the wife.’
(16) Borrowing of ra (adapted from Hildebrandt 2004: 78)

1 4 4
nʌkju ɾʌ ʃi polpʌ=ɾi 1mo 1mu.
dog CONJ one frog=indef cop evid
‘There was a dog and a frog.’
Clause coordination can be signaled via clause chaining (-tse) or by jux-

taposing the first clause, along with the final, bare verb before the second
coordinated clause, as (17) shows (unpublished text data):
(17) Bare verb clause coordination

tjʌpʌ 2prĩ 4ŋi ra 1nẽ 1ten-tsi.
yeast put two day alone put/leave-perf
‘I put the yeast in (the mash) and left it alone for two days.’
One additional strategy of clausal coordination found only in conversation

data is with the Nepali clause coordinator ani. Example (18) is taken from
a conversation between two people who are talking about making home-
brewed beer:
(18) Clause coordination with ani

Grandma: 2taŋ=ko 1kʌ-tsi nʌ 1a-tʌ?
smell=def come-perf or neg-become?
‘Did (an alcohol) smell come or not?’
Auntie: 1a-tʌ ani 2
pe=ko
neg-become and.then beer=def
3 2
naŋ=ri tsaŋ-tsi.
inside=loc fill-perf
‘No, it didn’t, and then I put the beer inside (of a pot).’
In (18) the clause coordinator follows a bare verb ‘become’ and precedes a
clause with an inflected verb. Likely the verb is bare because it is negated. In
non-negated clauses, ʌni can conjoin finite clauses. In such cases, this would
be a case of both matter and pattern borrowing. It is matter borrowing in
that the coordinating form from Nepali is used. It is also pattern borrowing
Manange 295
in that the clauses on both sides of the conjunction may be finite, independ-
ent clauses (as in Nepali), whereas with other clausal coordination strategies
in Manange the verb of the first clause is always non-finite (either bare or
marked with -tse converb/clause chaining morphology).
7. Lexicon
In a study on loanwords in Manange, Hildebrandt (2007) has found that ap-

proximately 12 percent of the ‘lexicon’ included in this study show some evi-
dence of being borrowed. Of these, the vast majority, approximately 93 per-
cent, come from Nepali, with the remaining loans from English or from other
Tibeto-Burman languages.5 Most Nepali loans are nouns (84%), and with a
small amount of verbs, property concepts and function words rounding out
the list. Of the nouns, most belong to semantic fields like ‘animals’, ‘clothing/
grooming’ and ‘the modern world’, and (19) lists some examples of these.
(19) Nepali loanwords in Manange6
Manange Nepali Meaning
1
tuŋ 2ŋ dumsi ‘porcupine’
1
poro parewa ‘pigeon/dove’
1
gohi gohi ‘crocodile’
kutti lamkhuʈʈe ‘mosquito’
golbera golbheɽa ‘tomato’
makai makai ‘corn’
kʌrila kaĩɽo ‘gourd’
2
cak cakk ‘buttocks’
2
ʃakre shakti ‘brain/mind’
Borrowed property concepts (both true adjectives and verb-like adjectives)

include those listed in (20)
(20) Borrowed property concepts
alo suntala ‘potato + orange/fruit’ ‘orange’
2
mi 2kʌtti kati = emphatic ‘furry’
sitʌri sittei ‘free/no charge’
sita si:dha ‘straight (path)’
Even though verbs constitute a smaller amount of Nepali loans, they are
somewhat more interesting structurally, as there are different strategies em-
ployed to enter them into Manange inflectional morphology. Some Nepali
loans take a ‘dummy’ root (2ti), which itself can host aspect and modality
morphology. Examples are shown in (21), where the bold-faced element is
the Nepali loan-verb root.
(21) Borrowed Verbs with 2ti 7

rok 2ti-pʌ rok-nu ‘to forbid’
tʃuk 2ti-pʌ tʃukyau-nu ‘to separate’
kelai 2ti-pʌ kelau-nu ‘to sift/clean’
Other loans from Nepali are the first element in a verbal compound struc-
ture where the second piece is one of a small set of (native) semantically
empty verbs (e.g. 1lʌ ‘do’; 2prĩ ‘hit/put/affect’). The second verb hosts the
aspect/modality morphology:
(22) Borrowed Verbs with 1lʌ or 2prĩ
1
hai 1lʌpʌ hai aaunu ‘to yawn’
3
pu 1lʌ-pʌ puknu ‘to whistle’
poke 2prim-pʌ ‘untie’ ‘to tie’
The compounding process per-se is not borrowed from Nepali; in fact, ver-
bal compounding is quite typical of languages throughout South Asia (and
beyond).
One ongoing question is what the motivation is behind the different Man-
ange verb-words. One possible answer is that the verb 1lʌ ‘do’ occurs with
loan-verbs that are transitive in Manange (where the A argument is ergative-
marked). The verb ‘do’ does in fact function as a periphrastic causative mark-
er in Manange (Hildebrandt 2004: 106107). Along these lines, a number
of the Nepali source verbs in this database have the diphthong portion aau
in the stem. Acharya (1991: 167168) notes that the derivational morpheme
-aau- in Nepali derives a base verb form into a causative or ergative form (e.g.
bannu ‘to become’ > banaaunu ‘to cause someone or something to become,
to make’). In the Manange loan-verb cases, however, it is not obvious that the
aau marker in the Nepali source forms is functioning as a causative marker
here, as these Nepali verbs have no non-causative, (i.e. non-aau) counterparts
Manange 297
(i.e. searches through Turner’s dictionary have not revealed forms like se-
mantically related, but non-transitive patnu or kelnu). In addition, some bor-
rowed verbs (like ‘drown’ and ‘yawn’) are not transitive in Manange, as there
is no ergative-marked A argument. As such, it is currently not clear what
motivates the different verb-words in these constructions. Additional work in
this area of loanword integration could yield clearer patterns and functions of
these verb-words.
8. Conclusion
The relatively long period of (both punctuated and regular) contact between
Nepali and Manange has resulted in a number of structural changes to Man-
ange. Some of these changes can be considered pattern borrowings, whereby
a strategy is modeled on Nepali (e.g. the lack of a negation-finiteness depend-
ency on verbs). Other changes are matter borrowings, where a form from
Nepali is incorporated into Manange (e.g. loanwords, phrasal and clausal
conjunction, the numeral classifier). Still other changes are not clearly bor-
rowing at all, but rather structural loss or pattern (over-) generalization in
the urban community of speakers, perhaps due to infrequent and interrupt-
ed access to Manange in a scenario of asymmetrical bilingual maintenance
(e.g. the tone merger, the lack of a split-ergative pattern). An obvious next
step in the documentation of contact-induced change is to more systemat-
ically note the ways in which the Nepali of both urban and rural Mananges
may be altered. Such cases of Tibeto-Burman substratal influence on Nepali,
while not regularly recorded, have been noted previously (e.g. Genetti 1999;
Bickel 2001).
Abbreviations
Superscript numerals indicate the tone class classifier

category membership of the adjacent conj conjunction
word. cop copula
1 first person dat dative
2 second person def definite
3 third person erg ergative
adv adverbial evid evidential
caus causitive fem feminine
cc clause chainer imm immediate (irrealis)
indef indefinite pres present

loc locative rep reported speech
neg negative seq seqiential
nom nominalizer sg singular
perf perfective
Notes
1. I would like to thank the Manange community for its ongoing assistance with
my ongoing study of their language. I wish to also thank Michael Noonan and
Balthasar Bickel for feedback and advice on this account. All errors are my own.
2. This strategy was not confirmed by me as a regular pattern until a 2004 fieldtrip,
and these examples are from my field notes. The ‘urban’ pattern (negated verbs
inflected for aspect) is also observed by Hoshi (2006) with a speaker from Prakaa
Manang.
3. I have adapted Hoshi’s transcription of segments and tones to fit with other Man-
ange examples.
4. Yet another “Bodish” way to signal causation is via nominalization of the clause
of causation (Hildebrandt 2004: 118)
1
ŋʌ=tse 4mwi 3kjʌ=ri 1pim-pʌ 3kjʌ 3kola 3kju-pʌ.
1.sg=erg money 2.sg=loc give-nom 2.sg dress buy-nom
‘Because I gave you money, you will buy a dress’
5. This percentage is from a database list of approximately 1,100 meanings. This
study stands as a contribution to the Loanword Typology Project, organized by
Martin Haspelmath and Uri Tadmoor at the Max Planck Institute for Evolution-
ary Anthropology (http://www.eva.mpg.de/lingua/files/lwt.html).
6. Some loans fall into the existing tone system (usually either the low level /1/ or
high level /2/ tone), and words without a tone numeral mean that the tone status/
features are not yet established.
7. The au portion of the Nepali verb is the causative affix. The 2ti morpheme occurs
frequently (but not exclusively) with causative-marked loan-verbs from Nepali.
References
Acharya, Jayaraj
1991 A Descriptive Grammar of Nepali and An Analyzed Corpus. Washing-
ton, DC: Georgetown University Press.
Bickel, Balthasar
1998 Rhythm and feet in Belhare morphology. Rutgers Optimality Archive
Working Paper No. 287.
Manange 299
2001 The Tibeto-Burman substratum of Nepali (Indo-Aryan). Substrate

Workshop, Leipzig Germany.
2003 Prosodic tautomorphemicity in Sino-Tibetan. In: David R. Bradley,
Randy LaPolla, Boyd Michailovsky and Graham Thurgood (eds.),
Language Variation: Papers on Variation and Change in the Sino-
sphere and in the Indosphere in Honour of James A. Matisoff, 8999.
Bradley, David R.
1997 Tibeto-Burman languages and classification. In: David Bradley (ed.),
Tibeto-Burman Languages of the Himalayas (Papers in Southeast
Asian Linguistics No. 14, Pacific Linguistics Series A86), 171. Can-
berra: Australian National University.
DeLancey, Scott
1989 Verb agreement in Proto-Tibeto-Burman. Bulletin of the School of Ori-
ental and African Studies 51 (2): 315333.
Genetti, Carol
1999 Variation in agreement in the Nepali finite verb. In: Yogendra P. Yadava
and Warren W. Glover (eds.), Topics in Nepalese Linguistics, 542–
555. Kalimadi, Kathmandu: Royal Nepal Academy.
Genetti, Carol, and Kristine A. Hildebrandt
2004 The two adjective classes in Manange. In: R. M. W. Dixon and Alex-
andra Y. Aikhenvald (eds.), Adjective Classes, 7496. Oxford: Oxford
University Press.
Hildebrandt, Kristine A.
2003 Manange tone: Scenarios of retention and loss in two communities.
Linguistics, University of California.
2004 A grammar and glossary of the Manange language. In: Carol Genetti
(ed.) Tibeto-Burman Languages of Nepal: Manange and Sherpa, 241.
2006 Maintenance and merger in Manange. Manuscript.
2007 Loanwords in Manange, a Tibeto-Burman Language of Nepal. Manu-
script.
Hoshi, Michiyo
1986 An outline of the Prakaa grammar: A dialect of the Manang language.
187317. Tokyo: Institute for the Study of Languages and Cultures of
Asia and Africa.
Kincaid, M. Dale
1991 The decline of native languages in Canada. In: R. H. Robins and E. M.
Uhlenbeck (eds.), Endangered Languages, 157176. Oxford and New
York: Berg.
Masica, Colin P.
2001 The definition and significance of linguistic areas: Methods, pitfalls
and possibilities (with special reference to the validity of South Asia
as a linguistic area). In: Rajendra Singh (ed.), Yearbook of South Asian
languages and Linguistics, 205267. London/New Delhi: SAGE.
Mazaudon, Martine
2003 Les principes de construction du nombre dans les langues Tibéto-
birmanes. In: J. François (ed.), La pluralité, 91119. Paris: Société de
Linguistique de Paris.
Noonan, Michael
2003 Recent language contact in the Nepal Himalaya. In: David Bradley,
Randy LaPolla, Boyd Michailovsky and Graham Thurgood (eds.),
Language Variation: Papers on Variation and Change in the Sino-
sphere and in the Indosphere in Honour of James A. Matisoff, 6588.
Rogers, Clint
2004 Secrets of Manang. Kantipath, Kathmandu: Mandala Publications.
Snellgrove, David L.
1961 Himalayan Pilgrimage: A Study of Tibetan Religion by a Traveller
Through Western Nepal. Boston: Shambhala.
van Driem, George
2001 Languages of the Himalayas: An Ethnolinguistic Handbook of the
Himalayan Region, Containing An Introduction to the Symbiotic The-
ory of Language. Leiden: Brill.
Grammatical borrowing in Indonesian
Uri Tadmor
1. General1
1.1. Classification
Malay-Indonesian is a member of the Malayic subgroup of Western Malayo-

Polynesian, a branch of the Austronesian language family. Malayic languages
are spoken throughout the Malay-Indonesian archipelago, and similar forms
of standard Malay-Indonesian serve as the national languages of Indonesia,
Malaysia, Brunei, and Singapore. The number of speakers of Malay-Indone-
sian is nearly 250 million, making it by far the most widely spoken language
in Southeast Asia.
The name “Indonesian” (or Bahasa Indonesia, ‘the Indonesian language’)
refers to standard Malay as used in Indonesia, and also to regional varieties
of the language which have been developing throughout the country in recent
decades.2 It would not be possible to discuss all varieties of Indonesian here,
not only because of space limitations, but also because many of them are
poorly documented. Therefore the discussion will focus mostly on standard
Indonesian, the most widely used variety of Malay-Indonesian. When rele-
vant to the discussion, a few other varieties will be mentioned and specifically
noted as such.3
1.2. Sociolinguistic position
The vast majority of Indonesians know at least some Indonesian, and most use
it on a regular basis. However, the standard language is not acquired as a first
language. Where Indonesian is used as a home language, it is invariably in the
form of a local colloquial variety. In this sense, all speakers of standard Indo-
nesian are at least bilingual or bidialectal.4 Children acquire the standard lan-
guage early on from its use on television and in school. In recent years the Ja-
karta dialect has been making inroads into areas that have previously been the
sole domain of standard Indonesian. It is also used in youth magazines, on the
Internet, in text messaging, and recently in advertisements as well. However,
302 Uri Tadmor
for any formal purposes standard Indonesian is still the overwhelming choice.
Indonesian is the sole official language of Indonesia, and is used in all
government communication, both oral and written. Practically all published
work (books, newspapers, magazines) is in Indonesian, as are most product
markings and instructions, public signs, and even personal letters. Spoken
Indonesian is widely used as a lingua franca among people who belong to
different ethnolinguistic groups.
1.3. Sources of foreign influence on Malay-Indonesian
The earliest foreign language known to have been in significant contact with
Malay-Indonesian was Sanskrit. The oldest Malay inscriptions (7th c. AD)
are intertwined with Sanskrit texts, and even the Malay sections contain
many Sanskrit loanwords. However, these inscriptions are too few and too
fragmentary to reach any definite conclusions about grammatical borrowing.
Sanskrit continued to be used in the Malay-speaking world for centuries as
a literary and liturgical language, but gradually disappeared from use after
the introduction of Islam. Newer Indo-Aryan languages such as Hindi–Urdu
as well as Dravidian languages such as Tamil, brought to the archipelago by
merchants, missionaries, and immigrants, were also in contact with Malay-
Indonesian, and have left traces in the form of numerous loanwords.
Chinese pilgrims and traders have been visiting Indonesia for well over
a thousand years, and Chinese communities have also existed throughout
the archipelago for many centuries. Various southern Chinese languages are
spoken in Indonesia, and have influenced colloquial varieties of Indonesian,
although in the standard language their influence has been limited and purely
lexical.
Traders from the Near East first arrived in Indonesia during the second
half of the first millennium ad. Eventually the Arabic and Persian languages
were to have a strong impact on Malay-Indonesian. However, this did not take
place until several centuries later, when local inhabitants began converting
to Islam. The influence of Arabic has been especially strong, in the form of
a large number of loanwords. Most did not enter the language from spoken
Arabic, but rather from Arabic literature (as well as from Persian and Indian
literature, where Arabic loanwords abound). Because many religious and
other texts were translated into Malay-Indonesian from Arabic, sometimes
word for word, Arabic has also had some grammatical influence on Malay-
Indonesian.
Indonesian 303
The earliest Europeans with a substantial presence in Indonesia were the

Portuguese, who first arrived in the first half of the sixteenth century. There
are numerous Portuguese loanwords in Indonesian, most originating from
creolized varieties rather than from metropolitan Portuguese. Some collo-
quial varieties of Malay-Indonesian underwent structural interference from
these creoles, but the structure of the standard language was not affected.
The next Europeans to appear on the scene were the Dutch, who sent their
first expedition in 1598. They eventually came to control all of present-day
Indonesia, and remained until the mid-twentieth century. The use of Dutch in
Indonesia was rather limited and only a fraction of the population ever gained
any fluency in the language. Nevertheless, since Dutch-speaking Indonesians
formed to the influencial elite, Dutch had a strong impact on the Indonesian
lexicon, and a lesser one on its grammar as well.
Following independence, English quickly replaced Dutch as the most
widely studied European language in Indonesia. Although English instruc-
tion in general has not been very successful, many members of the upper
classes have a good command of the language, which they acquire in elite
schools as well as abroad. Indeed, code-switching between English and In-
donesian has become common among well educated Indonesians. English is
also heard daily on television and in movie theaters, so most Indonesians have
at least some exposure to it.
In addition to coming in contact with foreign languages, Malay-Indone-
sian has been in contact with hundreds of local languages, principally via its
role as a lingua franca throughout the archipelago. The most influential of
these local languages overall was Javanese, which has existed in a state of
quasi-symbiosis with Malay-Indonesian for over a millennium. Today, na-
tive speakers of Javanese form the largest group of speakers of Indonesian.
Another language that has had some influence on Standard Indonesian is
Minangkabau, a Malayic language of western Sumatra. Many Indonesian
authors and educators, especially during the early formative years of modern
standard Indonesian, were native speakers of Minangkabau. Sundanese, spo-
ken in western Java, has had a strong impact on the nearby Jakarta dialect,
and through it on the standard language as well.
It is important to note that many borrowed features and words – especially
the oldest and best-integrated ones – entered the language not via widespread
bilingualism, but rather through written literary languages used by small mi-
norities. Such changes affected the language of the elite first, before slowly
spreading to the general community.
304 Uri Tadmor
2. Phonology5
The structural domain arguably most affected by contact has been the phon-
ology. Contact-induced phonological changes were introduced principally
via two means. Many words borrowed into Malay-Indonesian contained
sounds and sound combinations previously unknown in the language. Ini-
tially, loanwords were assimilated to the existing phonological structure, but
when borrowing from a particular language was extensive (as was the case
with words from Sanskrit, Arabic, and Dutch), this eventually led to changes
in phonotactics, and even to the introduction of new phonemes. In addition,
the widespread use of Malay-Indonesian as a second language has played
an important role in phonological interference, as speakers transferred fea-
tures from the phonology of their native languages into Malay-Indonesian.
This type of interference was much stronger in Indonesia than in Malaysia,
since the great majority of Indonesians did not speak Malay historically.
The most important language in this category is Sundanese.
2.1. Consonants
The inherited phoneme inventory (that is, Indonesian phonemes which re-
flect Proto Malayic phonemes) includes 16 consonants: /b, d, Ô, g, p, t, c, k,
m, n, ɲ, ŋ, l, r, s, h/. Indonesian orthography – which is used in this chapter
for citing Indonesian words – represents these consonants with characters
identical to their IPA equivalents, with three exceptions: the voiced palatal
stop /Ô/ is spelled j, the palatal nasal /ɲ/ is spelled ny, and the velar nasal /ŋ/
is spelled ng.
The glottal stop was not yet a phoneme in Proto Malayic, and even in
modern Indonesian it is only marginally distinctive. A phonetic glottal stop
may be inserted before word-initial vowels after pause. In addition, in medial
position there are words in which it clearly phonemic. These are all loan-
words, chiefly from Arabic. After a consonant it occurs, for example, in the
word Jumat [jumʔat] ‘Friday’, in contrast with words such as rumah [rumah]
‘house’; after a vowel, in the word syair [çaʔir] ‘prayer’, in contrast with
words such as kain [kain] ‘cloth’.6 In final position it occurs in a few kinship
terms, e.g. (bapak [bapaʔ] ‘father’, kakak [kakaʔ] ‘elder sibling’) and excla-
mations (tidak [tidaʔ] ‘no!, not’,7 masa [masaʔ] ‘really?!’), which contrast
with the large number of words which end in vowels. In such words the glot-
tal stop originated in an exclamatory intonation that involved glottalization.
Indonesian 305
In the case of kinship terms, its origin would be vocative forms, which are
exclamatory by definition.8 Other than in these words, final glottal stop hardly
occurs in Indonesian.9 The phonemicization of the glottal stop is therefore the
result of a combination of internal and external factors.
Two other phones which probably phonemicized under the influence of
borrowing were the glides w and y. In Proto Malayic, w and y never contrast-
ed with u and i, respectively, and are thus best analyzed as underlying vowels
which undergo gliding in certain environments. In early Sanskrit loanwords,
Sanskrit v ([w])10 was represented by Malay b in syllable initial position, e.g.
baca ‘read’ (< vaca ‘speaking’), and by u in other positions, e.g. suara ‘sound,
voice’ (< svara ‘sound’). Sanskrit y was represented by j in initial position,
e.g. jasa ‘meritorious service’ (< yaśas ‘honor’), and by i in other positions,
e.g. setia ‘loyal’ (< satya ‘true, faithful’). In later Sanskrit loanwords, how-
ever, there is no longer an assimilation of semivowels, and they are left intact,
even in syllable-initial position: wanita ‘woman’ (< Sanskrit vanita), bahaya
‘danger’ (< Sanskrit bhaya). This pattern is repeated in loanwords from other
languages as well: the earlier the loan, the greater the chance that the semi-
vowels would be represented by a stop. Initial glides are now found in a few
native words, such as ya ‘yes’ (from ia ‘3sg’) and yang ‘relative pronoun’
(from ia '3sg' + ng 'ligature'). Malaysian also has yu ‘shark’ (from the earlier
hiu, still used in Indonesian) and yuran ‘fee’ (cf. Indonesian iuran). Unas-
similated semivowels also occur in a large number of later loanwords from
languages other than Sanskrit, especially Arabic.
In addition to phones which phonemicized under the influence of borrow-
ing, modern Indonesian also has several consonant phonemes which were
borrowed outright. All loan phonemes consist of fricatives, which is not sur-
prising, considering that Proto Malayic was very poor in fricatives (the only
fricative phonemes were *h and *s).
The labiodental f is now used by most Indonesians, and certainly forms
part of the phoneme inventory of standard Indonesian. It probably first en-
tered the language via Dutch loanwords, such as famili ‘relatives’ (< familie
‘family’) and filem ‘film, movie’ (< film). This phoneme is also present in
many Arabic loanwords, but initially Arabic f was represented by p as in
pikir ‘think’ (< Arabic fikr ‘thinking, cognition’) and peduli ‘caring’ (< Arabic
fuḍūlī ‘inquisitive, busybody’). In Dutch loanwords, too, f was initially rep-
resented by p; the words for ‘relatives’ and ‘film’ cited above were realized
as [pamili] and [piləm], respectively, when first borrowed. The addition of /f/
to the phoneme inventory also added a new place of articulation. Before its
incorporation, Indonesian had a series of bilabials, but no labiodentals.
306 Uri Tadmor
The loan phoneme /ç/, spelled sy, first emerged as the equivalent of Arabic
ʃ in loanwords, for example in the words syair ‘poem’ (< Arabic ša‘ir) and
syukur ‘thank (God)’ (< Arabic šukr). Before then, ʃ in Arabic loanwords was
represented by s, as it still is in some well-established older loanwords like
serikat ‘union’ (< Arabic širkat-). Indeed, the words for ‘poem’ and ‘thank
(God)’ just cited are still realized sair and sukur by many speakers. Later,
sy was also used to represent ʃ in English loanwords, e.g. syuting ‘shooting,
filming’, syok ‘shock’.
Another loan phoneme is /z/. Initially it was represented by j in loanwords
from Arabic, for example jiarah ‘to visit a grave or a holy site’ (< Arabic
ziyārah ‘a visit’), jaman ‘time, period’ (< Arabic zamān ‘time’). Such pro-
nunciations persist, but many educated speakers now realize these words as
ziarah and zaman, respectively. In European loanwords, z was initially rep-
resented by s as in bensin ‘gasoline’ (< Dutch benzine), even if the original
orthography with z was maintained as in zébra [sebra] (< Dutch zebra). In-
creasingly, z is retained unassimilated in European loanwords too (now ori-
ginating mostly from English).
The loan phoneme /x/ has had a similar history. Initially, the voiceless
velar fricative x in Arabic loanwords was represented by k, as in kabar ‘news’
(< Arabic xabar) and Kamis ‘Thursday’ (< Arabic [yaum al]-xamīs). Later,
representing x with h became the norm, although the pronunciation k still per-
sists as well. Some speakers realize Arabic x in loanwords with a spelling pro-
nunciation – that is , as a sequence of k and h. Finally, people with an Islamic
educational background often preserve the original sound [x]. Thus the word
akhir ‘end’ (< Arabic āxir) is pronounced variously as [akir], [ahir], [akhir],
and [axir]. Dutch also has the phoneme x, and some speakers with a Dutch
educational background (or from groups strongly influenced by Dutch) retain
x in Dutch loanwords in Indonesian, e.g. [spaxeti] ‘spaghetti’, [bioloxi] ‘biol-
ogy’, [texəl] ‘tile’.
Finally, /v/ has been claimed as a (loan) phoneme by some authorities, for
example the official dictionary of Indonesian (KBBI 2002). However, ortho-
graphic v is invariably realized as f or p in Indonesian. (The phoneme /v/ does,
however, occur in English loanwords in Malaysian.)
2.2. Vowels
The Indonesian vowel system includes six vowels: /a, e, i, o, u, ə/. Indonesian
orthography represents these vowels in a straightforward manner, with the
Indonesian 307
exception of /e/ and /ə/, which are both spelled /e/. In this chapter, the two are
distinguished: /ə/ is spelled e, and /e/ is spelled é.
The schwa – realized as a mid central vowel in modern Indonesian – is re-
constructed in Proto Malayic only between consonants,11 in sharp contrast to
all other vowels, which occur at word edges as well as before and after other
vowels. This casts a strong doubt on its historic status as a phoneme. More-
over, even its original phonetic nature is not clear (in all probability it was
realized as a short a). Therefore it is possible to analyze schwa in premodern
Malay-Indonesian as phonetically inserted (epenthetic), later undegoing pho-
nemicization under the influence of Dutch loanwords, in which a mid-central
vowel can occur word finally, as in halte bis ([haltə bəs/bis] ‘bus stop’.12 This
phonemicization has affected native vocabulary as well. Schwa now occurs
at the end of the very common word ke ‘to’ (a clitic, from Proto Malayic
*ka) and also before other vowels, but only at morpheme boundaries), as in
keindahan ‘beauty’ (from indah ‘beautiful’ + the noun-forming circumfix
ke-an).
The process lying behind the emergence of the vowel phonemes é and o is
not fully understood process, but it was probably influenced by language con-
tact. These vowels did not occur (at least not as phonemes) in Proto Malayic,
and almost all occurrences of é and o in inherited Indonesian morphemes
reflect *i and *u, respectively. However, no satisfactory phonological explan-
ation for this phoneme split has been put forward, and counterexamples can
be found for any hypothetical conditioning environment. One possibility is
that the phonemicization was caused by random over-distinction of phonetic
lowering by second-language speakers of Malay, who were transferring into
it the phonemic distinction of their native language.
2.3. Phonotactics
The syllable structure of Indonesian has been profoundly affected by borrow-

ing. In Proto Malayic, the syllable shape was (C)V(C). There were no true
consonant clusters, i.e. sequences of consonants at syllable edges. The only
permissible sequences of two consonants ran across syllable boundaries, and
even then they were restricted to homorganic nasal clusters13 (for example in
*bantu ‘help’, *tanda ‘sign’, *jumpa ‘meet’) or /r/ followed by another con-
sonant (as in *terbang ‘fly’, *pergi ‘go’). Due to massive lexical borrowing,
Indonesian now allows two or even three consonants in the onset and coda, as
in the following examples: pré.tél ‘dismantle’ (< Javanese prétél), stro.bé.ri
308 Uri Tadmor
‘strawberry’ (< English strawberry), fals ‘(to sing) off key’ (< Dutch fals),
korps ‘corps’ (< Dutch korps).
Other phonotactic constraints of Indonesian have also undergone change
due to contact. As just mentioned, sequences of nasal+oral consonants in
Proto Malayic were homorganic. Loanwords were initially assimilated to
this pattern, as in mungkin ‘maybe’ (< Arabic mumkin) and ingkar ‘repudi-
ate’ (< Arabic inkār). However, due to borrowing, modern Indonesian allows
heterorganic nasal-oral sequences in newer loanwords, as in angpao ‘gift
envelope’ (< Hokkien âŋ-pao) and tanpa ‘without’ (< Javanese tanpa). Non-
homorganic nasal-oral sequences can also be found in newly coined words,
as in the clipped forms amdal ‘environmental impact study’ (from analisa
mengenai dampak lingkungan) and Ménhankam ‘Minister of Defense’ (from
menteri pertahanan dan keamanan).
3. Typology
Historically, Malay-Indonesian was an agglutinative language, a fact still re-

flected in the numerous affixes and clitics of the standard language. Collo-
quial varieties are on the whole much more isolating, and use far less produc-
tive affixation. Some do not even use the verbal active and passive prefixes
meng- and di-, which are the most commonly used affixes in standard Malay-
Indonesian. With the decreased use of affixation and the increased use of
roots as free morphemes, the distinction between different word classes has
also been eroding. These processes, while most evident in colloquial var-
ieties, have also been affecting standard varieties.
Although internal factors may be partly responsible for these changes,
external factors were also involved. A decrease in morphological complex-
ity is an integral part of the process of pidginization, and indeed pidginized
varieties of Malay-Indonesian exhibit a sharply reduced productive morph-
ology. Such varieties have served as the basis for creolized varieties of Ma-
lay-Indonesian, which in turn have been influencing the standard language.
The fact that Malay has served as a regional lingua franca for over a millen-
nium is also part of the reason why its morphosyntactic structure has lost
some of its complexity, since it involved widespread and often imperfect
second-language acquisition. It should also be noted that the vast majority of
current speakers of Malay-Indonesian have acquired it as a second language
(even if eventually it may have become their dominant language). A process
of language shift has been having similar affects, as increasing numbers of
Indonesian 309
speakers have been abandoning their ancestral home languages and switch-
ing to Malay-Indonesian as their home language instead. This long historic-
al process received an impetus in the middle of the twentieth century, when
Malay-Indonesian became the national language of newly independent Indo-
nesia and Malaysia. Other changes observed in the history (and prehistory)
of Malay-Indonesian, such as the increased frequency of SVO sentences,
may also be related to this historical process. Contact-induced processes of
simplification in various standard languages, including in standard Malay-
Indonesian, are discussed in detail in McWhorter 2007.
In addition to the general simplifying and markedness-reducing affects
associated with pidginization, use as lingua franca, widespread second lan-
guage acquisition, and language shift, the typology of Indonesian has also
undergone more specific changes due to contact. In an insightful yet largely
overlooked paper,14 Becker and Umar (1980: 97) discussed syntactic change
in Indonesian. They observed ‘a general systemic change in Indonesian which
has been going on for a long time: the change from a focus system of topi-
calization to a subject system of topicalization.’ Several types of change are
mentioned, some of which are discussed in this chapter. Becker and Umar
conclude that ‘[t]he kinds of syntactic changes we are observing in Indone-
sian may be among the most important impositions of the colonial – and neo-
colonial – period [...]’ (1980: 100). This may be an overstatement, but there
is little doubt that these changes they discussed have been taking place, and
that language contact is at least partially responsible for them.
4. Nominal structure
4.1. Case marking and prepositions
Indonesian has overt case marking only in the pronominal system, and on a
very limited scale at that. Classical Malay had three enclitic pronouns which
could roughly be described as having a genitive function: -ku (1sg), -mu (2),
and -nya (3sg). In addition to expressing possession, they were also used
with certain prepositions. These clitics were the counterparts of the free mor-
phemes aku (1sg), engkau (2sg), kamu (2pl), and ia (3sg). The genitive use
of these clitics persists in modern Indonesian: rumah-ku 'my house', untuk-
ku ‘for me’, rumah-mu 'your house', untuk-mu ‘for you’, rumah-nya 'his/
her/its house', untuk-nya ‘for him/her/it’. However, these enclitics have also
developed an accusative function, which they did not have in early Classical
310 Uri Tadmor
Malay: me-lihat-ku 'to see me', me-lihat-mu 'to see you', me-lihat-nya 'to
see him/her/it'. The probable source of this change is to be found in literally
translated Arabic religious texts. Arabic has a complete set of pronominal
suffixes, which have both accusative and genitive functions.15 These suffixes
were translated into Malay using pronominal clitics, which had previously
been used only with a genitive function.
An interesting case of genitive marking that owes its development to lan-
guage contact is the so-called possessive -nya construction. The enclitic -nya,
as just seen above, is the oblique counterpart of the third-person pronoun ia/
dia. It also fills various other functions, one of which is marking the head in
possessive construction. This is patterned after a similar construction in Sun-
danese and Javanese. For discussion and examples, see Section 4.4.
There are no clear examples of prepositions that were borrowed into
Malay-Indonesian as such. However, several words which were borrowed
into Indonesian as content words were later grammaticalized as prepositions.
The colloquial preposition sama ‘with’ is derived from Sanskrit sama ‘equal’.
In early Classical Malay, sama only meant ‘same (as)’. However, some var-
ieties of Bazaar Malay, while maintaining its original sense of ‘same’, also
developed into a preposition with a basic comitative function (‘with’), replac-
ing the Malay comitative preposition dengan, but also filling various other
functions.16 The comitative function of sama was then transferred into the
standard language in the form bersama (with the literary prefix ber-), mean-
ing ‘(together) with’.
The function of some inherited prepositions has been extended under the
influence of similar prepositions in other languages. The benefactive prep-
osition untuk ‘for’ is now also used in certain temporal expressions, such as
untuk selama-lamanya ‘forever and ever’, untuk pertama kalinya ‘for the first
time’. This is probably under the influence of the Dutch benefactive prepos-
ition voor ‘for’, which can also mark temporal clauses, such as voor eeuwig
en altijd ‘for ever and ever’, voor het eerst/voor de eerste keer ‘for the first
time’.
Another preposition whose function has been influenced by Dutch is dari
‘from’. In modern Indonesian the original ablative function of this prep-
osition has been extended to include a genitive function, as in rumah dari
Presiden ‘the President’s house’ and daftar dari kata-kata. This extension of
meaning, considered ungrammatical by purists but nevertheless widely used,
is patterned after the Dutch ablative preposition van, which can also express
possession, as in het huis van de President ‘the President’s house’ and een
lijst van woorden ‘a list of words’.
Indonesian 311
The marking of spatial relations is remarkably free of borrowing. A nota-

ble exception is antara ‘between’, from Sanskrit antara ‘intermediate space
or time’. It has been claimed that pada, a preposition with a variety of func-
tions, is derived from Sanskrit pada ‘foot’ or the related word pāda ‘site, pos-
ition’, but the evidence is not altogether convincing (Gonda 1952: 396).
Finally, the Arabic vocative particle yā has been borrowed into Indonesian
as ya. It was first borrowed as part of fixed expressions such as Ya Allah! ‘My
God!’ (< Arabic yā aḷḷāh) and Ya Robbi! ‘Oh my Lord!’ (< Arabic yā rabbī).
Later, ya came to be used with native words, as in Ya Tuhan! ‘Oh my God!’
and Ya Ampun! ‘Oh my goodness!’
4.2. Sex and gender
There is no gender marking per se in Malay-Indonesian, and even natural

sex distinction was not lexicalized in early Malay, except for a few pairs of
basic kinship terms such as father/mother and husband/wife. However, due
to borrowing, there are now quite a few words which have distinct male and
female forms.
Some Sanskrit loanwords in Indonesian come in pairs, with male forms
ending in -a and female forms ending in -i (female): putra ‘son’ : putri
‘daughter’, siswa ‘male high school student’ : siswi ‘female high school stu-
dent’, saudara ‘male sibling or relative’ : saudari ‘female sibling or relative’;
dewa ‘god’ : dewi ‘goddess’. This pattern has been extended to the native
word pemuda, which originally meant ‘young person’, but now means ‘young
man’, because the final -a was reinterpreted as a male ending. A new word
pemudi ‘young woman’ was created as its female counterpart by false ana-
logy. Another pair of Sanskrit-derived suffixes with distinct male and female
forms is -wan/-man and -wati, which respectively denote male and female
habitual agents, e.g. karyawan ‘(male) worker’ : karyawati ‘(female) worker’,
wartawan ‘(male) journalist : wartawati ‘(female) journalist’. The bases of
these words themselves are also derived from Sanskrit, but these suffixes can
be used with non-Sanskrit bases, as in ilmuwan ‘scientist’ (ilmu ‘knowledge,
science’, < Classical Arabic [al]-‘ilmu ‘[the] knowledge’).
Some loanwords from Arabic also distinguish between male and female
forms. These consist mostly of terms related to Islam such as soleh ‘pious
(Muslim man)’ : solehah ‘pious (Muslim woman)’, haji ‘a man who has made
the pilgrimage to Mecca’ : hajjah ‘a woman pilgrim’. Some more examples
are provided in Section 4.3.
312 Uri Tadmor
In most loanwords with distinct male and female forms, the female form is
not in common use, and the male form in fact serves as the unmarked mem-
ber (which can refer to females as well, especially colloquially). However, in
some common kinship terms borrowed from Dutch, the male–female distinc-
tion is strictly maintained in Indonesian. Such terms include mama ‘mother’
(< Dutch mamma) and papa ‘father’ (< Dutch pappa); tante ‘aunt’ (< Dutch
tante) and om/oom ‘uncle’ (< Dutch oom); oma ‘grandmother’ (< Dutch oma)
and opa ‘grandfather’ (< Dutch opa).
4.3. Number and numerals
In Proto Malayic, the only set of words with formal number distinction
were pronouns: *aku ‘1sg’, *kami ‘1pl’, *kau ‘2sg’, *kamu ‘2pl’, *ia ‘3sg’,
*sida(?) ‘3pl’.17 Other than that, Malay-Indonesian does not distinguish be-
tween singular and plural forms. Collective nouns can be formed by redupli-
cation: anak ‘child/children’ : anak-anak ‘group of children’, rumah ‘house/
houses’ : rumah-rumah ‘group of houses’. There is a trend in academic and
journalistic writing to use reduplicated forms as the equivalents of English
plurals, especially in Malaysia. This reinterpretation of collectives as plurals
is due to the influence of Dutch and English, both of which have morpho-
logical plural forms.
Two particles used to form collectives were borrowed from Javanese. In
formal Indonesian para (< Javanese para) is used to form collective nouns, as
in penonton ‘viewer’ : para penonton ‘the viewers, the audience’, penumpang
‘passenger’ : para penumpang ‘the passengers (of a particular vehicle or trans-
port service)’. In colloquial Jakarta Indonesian (and increasingly in literary
Indonesian as well), another particle, pada, is used to form collective-subject
verbs, as in meréka pergi ‘they went’, meréka pada pergi ‘they went (refer-
ring to a group)’. The word pada (in this sense) was borrowed from Javanese
padha ‘same, equal’, from which the collective-forming function developed
in Javanese before being borrowed into Indonesian.
In some cases plural forms of words were borrowed but used without
number distinction (that is, even when the referent is singular). Examples of
such Arabic loanwords are (satu) huruf 18 ‘letter [of the alphabet]’ (<ḥurūf
‘letters’, the singular form being ḥarf), and (seorang) ulama ‘Islamic scholar’
(< ‘ulamā’ ‘scholars’, the singular form being ‘alīm [which was also bor-
rowed without number distinction as alim]). Dutch plurals borrowed without
number distinction include (satu) karcis ‘ticket’ (< kaartjes ‘tickets’, the sin-
Indonesian 313
Table 1. Arabic loanwords distinguished for number and gender in Indonesian

masc.sg fem.sg masc.pl fem.pl
‘attendee’ hadir hadirah hadirin hadirat
(<Ar. ḥāḍīr) (<Ar. ḥāḍīrah) (<Ar. ḥāḍīrīn) (<Ar. ḥāḍīrāt)
‘Muslim(s)’ Muslim Muslimah Muslimin Muslimat
(<Ar. muslim) (<Ar. muslimah) (<Ar. muslimīn) (<Ar. muslimāt)
‘believer(s) mukmin mukminah mukminin mukminat
[of Islam]’ (<Ar. mu’min) (<Ar. mu’minah) (<Ar. mu’minīn) (<Ar. mu’mināt)
Note: In general Indonesian (i.e. not used in an Islamic context), hadir is an adjective meaning
‘present’, while hadirin means ‘audience’.
gular is kaartje), and (seorang) politisi ‘politician’ (< politici ‘politicians’, the
singular is politicus [which was also borrowed without number distinction as
politikus]). Similar examples of English loanwords include (satu) tips ‘tip’,
(seorang) fans ([fens])‘fan’. The only cases of plural forms which are actu-
ally used as such (i.e. in opposition to singular forms) are a few Arabic terms
of human reference used by speakers with a strong Islamic background. Ex-
amples are provided in Table 1.
The numeral system has also been impacted by borrowing. Tiga ‘three’
ultimately derives from Indo-Aryan via a Dravidian language. The Old Malay
word for ‘three’ was tlu, which goes back to Proto Austronesian. The San-
skrit loanword laksa ‘ten thousand’ (< lakṣa ‘a hundred thousand’) is rarely
used in modern Malay-Indonesian, but juta, also borrowed from Sanskrit
(< ayuta ‘ten thousand’), is the only word for ‘million’. The Dutch loanword
milyun ‘million’ (< miljoen) is now obsolete, but milyar ‘billion’ (< miljard)
and trilyun ‘trillion’ (< triljoen) are commonly used with reference to mon-
etary amounts (the intrinsic value of the Indonesian currency is extremely
low). Another numeral borrowed from Dutch – or possibly English – is lusin
‘dozen’, ultimately derived from Dutch dozijn or English dozen via Chinese
Bazaar Malay (where the change d > l is common).19
In Old Malay, numerals between 10 and 20 were expressed by simple
juxtaposition, e.g. sa-puluh dua (lit. ‘one-ten two’) ‘twelve’. In modern
Malay-Indonesian, these numerals are formed with the special element belas
‘-teen’, e.g. dua belas (lit. ‘two-teen’) ‘twelve’. This pattern was borrowed
from Old Javanese, along with the element belas itself (< Old Javanese welas).
A similar pattern was also borrowed from Javanese for numerals between 21
and 29, with the element likur ‘score and...’: se-likur ‘twenty-one’, dua-likur
314 Uri Tadmor
‘twenty-two’, etc., but these forms are rarely used in modern Malay-Indone-
sian. Finally, some Sanskrit-derived bound numerals are used in compounds,
such as éka- ‘one’ (< Sanskrit eka), dwi- ‘two’ (< Sanskrit dvi), tri- ‘three’
(< Sanskrit tri), catur- ‘four’ (< Sanskrit catur), and panca- ‘five’ (< Sanskrit
paṅca). They occur not only with Sanskrit-derived bases, but also with bases
from other sources, as in dwifungsi ‘dual function (of the military; fungsi is
from Dutch) and caturwulan ‘trimester’ (lit. ‘four months’; wulan is from
Javanese).
4.4. Possession
Possession can be expressed in various ways in Indonesian. In standard In-

donesian (as in earlier forms of Malay), possession is indicated by simple
head-initial juxtaposition, as in rumah ibu ‘mother’s house’ (rumah ‘house’,
ibu ‘mother’). In colloquial Jakarta Indonesian, the common way to indicate
possession is by attaching a third-person pronominal clitic to the head. This
pattern is modeled after similar constructions in Javanese and Sundanese:
(1) Indonesian: rumah-nya bapak

Javanese: omah-é bapak
Sundanese: imah-na bapa
house-3 father
‘father’s house’
This pattern has spread from colloquial to standard Indonesian, but only when
the head is a verb. In such cases, the addition of -nya nominalizes the verb.
(2) pecah-nya pesawat

break-3 airplane
‘the breaking up of the plane’
In some varieties of Bazaar Malay (and in speech forms that developed from
it or were influenced by it), a different possessive construction is used. The
order of the head and possessor is reversed, and punya ‘to have’ is insert-
ed between them: bapak punya rumah ‘father’s house’ (this can also mean
‘father has a house’, depending on the context). In Malacca (which was the
focal point for the dissemination of Bazaar Malay throughout the archipela-
go) this construction may have been patterned after Hokkien, as can be seen
Indonesian 315
in the following example from Baba Malay (a creole based on Bazaar Malay),
adapted by Baxter (1988: 92) from Lim (1981: 4552):
(3) Baba Malay: gua punya rumah

1s + punya + house
Hokkien: guà é chhǔ
1s + é + house
‘my house’
Finally, as mentioned in Section 4.1, the ablative preposition dari can also
be used in certain contexts to indicate possession, under the influence of
Dutch van.
5. Verbal structure
Indonesian has only two affixes that can be considered inflectional, but they
occur with very high frequency. These are the active prefix meng- and the
passive prefix di-. Derivational affixes are much more numerous. None of the
basic affixes of standard Indonesian exhibit evidence for borrowing.
Categories such as tense, mood, and aspect are not grammaticized in In-
donesian. Tense is optionally expressed by particles such as akan (to indicate
future actions) and telah (to indicate past actions). Such particles are used
with increasing frequency, even in contexts where they would not be used in
earlier forms of the language, and this may be due to the influence of West-
ern languages in which tense marking is obligatory. This trend is especially
strong in standard Malaysian, where the influence can be clearly traced to
English.
The very common perfective particle sudah developed from a verb mean-
ing ‘to complete, to finish’, which may have been derived from Sanskrit
śuddha ‘cleansed, cleared, acquitted’. Many other particles with modal or
modal-like meanings are also derived from loanwords, e.g. mesti ‘have to,
must’ (< Javanese mesthi ‘inevitable’), perlu ‘need, must’ (< Classical Arabic
(al-)farḍu ‘(the) duty’), bisa ‘can’ (< (dialectal) Javanese bisa ‘can’, possibly
of ultimate Sanskrit origin), réla ‘willing to’ (< Arabic riḍa’ ‘agreement, con-
sent’), mungkin ‘maybe, possible, probably’ (< Arabic mumkin ‘possible’),
pasti ‘certain(ly)’ (< Javanese pesthi ‘predestined fate’); niscaya ‘certainly’
(< Sanskrit niścaya ‘certainly’), suka ‘to like’ (< Sanskrit sukha ‘pleasure’).
316 Uri Tadmor
6.1. Pronouns
Several Indonesian pronouns are derived from loanwords, although it is not

clear whether the pronominal system as a whole has been affected. The defer-
ential first-person pronoun saya is derived from Sanskrit sahāya ‘companion’.
It is used side by side with the inherited pronoun aku, now relegated to infor-
mal, intimate, or literary/poetic use. The introduction of honorific distinctions
into the pronominal system may have been due to contact, although it is also
possible this was due to an internal process. The originally plural second-
person pronoun kamu developed a secondary meaning as honorific second-
person pronoun (singular or plural), and this process may have preceded the
grammaticalization of saya. Also, the third-person singular honorific pronoun
beliau does not appear to be a loanword. So it is probable that rather than
introducing a new category, the loanword saya simply filled the empty first-
person slot in an already existing paradigm of honorific pronouns.
The third-person plural pronoun meréka (in earlier Malay, marika) was
borrowed from Old Javanese marika, a distal demonstrative that was also
used as a third-person pronoun. A third-person plural pronoun cannot be re-
constructed in Proto Malayic with certainty. Some Malayic isolects in Borneo
have forms going back to *sida, but this etymon has cognates in some non-
Malayic languages of Borneo, and may have been borrowed from them into
Bornean Malayic. If the slot was empty in prehistoric Malay, it would help
explain why a loanword was used to fill it.
In colloquial varieties of Malay-Indonesian, there are several other bor-
rowed pronouns. In the Jakarta dialect, the general 1sg pronoun is gué, from
Hokkien goá; 2sg is lu, from Hokkien lù. These two pronouns are well known
throughout Indonesia (and in Malaysia too, where they are sometimes used in
Bazaar Malay used with or by ethnic Chinese).20
The English first-person pronoun I and especially the second-person pro-
noun you are sometimes used by Indonesians when addressing foreigners.
They are also used among westernized Indonesians. In urban peninsular
Malay, both I and you are in very common use.
Pronouns of Dutch origin are also used by certain groups of Indone-
sian speakers. These are éke ‘1sg’ (< ikke ‘1sg [child language]’), yéy ‘2sg’
(<Dutch jij ‘2sg’),21 and dése ‘3sg’ (< deze ‘this (one)’). They were origin-
ally borrowed by Dutch-educated Indonesians, and were later learned from
upper-class ladies by male transvestite hairdressers who used them in their
Indonesian 317
special jargon. From there these pronouns spread to the jargon of trendy
young women, known as Bahasa Gaul (‘language of socializing’). Some-
times they are also used in writing, as in (4), taken from a column in a leading
Indonesian daily.22 In this excerpt, in which the writer is debating with his
friends whether his writing style is critical enough, second-person reference
expressions have been highlighted:
(4) “Ohh . . . jij salah. Kita tuh enggak menganggap lo begitu, Mas.
Kita kan tahu dari dulu Mas memang sukanya gigit-gigitan, kan?”
komentar spontan teman saya.
“Maksud sampean dari dulu saya anjing?” balas saya.
‘“Oh, you’re wrong. We don’t think you’re like that. We’ve known all
along that you like to bite, right?”, commented my friends spontane-
ously.’
“You mean I’ve always been a dog?”, I replied.’
In this short excerpt, four different expressions are used for second-person
reference, all loanwords: jij from Dutch, lo from Hokkien, Mas (lit. ‘elder
brother’) and sampéan from Javanese.
Traditionally, pronouns have been considered to be impervious (or at least
highly resistant) to borrowing. As Thomason and Everett (2001: 301) explain,
because pronouns form a closed set and form a tightly structured system, lin-
guists assumed that borrowing into the set would disrupt the system. (There
may have been another, more prosaic reason for this wrong assumption: pro-
nouns are rarely borrowed in European languages.) However, Thomason and
Everett go on show (citing copious examples) that “given appropriate social
circumstances, pronouns and even whole pronominal paradigms are readily
borrowed” (2001: 301).
Two social factors may have contributed to widespread pronoun borrow-
ing in Malay-Indonesian (and in other Southeast Asian languages). One is the
tendency to adapt to the speech of one’s interlocutor by using structural fea-
tures and lexical items perceived as belonging to the interlocutor’s language,
including pronouns. This may have been part of the motivation behind the
initial borrowing of gua and lu from Hokkien. Another factor is lexicalized
politeness; some pronouns (such as saya from Sanskrit and sampéan from
Javanese) were initially borrowed as the honorific counterparts of existing
pronouns, while others (such as I and you from English) can be used when
the speaker wishes to avoid having to make a choice between an honorific
pronoun and a derogatory one.
318 Uri Tadmor
6.2. Interrogatives
Some basic interrogatives of Indonesian can be reconstructed in Proto

Malayic, such as apa ‘what?’, siapa ‘who?’, mana ‘which?, where?’. How-
ever, there are no reconstructed forms for ‘when?’ and ‘how?’. The most com-
mon Indonesian word for ‘when?’, kapan, was borrowed from Javanese. The
much earlier loanword bila ‘when?’ (< Sanskrit velā ‘time’) is now used only
in formal or poetic contexts (although it is still in general use in standard
Malaysian and various Malay dialects). In the word bagaimana ‘how?’ (lit.
‘like which’), the element bagai is borrowed from Tamil vakai ‘kind, manner,
method’ (the second element, mana ‘which?, where?’, is Malay).
6.3. Particles and their functions
The borrowing of function words may have repercussions on the grammar,

because they may introduce new categories and distinctions into the lan-
guage. While many Indonesian function words are derived from loanwords,
relatively few were borrowed as such. Most were borrowed as content words,
and later grammaticalized within Malay-Indonesian. Examples of both types,
as well as for changes in the meanings of function words under the influence
of language contact, are discussed below.
6.3.1. Conjunctions
It is not possible to reconstruct coordinating conjunctions such as ‘and’, ‘or’,

and ‘but’ in Proto Malayic. The conjunction dan ‘and’ is probably a con-
tracted form of dengan ‘with’;23 it is commonly used in Classical Malay and
in standard Indonesian, but rarely in colloquial varieties. Another literary
form with a similar meaning is serta ‘and, also, together with’, from Sanskrit
sārtha ‘company’. The word atau ‘or’ (< Sanskrit athavā ‘or, rather’) is used
in both standard and colloquial varieties of Malay-Indonesian. Similarly, tapi
‘but’ (literary tetapi), from Sanskrit tathāpi ‘nonetheless’, is commonly used
in formal and informal varieties of Indonesian. The use of Sanskrit loanwords
to fill such basic functions is indicative of the extent of the influence of San-
skrit on early Malay.
Indonesian 319
6.3.2. Complementizer and relativizer
The commonly used complementizer bahwa is a loanword (< Sanskrit bhāva

‘being, state’). It is possible that Proto Malayic did not use a complementizer
to introduce indirect speech or thought. However, bhāva is not used as a com-
plementizer in Sanskrit, so the development of a complementizer in Malay-
Indonesian is only indirectly due to contact. The literary relativizer nan was
also borrowed, from the closely related language Minangkabau. It occurs in
modern standard Indonesian, but not in early Classical Malay.
6.3.3. Adverbial clause markers
Most adverbial clause markers in Indonesian are borrowed or contain bor-

rowed elements. These include sementara ‘while (simultaneously)’ (< San-
skrit sam-anantara ‘immediately contiguous to or following’), waktu ‘when
(referring to the past)’ (< Classical Arabic [al-]waqtu ‘[the] time’), walau(pun)
‘although’ (< Arabic walau ‘and [even] if’), meski(pun) ‘although’ (< Cre-
ole Portuguese maski ‘although’, < Portuguese mas que), karena ‘because’
(< Sanskrit kāraṇa ‘cause’), sebab ‘because’ (< Arabic sabab ‘reason, cause’),
seperti ‘like’ (< Malay se- + Sanskrit prati ‘nominal prefix expressing like-
ness’), guna ‘in order to’ (< Sanskrit guṇa ‘quality, use’), supaya ‘in order to’
(< Malay se- + Sanskrit upāya ‘means’), agar ‘in order to’ (< Persian agar
‘if’), pasal ‘regarding, concerting, about’ (< Arabic faṣl ‘part’), soal ‘regard-
ing, concerning, about’ (< Arabic su’āl ‘question’). The fact that so many
adverbial clause markers in Indonesian are borrowed or based on loanwords
may be an indication that this category of function words did not exist in early
Malay, and that its presence in modern Indonesian is due to borrowing.
6.3.4. Focus particles
Borrowed focus particles include cuma ‘only’ (< Tamil cummā ‘vaguely,
gratuitously, freely’), saja ‘only, just’ (< Sanskrit sahaja ‘natural’), sama
‘same (as)’ (< Sanskrit sama ‘equal’), persis ‘precisely’ (< Dutch precies
‘precise(ly)’), pas ‘exactly’ (< Dutch pas ‘just now’), saban ‘each, every’
(< Javanese saben ‘each, every’), and colloquially even [ifən] ‘even, even
though’ (< English even).
320 Uri Tadmor
6.4. Situation-bound expressions
Borrowed greetings include Halo! ‘Hello!’ (< Dutch Halo!), Selamat! (‘Con-
gratulations!’, and also part of many every day greetings like Selamat pagi!
‘Good morning!’, < Arabic salāmat-), and Asalamualaikum! ‘Peace be upon
you!’ (traditional Islamic greeting, < Classical Arabic as-salāmu ‘alaikum!).
The fact that Indonesian has many borrowed greetings may be because in
traditional Malay society (and in southeast Asia in general) there were few
specific situational greetings. Upon meeting an acquaintance, one would say
something like ‘Where are you off to?, ‘ or ‘Where are you coming from?’,
or ‘Have you had lunch yet?’, as indeed many Southeast Asians still do in-
formally. The concept of specific greeting expressions for different times of
day and situations appears to have been borrowed. Other borrowed situa-
tion-bound expressions and interjections include Sori! ‘Sorry!’ (< English
‘Sorry!’), Maaf! ‘Excuse me, forgive me!’ (< Hindi–Urdu māf, ultimately
from Arabic mu‘āf ), andWow! ‘exclamation expressing admiration’ (< Eng-
lish Wow!). An expression that stands out by not being borrowed is Terima
kasih! ‘Thank you!’, but the common reply Sama-sama! ‘You’re welcome!’
is based on the loanword sama ‘equal’ (< Sanskrit sama).
As mentioned in Section 3, SVO sentences have become increasingly fre-

quent in Malay-Indonesian. Contact with languages where this is the preva-
lent word order (basically English and Dutch) may have been involved in this
process, although the evidence is not conclusive.
In the Bazaar Malay of Jakarta, and consequently in some varieties of
Jakarta Indonesian, demonstratives are preposed rather than postposed (as
they are in standard Malay-Indonesian and in traditional Malay dialects), e.g.
rumah ‘house’, ini rumah ‘house this’ (rather than rumah ini). This may re-
flect a Creole Portuguese substrate (cf. kaju ‘house’, iste kaju ‘this house’),
reinforced by Chinese word order (cf. Hokkien chhǔ ‘house’, cí lè chhǔ ‘this
house’). This analysis is supported by the fact that preposed demonstratives
are typical of ethnic Chinese speakers of Jakarta Indonesian.
Some idiomatic expressions of time and frequency appear to reflect a San-
skrit word order. One such expression is pertama kali ‘(for the) first time’
(lit. ‘first time’, rather than the expected kali pertama [which also occurs but
is less usual]). This is probably not a coincidence, since both constituents –
Indonesian 321
pertama ‘first’ and kali ‘time’ – are of Sanskrit origin. By analogy, in other
expressions containing kali ‘time(s)’, the modifier also precedes the head:
kedua kali ‘second time’, lain kali ‘another time’.
8. Syntax
Most matters pertaining to syntactic borrowing have already been discussed

in previous sections. This section discusses two contact-related phenomena
which have not been treated yet: the emergence of copulas and locative rela-
tive clauses.
8.1. Copulas
Early Malay did not have copulas. A nominal subject and a nominal predi-
cate could be simply juxtaposed, although frequently the topic marker pun
marked the subject. In recent centuries, several copula-like expressions have
emerged, possibly under the influence of Western languages which require
a copula with nominal predicates. The most common are adalah (from ada
‘exist’ + -lah ‘comment marker’) and merupakan (originally a verb meaning
‘to take the form of’). Ialah (from ia ‘3sg’ + lah ‘comment marker’) also oc-
curs, but is more common in Malaysian and in earlier Indonesian literature.
The choice of copula depends on idiomaticity, but in (5) all three are permis-
sible:
(5) Kemiskinan adalah/merupakan/ialah tantangan bagi Indonesia.

poverty copula challenge for Indonesia
‘Poverty is a challenge for Indonesia.’
Recently there has been a trend of using copulas with adjectival predicates
as well, especially in Malaysia. This is probably due to the influence of Eng-
lish.
8.2. Locative relative clauses
Locative relative clauses are a relatively recent development in Indonesian,

and are due to the influence of Dutch. Instead of the relativizer yang, such
322 Uri Tadmor
clauses use a locative interrogative (usually di mana ‘where’, but ke mana

‘whither’ and dari mana ‘whence’ also occur):
(6) rumah di mana dia tinggal

house at which 3sg live
‘the house where he lives’
Purists view such constructions as ungrammatical, and advocate using the

word tempat ‘place’ instead of di mana ‘where’, as in (7). However, this
construction is also without precedent in early Malay, and thus also owes its
emergence to foreign influence, albeit indirect.
(7) rumah tempat dia tinggal

house place 3sg live
‘the house where he lives’
9. Lexicon
Lexical and semantic borrowing (including calquing) are often treated togeth-
er in the literature, but in fact they constitute different phenomena. Lexical
borrowing consists of adopting morphs from another language. This usually
also invovles adopting meanings as well, but in principle a morph can be
borrowed without a meaning that is different from the source word’s. Thus,
for example, Maori Wīwī ‘French’ is from French ‘oui, oui!’ ‘yes, yes!’, but
there is not indication that it ever meant ‘yes’ in Maori, or that it was used as
a name in French. In other words, only the morph has been borrowed, with-
out its meaning. Semantic borrowing, on the other hand, consists of a change
of meaning in morphemes which already exist in the language, and does not
involve a transfer of morphs. In fact, semantic borrowing is better viewed as
a type of structural borrowing (since it affects the semantic structure).
9.1. Lexical borrowing
A large number of loanwords can be found in all varieties of Malay-Indo-

nesian and in all semantic fields. Most older loanwords from Sanskrit and
Arabic were borrowed directly from written sources. Newer loanwords
from European, local, and other languages were usually borrowed via direct
Indonesian 323
contact with their speakers. English constitutes an intermediate case: some

words were borrowed from speech, while others were borrowed from written
sources. Some examples of loanwords in Indonesian from various sources
are listed below.
Sanskrit: suami ‘husband’, istri ‘wife’, kepala ‘head’, muka ‘face’, kunci ‘key’,
gula ‘sugar’, kerja ‘work’, cuci ‘wash’, pertama ‘first’, semua ‘all’.
Arabic: badan ‘body’, dunia ‘world’, nafas ‘breathe’, lahir ‘be born’, kuat
‘strong’, séhat ‘healthy’, kursi ‘chair’, waktu ‘time’, pikir ‘think’, perlu
‘need’.
Chinese (Hokkien): cat ‘paint’, toko ‘store’, hoki ‘lucky’, téko ‘teapot’, mi
‘noodles’, kécap ‘soy sauce’, giwang ‘earrings’.
Persian: kawin ‘marry’, domba ‘sheep’, anggur ‘grapes, wine’, pinggan ‘dish’,
gandum ‘wheat’, saudagar ‘merchant’.
Portuguese and Portuguese Creole: garpu ‘fork’, kéju ‘cheese’, sepatu ‘shoes’,
jendéla ‘window’, méja ‘table’, roda ‘wheel’, bola ‘ball’, minggu ‘week’,
dansa ‘dance’, séka ‘wipe’.
Dutch: open ‘oven’, sup ‘soup’, handuk ‘towel’, kamar ‘room’, mobil ‘car’,
gelas ‘glass’, duit ‘money’, koran ‘newspaper’, nécis 'neat', bor 'to drill'.
English: koin ‘coin’, bolpoin ‘pen’, strés ‘stressed out’, tivi ‘television’, tikét
‘ticket’, pink ‘pink’, gaun ‘formal dress’, komputer ‘computer’, notes
‘notepad’, flu ‘flu’, stop ‘to stop’, cas ‘to charge’.
Old Javanese: bapak ‘father’, ibu ‘mother’, meréka ‘they’, daging ‘meat’,
rusak ‘damaged’, masuk ‘enter’, murah ‘cheap’, antar ‘bring/take’, ratu
‘queen’, pasti ‘sure’.
9.2. Semantic borrowing
Some Indonesian words of Malay (i.e., inherited) origin have changed their
meaning based on the meanings of similar-sounding words in Javanese and
Table 2. Replaced meanings in some Indonesian words

Meaning in Meaning in Meaning in Javanese
Word Malaysian Indonesian and Sundanese
sulit secluded, secret difficult difficult
butuh penis need need
gampang illegitimate child easy easy
324 Uri Tadmor
Table 3. Indonesian expressions calqued from Dutch

Indonesian expression Literal meaning Metaphorical meaning
benang mérah ‘red thread’ ‘a common theme’
sayap kiri ‘left wing’ ‘(politically) radical’
sayap kanan ‘right wing’ ‘(politically) conservative’
isapan jempol ‘something sucked out ‘a made-up story’
of the thumb’
luar biasa ‘out of usual’ ‘extraordinary’
Dutch expression Literal meaning Metaphorical meaning
rode draad ‘red thread’ ‘a common theme’
linker vleugel ‘left wing’ ‘politically radical’
rechter vleugel ‘right wing’ ‘politically conservative’
iets uit de duim zuigen ‘to suck something out ‘to make up a story’
of the thumb’
buitengewoon ‘out of usual’ ‘extraordinary’
Sundanese. Often the semantic change is subtle and thus hard to detect, but
sometimes the borrowed meaning can completely replace the original one, as
in the examples in Table 2.
People often borrow words to represent concepts which are not yet lexi-
calized in their language, but sometimes a word is borrowed even if the lan-
guage already has a word for that concept. In such cases the original word
may become obsolete, but it can also be retained side by side with the newer
loanword. Since it is not economical for a language to have two words with
the same meaning, one of the words may undergo a semantic change. For ex-
ample, Indonesian borrowed the Javanese words sapi ‘cattle’ and tawòn ‘bee’,
but has also retained the inherited Malay words lembu ‘cattle’ and lebah ‘bee’.
The inherited words then underwent semantic narrowing; lembu is now used
for particular kinds of cattle, while lebah is used for the honey bee.
Some Indonesian metaphorical compounds have equivalents in many lan-
guages of Southeast Asia and beyond. In such cases it is difficult to trace their
origin, and there is also the possibility that they arose independently, espe-
cially if their semantics is transparent. However, many compounds that have
opaque compositional semantics can be traced to Dutch. A few of the many
examples of such calqued expressions are provided in Table 3.
Indonesian 325
10. Conclusion
Malay-Indonesian has undergone extensive interference from other lan-

guages. This was due to long term contact with foreigners, as well as to its
use as a lingua franca throughout the Malay-Indonesian archipelago. The do-
main most affected by borrowing has been the lexicon, while structural bor-
rowing has been moderate. The impact of borrowing has been particularly
strong on the phonology, significantly expanding the inventory of phonemes
as well as radically changing the syllable structure and phonotactics. There
are also some morphosyntactic contact phenomena, but on a smaller scale. In
particular, the use of enclitic pronouns as direct objects and the emergence of
copulas and locative relative clauses appear to due to contact.
The main sources of borrowing into Malay-Indonesian were Sanskrit and
Arabic (often via a third language), and more recently some local languages,
Dutch, and English. Borrowing from Sanskrit and Arabic was almost exclu-
sively through writing, while borrowing from other languages was mostly
through speech. These two kinds of contact resulted in different kinds of in-
terference phenomena. Taken together, these contact-induced changes have
resulted in a modern Indonesian that is markedly different from its Malay
predecessors.
Notes
1. The author has taught Indonesian at several institutions, and has been working
as a linguist in Indonesia since 1999.
2. In this chapter ‘Indonesian’ is used as shorthand for ‘standard Indonesian’.
3. Other than standard Indonesian, varieties of Malay-Indonesian mentioned in this
chapter include Old Malay (the language of the oldest Malay inscriptions, espe-
cially those of the seventh century); Classical Malay (the language of the written
literature of the 17th–19th centuries, from which modern standard Indonesian
developed); Malay dialects (regional varieties spoken by ethnic Malays); Bazaar
Malay (pidginized forms of Malay used for inter-ethnic communication); Urban
Peninsular Malay (used among ethnic Malays in Kuala Lumpur and some other
cities of West Malaysia, influenced by Bazaar Malay); and Malaysian (used here
to refer to standard Malay as spoken in Malaysia).
4. In a diglossic situation where speakers use standard Indonesian in more formal
situations and a colloquial variety of Indonesian as a home language, the two can
be said to form the two ends of a continuum.
5. It would be unwieldy to furnish references for each of the many loanwords cited
here. Suffice it to note that most of the Arabic loanwords mentioned here appear
326 Uri Tadmor
in Jones 1978; most European loanwords appear in Grijns et al. 1983; and most
Sanskrit loanwords appear in de Casparis 1997.
6. The glottal stop in [jumʔat] ‘Friday’ and [çaʔir] ‘prayer’ may be deleted in rapid
speech, but crucially a glottal stop can never be inserted in words such as rumah
‘house’ and kain ‘cloth’.
7. The sense of ‘not’ developed from the original sense ‘no!’.
8. Similar phenomena can be found in other Austronesian languages; see Blust
1979.
9. In many Malay dialects as well as in standard Malaysian a final k is realized as a
glottal stop in inherited vocabulary. However, this is not the case with standard
Indonesian, nor with the most important colloquial variety, Jakarta Indonesian.
In both, final /k/ is realized [kÝ].
10. The convention of using v in the transliteration, which I follow here, is based on
the current pronunciation of Sanskrit. Historically, this was a bilabial semivowel
much like English w, and this was probably the sound that was borrowed into
Malay. Otherwise, it would be difficult to explain why v should be represented
by u in non-initial position (see below).
11. The initial schwa reconstructed in words such as *əmpat ‘four’ and *ənam ‘six’
only occurs before sonorants, and is therefore better analyzed as representing
syllabicity: mÞpat, nÞam. Indeed, this is how these words are still realized in most
Malay-Indonesian variants, including the standard ones.
12. Similarly, schwa phonemicized in Malaysian under the influence of loanwords,
but the process there was more complex.
13. The only apparent exception is -ngs-. This can be explained by the phonetic
realization of s in Malay and by Malay phonotactics, but a detailed explanation
would be beyond the scope of this chapter.
14. Embarrassingly the present author also overlooked this paper, only becoming
aware of it just before submitting the final version of this chapter for publica-
tion.
15. The sole exception is 1sg, which has different accusative and genitive forms.
16. The semantic expansion from ‘same’ to include ‘with’ is fairly transparent: two
persons doing the same action together can be viewed as doing it with each
other.
17. It is not clear whether Proto Malayic actually had a 3pl form (see Section 6.1).
18. In this and following examples, satu ‘one’ or seorang ‘one+numeral classifier
for humans’ are added, to demonstrate the fact that they can be used with a sin-
gular meaning.
19. According to Scott Paauw (p.c.), a similar form occurs in Chinese Pidgin Eng-
lish.
20. An example for the popularity of these pronouns in Indonesian comes from the
title of the television series Gue sihir lu! (‘I put a spell on you!’), shown on the
SCTV channel at the time of writing.
21. Sometimes the original Dutch orthography is retained, as in (4).
Indonesian 327
22. From the colunm Kurang Tajam (‘not sharp enough’) by Samuel Mulia, Kompas
online edition, 10 December 2006, accessed at http://www.kompas.co.id/ver1/
Kesehatan/0612/10/120132.htm. The original spelling has been retained.
23. This etymology was suggested to me by David Gil.
References
Baxter, Alan N.
1988 A Grammar of Kristang (Malacca Creole Portuguese). (Pacific Lin-
guistics B95.) Canberra: The Australian National University.
Becker, Alton L., and Umar Wirasno
1980 On the nature of syntactic change in Bahasa Indonesia. In: Paz Buena-
ventura Naylor (ed.), Austronesian Studies: Papers from the Second
Eastern Conference on Austronesian Linguistics. (Michigan Papers on
South and Southeast Asia Number 15). Ann Arbor: The University of
Michigan.
Blust, Robert
1979 Proto-Western Malayo-Polynesian Vocatives. Bijdragen tot de taal-,
land- en volkenkunde 135: 205251.
de Casparis, J. G.
1997 Sanskrit Loan-Words in Indonesian. Published for the Indonesian
Etymological Project as NUSA 41. Jakarta: Universitas Katolik Indo-
nesia Atma Jaya.
Gonda, J.,
1952 Sanskrit in Indonesia. The Hague: Oriental Bookshop.
Grijns, C. D., J. W. de Vries, and L. Santa Maria
1983 European Loan-Words in Indonesian. Published for the Indonesian
Etymological Project by the Koninklijk Instituut voor Taal-, Land- en
Volkenkunde, Leiden.
Jones, Russell
1978 Arabic Loan-Words in Indonesian. Published simultaneously by the In-
donesian Etymological Project as Cahier d’Archipel 2, SECMI, Paris.
Produced at the School of Oriental and African Studies, University of
London.
KBBI
2002 Kamus Besar Bahasa Indonesia [The Great Dictionary of the Indone-
sian Language]. Jakarta: Balai Pustaka.
Lim, Sonny
1981 Baba Malay: The language of the ‘Straits-Born’ Chinese. MA thesis,
Monash University, Australia.
328 Uri Tadmor
McWhorter, John H.
2007 Language Interrupted: Signs of Non-Native Acquisition in Standard
Language Grammars. Oxford: Oxford University Press.
Thomason, Sarah G., and Daniel L. Everett
2001 Pronoun borrowing. Berkeley Linguistics Society: Proceedings of the
Annual Meeting 27: 301315.
Grammatical borrowing in Biak
Wilco van den Heuvel
1. Background
Biak (in older sources also Numfoors(ch), Mafoors(ch) or Myfoors(ch)) is

an Austronesian language of Papua, the easternmost province of Indonesia.
The language has around 70,000 speakers, most of whom live on the islands
Biak, Supiori, Numfor and the smaller islands around. In addition, the lan-
guage is spoken in a number of settlements and on several islands along the
North coast of the Bird’s Head peninsula, reflecting a long-lasting prominent
position of Biak people in trade throughout the area. The data presented in
this chapter are based on different periods of fieldwork in the village Wardo,
West-Biak.
The long history of contact with both Austronesian and non-Austronesian
languages has led to a number of similarities between Biak and other lan-
guages in the same linguistic area. A brief discussion of Biak as part of the
linguistic area of Eastern Indonesia can be found in Van den Heuvel (2006:
1112), who also discusses the relation between Biak and the languages of
the Bird’s Head. The present chapter, however, focuses specifically on the
unidirectional influence of local Malay/Indonesian on Biak.
Although Malay has served as a sort of ‘supra-regional’ language through-
out Indonesia from before the arrival of the Portuguese (Adelaar and Prentice
[1996]), the influence of Malay on Biak used to be small. From at least the
beginning of the eighteenth century, Biak people played an important role
in trade, functioning as intermediaries between the West (especially Tidore)
and the people from the Bird’s Head Peninsula (Van den Heuvel 2006: 23,
and the references cited there). Although Malay was used in contacts with
Tidore, Biak people appear to have used their own language in contacts with
the local population, which can be deduced from the relatively high number
of borrowed words in the local languages. After the end of the nineteenth
century, however, with the intensification of Dutch colonial rule over the re-
gion and the conversion of the Biak people to Christianity, the use of Malay
gained ground. Malay was used as the administrative language during Dutch
colonial rule, in education, and by preachers from outside Biak. Since the
1960s, Indonesian, which is a standardized form of Malay, has been the of-
330 Wilco van den Heuvel
ficial language. This has lead to a gradually increasing use of Indonesian

(and local Malay) to the expense of Biak. Nowadays, in the main town of the
island (Kota Biak), Indonesian and local Malay (hereafter LM) are the dom-
inant language both inside and outside the family. In Kota Biak, the use of
Biak is restricted to communication between older Biak speakers (older than
55 years of age). In the villages that are more remote from town, Biak is used
by younger generations also. In the village Wardo, where most fieldwork was
conducted, the Biak language is used actively by people of approximately 30
years of age and older. For all generations in the village, code-switching be-
tween Biak and Indonesian/LM is common. The corpus is restricted to speak-
ers that are still (relatively) fluent in the language.
Although code-switching between Biak and Indonesian/LM is very com-
mon throughout the island, the present chapter focuses on one-to-one bor-
rowing from Indonesian/LM into Biak, specifically on those cases that can be
qualified as grammatical borrowing. For each of the presented phenomena,
however, it will be argued briefly whether the phenomenon in question is
more like borrowing or more like code-switching, based on the assumption
that the demarcation between code-switching and borrowing is gradual ra-
ther than absolute (cf. Salmons 1990: 466, Gardner-Chloros 1995). Both for
lexical borrowing and for grammatical borrowing, it is often not possible to
distinguish between borrowing from Indonesian and borrowing from LM,
as the two languages are similar in many respects. In this chapter, therefore,
I will not attempt to distinguish between the two types of borrowing.
As I am not aware of any influence of Indonesian/LM on Biak typology
or Biak nominal structures (except from lexical borrowing), the discussion
below is restricted to its influence in the realm of phonology, verbal structures
and other parts of speech. The chapter focuses on the borrowing of modal
auxiliaries, of the conjunction kalau ‘if’, and on the borrowing of the negative
adverb bukan ‘not’.
2. Phonology
Compared to Indonesian/LM, Biak lacks /l/, /h/, /t/ and /ŋ/, but has an add-
itional voiced labial fricative /β/. In addition, unlike Indonesian/LM, Biak has
a distinction between short and long vowels (the latter indicated by a diacritic
sign on top of the vowel). In spite of these differences, adaptation of Indone-
sian/LM loans to Biak phonology is very rare. During my fieldwork, the only
examples of adaptation were found with some very old people, like kapal
Biak 331
‘ship’ being realized as [kapar] and tahun ‘year’ being realized as [saun].
The non-adaptation of words from Indonesian/LM can be accounted for in
several ways. On the one hand, it can be seen as an argument for their status
as (insertional) code switches rather than loans. It is also possible, however,
to view this phenomenon as proof of a certain convergence of Biak phon-
ology and Indonesian/LM phonology. In that view, speakers of Biak have
one shared phoneme inventory, integrating the Indonesian/LM and the Biak
set of phonemes.
Whereas both Biak and Indonesian/LM have SVO order in the verbal clause,
they differ in the expression of the subject. While in Indonesian the subject is
indicated by either a free pronoun or a noun phrase, Biak expresses the sub-
ject by a prefix or infix on the verb, which is optionally preceded by a coref-
erential appositional noun phrase. This is illustrated in (1), where the noun
phrase rusa nanine ‘this deer’ is optional, and in an appositional relation to
the subject prefix d-.
(1) (Rusai nan-i-ne) di-ores.

(deer giv-3sg.spc-this 3sg-stand
‘(This deer) stood still.’
In fact, the position preceding the verb is a sentential topic position. Whereas
this position is often taken by noun phrases that are coreferent with the sub-
ject, it can also be occupied by other nominal arguments, as is the case in (2)
below. Here the topic position is occupied by a preposed object, while the
canonical object position is taken by a resumptive pronoun.
(2) Insape, ai-knam an-i-ne nko-kar i.

then wood-tree giv-3sg.spc-this 1pl.ex-fell 3sg
‘Then, this tree we cut it.’
Considering contact phenomena, there is a clear difference between the bor-

rowing of main verbs on the one hand, and the borrowing of auxiliary verbs
on the other hand. Main verbs are integrated into Biak by prefixation with a
verbalizing prefix ve-. The combination of prefix and verb, then, behaves as
a normal verb, in the sense that the newly formed verb has to be combined
with a subject prefix. An example of the use of the verbalizer is given with
ko-ve-putar ‘we turn’ in (3), where the newly formed verb ve-putar ‘vblz-
turn’ combines with the subject prefix ko ‘1pl.inc’.
(3) Ko-ro Yafdas ma, ko-ve-putar ve Ridge.

1pl.inc-loc Yafdas to.here 1pl.inc-vblz-turn to Ridge
‘We come from Yafdas to here, (and) we turn to Ridge.’
The prefix ve- ‘vblz-’ is used not only for the integration of loan-verbs, but
also for the verbalization of both indigenous and exogenous stems belonging
to other parts of speech, as illustrated in the following examples. In (4), the
verbalizer is used for the verbalization of an indigenous numeral, while it is
used in (5) for the verbalization of a borrowed Indonesian noun:
(4) Sko-ve-kyor.
3PC-vblz-three
‘They are three (persons).’
(5) V<y>e-guru.
<3sg>vblz-teacher
‘He is a teacher.’
Whereas borrowed main verbs clearly function as verbs in the recipient lan-
guage, this is much less the case for the borrowed auxiliaries bisa ‘can’ and
harus ‘must’. It should be noted that Biak has only one native auxiliary verb,
the verb ve ‘want’, while it lacks native equivalents for the borrowed verbs
bisa ‘can’ and harus ‘must’.1 The auxiliary verb behaves like a normal verb
in that it combines with a prefix expressing person, number and gender of the
subject. This is illustrated in the following example, where the subject-prefix
ya- ‘1sg’ on the auxiliary verb is coreferential with the subject-prefix ya-‘1sg’
on the main verb:
(6) Ya-ve ya-ra ve pasar.

1sg-want 1sg-go to market
‘I am about to/want to go to the market.’
More commonly, the auxiliary combines with an “impersonal” 3sg subject

marker, as in the following example, where the main verb ya-ra ‘1sg-go’ is
preceded by i-ve ‘3sg-want’ rather than ya-ve ‘1sg-want’.
Biak 333
(7) Rov i-ne i-ve ya-ra ve amber

night 3sg.spc-this 3sg-want 1sg-go to foreigner
an-i-ra-wa.
giv-3sg.spc-sea-over.there
‘This night I want to go to the foreigner seawards over there.’ [MIax]
When the verb combines with a preverbal topic-NP, the auxiliary can either
precede this NP or intervene between the NP and the verb.2 The auxiliary
shares these structural properties with the historically related modal–aspec-
tual adverb imbe ‘want’, whose function cannot be distinguished from that of
the auxiliary. Consider (8) and (9) below, where the inflected auxiliary i-ve
can be replaced by the modal adverb imbe ‘want’ without any observable
change in meaning. While the auxiliary/adverb precedes the preverbal NP in
(8), it follows in (9)
(8) Imbe/I-ve [snon-nánki an-i-ne]NP i-bur.

want/3sg-want [male-sky giv-3sg.spc-this 3sg-leave
‘This man from heaven wanted to leave.’
(9) [Snon-nánki an-i-ne]NP imbe/i-ve i-bur.

[male-sky giv-3sg.spc-this want/3sg-want 3sg-leave
‘This man from heaven wanted to leave.’
We now return to the borrowings bisa ‘can’ and harus ‘must’, which are aux-
iliary verbs in the donor language. With respect to their position in the clause,
these formatives are similar both to the auxiliary verb and to the modal ad-
verbs. Given, however, that the borrowed formatives cannot be inflected for
number, person and gender of the subject, they should be analyzed as modal
adverbs rather than as auxiliary verbs. Consider (10) and (11) below. The pos-
ition of bisa ‘can’ in (10) is parallel to that of imbe/ive ‘want’ in (8), while the
position of bisa ‘can’ in (11) is parallel to imbe/ive ‘want’ in (9). Both in (10)
and (11), the formative bisa ‘can’ is used in its bare form, not preceded by a
verbalizer or a subject prefix, which makes it similar to the use of the modal
adverb imbe ‘want’ in (8) and (9).
(10) Vape bisa [romá-mkun an-i-ne]NP m<y>ám bavír i.

but can [child-little giv-3sg.spc-this <3sg>see know 3sg
‘(All frogs looked alike,) but the child could recognize it (his own
frog).’
(11) Inkukro sinan mko-i-ne mko-fár fa

because parent 1pl-spc-this 1pl-tell cons
[roma-babo ko-i-ne]NP bisa k-ák-swar epéne.
[child-young 1pl.inc-spc-this can 1pl.inc-also-remember push.tight
‘Because you parents you tell so that we younger children can also
remember (the stories).’
I analyze the phenomenon described here as closer to borrowing than to code-

switching. Although people are aware of the Indonesian/LM origin of these
modals, they are the only available options for the concepts expressed by
them, and are used rather regularly, without observable speaker variation.
There are no indications that the use of modals is triggered by neighboring
elements from L2. Finally, harus ‘must’ is not adapted to Biak phonology,
as it is realized as [harus] despite of the absence of the phoneme /h/ in the
indigenous Biak phoneme inventory. This, however, need not be seen as an
argument in favor of code-switching, as non-adaptation is the rule rather than
exception. As suggested above, Biak speakers employ an integrated Indone-
sian/LM phoneme system, so that there is no need to adapt loan phonemes to
the indigenous Biak phoneme-inventory.
Within the domain of “other parts of speech”, the discussion focuses on the
use of a borrowed conjunction kalau ‘if’ and the use of a borrowed nega-
tive adverb bukan ‘not’. While the discussion on bukan ‘not’ will be post-
poned until Section 5, this section discusses the use of kalau ‘if’, along with
a number of other, minor, contact phenomena: complementizers, numerals
and reference to concepts like days of the week. The section opens, however,
with a brief discussion on the conjunctions atau ‘or’ and dan ‘and’.
Both the conjunction atau ‘or’ and the conjunction dan ‘and’ take the
same structural position in the clause as their Biak counterparts (orovaido
‘or’ and ma ‘and’ respectively) and cannot be combined with them. While the
use of atau is relatively frequent (46 instances of atau ‘or’ vs. 107 instances
of orovaido ‘or’), dan ‘and’ is used only occasionally (10 instances of dan
‘and’ vs. hundreds of examples of ma ‘and’), in contexts containing a lot of
code switches. This shows that atau ‘or’ is on its way to becoming an estab-
lished loan, while dan ‘and’ should rather be considered a code-switch. For
both conjunctions, there is no observable functional difference between the
Biak 335
Indonesian/LM word and their Biak counterpart. Both the Indonesian/LM

and the Biak conjunctions are used for the connection of two phrases or two
clauses. Whereas the function of dan ‘and’ is quite straightforward, it should
be noted that atau ‘or’ (as well as its Biak counterpart orovaido) is typically
used to introduce an alternative term for the (referent of) the constituent dir-
ectly preceding the conjunction. Examples of this use are given in (12) and
(13). In (12) we find the Biak word ruk ‘monkey’ given as an alternative for
the Indonesian kera ‘monkey’.
(12) Ras oser kera atau ruk i-warpu wáwe su-yaw

day one monkey or monkey 3sg-with turtle 3du-pursue
fa su-ker imbyef.
purp 3du-plant banana
‘One day, a monkey or ruk with a turtle ran after each other to (go
and) plant a banana tree.’
In (13) the speaker describes a scene that he is watching, and gives an alterna-
tive term for the referent of ai-mun ‘wood-piece’.
(13) D-ors p<y>an-kar ai-mun=i,

3sg-stand <3sg>touch-break wood-piece=3sg.spc
atau ai-snáw=i.
or wood-branch=3sg.spc
‘He is standing and breaking a piece of wood, or a branch.’
Turning to the borrowed conjunction kalau ‘if’, it should be noted that this
conjunction differs from the other two conjunctions both in its functional
properties and in the fact that its structural properties differ from those of its
Biak counterpart. Structurally, the borrowed conjunction kalau ‘if’ differs
from that of indigenous ido ‘theme’ in that kalau precedes the constitu-
ent that it has scope over whereas ido follows it. The two conjunctions can
be used either on their own, or in combination.3 Functionally, the two con-
junctions kalau ‘if’ and ido ‘theme’ have a similar pragmatic function, but
slightly different semantics. As for their pragmatic function, both kalau and
ido can be characterized as conjunctions that mark the constituent that they
have scope over as “setting the scene” for the clause to come. This function
is illustrated in (14), where the two conjunctions are used in combination.
Note that the borrowed conjunction kalau precedes the phrase ránsyo ‘sweet
potato’, which then is followed by the indigenous conjunction ido:
(14) Kalau ránsyo ido, bisa k-án vepék i.

if sweet.potato theme can 1pl.inc-eat raw 3sg
‘As for sweet potatoes, we can eat them raw.’
The conjunctions kalau and ido may have scope either over a (series of)
phrase(s), as in (14) above, or over a (series of) clause(s). In the latter case,
the indigenous conjunction ido allows for both a temporal and a conditional
interpretation, while the use of kalau is restricted to conditional contexts.
Two examples of the use of kalau introducing clauses are given in (15) and
(16). Sentence (15) is the introduction of a narrative sketching the hypothetic-
al situation of the addressee accompanying the speaker at a journey across
the island Biak.
(15) Kalau Wilco wa-so vo kuy-ék ro bis,

if Wilco 2sg-accompany sim 2du-go.up loc bus
ku-sasyar ro terminal kota (...)
1du.inc-go.out loc terminal town
‘If Wilco you followed and the two of us got on the bus, the two of us
would go out of the bus terminal (…)’
While (15) above is an example of the use of kalau on its own, the following
sentence is another example of the use of kalau and ido in combination.
(16) Kalau nko-na rovean=no va ido,

if 1pl.ex-have food=nonSP.pl not theme
na nk-án vo (...)
3pl.inan 1pl.ex-eat sim
‘If we have no food, that (sago) we eat and (use it to live).’
Examples of the use of bare ido are given in (17) and (18). Whereas (17)
allows for the addition of kalau, this is not possible in (18), as the sentence
cannot have a conditional interpretation.
(17) Na ya-ra vo ya-yar ido, w-óre.

then 1sg-go sim 1sg-late theme 2sg-call
‘When I then go and am late, you call.’
(18) I-fukn kapai nan-ya ido d-óve: (...)

3sg-ask mouse giv-3sg.spc theme 3sg-say
‘When he asked the mouse, it said: (...)’
Biak 337
As for the classification of kalau as either a code-switch or a loan, the fol-

lowing should be noted. First, whereas the use of kalau is optional, it does
not seem to be motivated by “a conscious choice for stylistic effects”. Rather,
the choice for kalau seems to be motivated by (subconscious) structural pres-
sure from the donor language to express the function of “setting the scene”
by means of a sentence-initial conjunction. Once the speaker has come to the
end of the constituent setting the scene, there is structural pressure from the
indigenous language to mark this setting of the scene again, by using the in-
digenous constituent-final ido. The fact that the use of kalau does not seem to
be triggered by the presence of neighboring L2 elements points towards bor-
rowing. The use of kalau is not very frequent, however, given that it is used
in only 20 cases out of an estimated 300 cases in which it could be used in
addition to ido.4 Although this low frequency does not exclude an analysis of
kalau as a loan, a higher frequency would be a more convincing indication of
its status as a loan. Finally, following the earlier discussion on shared phon-
eme inventories, the absence of phonological integration – kalau being real-
ized as [kalau] whereas Biak has no phoneme /l/ – does not say much either
in favor of or against borrowing.
Turning to another part of speech, the corpus contains a number of ex-
amples of the use of the Indonesian complementizer bahwa, an example of
which is given in (19).
(19) Karena rasul Yohanes i-fawi bahwa jemaat

because apostle Yohanes 3sg-know compl congregation
Efesus se-terima wós Mansern
Ephesus 3pl.an-receive word Lord
v<y>e=d-ya.
<3sg>pos=3sg-3sg.spc
‘For the apostle John he knew that the congregation of Ephesus had
received the Word of the Lord.’
Unlike Indonesian/LM, Biak has no complementizers, but links clauses either

asyndetically, as in (20), or by the use of a conjunction, as illustrated by the
conjunction voi ‘but’ in (21).
(20) M<y>ám randip=s-ya s-an rovean
<3sg>see pig=3pl.an-spc 3pl.an-eat food
v<y>e=na.
<3sg>pos=3pl.inan
‘He saw that the pigs had eaten his food.’
(21) Si-fawi va voi, mankroder an-ya s<y>áe

3pl.an-know not but frog giv-3sg.spc <3sg>go.out
bur fyom an-i.
from vase giv-3sg.spc
‘They did not know, but the frog had gone out of the bottle.’
It should be noted that the use of bahwa ‘compl’ is clearly restricted to texts
that contain quite a number of code switches. This suggests that the use of
bahwa ‘compl’ should be analyzed as an instance of code-switching due to
language loss, rather than as incorporation of a new structure into the Biak
language.
Considering numerals, both Biak and Indonesian/LM (like many other
Austronesian languages) have a decimal system, and it is very unlikely that
one of the two languages has borrowed the system from the other. Borrowing
of individual numerals, however, is very common, especially for numbers
higher than ten.
Finally, the language uses Indonesian/LM words for reference to the days
of the week, as well as for the concept ‘year’. Although some older people
know of other names for days of the week introduced by missionaries, I have
never come across any of these names being used spontaneously.
5. Syntax
Apart from the placement of kalau ‘if’ described above, another evident con-
tact phenomenon in syntax is the placement of the borrowed negator bukan
‘not’ in sentence-initial position, whereas the indigenous Biak negator va
is used sentence-finally. The use of the Biak negator va ‘not’ is illustrated
in (22):
(22) Mansren Yesus i-pok fa v<y>e-farander ko va.

Lord Jesus 3sg-can cons <3sg>vblz-forget 1pl.inc not
‘The Lord Jesus cannot forget us.’
All of the sentences in the corpus that are introduced with borrowed bukan
‘not’ are at the same time closed off with the Biak negator va. An example of
the use of bukan ‘not’ in combination with va ‘not’ is given with the follow-
ing sentence:
Biak 339
(23) Bukan ko-fafyár biasa va.

not 1pl.inc-tell usual not
‘It is not that we just sit normally and tell stories.’
The use of Indonesian/LM bukan ‘not’ and Biak va ‘not’ thus leads to double
marking of negation, which however is interpreted as single negation. Both
in Indonesian/LM and in Biak, the use of bukan ‘not’ serves to contradict a
presumed belief, indicating that the circumstances referred to are not true.
More than the bare use of indigenous va ‘not’, the additional use of borrowed
bukan ‘not’ stresses the counter-expectational or “contrastive” nature of the
negation (cf. Van Minde 1997: 278 on bukang, Macdonald and Darjowidjojo
1967: 160). It is not surprising, then, that in quite a number of instances the
negated expression is linked to a following clause by the indigenous counter-
expectational or contrastive conjunction voi ‘but’. An example of the com-
bined use of bukan ‘not’, va ‘not’ and the contrastive conjunction voi ‘but’
is given with (24). The sentence is part of an exposition about a boy seeking
for a ran-away frog.
(24) Indya bukan mankroder an-ya is-ya va voi

So not frog giv-3sg.spc 3sg.pred-that not but
kapai nan-i-ne d-éke.
mouse giv-3sg.spc-this 3sg-go.up
‘So not the frog was there, but the mouse came up.’
More than in the case of kalau ‘if’, the use of bukan ‘not’ seems to be restrict-
ed to those speakers that tend to code-switch between Biak and Indonesian.
The reason for the relative frequent borrowing of the negator compared to
other adverbs is very much comparable to the reasons described for kalau ‘if’
above. Speakers that are used to speak Indonesian/LM in daily life are guided
by a (subconscious) structural pressure to express contrastive negation by
the use of a sentence-initial adverb. As the Biak negative adverb cannot be
used in this position, speakers make use of an Indonesian loan. As soon as
the speaker has reached the end of the constituent that is negated, structural
pressure from the indigenous language makes him/her use the Biak negator
va ‘not’ in addition.
6. Conclusion
In addition to the lexical borrowing of verbs, this chapter has described sev-
eral contact-related phenomena that can be qualified as instances of gram-
matical borrowing. An overview of borrowed formatives is given in Table 1,
which also compares their functional and structural properties with those of
their Biak counterparts.
Comparing the different cases of grammatical borrowing, the following
can be observed. The borrowing of modals differs from the other cases in
that the modals lack indigenous counterparts fulfilling the same function.
The modals are similar to the conjunction atau ‘or’ in that both occupy the
same syntactic position as indigenous members of the same lexical category.
The borrowing of kalau ‘if’ and bukan ‘not’ (and bahwa ‘compl’, which how-
ever is better considered a code-switch), on the other hand, can be said to
bring along not only a phonological form (MAT), but also a pattern (PAT)
that differs from their Biak counterparts. Comparison of the last two columns
Table 1. Borrowed formatives; function and structural position compared to indig-

enous counterparts
Formative Function Function Stuctural Used in
compared to position combination
counterpart compared with or instead
to its of counterpart
counterpart
bisa ‘can’ + modal adverb [no same as [no counterpart
harus ‘must’ counterpart modal adverb with same
with same imbe ‘want’ function]
function]
atau ‘or’ disjunctive same same instead
conjunction
kalau ‘if’ setting loan is different: combination +
the scene; semantically initial vs. instead
conditional more final
restricted
bukan ‘not’ negative loan is more different: combination
adverb, specific intial vs. final
implying (implies
contrast contrast)
Biak 341
shows that the combination of a borrowed and an indigenous formative is

only acceptable (or even required) when the two occupy different structural
positions in the clause.
Abbreviations
an animate pc paucal
compl complementizer pl plural
cons consecutive pos marker of possession
du dual pred predicative
ex exclusive sg singular
giv marker of givenness sim simultaneous
nonSP non-specific spc specific
inan inanimate u filler
inc inclusive vblz verbalizer
loc locative
Notes
1. To be more precise, the language has an auxiliary verb pok ‘can’, which how-
ever is used in negative contexts only, as in ya-pok ya-rir aw va ‘1sg-can 1sg-
let.go 2sg not’ → ‘I cannot let you go’. For the expression of “knowing” or
“being able”, the language makes use the verb fawi, as in the following i-fawi
f<y>arfyáre ‘3sg-know <3sg>tell’ → ‘he can tell stories’. While the latter sen-
tence was approved in eliciation, the non-elicited examples in the corpus are
restricted to negative sentences, like sifawi siwasya va ‘3pl.an-know 3pl.an-
read not’ →’they cannot read’. There seems to be no alternative for the borrowed
Indonesian harus ‘must’.
2. In case the auxiliary precedes the NP, it can only combine with an impersonal
3sg-subject marker, and not with a marker that reflects the person-number and
gender properties of the subject of the main verb.
3. The corpus contains more than 600 sentences containing the conjunction ido.
In 20 of these sentences, the indigenous conjunction ido is paired with the bor-
rowed conjunction kalau. In addition, the corpus contains 10 sentences where
kalau is used on its own, not accompanied by ido.
4. As stated in n. 3, the corpus contains at least 600 sentences containing the con-
junction ido. Only in half of these sentences, it would be possible to pair ido
with kalau, given the fact that the use of kalau is more restricted than that of
ido.
References
Adelaar, K. Alexander, and D. J. Prentice

1996 Malay: Its history, role and spread. In: Stephen A. Wurm, Peter Mühl-
häusler, and Darrell T. Tryon (eds.), Atlas of Intercultural Communi-
cation in the Pacific, Asia, and the Americas, 673693. Berlin/New
York: Mouton de Gruyter.
Gardner-Chloros, Penelope
1995 Code-switching in community, regional and national repertoires. In:
Lesley Milroy and Pieter Muysken (eds.), One Speaker Two Lan-
guages: Cross-Disciplinary Perspectives on Code-switching, 6890.
Cambridge: Cambridge University Press.
Macdonald, R. Ross, and Soenjono Darjowidjojo
1967 Indonesian Reference Grammar. Washington (DC): Georgetown Uni-
versity Press.
Salmons, Joe
1990 Bilingual discourse marking: Code switching, borrowing, and conver-
gence in some German-American dialects. Linguistics 28: 453480.
Van den Heuvel, Wilco
2006 Biak: Description of an Austronesian Language of Papua. Utrecht:
LOT.
Van Minde, Don
1997 Malayu Ambong: Phonology, Morphology, Syntax. Leiden: Research
School CNWS, School of Asian, African and Amerindian Studies.
Sino-Vietnamese grammatical borrowing:
An overview
Mark J. Alves
1. Overview1
The purpose of this chapter is to describe the influence of the Chinese

language(s)2 on Vietnamese with a focus on grammatical aspects. Language
contact between Vietnamese and Chinese has led not only to the borrowing of
many thousands of Chinese words and influence on the phonological system
of Vietnamese but also to some changes in Vietnamese grammar. This influ-
ence and borrowing was not, however, sufficient in quantity to move Viet-
namese away from its Southeast Asian typological linguistic template, and
many of the grammatical characteristics typical of varieties of Chinese are
not part of Vietnamese grammar, thereby indicating substantial contact but
not one that suggests overwhelming language contact. Following Thomason
and Kaufman’s (1988: 50) typology of language contact, we can consider
Sino-Vietnamese language contact a case of medium to strong cultural pres-
sure with heavy lexical borrowing and moderate structural influence.
This chapter first looks briefly at the history of language contact that Viet-
namese has had with several languages. It then covers specific aspects of
phonological, morphological, lexical, and grammatical borrowing, with an
emphasis on Chinese linguistic influence. Overall, the grammatical influence
of Chinese is primarily lexical rather than structural, and many of the gram-
matical words of Chinese origin were not originally grammatical in Chinese,
highlighting internal innovation as a source of some of the change rather than
direct influence from Chinese. The rest of this introduction describes the his-
torical and sociolinguistic setting.
Vietnamese is the national language of Vietnam, a country with over 80
million people. 90 percent of the population is ethnic Kinh, the ethnic auto-
nym for the Vietnamese.3 As the language of the majority, Vietnamese is, and
has been for the last millennium from the end of Chinese rule over Vietnam
in the middle 900s CE, the language of administration, religion, education,
and most aspects of daily life, although Chinese literature has nevertheless
remained a constant influence for two thousand years.4 The other 10 percent
344 Mark J. Alves
of the population in Vietnam speak over 50 different languages belonging to

five different language families: Austroasiatic, Austronesian, Hmong-Mien,
Sino-Tibetan, and Tai-Kadai. In this multi-lingual situation, Vietnamese has
also served as a lingua franca and has spread linguistic elements into lan-
guages of some ethnic minority groups.5
The orthography used in Vietnam has moved through three stages over the
past two millennia. In the first stage, Chinese was the sole writing script from
about 100 bce to sometime after the tenth century ce.6 In the second stage,
for several centuries after the end of Chinese rule in Vietnam, the develop-
ment of Chữ Nôm, a writing system based on Chinese writing techniques7 but
representing colloquial Vietnamese vocabulary, had matured in a full literary
tradition by the early nineteenth century. Finally, in the third stage, with the
arrival of European missionaries in the seventeenth century came the creation
of a romanized orthography to represent the pronunciation of Vietnamese,
the seminal publication being the 1651 ‘Dictionarium Annnamiticum [sic]
Lusitanum, et Latinum’ (a Vietnamese–Portuguese–Latin dictionary) by the
Portuguese missionary Alexandre de Rhodes (reprinted in 1991). However,
it was not until the beginning of the twentieth century that what came to be
called the Quốc Ngữ alphabet completely replaced the use of the Chinese and
Nôm scripts. While this was also the French-colonial era, when the French
administration was encouraging the use of a Western alphabet in Vietnam, the
Vietnamese themselves ultimately adopted this writing system at least in part
under a Nationalist movement (DeFrancis 1977).
Overall, for the past two thousand years, the primary influence in Viet-
namese culture and language has come from Chinese. By the time the Chi-
nese arrived, those living in the Red River Delta region, the presumed ances-
tors of the Vietnamese, had a developing civilization with possible influence
from Tai-Kadai groups, though there is little evidence to portray this com-
munity vividly. From the first century bce on, successive waves of Chinese
entered Vietnam, settling in Vietnam and marrying into the Vietnamese popu-
lation, thereby contributing to both Vietnamese culture and the Vietnamese
gene pool. The early, formative Sino-Vietnamese era during the Han dynasty
set the stage for continued Chinese cultural influence which remains to the
present day, primarily in the area of the lexicon through the coining of mod-
ern terminology, called ‘Sino-neologisms’ in this chapter. In particular, the
written language of Chinese has brought during the three stages described
above substantial additions to the Vietnamese lexicon in a wide variety of
semantic categories, along with the imported culturally specific ideas and
conceptual systems.
Sino-Vietnamese 345
2. Phonology
Ultimately, no absolute method exists to determine precisely how Chinese

has influenced Vietnamese phonology and to separate borrowing from natural
internal changes. Nevertheless, considering the quantity of vocabulary bor-
rowed from Chinese into Vietnamese and various similarities in their phono-
logical systems, such a possibility cannot be casually dismissed. At the same
time, it is best to consider also possible language-internal innovations before
assuming that similar phonological aspects are the direct result of borrowing
from Chinese. The aspects of Vietnamese phonology that most likely show
the influence of contact with Chinese are tones and certain classes of con-
sonants.
The system of tones in Vietnamese, with a single register height split, fits
the Chinese model better than that of Tai-Kadai languages, with a three-way
distinction (Haudricourt 1972), supporting the position that Chinese, as op-
posed to Tai, could have influenced the development of the Vietnamese tone
system. However, the precise progression of the development of Vietnamese
tones has yet to be agreed completely upon by researchers. Haudricourt’s
(1954) original ground-breaking hypothesis of tonogenesis in Vietnamese
has been questioned and modified (Gage 1985, Diffloth 1990). Some have
suggested that Vietnamese tones are an internal development (Alves 2001).
Thurgood (2002) proposes a model for Vietnamese tonogenesis in which it
is the result of interacting laryngeal features, a model which does not require
language contact to account for it. Based on the archaic four-tone Vietic lan-
guages, such as Ruc and Arem (Nguyễn V. L. 1988), which have much less
lexical influence from Chinese and which have preserved final /-h/ which
corresponds to the hỏi and ngã tones of Vietnamese, it is possible to hypoth-
esize that the borrowing by Vietnamese of numerous Chinese words having
the shǎng shēng 上聲 tone category may indeed have influenced the develop-
ment of the hỏi/ngã tone category in Vietnamese and varieties of the closely
related Mường languages. The strong version of this hypothesis – that Chi-
nese borrowing directly led to the creation of the hỏi/ngã tone category – is
likely to be proven wrong, while the weak version – that Chinese influenced
this tone category to a degree through extensive lexical borrowing of words
with that tone category – may turn out to be more feasible.
In the area of segmental phonology, it is most likely the case that the
retroflex sounds /ʂ/, /ɽ/, and /ʈ/, the labial sounds /f/ and /v/, and the dental
sound /z/ are, at the very least, partially the result of the massive borrowing
of Chinese loanwords and the phonological changes in Chinese during some
346 Mark J. Alves
periods (in particular, the period during which palatalization was spread-
ing throughout Chinese in the Middle Chinese era) when Chinese vocabu-
lary was being incorporated into the Vietnamese spoken lexicon. Retroflex
sounds are typologically marked among the nearby major language groups
Tai and Mon-Khmer in Southeast Asia, thus supporting the notions that these
sounds were not borrowed from the neighbors of the Vietnamese and that
they are less likely random changes. Numerous instances of non-Chinese
vocabulary which today have initial retroflex sounds are readily found in
the seventeenth-century dictionary by de Rhodes as initial clusters of *tl.8 In
some instances, such sounds may have come from collapsed pre-syllables, as
suggested by the presumed original Mon-Khmer phonological word struc-
ture of earlier stages of Vietnamese, which finally became single retroflex
consonants.
This apparent monosyllabification of Vietnamese and reduction of clusters
into single consonants was most likely due to a combination of natural lin-
guistic tendencies toward the unmarked as well as contact with and massive
lexical borrowing from Chinese. Finally, it is perhaps worth mentioning that
Vietnamese, like varieties of Chinese, has a maximum syllable CVC struc-
ture, with the exception of the glide /w/ in CGVC syllables, also a character-
istic of varieties of Chinese.
More recent adaptation of Western words, mainly French and English,
have brought certain non-native phonemes, such as initial unaspirated /p-/
from French (e.g. pin ‘battery’).9 Such single instances of borrowed sounds
are limited to foreign loans and have not altered the Vietnamese phonological
system.
3. Nouns, measure words, and nominal structures
In the realm of Vietnamese syntactic structure, the only aspect which Chi-
nese most likely influenced is noun phrase structure, specifically regarding
the position of measure words before their semantically selected nouns. This
order is in contrast with the post-nominal position expected for an otherwise
modifier-final language, as well as in contrast with the typology of numer-
ous languages in Southeast Asia (e.g. various Tai languages and Mon-Khmer
languages native to neighboring regions in Southeast Asia west of Vietnam)
which do exhibit such a sequence. Indeed, it is primarily the Mon-Khmer lan-
guages within Vietnamese borders that generally follow the measure–noun
order (Jones 1969), suggesting that this is a contact effect in this region from
Sino-Vietnamese 347
China southward, whereas the regions west of Vietnam exhibit a different

typological pattern. Overall, this suggests (but certainly does not prove at
this point) two distinct regions of long-term language contact, with a chain of
influence from Chinese to Vietnamese and then to various ethnic minorities
in Vietnam.
Support for this position of Chinese influence lies in two key points. First
of all, Vietnamese has borrowed at least a dozen classifiers from Chinese and
dozens of other general measure words (Nguyễn 1957 and Alves 2001). The
borrowing of such vocabulary most likely came through both literary and
spoken contact, and the number may have lent to the borrowing of the pos-
ition of the elements as well. Many of these loanwords have kept their ori-
ginal semantic properties and take the same kinds of nouns as they do in Chi-
nese, while others have developed new semantic constraints on co-occurring
nouns.10 Table 1 contains a list of Vietnamese classifiers of Chinese origin.
These are more semantically bleached and hence more grammaticalized lex-
ical items than general measure words (e.g. bình 瓶 (píng) ‘bottle of’, bao 包
(bāo) ‘bag of’, and the like, as well as units of measurement, such as length
and weight). A handful of the forms in the table are phonologically nativized
Table 1. Borrowed Chinese classifiers

Vietnamese Chinese Category
bàn 壁 (bì) a unit for flat surfaces (table, hand, foot)
bản 本 (běn) a unit for scripts, reports, compositions
căn (SV gian) 間 (jiān) a unit for houses
chiếc (SV chích) 隻 (zhī) (1) a unit for vehicles cars, boats, planes
(2) a pair of chopsticks
cuốn (SV quyển) 卷 (juǎn) unit for books
đạo 道 (dào) unit for laws, orders, decrees
đỉnh 頂 (dǐng) unit for mountains
đoạn 段 (duàn) unit for sections, paragraphs, passages
đôi 對 (duì) couple of shoes, chopsticks, husband/wife
môn 門 (mén) unit for a subject/field of study
phát 發 (fā) unit for a shot of a firearm, an injection
tòa (SV tọa) 座 (zuò) unit for buildings
vị 位 (wèi) unit for people of high status
viên 員 (yuán) unit for officials
viên 丸 (wán) unit for small, round things (pills, tablets, bullets,
etc.)
348 Mark J. Alves
words, next to which are listed the Sino-Vietnamese (SV) standard readings
of the same etymon.
A particularly significant borrowing is the Sino-Vietnamese generic clas-
sifier cái (most likely the archaic Chinese form 丐 (gài)),11 which has a virtual
article-like function in Vietnamese that, when used alone without quantifica-
tion, can indicate definiteness and resembles the function of the semantically
equivalent lexical item in Cantonese, the default classifier 個 (Cantonese goh,
Mandarin gè). With very little doubt, this is a Chinese loanword, though the
timing and means of transmission (i.e., spoken or literary sources, or a com-
bination) is not clear.
The second issue, in contrast with the numerous borrowed measure words,
is that of the post-nominal modifying elements in Vietnamese, including stat-
ive verbs, relative clauses, possessive elements (e.g. the native possessive
marker của), and demonstratives (e.g. the native demonstrative đó ‘that’),
none show the borrowing of grammatical lexical elements. Consider the fol-
lowing sample noun phrase, which shows the various post-nominal modifiers
in Vietnamese, all of which precede nouns in varieties of Chinese.
(1) một cái bàn mới đó của tôi

one unit table new that of I
‘that new table of mine’
It is also important to recognize that Chinese has had very little influence
on the numeral system of Vietnamese, in contrast with the substantial influ-
ence of the numeral systems of Tai, and through Tai, Mon-Khmer languages
(i.e. the borrowing of the prime decimals, starting at 30). All basic numbers
in Vietnamese, which are of Mon-Khmer origin, have consistently main-
tained their place in both spoken, colloquial language as well as written and/
or formal spoken language. The single lexical item expressing ‘ten thousand’
was borrowed twice from Chinese. The older Han dynasty borrowing muôn
is the more colloquial form, while the same word borrowed again during the
Tang dynasty, vạn 萬 (wàn), is the literary form, though in fact, neither is
commonly used in Vietnamese today. Instead, the native vocabulary items
ngình ‘thousand’ and triệu ‘million’ are used. Sino-Vietnamese numerals are
generally restricted to specific semantic functions (e.g. ordinals for grades in
school). The Sino-Vietnamese oridinal ‘fourth’ -tư (SV tứ) 四 (sì) in particu-
lar has a more a general function as the fourth day of the week and fourth
month of the year. Overall, Sino-Vietnamese numerals have a peripheral ra-
ther than primary role in the Vietnamese numeric system.
Sino-Vietnamese 349
While the numeral system in Vietnamese has been unaffected by Chinese,

the system of quantification has been more notably influenced. Vietnamese
has accepted some Chinese lexical items of general quantification, including
các 各 (gè) ‘various’ and mỗi 每 (měi) ‘each’. The status of these words as
Chinese is solid as they are listed in Sino-Vietnamese dictionaries and have
very similar semantico-syntactic properties. The term mọi ‘every’ is also
likely a phonologically nativized earlier borrowing form of mỗi.12 The Viet-
namese pronoun system has also received some influence from Chinese, but
not in ways typically seen among varieties of Chinese, indicating both bor-
rowing and innovation in Vietnamese. The three examples include the bound
prenominal chúng 眾 (zhòng) (third-person plural marking before pronouns,
e.g., chúng nó (plural3rd singular) ‘they’), the unbound ta (SV tha) 他 (tā)
(third-person plural marking after pronouns, e.g., cô ta (miss-plural) ‘those
ladies’), and the unbound y 伊 (yī) (third-person singular). Only the latter of
the three is parallel with original Chinese usage,13 but the first two are bound
rather than free forms. Indeed, it must be acknowledged that these lack com-
plete certainty in their origins and can only be considered possible Chinese
loans based on their phonetic and semantic similarity. The process of gram-
maticalization, if it turns out to be so, would need to be identified before a
higher degree of certainty could be reached.
Another important grammatical aspect that ultimately comes from Chi-
nese is the use of terms of address, though their pronominal function in
Vietnamese is an extreme semantico-syntactic extension of their original
usage in Chinese, one which parallels that of other Southeast Asian lan-
guages. Vietnamese pronouns have for the most part been replaced by a
system of familial-based terms of address – many but not all of which come
from Chinese – that indicate age, gender, and degree of formality and polite-
ness, including both standard Sino-Vietnamese (e.g. cô 姑 (gū) ‘miss (more
formal)’) and some nativized Sino-Vietnamese forms (e.g. chị (SV tỉ) 姐
(jiě) ‘miss (less formal)’) (Benedict 1947). Such terms are highly grammat-
icalized; they have complete pronominal referential functions and in fact
have ‘floating’ reference, being able to refer variously to first, second, or
third person in context. While such terms do function as terms of address in
Chinese, they have not gone this far in the direction of complete pronom-
inal usage as in Vietnamese. As in the English translation of sentence (2), in
which ‘Sir’ is an optional addition, in modern Chinese, such terms gener-
ally appear outside the sentence core, whereas in Vietnamese, the term of
address is the subject inside the sentence matrix, not an external, optional
element.
350 Mark J. Alves
(2) anh đi đâu vậy?

you Sir go where thus
‘Where are you going, Sir?’
4. Verbs
One area of significant Sinitic influence is in the Vietnamese system of pre-

verbs, especially in the systems of negation (see Table 3) and voice. One
post-verbal complement has also been borrowed. A number of negation
words in Vietnamese appear to be grammaticalized words of Chinese origin.
First, it has been posited (Nguyễn P. P. 1996) that the negation word không
is a grammaticalized form of the homophonous Sino-Vietnamese không 空
(kōng) meaning ‘void’. Usage of this form has become the dominant form
over other native Vietnamese negation words, such as chẳng. Next, two lex-
ical items, chớ (SV trừ) 除 (chú, meaning ‘exclude’ in Chinese) and đừng
(SV tình) 停 (tíng, meaning ‘stop’ in Chinese), both meaning ‘don’t’ in Viet-
namese, are probably of Chinese origin. Neither form had these semantico-
syntactic properties when borrowed. Note that chớ is also a connector word,
as discussed in Section 5.
The passive-like elements borrowed from Chinese have been discussed
(Matisoff 1991, Alves 2001). The Sino-Vietnamese form được (SV đắc) 得
(dé) shares some of the various functions as seen in Chinese, including pas-
sive-like, abilitative, and resultative functions (see Matisoff 1991). In add-
ition, the two forms do ‘by’ and bị ‘negatively affected by’ show similarly
passive-like functions. While do and bị are unarguably standard Sino-Viet-
namese borrowings of relatively recent borrowing (perhaps early twentieth
century), được is more difficult to account for in terms of its unexpected
phonetic realization (i.e., both tone height and vowel quality). However, con-
sidering its significant phonetic and semantico-syntactic overlap, including
shared patterns of grammaticalization in varieties of Chinese and Vietnam-
ese, it should be included until such time as a stronger alternative explanation
can be provided.
The pre-verb đang is likely from Chinese 當 (dāng), meaning in Chinese
‘at the time of’. In Vietnamese, it is used to indicate an action in progress in
Vietnamese. If it is Chinese in origin, it has developed a different syntactic
distribution from that of Chinese (i.e., it appears between a subject and verb
in Vietnamese main clauses but before a subject in dependent clauses in Chi-
nese).
Sino-Vietnamese 351
The Sino-Vietnamese word qua (SV quá) 過 (guò) ‘to cross’ has a post-
verbal directional function parallel to that in Chinese, a preposition or adverb-
like function expressing the meaning ‘across’. The etymon is also seen as an
intensifier in Vietnamese, as discussed in Section 5. However, despite the
fact that Vietnamese has a rich variety of post-verbal directional elements,
besides this single term, no other directional terms have been borrowed from
Chinese.
(3) đi qua đường.

go cross street
‘Go across the street.’
5. Adverbs, conjunctions, locative terms and others
For the most part, the grammatical adverbs and conjunctions that have been
borrowed from Chinese, as listed in Table 2, were borrowed grammatical-
ized and have largely kept their original senses with little or no modification.
Nevertheless, it is worth noting that several of the items in Table 2 are phono-
logically nativized loans with standard Sino-Vietnamese counterparts with-
Table 2. Borrowed Chinese connective words

Vietnamese Chinese Gloss
bèn (SV tiện) 便 (biàn) and then
chỉ 只 (zhǐ) only
chính 正 (zhèng) just/precisely
chứ (SV trừ) 除 (chú) but not (contrastive)
cùng (SV cộng) 共 (gong) together
giá (SV giả) 假 (jiǎ) if/supposing
hiện tại 現在 (xiàn zài) currently
hoặc 或 (huò) or
nhưng 仍 (réng) but
sở dĩ 所以 (suǒ yǐ) the reason why…
tại 在 (zài) because
thậm chí 甚至 (shén zhì) even
tuy nhiên 雖然 (suī rán) however
và (SV hòa) 和 (hé) and
vì (SV vị) 為 (wèi) because
352 Mark J. Alves
out those developed usages. It must be admitted that more data (e.g. written
records and both syntactic and phonological comparative research of Viet-
namese and Chinese) will have to be provided to verify these phonological
variants as genuine Chinese loanwords.14
A handful of locative terms have also been borrowed. These include (a)
tại 在 (zài) ‘to be at’ (with a more formal, literary flavor), which is the ori-
ginal meaning in Chinese (and which has developed the additional meaning
‘because’ in Vietnamese), (b) ngoài (SV ngoại) 外 (wài) ‘outside (of)’, (c)
gần (SV cận) 近 (jìn) ‘near (to)’, and (d) bên (SV biên) 遍 (biān) ‘side’.15
Notably, a number of comparative/intensifying words are likely Chinese
loanword candidates that have grammaticalized since entering Vietnamese.
These include (a) bằng (SV bình) 平 (píng) ‘equal to’ (originally ‘level/bal-
anced’), (b) nhất 一 (yī) ‘most’ (originally ‘one’), (c) giống (SV chủng with
the basic meaning ‘type/kind’) 種 (zhǒng) ‘resemble/similar to’, (d) thật (SV
thiệt) 實 (shí) ‘truly’, and (e) quá 過 (guò) ‘extremely’ (with the basic mean-
ing of ‘to pass over’) (Alves 2005). More data, such as examples of these in
ancient Nôm writings, would be required to label these with more certainty.
One item which has maintained its original semantico-syntactic properties as
in Chinese is the form như 如 (rú) ‘as/like’.
The Sino-Vietnamese form tự 自 (zì) means ‘to do by oneself’, though it
has a much more restricted usage than in Chinese, where it has a full reflexive
function.
6. Syntax
Vietnamese is a topic–comment language with right-branching syntactic

structures, like many other languages in Southeast Asia. Grammatical bor-
rowing from Chinese has been primarily lexical with relatively little evidence
of structural borrowing, the one exception being the position of measure
words in quantified noun phrases, as discussed in Section 3.
7. Lexicon and timing of borrowing
While Vietnamese has definitely borrowed a modest amount of vocabulary

from French and quite probably from Tai groups before contact with the
Chinese,16 neither loanword source has contributed in any way to Vietnam-
ese grammar or syntax.17 Loanwords from each of these sources number at
Sino-Vietnamese 353
most several dozen from French (many which have fallen out of use in the
post-colonial era) and a few dozen from Tai. These loanwords belong to ra-
ther restricted ranges of semantic domains. Words of a Tai origin suggest
contact at a time when agricultural techniques of the Tai peoples were passed
on from the ancestors of the Tai to the ancestors of the Vietnamese. Loan-
words in this category include domesticated livestock such as vịt (Thai pèt)
‘duck’ and đực (Thai t ɬ`k) ‘young male animal’ and terms related to farming
such as rẫy (Thai rây) ‘dry field’, đồng ruộng (Thai thûŋ) ‘field’, and mương
(Thai mɬaŋ) ‘ditch’ (Nguyễn 1995: 322).18 French loanwords that remain in
Vietnamese today, as opposed to vocabulary that fell out of use in Vietnam-
ese after the French left Vietnam, are primarily introduced western cultural
terms and modern (at the time of borrowing) accoutrements, such as áo sơ
mi (Fr. chemise) ‘shirt’, xà phòng (Fr. savon) ‘soap’, xe ô tô (Fr. automobile)
‘automobile’, bơ (Fr. beurre) ‘butter’, pin (Fr. pile) ‘battery’ and pa tê (Fr.
pâté) ‘pate’,19 among others. The European presence did lead to the spread
of an alphabetic writing system in Vietnam, though lexical influence from
French was nevertheless superficial, resulting in the permanent borrowing
of some dozens of words (Barker 1969) and some dozens more which have
fallen out of usage.
More recently, some English loans have entered Vietnamese, but not in
large numbers and without certainty of permanent or even long-term pres-
ence in the Vietnamese lexicon, though Vietnamese dictionaries appear to list
increasing numbers of them. Some examples include mít tinh ‘meeting’, vi
rút ‘virus (either computer- or health-related)’, and in tơ nét ‘internet’. Such
words tend to be related to, but are not restricted to, technology.
In contrast with the loanwords from other sources which were borrowed
in relatively small numbers within relatively short periods of time, Chinese
vocabulary in Vietnamese consists of several thousand words borrowed over
a period of two thousand years.20 As discussed in Section 1, the starting point
of Sinitic borrowings began possibly as early as 100 bce during the end of
the Western Han dynasty. Words that were borrowed from Chinese in this
era are often considered by the Vietnamese to be Nôm vocabulary, mean-
ing essentially native Vietnamese vocabulary. This perception is due to both
the substantial phonological and in some cases semantic changes over such
an expanse of time and the lack of written records to verify their borrow-
ing. The second stage of borrowing happened as a result of the spread of the
Chinese rhyming dictionaries throughout East Asia during the Tang dynasty
(618907). These books brought with them the entirety of literary Chinese,
though this vocabulary was, of course, brought into use in spoken Vietnam-
354 Mark J. Alves
ese over time. This borrowing of Chinese through written language continued
throughout the second millennium ce, though spoken language contact was
likely also a factor as trade and cultural intake continued. The third stage of
borrowing was in the beginning of the twentieth century, when the Japanese
had been translating Western concepts by utilizing classical Chinese lexical
material. This system of linguistic adaptation spread into Chinese, Korean,
and Vietnamese (Sinh 1993). Over a period of several decades, thousands of
new Sino-neologisms entered Vietnamese, though it was ultimately a mixture
of borrowings from both Japan and China.
One of the more difficult problems is, when dealing with language contact
over such a lengthy period of time, determining the timing of borrowings. In
some cases, such as the influence of Chinese on the Vietnamese use of the
passive voice, modern written records show that this corresponds with the
timing of large quantities of translations of Western writings into Chinese.
In other cases, phonological evidence is the source of the identification of
timing. For example, Wang Li (1948) posited that giống ‘type’ was a word
borrowed most likely during the Han dynasty, and the same word was again
borrowed as the Sino-Vietnamese (SV) literary reading chủng 種 (zhǒng).
Mei (1970) identified this as part of a pattern of both the palatalization of
initial consonants and the correspondence between the Old Sino-Vietnamese
sắc tone and the Middle Chinese shǎng shēng 上聲 tone category (whereas
Sino-Vietnamese proper borrowed in the Middle Chinese era was the hỏi
tone, as in the previous example).
Another systematic phonological correspondence is between literary
Sino-Vietnamese readings having the nặng tone and the nativized cognates
with the huyền tone, such as the standard, literary readings tự自 (zì) ‘from’,
dụng 用 (yòng) ‘use’, and loại 類 (lèi) ‘type’ versus their nativized, spoken
readings, từ, dùng, and loài respectively. Considering their otherwise similar
phonetic shape indicates that they are more likely relatively recent borrow-
ings, perhaps within the past several centuries rather than in the Han-dynasty/
Pre-Tang era. In many other instances, no patterns or written records exist
to assist in verifying approximate dates of the borrowings, which weakens
somewhat the argument for these words as Chinese in origin. Regardless,
with the weight of the lexical evidence and numerous patterns of phonologic-
al correspondences, such words with variant pronunciations must at least be
considered likely loanwords.
As a result of the similarity in general of word formation patterns through-
out China and Southeast Asia, bisyllabic compounds consisting of two syl-
lables each with distinct meanings cannot be said to be the direct influence of
Sino-Vietnamese 355
Table 3. Borrowed Chinese derivational word forms

Vietnamese Chinese Gloss
bất 不 (bù) ‘in/un-’
phản 反 (fǎn) ‘anti-‘
phi 非 (fēi) ‘non-’
vô 無 (wú) ‘non-’
hoá 化 (huà) ‘-ize’
học 學 (xué) ‘-ology’
Chinese despite the borrowing of significant quantities of Chinese bisyllabic

compounds, such as the Sino-neologisms in the beginning of the twentieth
century. There are not too many instances of morphological borrowing (e.g.
prefixes or suffixes) since many Chinese compounds have been borrowed
unanalyzed. There are, however, a number of derivational morphemes that
have been borrowed from Chinese, as in Table 3, that have some limited
amount of productivity in Vietnamese and can be used to create new terms
in Vietnamese. The examples here are generally in line with Vietnamese typ-
ology (e.g. negation indicated before the negated aspect, as is the case in
Vietnamese syntax).
In contrast, when a typological pattern in Chinese lexical material does
not match that of Vietnamese, expected modifications are seen. In some
cases, the Vietnamese head–modifier order has been applied to Chinese loan-
words, such as the order of the names of continents. The Chinese term for
‘Europe’ 歐洲 (ōu zhōu) (Europe-continent), Sino-Vietnamese Âu Châu, was
used in its original Chinese order earlier in the twentieth century, but later,
the native-like order Châu Âu (continent-Europe) became dominant, as was
the case with expressions for other continents. In addition, there are some
calques such as xe lửa ‘train’ (vehicle-fire) and máy bay ‘airplane’ (machine-
fly) which also follow the expected Vietnamese order. These two words have
largely replaced the earlier Sino-Vietnamese forms hoả xe 火車 (huǒ chē,
(fire-vehicle)) ‘train’ and phi cơ 飛機 (fēi jī, (fly-machine)) ‘airplane’. While
the terms with Chinese order are not completely out of usage, the forms that
follow the native Vietnamese patterns are dominant.
Overall, lexical borrowing has not influenced the word-formation patterns
seen in Vietnamese, which has retained its typological Southeast Asian pat-
tern of head-modifier. Also, as mentioned below in Section 8, the reduplicat-
ing patterns common to varieties of Chinese are entirely absent in Vietnam-
ese.
356 Mark J. Alves
8. Summary
While Chinese has been the primary and most influential donor of vocabulary
to the Vietnamese lexicon, this borrowing has not resulted in relexification or
major restructuring of Vietnamese syntax. There does appear to have been
some influence in Vietnamese phonology and some influence in the position
of measure words in quantified noun phrases, but otherwise, grammatical bor-
rowing has been largely in the area of grammatical vocabulary. The amount of
grammatical vocabulary is substantial, including dozens of measure words,
two dozen or so words with connective functions, a good handful of grammat-
ical adverbs and preverbs, some locational terms, and a few other numerals
and quantity expressions.
To further understand the Sino-Vietnamese borrowing situation, it is also
important to consider in a borrowing situation what was not borrowed. Nor-
man’s (1988: 13) list of Sino-Tibetan comparisons in a comparative wordlist
of six Chinese and non-Chinese Sino-Tibetan languages shows two dozen
basic vocabulary items, none of which have replaced native Vietnamese
words (with the single exception of lạnh (SV lãnh) 冷 (lěng) ‘cold’). Clearly,
however much lexical influence Vietnamese has received, it has not been
relexified and has retained a significant amount of vocabulary of Mon-Khmer
stock. Vietnamese does not even share characteristic vocabulary of the nearby
Chinese Yue languages, such as Cantonese.21 Beyond lexical borrowing, there
is the minimal amount of syntactic borrowing. Were the Sino-Vietnamese
contact heavy enough, we might expect to see other kinds of grammatical,
structural aspects of Chinese in Vietnamese, but most of the grammatical
typological characteristics common to varieties of Chinese are in fact not
seen in Vietnamese. Notable characteristics of varieties of Chinese include
verbal compounds of ability (verb–得 (dé)–resultative) and direction (not-
ably, the verb plus two-syllable directional terms); the use of reduplication
(A-not-A) in questions; the Chinese style of reduplication of the second syl-
lable in two-syllable terms (ABB); the use of post-positional locative nouns;
and the position of modifiers, possessives and demonstratives before head
nouns. While Vietnamese does make use of a substantial number of sentence
final particles with modal properties to express politeness, assertion, com-
mands, and others, there is no strong evidence suggesting that they have come
from Chinese.22
As can be seen, Sino-Vietnamese borrowing did not lead to the borrow-
ing of characteristic aspects of most varieties of Chinese. Moreover, internal
innovation, rather than borrowing in a direct sense, has led to the develop-
Sino-Vietnamese 357
ment of grammatical functions of Sino-Vietnamese vocabulary that originally

lacked those specific semantico-syntactic characteristics. Questions remain
about the nature of the contact through which borrowing occurred. Was this
borrowing primarily through spoken contact or from texts that the Vietnam-
ese elite spread into spoken Vietnamese? In fact, it appears that the borrowing
has only been partially through spoken contact. It seems that, though bilin-
gualism in Chinese and Vietnamese has clearly played a part in this contact
situation, a significant amount of Chinese was likely transmitted into Viet-
namese from Chinese writings by Vietnamese leaders, scholars, and other
socially influential figures, a situation which best accounts for the reason
why colloquial grammatical elements common to varieties of Chinese did not
enter Vietnamese.
Notes
1. Sources of data regarding borrowing into Vietnamese are mentioned when rele-
vant. Otherwise, the data comes from the author’s research, and any discrepan-
cies are the author’s responsibility.
2. The simplified term ‘Chinese’, as used in this chapter, can refer variously to the
entire group of Sinitic languages, various subgroups within the Sinitic branch
of Sino-Tibetan, or the written language used by all of those groups. Specific
usages of ‘Chinese’ for the arguments in the chapter are noted as needed.
3. The term Kinh is itself Chinese in origin, jīng 京.
4. See Taylor (1983) for a more detailed discussion of the first thousand years of
Sino-Vietnamese contact.
5. The issue of borrowing of Vietnamese elements has not been explored exten-
sively in linguistic literature.
6. No records exist to demonstrate whether or not the Vietnamese had developed
an indigenous writing system before the arrival of the Chinese, as some have
suggested.
7. Most often, as in Chinese, Chữ Nôm characters use a combination of Chinese
radicals, one with a phonetic element and one with a semantic element. See
Nguyễn D. H. (1990) for more discussion.
8. See Maspero (1912), Ferlus (1981, 1992), and Nguyễn (1995) for more discus-
sion on the historical reduction of initial clusters and development of initials in
general in Vietnamese.
9. Final /-p/ is a native sound in Vietnamese. Initial /p-/ was lost most likely due to
changes in Chinese from *p to *f and perhaps partly due to merging with earlier
Vietnamese *β.
10. This is not unlike the variation in semantic properties of the same classifier in
different varieties of Chinese.
358 Mark J. Alves
11. In Alves (2005), this was linked the generic Chinese classifier個 (gè), which is
less likely the source of Vietnamese cái. Nevertheless, the phonetic and seman-
tic similarities and the dominant use of in varieties of Chinese 個 (gè) do put
into question whether gài 丐 or gè 個 is the source.
12. Another possibility is that it is a retention of the original Mon-Khmer word
meaning ‘one’, which could have grammaticalized, as the cognate in the Mon-
Khmer language, Pacoh (Alves 2005: 76). If this hypothesis is valid, this could
account for the fact that Vietnamese một ‘one’ has an added /t/; it is an instance
of phonological distinction due to semantico-syntactic differences.
13. The Chinese form yī 伊, while not part of all spoken varities of modern Chinese,
is the standard third-person pronoun in Taiwanese.
14. Not included in the list is the quintessential Vietnamese topic–comment linking
thì, which may have developed from the homophonous Chinese form 時 (shí)
meaning ‘time’. The origins of this form will remain speculative until more ex-
plicit data become available.
15. While it is tempting to include Vietnamese trong ‘inside’ as a nativized form of
Sino-Vietnamese trung 中 ‘inside’, there is a competing Mon-Khmer etymon
(cf. Pacoh kallúng and Khmer knoŋ, with presyllabic telescoping to retroflex /ʈ/
in Vietnamese, as discussed in Section 2).
16. The French borrowings are certain since they are recent, phonologically close
to their source words, and clearly connected with modern cultural innovations.
While Tai-Vietnamese contact appears certain, Tai borrowings can only be de-
termined by a systematic comparison of reconstructions going back two millen-
nia or earlier, making such forms likely but far from certain candidates. Also,
the possibility that Tai borrowings happened after Vietnamese contact with the
Chinese cannot be excluded but is difficult to verify.
17. The general typological similarities between Vietnamese and other Southeast
Asian languages cannot be readily attributed to any other language and can only
be considered an areal affect with no clear, single direction of influence or bor-
rowing.
18. The modern Tai forms are given for convenience. These forms have been checked
for Proto-Tai forms using the ‘Proto-Tai’o’matic’ lexical database, which con-
tains a compilation of several Proto-Tai reconstructions, at http://crcl.th.net/.
19. See last paragraph in Section 2 on loanword phonology of Western words.
20. Studies on Sino-Vietnamese vocabulary and these layers of vocabulary include
the works of Wang (1948), Ðào (1979), Tryon (1979), and Pulleyblank (1981).
21. The exception is the verb “to see” thấy, most likely related to Chinese 睇 (tì),
Cantonese tái.
22. Where are the Vietnamese sentence particles à, which expresses surprise, and
ạ, which expresses politeness. Chinese as well as other languages in the re-
gion have sentence particles with similar unmarked phonological material, es-
sentially eliminating the ability to determine whether borrowing has occurred.
One form of note is the Vietnamese sentence-final particle mà, which suggests
Sino-Vietnamese 359
that what is said is something previously asserted and should be known by the
speaker. This is not unlike Mandarin Chinese ma 嘛. Again, however, the phono-
logical material is unmarked and harder to confirm as borrowed material.
References
Alves, Mark J.
2001 What’s so Chinese about Vietnamese? Papers from the Ninth Annual
Meeting of the Southeast Asian Linguistics Society, 221242. Ed. Gra-
ham W. Thurgood.
2005 Sino-Vietnamese grammatical vocabulary and triggers for grammat-
icalization. The 6th Pan-Asiatic International Symposium on Linguis-
tics. 315332. Hà Nội: Nhà Xuất Bản Khoa Học Xã Hội (Social Sci-
ences Publishing House).
Barker, Milton E.
1969 The phonological adaptation of French loanwords in Vietnamese. Mon-
Khmer Studies 3: 138147.
Benedict, Paul K.
1947 An analysis of Annamese kinship terms. Southwestern Journal of An-
thropology 3: 371390.
Đào, Duy Anh
1979 Chữ Nôm: nguồn gốc, cấu tạo, diễn biến (Chu Nom: Origins, formation,
and transformations). Hà Nội: Nhà Xuất Bản Khoa Học Xã Hội.
DeFrancis, John
1977 Colonialism and language policy in Viet Nam. New York: Mouton.
De Rhodes, Alexandre
1991 Từ Ðiển Annam-Lusitan-Latinh (Thường Gọi là Từ Ðiển Việt-Bồ-La).
Ho Chi Minh City: Nhà Xuất Bản Khoa Học Xã Hội. (First publ. 1651).
Ferlus, Michel
1981 Sự biến hóa của các âm tắc giữa (obstruentes mediales) trong tiếng Việt
(Changes of medial obstruents in Vietnamese). Ngôn Ngữ Học 1981/2:
121.
1992 Histoire abrégrée de l’évolution des consonnes initials du Vietnamien et
du Sino-Vietnamien. Mon-Khmer Studies 20: 11125.
Gage, William W.
1985 Vietnamese in Mon-Khmer perspective. In: S. Ratankul, D. Thomas,
and S. Premisirat (eds.), Southeast Asian Linguistics Presented to
Andre-G. Haudricourt, 493524. Bankok: Mahidol University.
Diffloth, Gérard
1989 Proto-Austroasiatic Creaky Voice. Mon-Khmer Studies 15: 139154.
360 Mark J. Alves
Diffloth, Gérard
1990 Vietnamese as a Mon-Khmer language. Papers from the First Annu-
al Meeting of the Southeast Asian Linguistics Society, ed. by Martha
Ratliff and Eric Schiller, 125139.
Haudricourt, André G.
1954 Sur l’origine de la ton de Vietnamien. Journal Asiatique 242: 6982.
1972 Two-way and three-way splitting of tonal systems in some far-eastern
languages. In: J. G. Harris and R. B. Noss (eds.), Tai Phonetics and
Phonology, 5886. Bangkok: Central Institute of English Language.
Jones, Robert B.
1969 Classifier constructions in Southeast Asia. Journal of the American Ori-
ental Society 90 (1): 112.
Maspero, Henri
1912 Études sur la phonétique historique de la langue Annamite: Les initiales.
Bulletin de l’École Françoise d’Extrême-Orient 12: 1127.
Matisoff, James A.
1991 Areal and universal dimensions of grammaticalization in Lahu. In:
E. Traugott and B. Heine (eds.), Approaches to Grammaticalization,
Volume 1, 383453. Amsterdam/Philadelphia: John Benjamins.
Mei, Tsu-Lin
1970 Tones and prosody in Middle Chinese and the origin of the rising tone.
Harvard Journal of Asiatic Studies 30: 86110.
Nguyễn, Đình Hoà
1957 Classifiers in Vietnamese. Word 13 (1): 124152.
1966 Vietnamese–English Dictionary. Rutland, Vermont: Charles E. Tuttle
Co.
1990 Graphemic borrowing from Chinese: The case of Chữ Nôm, Viet-
nam’s demotic script. Bulletin of the Institute of History and Philol-
ogy, Academia Sinica 61: 383432.
Nguyễn, Phú Phong
1996 Negation in Vietnamese and in some of the Viet-Muong Languages.
Pan-Asiatic Linguistics: Proceedings of the Fourth International Sym-
posium on Languages and Linguistics, 810 Jan. 1996, Vol. II: 563–
568. Thailand: Mahidol University at Salaya, Institute of Language
and Culture for Rural Development.
Nguyễn, Tài Cẩn
1995 Gíao trình lịch sử ngữ âm tiếng Việt [Textbook of Vietnamese histor-
ical phonology]. Hà Nội: Nhà Xuất Bản Gíao Dục.
Nguyễn, Tài Cẩn, and Hoàng Dũng
1994 Về các từ gốc Hán cổ tiếng Việt xử lý bằng thuỷ âm tắc bên (lateral
stops) [On the Vietnamese words of Old Chinese origins with lateral
stops]. Ngôn Ngữ 1994 (2): 17.
Sino-Vietnamese 361
Nguyễn, Văn Lợi

1988 Sự hình thành đối lập đường nét thanh điệu bằng/không bằng trong
các ngôn ngữ Việt-Mường (Trên tư liệu tiếng Arem và Rục) (The fea-
tures of the comparative formation of the level/non-level tones in some
Viet-Muong languages (in materials of the Arem and Ruc languages)).
Ngôn Ngữ 2: 39.
Norman, Jerry
1988 Chinese. Cambridge University Press.
Pulleyblank, Edwin G.
1981 Some notes on Chinese historical phonology. Bulletin de l’école
Françoise d’Extreme-Orient 277288.
Sinh, Vinh
1993 Chinese characters as the medium for transmitting the vocabulary
of modernization from Japan to Vietnam in Early twentieth century.
Asian Pacific Quarterly 25 (1): 116.
Taylor, Keith W.
1983 The Birth of Vietnam. Berkeley: University of California Press.
1988 Language Contact: Creolization and Genetic Linguistics. Berkeley
and Los Angeles: University of California Press.
Thurgood, Graham
2002 Vietnamese and tonogenesis: Revising the model and the analysis.
Diachronica 19 (2): 333363.
Tryon, Ray
1979 Sources of middle Chinese phonology: A prolegomenon to the study
of Vietnamized Chinese. MA thesis, Southern Illinois University.
Wang, Li
1948 Hanyueyu yanjiu (Research on Sino-Vietnamese). Lingnan Xuebao.
9.1.196. (Repr. 1958 in the Hanyushi Lunwenji. Beijing Kexue Chu-
banshe, 290406.)
Recent grammatical borrowing into
an Australian Aboriginal language:
The case of Jaminjung and Kriol
Eva Schultze-Berndt
1. Introduction
This chapter deals with the nature and extent of grammatical borrowing into
an Aboriginal language of northern Australia from Kriol (also called Roper
River Kriol, ROP). This is an English-lexified Creole that has arisen relatively
recently out of a colonial contact situation and now functions as a lingua
franca between indigenous people throughout a large area of northern-cen-
tral Australia. The recipient language is a dialect cluster comprising the two
closely related and mutually intelligible varieties Jaminjung and Ngaliwurru
(henceforth Jaminjung for short; DJD). They are the only remaining members
of the Jaminjungan or Yirram subgroup of the Mirndi family, one of the Non-
Pama-Nyungan language families.
All observations are based on my own (ongoing) fieldwork on the lan-
guage since 1993. Although Jaminjung has to be considered a severely en-
dangered language (there are possibly only around 50, mostly elderly, fluent
speakers today), it will be argued that, for the remaining fully competent
speakers, the extent of grammatical borrowing from the dominant language
Kriol is limited to function words, especially in the domain of connectors and
discourse-structuring devices, and thus fits in with the predictions made in
the literature for the accessibility of grammatical morphemes to borrowing.
Lexical borrowing, unsurprisingly, also occurs; the integration of verbal loans
is discussed here in some detail because it exposes an interesting feature of
the recipient language. In contrast, borrowing of phonological features and
of structure (“pattern”) does not occur.
The issue is made more complicated, however, by the many phonological
and grammatical similarities between Kriol and Jaminjung which are argua-
bly not the result of borrowing or “adoption” (in the sense of Johanson 2002)
but rather of substrate influence (“imposition”) on Kriol from languages simi-
lar to Jaminjung, if not Jaminjung itself. Although the degree of substrate
influence in Creole languages continues to be a debated topic, I will assume
364 Eva Schultze-Berndt
substrate influence here if the phenomenon in question is not attested in the

lexifier language, English, but is widespread (in terms of “pattern”, not “mat-
ter”) within the linguistic area of which Jaminjung is a part. Such features in-
clude, for example, the phoneme inventory (Section 4), the absence of a cop-
ula, the existence of general subordinate clauses (“adjoined relative clauses”)
in the functions of both nominal modification and adverbial, the pronominal
system, the semantics of case categories (see Section 5 for an example), and
the semantics of TAM categories (see Section 6 for an example).
A second issue that arises more generally in a study of grammatical bor-
rowing is the question of how to distinguish borrowing from code-switch-
ing. Code-switching to Kriol is common in the everyday language use of the
remaining Jaminjung speakers, since the former can be considered as the
dominant language even in in-group interaction in terms of its frequency of
use (see also Section 2). While I do not claim to have solved this problem,
and will raise it again in the concluding section, I attempt here to at least
make explicit the criteria according to which certain items have been con-
sidered borrowings rather than switches. One often-cited criterion, that bor-
rowings are used even by monolingual speakers of the recipient language,
is obviously not applicable, since there are no monolingual speakers. On
the other hand, the criterion that only those items should be considered bor-
rowings that have no equivalent in the receiving language seems too strict
to account for the actual patterns observed. The criterion of recognisability
also seems somewhat too strict, since Jaminjung speakers usually recognise
a word as “English”, and point this out in situations which are likely to trig-
ger purist sentiments, such as dictionary production. The main criteria em-
ployed here to identify borrowings as opposed to switches are therefore the
following:
(a) The item is used frequently and across different speakers.

(b) The item regularly occurs in utterances with Jaminjung as a matrix lan-
guage; the criterion for matrix language status, in turn, is the use of a
Jaminjung inflecting verb (see also Section 6.2 on the word class status
of verbs in Jaminjung). Kriol auxiliaries are never found in co-occur-
rence with an inflecting verb in Jaminjung.
(c) The item also occurs as a single-word item in utterances that otherwise
show no influence from or switches into Kriol, i.e. the item does not just
occur as part of whole phrases or clauses in Kriol. (Alternatively, these
instances could of course be regarded as insertional code-switching in
the sense of Muysken 2000).
Jaminjung and Kriol 365
(d) The item, when it occurs, is completely prosodically integrated (not as-
sociated with hesitation phenomena).
Sections 3 to 8 of this chapter provide a discussion of grammatical borrow-

ings from Kriol into Jaminjung, contrasted with similarities due to substrate
influences from northern Australian languages in Kriol. They are preceded
by an account of the sociolinguistic situation that brought the two languages
into contact (Section 2), and followed by a brief section on lexical borrowings
(Section 9) and concluding remarks (Section 10).
2. Sociolinguistic background
The traditional country of the Jaminjung and Ngaliwurru people is situated

roughly between the present-day settlements of Victoria River Crossing and
Timber Creek in the Northern Territory. The Jaminjung were hunter-gatherers
with a nomadic life-style; they engaged in trade and maintained ceremoni-
al and inter-marital relationships with members of neighbouring language
groups. The resulting multilingualism has led to linguistic convergence across
different language families similar to that described by Heath (1978) for East
Arnhem Land. Structural convergence manifests itself in the areas of phon-
ology and prosody, marking of grammatical relations, tense–aspect–mood
categories, complex predicate formation, complex sentence formation, the
use of discourse particles, and pragmatic conditioning of word order. A dis-
cussion of these earlier layers of borrowings is however outside the scope of
the present chapter.
The first people of European descent came to the area in 1834, and the es-
tablishment of cattle stations began in the 1880s. There can be no doubt that
the early contact history throughout northern Australia was in many cases
violent, including massacres on the part of the colonizers (Rose 1991), as
well as being accompanied by the spread of previously unknown diseases.
Sooner or later most of the indigeneous inhabitants would be forced – by
actual coercion or by the desire for safety – to join the work force of the
cattle stations, as essentially unpaid labour. Almost all older Jaminjung and
Ngaliwurru speakers worked on cattle stations earlier in their lives, as stock-
men, cooks, builders, domestic workers, or police trackers. After the intro-
duction of equal wages, forced labour gave way to unemployment, and land
and power still remain largely in the hand of non-indigenous people.
The genesis and spread of Kriol and the ongoing process of language
shift from the dozens of Aboriginal languages of northern Australia to this
new language can be regarded as a consequence of the disruptions described
above. Kriol originates in the English-based Pidgin used between the first
colonizers and the indigenous inhabitants of the Sydney area (Troy 1993;
Tryon and Charpentier 2004) which subsequently spread inland and north,
and then further to the Pacific Islands. The records of the Pidgin used in the
Northern Territory in the late 19th and early twentieth century cited by Harris
(1986) bear a close resemblance to Kriol as it is spoken today, which in turn
has many similarities with Pacific Pidgin and Creole languages such as Tok
Pisin and Bislama.
A stabilization and standardization of the Pidgin in northern Australia
was brought about by the necessity of communication between the increas-
ing numbers of Aboriginal people working on the stations, the (primarily)
English-speaking pastoralists, and the non-English-speaking colonists. Ac-
cording to some authors (Munro 2005: Ch. 2), Kriol emerged, with concomi-
tant substrate influences, mainly as the result of this stabilization. Others,
in particular Harris (1986), assume that creolization occurred abruptly early
in the twentieth century at an Anglican mission at Roper River (close to the
present-day Ngukurr), which provided refuge to the survivors of several lan-
guage groups. Harris (1986: 306312) argues that a peer group of children,
who lived in dormitories and were thus separated from adults for large parts
of the day, needed a common language, and adopted and creolized the exist-
ing Pidgin. World War II and later the collapse of the pastoral industry led to
increasing mobility which favoured the adoption of Kriol as the lingua franca
it is today (Munro 2000).
If Harris’ account of creolization at Roper River is correct but if one al-
lows, as Munro (2005) does, for substrate influence, eight languages from
four different Non-Paman-Nyungan language families (Marra, Alawa, Warn-
darrang, Ngandi, Ngalakgan, Nunggubuyu, and Mangarrayi) are the plausible
substrate languages originally spoken at Roper River Mission (Harris 1986:
230233). All of these are unrelated (or at least not demonstrably related) to
the Mirndi family which includes Jaminjung. One therefore has to distinguish
substrate influence during the alleged process of creolization, and substrate
influence in other regions of northern Australia to which the new language
subsequently spread. Strictly speaking, it is only in the latter sense that Jam-
injung can be counted as one of the substrate languages. However, due to the
areal convergence already mentioned, Jaminjung shares many of the features
of the above languages. Of the features associated with substrate influence
on Kriol by Munro (2005: Ch. 5), these are e.g. a minimal/augmented pro-
nominal system, a distinction between punctual–perfective and continuous–
imperfective in the past tense, a modal character of the “future” marker and
a system of locative, possessive and instrumental cases.
While Kriol as spoken in Roper River is the best-described variety, the
existence of regional varieties is described by authors such as Hudson (1983),
Sandefur and Harris (1986) and Rhydwen (1993, 1996). Munro (2000) ar-
rives at the conclusion that the differences between them are slight and most-
ly limited to small differences in the phonemic inventory, in phonetics and
prosody, and in the lexicon. This is confirmed by my own comparison of
published descriptions and texts of Roper River Kriol (Sandefur 1979, 1991;
Sandefur and Sandefur 1981, Munro 2000, 2005) with the Kriol spoken by
Jaminjung speakers. However, a more detailed study of the use of grammat-
ical constructions and a comparison with the traditional languages spoken in
the region might well reveal more differences between the varieties than the
fairly superficial comparisons undertaken so far.
As already mentioned, Kriol is now the dominant language for most Jam-
injung speakers, as it is used not only as a lingua franca between members of
different language groups, but also as the in-group language of cross-gener-
ation communication. Moreover, a growing number of first-language speak-
ers are monolingual, or bilingual in Kriol and English. All of the remaining
Jaminjung speakers are also bilingual in Kriol, but also usually in at least one
other indigenous language of the region. Even when the traditional languages
are spoken among members of the older generations, code-switching with
Kriol is very common, and has been for some time (cf. McConvell 1988).
None of the older speakers is literate in either Jaminjung or Kriol, since
both have only ever been used as an oral medium of communication within
the family and the larger community, and neither is used in the media or in
education, at least in the area under consideration. In education and written
communication (the role of which in daily life is, however, limited), English
is used exclusively. Acrolectal Kriol or Aboriginal English are used by many
people, especially younger people, to speak to outsiders, and English is gen-
erally understood, but in the area under consideration, most Kriol speakers’
active command of standard English is limited. Consequently, grammatical
and lexical influence in Jaminjung is quite clearly only from Kriol, not from
English.
In concluding this section, it may be worth pointing out that Kriol is not a
language of prestige and power, in multiple respects. While non-indigenous,
English-speaking people still tend to consider Kriol a degenerate form of
English, indigenous people, especially the older ones, tend to equate Kriol
with English and to perceive the use of the language as a threat not only to
the traditional languages, but also to their identity (see e.g. Schmidt 1990:
113). This has not stopped the considerable spread of Kriol, which now has
several thousands of speakers, as it does, paradoxically, fulfil a function as a
symbol of indigineous identity in contrast to the dominant language, English,
especially among younger people (Schmidt 1990: 111115, Munro 2000).
In the remaining sections of this chapter, I will discuss the extent of Kriol
borrowings into Jaminjung in the domains of phonology (Section 3), typo-
logical features (Section 4), nominal structures (Section 5), verbal structures
(Section 6), other parts of speech (in particular function words, Section 7),
constituent structure and other syntactic patterns (Section 8) and the lexicon
(Section 9). The findings are summarized in Section (10).
3. Phonology
While the phoneme inventories and phonotactic constraints of Kriol and Jam-
injung are very similar, this is quite certainly not due to borrowing but to sub-
strate influence. For example, like most Aboriginal languages of the area, the
basilectal variety of Kriol lacks fricatives. Thus, lexemes derived from Eng-
lish lexemes with the phoneme /s/ in Kriol have an alveo-palatal stop (e.g. jup
< Engl. soup), in turn non-existent in English. (Speakers of Kriol may well
adopt an acrolectal register, including a phonological system which is more
similar to English, when interacting with English speakers). Similar observa-
tions can be made for prosody, although there has not been much research on
the prosody of either Kriol or the traditional Victoria River languages.
4. Typological features
Jaminjung and Kriol have quite distinct typological characteristics, and again,
no contact influence from Kriol to Jaminjung seems to have taken place as
far as these are concerned (but see below on ergativity). As a Creole, Kriol is
mostly isolating, with little derivational morphology and, arguably, no inflec-
tional morphology. TAM categories are marked periphrastically, by auxiliar-
ies such as the past perfective marker bin. The basic constituent order is SVO
(see also Section 8), and core arguments in transitive clauses are distinguished
by word order only, following a nominative–accusative system. Subjects are
always overtly expressed, at least by a pronoun. Oblique arguments and ad-

juncts are marked by prepositions (e.g. the comitative preposition gotim and
the locative preposition la). Example (1), which was offered by a speaker as
a translation of (2) below, illustrates these basic characteristics.
(1) thei bin oldei faind-im, thet.. goana, (...)

3pl aux.pst habit find-tr dem goanna
‘They used to find the goanna (the dogs did).’
In contrast, as illustrated in (2), the basic morphological type of Jaminjung

is agglutinating to fusional, and the language has a rich system of verbal
inflections. Bound pronominals and mood markers are prefixed to the verb
stem, whereas tense–aspect marking is achieved by suffixation or stem sup-
pletion. The constituent order, on the sentence level at least, is determined by
pragmatic considerations only; compare the two lines of (2). Arguments and
adjuncts take case markers which probably have to be analysed as enclitics
rather than suffixes. The alignment type is ergative; the absolutive case is un-
marked (and is left unglossed in the examples).
(2) burra-ngayi-rna yirrag wirib-di jarl, malajagu (...)

3pl:3sg-see-impf 1pl.excl.obl dog-erg track.down goanna
jarl burra-ngayi-na yirrag, malajagu.
track.down 3pl:3sg-see-impf 1pl.excl.obl goanna
‘They used to track them down for us, the dogs, the goannas (...) they
tracked the goannas down for us.’
Like in a number of other Australian languages, ergative marking is “option-

al” in that around 30 percent of transitive agents are unmarked. The possibil-
ity that “optionality” of ergative marking is a result of contact, as suggested
for other contact situations e.g. by Schmidt (1985), Bavin and Shopen (1985)
and Meakins and O’Shannessy (to appear), cannot be excluded with certain-
ty, for lack of historical data. On the other hand absence of ergative marking
has a semantic and pragmatic effect similar to that described by McGregor
(1992, 1998) for Warrwa and Gooniyandi, here considered as an internal fea-
ture of the languages.
A further notable feature of Jaminjung and surrounding languages, which
distinguishes them from Kriol, is a part of speech system where inflecting
verbs such as -ngayi ‘see’ in (2) form a closed class, but can be combined
with members of an open class of “uninflecting verbs”, of which jarl in (2) is
an example. This phenomenon will be discussed in more detail in the context

of integration of loan-verbs in Section 6.2.
No influence from Kriol has been discerned in the word order within Jam-
injung noun phrases, which is relatively free. Nominal inflectional morph-
ology is absent from Kriol anyway, and derivational morphology such as
the nominalizing suffix -bala, has not been borrowed into Jaminjung. Very
rarely, Kriol prepositions appear to be “doubling” a Jaminjung case marker;
this is a phenomenon in need of further investigation. With the exception of
ergativity (see Section 4), the semantics of Kriol cases (expressed by prepos-
itions) and Jaminjung cases (expressed by suffixes/enclitics) is very similar,
but this is quite clearly a substrate phenomenon and not due to borrowing.
One example is the use of a general locative case in Jaminjung, shown in (3),
which translates as the general locative preposition la in Kriol, illustrated in
(4). Unlike English spatial prepositions, both Jaminjung -gi and Kriol la are
unspecific about the relationship, e.g. containment or attachment, between
located object and ground object.
(3) durlurl-ma ga-yu, jalbud-gi.

push-cont 3sg-be.prs house-loc
‘He/she is making a noise in the house.’
(4) (...) wan thei meik nois la haus

rel 3pl make noise loc house
‘when they make a noise in the house’
6. Verbal structures and the integration of verbal loans
6.1. Verbal structures
What was said in Section 5 for nominal structures also holds for verbal struc-
tures. No grammatical morphemes associated with the verb in Kriol are bor-
rowed into Jaminjung. Structurally, too, the complex predicate system of
Jaminjung is clearly different from the auxiliary-verb system of Kriol. Se-
mantically, there are many similarities between the categories expressed by
function words in Kriol and by affixes or stem suppletion in Jaminjung, but

this can be put down to substrate influence, not borrowing. An example is the
expression of habituality by an imperfective suffix in Jaminjung in (1) and by
the function word oldei in Kriol in (2). The examples in (5) and (6) illustrate
the past potential category, formed by a combination of the past imperfective
(here expressed by stem modification) and potential prefix in Jaminjung, and
by a combination of past auxiliary bin and potential auxiliary wanna (< Engl.
want to) in the Kriol translation.
(5) jurruny-ni yirr burru-bila.

hand-erg/instr pull 3pl:3sg-pot:get/handle.impf
‘They were going to pull it with their hands (a cart).’
(6) wal thei bin wanna pul-im intit.

well they aux.pst pot pull-tr tag
‘Well, they were going to pull it, weren’t they?
6.2. The integration of verbal loans
Jaminjung speakers not only borrow Kriol nominals, but also verbs quite ex-
tensively (see Section 9). Verb integration exhibits some interesting features
due to the nature of the verbal system of Jaminjung, which (like that of sur-
rounding languages, see e.g. Dixon 2001, McGregor 2002, Schultze-Berndt
2003) relies on the combination of two distinct parts of speech. Inflecting or
generic verbs inflect for person and tense–aspect–mood (see also Section 4),
but form a closed class of around 35 members. Most concepts for actions
and states are expressed by members of a distinct part of speech (here called
“uninflecting verb”, but also termed “preverb” or “coverb” in the literature),
which do not inflect and form an open class. Lexical items which function
as verbs in Kriol are integrated into Jaminjung as uninflecting verbs, and,
just like their native equivalents, form complex verbs in combination with a
semantically appropriate inflecting verb. This therefore functions as a verbal
classifier (see Schultze-Berndt 2000 and McGregor 2002 for detailed argu-
ments) at the same time as functioning as integrator, and as an indicator of
verbness itself. For example, concepts of physical or psychological manipu-
lation involve the transitive verb -angu, glossed as ‘get/handle’ in (7) (com-
pare (13) and (23)), and motion concepts involve one of the inflecting verbs
of motion, the most common one being -ijga ‘go’ in (8) (compare (24)).
(7) jalig-di.. lukabta-im bun-ngangu ngidbud-gi \

child-erg look.after-tr 3pl:1sg-get/handle.pst night-loc
‘The children looked after me at night.’
(8) janyungbari-bina yagbali-bina, shift-im yirr-ijga-ny \

another-all place-all shift-tr 1pl.excl-go-pst
‘To another place, we moved over.’
The inflecting verb also expresses valency (since the paradigm of pronominal
affixes distinguishes transitive from intransitive verbs); for example, the un-
inflecting verb shiftim shown in (8) can also combine with the transitive verb
-arra ‘put, cause to be in a location’, to express the meaning of ‘shift/move
something over (tr)’. Formally, this system can be described as periphrastic
marking by a combination of a loan-verb stem with an additional marker.
The form that is borrowed is an uninflected stem (but could hardly be any-
thing else since there are no verbal inflections in Kriol). A small number of
derivational suffixes (the transitivity marker -im and the progressive/continu-
ous suffix -(a)bat) are borrowed with the stem but are not productive in the
recipient language. The strategy of integrating Kriol verbs described here is
extremely frequent and productive.
The borrowing of “matter” from Kriol into Jaminjung affects lexical items
as well as some function words; bound grammatical morphemes are not bor-
rowed, and neither are pronouns, adpositions and auxiliaries. (It should be
pointed out that the borrowing of auxiliaries has been ruled out by defin-
ition here, since utterances containing a Kriol auxiliary in combination with
Jaminjung lexical items were regarded as essentially Kriol with switching
to Jaminjung, according to the criteria set out in Section 1.) For occurrences
of Kriol pronouns and adpositions, there are examples where the boundary
between borrowing and code-switching is perhaps less clear – for example,
I have found a few examples of isolated Kriol adpositions in a sentence that
was otherwise Jaminjung – but these are very infrequent in the data.
The remainder of this section is devoted to a discussion of borrowed gram-
matical morphemes, including particles functioning as connectors (7.1.),
subordinating conjunctions (7.2.), focus particles (7.3.), negators (7.4.), and
discourse-structuring devices (7.5.), discussed in turn below. All lexical bor-
rowings are discussed in Section 9, including the equivalents of English ad-

verbs and numerals, since both in Kriol and in Jaminjung these belong to the
nominal category and not to any minor part of speech.
7.1. Connectors
The borrowed additive particle en (< Engl. and) and the disjunctive particle o
(< Engl. or) may be used both as clause and NP connectors, illustrated only
for en in (9) and (10). Since Jaminjung does not have additive or disjunctive
connectors, these borrowings appear to fill a structural gap. The tradition-
al strategy of juxtaposition continues to be used frequently though, both in
Jaminjung and in Kriol.
(9) yawayi, buji jalig birang thanyung warrng,

yes cond child behind another walk
en ngiyi jungulug jwinging ga-yu, yuno, hai.
and here one swinging 3sg-be.prs you.know high
‘Yes, if another child is walking behind, and here one is swinging,
you know, high, (then the one might get hit by the swing).’
(10) yurrg-bayan burra-rra-nyi yirrag warlarladbari-ni,

talk-cont 3pl:3sg-put-impf 1pl.excl.obl rdp:old.man-erg
en mululurru-ni \
and rdp:old.woman-erg
‘They were talking to us, the old men and old women.’
The Kriol borrowing ani (< Engl. only) is frequently used as a contrastive
connector, i.e. as a translation equivalent of English but, as in (11). In this
case, however, the borrowing does not fill a structural gap, but replaces a cor-
responding Jaminjung contrastive particle yiga, illustrated in (12). In some
instances, the particle bugu ‘just, only’ is also used in a contrastive function,
as in (13), in addition to its more common restrictive function. The semantic
extension from a restrictive particle ‘only’ to a contrastive connector is thus
likely to be due to substrate influence.
(11) jarlwab nga-mama-na ani jarlig guyawud burr-agba:, ngilijja:,

save 1sg:3sg-have-impf only child hungry 3pl-be.pst crying
‘I was going to save it (the food) but the children were hungry, (and)
crying.’
(12) majani yawurr-irdbaj! yiga langa-marnany jubard \

maybe irr:3pl-fall but ear-priv shut
‘(We tell them) “(…) you all might fall!” – but they don’t listen, (as
if) deaf.’
(13) yathang gujang ngarrgina Nangari,

right mother 1sg:poss subsection
bugu=biya bulgarding, Jimij-nyunga
just=now father subsection-source
‘All right, my mother is a Nangari, only/but the father is (the child of
someone) from the Jimij subsection.’
7.2. Subordinating conjunctions
No subordinating conjunctions exist in Jaminjung for finite clauses except

for the general subordinator =ma which is a second position clitic, and the
postverbal conditional clitic =wunthu ‘if’. The latter is sometimes, but rarely,
replaced with the Kriol equivalent buji (< English suppose) in clause-initial
position, as shown in (9) above. Other types of subordination, e.g. causal and
purposive clauses, involve case marking on uninflecting verbs in a non-finite
clause.
The only borrowed conjunction with a high frequency is the Kriol causal
conjunction dumaji (< Engl. ‘too much’), illustrated in (14). This is again a
case of a loan apparently filling a structural gap.
(14) mali rolimap yurra-ngu:, damarlung,

cloth roll:tr:up 1pl.incl:3sg-get/handle.pst nothing
gumard marring dumaji \
road bad because
‘We rolled up our blankets (in order to go on a camping trip), (but)
nothing (= we didn’t go after all), since the road was bad.’
7.3. Focus particles and phasal particles
The term “focus particle” is used here in the sense of e.g. König (1991) to
refer to particles with an additive, restrictive etc. function which are associ-
ated with a focal constituent. An additive function (‘also, too’) is fulfilled by
the two Kriol borrowings in Jaminjung, tu, used much like its English source
too, and igen (< Engl. again), illustrated in (15). These also have a Jaminjung
equivalent, the clitic =gayi ‘also, too’.
(15) janyungbari ga-yu=ngardi \ janyungbari igen girrgirrlang \

another 3sg-be.prs=sfoc another too galah
ga-ruma-ny, Lijuna-ngunyi igen \
3sg-come-pst place.name-abl too
‘There is another one, another (mythological) galah as well, he came,
(him) too from Legune.’ (from a myth)
Kriol ani (<Engl. only) is used as a restrictive particle alongside the Jam-
injung equivalents bugu ‘just, only’ and =biji ‘only’, but, as shown in Sec-
tion 7.1, also as a contrastive connector.
(16) ngayin-marnany, ani darrmad

meat-priv only freshwater.crocodile
(Context: “bring us kangaroo meat!”) – “(There is) no (proper) meat,
only freshwater crocodile (meat)!”
A borrowed particle/clitic that comes closest to a pure focus marker is the

sequential discourse marker =na (< Engl. now). Its two, related, functions
are to mark a new (hence focal) event in a sequence, as in (17), and a shift in
topic (i.e. a topic sequence), as in (18) (see also Graber 1987). This does not
fill a structural gap, but rather appears in the same positions and with the same
functions as the Jaminjung clitic =biya(ng) (compare its use in (26), (29), and
(30)), which points again to substrate influence.
(17) burru-yu=na\ gurrany burr-angga bunug-bunug=na \

3pl-be.prs=now neg 3pl-go.prs rdp-steal=now
julany-nyunga,
smoke-source
(children, after being punished for stealing by “smoking” them) ‘They
stay (in one place) then, they don’t go stealing then, because of that
smoke.’
(18) nami=na, guny-b-ijga \ Bulla-bina

2sg=now 2du-pot-go place.name-all
‘You, you two will go, to Bulla.’
As a phasal particle, Kriol yet (< Engl. yet) is frequently used as a negative
polarity item, as in English (19), but also in the meaning of ‘still’ (20). For
both functions, there does not seem to be a Jaminjung equivalent except for
the much more strongly grammaticalized restrictive clitic =(w)ung (Schultze-
Berndt 2002), also shown in (20).
(19) gurrany yirrgbi-nyunga yet.

neg talking-source yet
‘(We have) not talked yet.’ (i.e. the speaker is aware that the linguist
was hoping to do some recording)
(20) mani ga-yu=wung yet.

money 3sg-be.prs=restr yet
‘Money is still there.’
7.4. Negation
The Kriol particles indicating a negative answer, na:(wu) or nomo ‘no’ are
used frequently in Jaminjung utterances. It is unclear whether the ‘yes’ par-
ticle, yawayi, common to several languages of the area, arose from contact
influence.
(21) na: jurlag ga-gba=ni the mugmug walthub jarriny-gi,

no bird 3sg-be.prs=sfoc there owl inside hole-loc
(Context: The boy thought “maybe the frog is here in the hole.”) ‘No,
a bird was there, an owl inside the hole.’ (from a Frog Story)
In contrast, the Kriol negation particles neba/neva (< Engl. never), employed
for sentence negation mainly in the past indicative, and nomo (< Engl. no
more), employed for sentence negation mainly in the nonpast and in non-
indicative moods, and for constituent negation, are never used in Jaminjung
utterances in my data, with one crucial exception: there are a number of ex-
amples, including (22), where the particle nomo is used for metalinguistic
negation instead of the Jaminjung general negative particle gurrany.
(22) wirr gani-ma-m jarriny-ngunyi, nomo dibard \

move.out 3sg:3sg-hit-prs hole-abl neg jump
‘It comes out of the hole, not “it jumps”.’ (about a frog, correcting
ESB’s suggestion of dibard ganiyu ‘it jumped out’)
Jaminjung and Ngaliwurru also have the special independent negative forms
damarlung (J) and gara (Ng). They are used on their own as a negative an-
swer to a request, but may also follow negated statements for emphasis, or
follow positive sentences to indicate that the action described did not lead to
the desired result. The latter function is illustrated in (14) for damarlung, and
in (23) for its Kriol equivalent, najing/nathing (< Engl. nothing). Despite the
existence of the Jaminjung forms, najing is employed so frequently that it can
justifiably be regarded as a borrowing.
(23) wardany-ni=ma skrejimbat yirra-ngu,

hand-erg=subord scratch:tr:cont 1pl.excl:3sg-get/handle.pst
nathing \
nothing
‘We were scratching (i.e. digging) with our hands, (but) to no avail.’
(looking for yam)
The frequency of use of najing is in striking contrast to the absence of the

Kriol sentence negator in Jaminjung clauses, mentioned above. The answer
may be that najing, in addition to expressing the concept of a negative or in-
successful action, also functions as a discourse structuring device, since like
its Jaminjung equivalents it usually occurs in an intonation unit of its own.
Further borrowings of this nature are discussed in the next subsection.
7.5. Discourse-structuring devices
One of the most striking uses of Kriol borrowings is as discourse-structur-

ing and paragraph-marking devices – even though in most cases, Jaminjung
equivalents exist. Subsumed under this heading are tags, fillers, and para-
graph-boundary markers.
The Kriol tag (y)intit (< Engl. isn’t it) is ubiquitous in everyday speech
although it also has equivalents in Jaminjung (ngi’) and Ngaliwurru (ngali).
(Y)intit is used irrespectively of the polarity or the verb of the utterance, or
the person and number of the subject; an example is (6) in Section 6.1. A
further tag yuno (< Engl. you know ), illustrated in (9) occurs much less fre-
quently.
The marking of an end of a paragraph or a transition frequently involves a
switch to Kriol. For example, the Kriol particles binij (<Engl. finish), which
marks the end and culmination of an episode, and thetsol (< Engl. that’s all),
which has a more metalinguistic function in indicating a switch in overall

topic, are used very frequently, even though an equivalent (yathang ‘finished,
enough, all right’) does exist in Jaminjung. The use of binij is illustrated
in (24).
(24) aligeita=biya mit-im=ung ga-ruma-ny=nu,

alligator=now meet-tr=restr 3sg-come-pst=3sg.obl
jum gan-uga \ binij \
grab 3sg:3sg-take.pst finish
‘the alligator came to meet it (a kangaroo), and grabbed it, finished’
Another particle, olrait (< Engl. all right), marks a transition between two
clauses which have a stronger thematic coherence, e.g. where the continua-
tion is a natural consequence of the beginning, as in (25).
(25) bard-bard nganth-arra-m larriny-ni,

rdp-cover 2sg:3sg-put-prs paperbark-erg/inst
olrait, mirrbba ga-ngga gunjarlg-di.
all.right buried 3sg-go.prs ground-erg/inst
‘You cover it up with paperbark, allright (i.e. when that’s done), it is
covered with earth.’ (meat to be roasted)
Another Kriol particle used to establish cohesion is wal, used in a similar

manner to its source in the lexifier language, well, but without clear equiva-
lent in Jaminjung. The following example is from a myth about Emu and
Brolga, where Emu incites Brolga to kill all her children but two, falsely pre-
tending that she herself only has two children. The particle wel here indicates
a shift in local topic from Emu to Brolga.
(26) ngayug=guji jirrama ngawuny-nganja, jalig, jirrama,

1sg=first two 3sg:3du-take.prs child two
gani-yu=nu, gumurrinji-ni \
3sg: 3sg-say/do.pst=3sg.obl emu-erg
wal gudarrg-di=biya jalig burrb ganurru-mangu niwina \
well brolga-erg=now child finish 3sg:3pl-hit.pst 3sg.poss
‘I already have (lit. ‘take’) (only) two children, two, she said to her,
the emu did. Well, the brolga then killed all her children (except for
two).’
To sum up this section: except in a few cases, borrowed function words were
shown not to fill structural gaps, but to correspond fairly closely to existing
equivalents in Jaminjung, much more so in any case than to the source item
in the lexifier, English. This points to a scenario of substrate influence in
terms of “pattern” on the Kriol forms, and subsequent borrowing of “mat-
ter” to Jaminjung, partly replacing the equivalents. The question about the
motivation for such borrowings will be adressed in the concluding section
(Section 10).
8. Constituent order and other syntactic patterns
Information structure and its influence on constituent order has scarcely been
studied for either Kriol or Jaminjung and other traditional languages of the
area. On the surface, Jaminjung has pragmatically conditioned word order,
while Kriol has the fixed basic constituent order SVO. However Kriol allows
for certain alternative word orders which appear to show substrate influence.
To give just one example, both Jaminjung and Kriol allow for discontinuous
or “split” noun phrases. One of the contexts in which these occur – in both
languages – is relatively easily identified. This is in an annuntiative thetic
sentence (Sasse 1987, 2006), i.e. an utterance announcing the presence or
appearance of some entity or situation “out of the blue”. Discontinuous noun
phrases – with one component appearing before and the other after the verb
– are used to emphasize a quality of the entity whose existence or appearance
is announced. The Jaminjung and Kriol examples in (27) and (28) are transla-
tion equivalents, produced on the same occasion by the same speaker.
(27) warrgad ga-ram=yirrag mayi.

long 3sg-come.prs=1pl.excl.obl man
‘A tall man is coming for us.’
(28) longbala kaming thet men theya.

long/tall coming that man there
‘A tall man is coming there.’
No influence of Kriol constituent order on Jaminjung could be discerned in

terms of consituent order, or for that matter on any other syntactic patterns ex-
cept where a pattern is created by the borrowing of one of the function words
for which no Jaminjung equivalent exists (Section 7). Another potential area
of influence, mentioned in Section 4, is optional ergativity. The overall im-
pression is that similarities of pattern between the two languages are due to
substrate influence, not borrowing.
9. Lexical borrowings
As already pointed out, and illustrated in many of the examples, Kriol lex-
ical items are frequently borrowed into Jaminjung, although the boundary
between borrowing and single-item or insertional code-switching is difficult
to draw. I am not aware of calques in Jaminjung on the basis of Kriol expres-
sions. Borrowed individual lexemes include verbs as well as nominals; the
latter category also includes items that are numerals, adjectives, or spatial and
temporal adverbs in English.
9.1. Nominals
It comes as no surprise in a situation of relatively recent contact between two

cultures with a very different material basis that nominals borrowed from
Kriol into Jaminjung include many names for introduced animals, plants,
tools and other artefacts, such as buliki ‘cow, cattle’, barrigi (< English pad-
dock) ‘fence’, kroba ‘crowbar’, jangayi ‘shanghai; sling shot’, teip ‘tape re-
corder’, and eroplein ‘aeroplane’. On the other hand, many Jaminjung coin-
ages or semantic extensions of existing words are in use for introduced items
alongside with Kriol words – e.g. diwu-ngarna ‘plane’, lit. ‘fly-assoc’ =
‘thing for flying’, jalwany-ngarna ‘tape recorder’, lit. ‘talk-assoc’ = ‘thing
for talking’, wagurra ‘rock, money’ (Kriol mani), bagarli ‘paperbark; paper,
paper money’ (Kriol peipa) or jingil ‘plant juice, meat juice, soup; petrol/
diesel’ (Kriol petrol).
There are also some clear instances of borrowing of temporal nominals
(including those originating in English adverbs). Days of the week are gener-
ally borrowed (including expressions such as penjin dei ‘the day of the pen-
sion payments’, and peidei ‘pay day’). Kriol expressions are used for times of
the day but alongside Jaminjung equivalents. The most frequent time-of-day
expression from Kriol is alibala ‘early, in the morning, the next day’ (from
English early + fellow).
(29) alibala=biyang gud buny-agba,

morning=now rise 3du-be.pst
‘In the morning then the two got up.’
Spatial nominals including deictics, on the other hand, are used but mostly
in those places identified as switches rather than borrowings, with a few pos-
sible exceptions such as (21) and (30).
(30) bily=biya ga-gba=nu tharrei=guji \

burst=now 3sg-be.pst=3sg.obl there=first
‘(The ulcer) burst on him only there (in hospital).’
Since Jaminjung only has numerals corresponding to ‘one’, ‘two’ and ‘three’,
it comes as no surprise that numerals are borrowed from Kriol. Interestingly,
however, ordinal numerals, which are completely absent from Jaminjung, do
not seem to be borrowed, in other words, ordinal numerals are not used at all.
Other Kriol quantifiers are used occasionally, despite the existence of Jamin-
jung equivalents, but their low frequency points to code-switching rather than
borrowing. One example is (31).
(31) janju binka-ni wuju.. plenti nga-manggu gagawuli,

that river-loc small plenty 1sg:3sg-hit.pst long.yam
‘At that small creek I got lots of long yam.’
9.2. Verbs
The borrowing of Kriol verbs into Jaminjung as uninflecting verbs was al-
ready discussed in Section 6.2 with respect to their integration. Borrowed
items include not only verbs for trade and money exchange such as bayim
‘buy:tr’, for activities related to Western technology such as kikap ‘kick start
(of motor)’ or to handling of cattle such as masterim ‘muster:tr’, but (as
shown e.g. in (7), (8), (23) and (24)) also verbs denoting traditional everyday
activities for which Jaminjung equivalents exist and continue to be used.
10. Summary and discussion
This chapter has provided an overview of lexical and in particular grammat-

ical borrowings from Kriol, an English-lexified Creole spoken in northern
Australia, to Jaminjung, a Non-Pama-Nyungan language of northern Austral-

ia. As demonstrated in Section 2, this situation is one of intense contact and
strong functional pressure from the dominant language, Kriol although this
is not a language associated with power and prestige. The contact situation
with Kriol is special in another way, as language shift to Kriol set in not long
after the Creole itself came into being. It was argued here that Kriol shows
many influences in “pattern” – themselves areal features – from the substrate
languages, e.g. in its phonology, the categories expressed by grammatical
markers, and even constituent order; these were therefore not considered as
borrowing.
The functional pressure from Kriol has led to a fairly high frequency of
borrowings of “matter”, both of lexical and of grammatical morphemes. As
far as the lexical level is concerned, it was shown that both verbs and nom-
inals (including adjectival and adverbial nominals) are borrowed. Borrowing
is not restricted to terms for introduced items, but on the other hand core vo-
cabulary such as kinship terms are not borrowed. Verb integration relies on
a pre-existing system of classificatory inflecting verbs which are semantical-
ly generic and form a closed class; Kriol verbs are integrated as uninflected
verbs.
On the grammatical level, borrowing of “matter” is restricted to free or clit-
ic forms, including connectors, subordinating particles, particles associated
with focus, and discourse-structuring particles. A sentence negation marker
from Kriol was shown to be restricted to metalinguistic negation. Other free
forms, notably auxiliaries and adpositions, are not borrowed, at least accord-
ing to the criteria set out in Section 1. Bound grammatical forms are not bor-
rowed at all, not surprisingly in part since Kriol does not have inflectional
affixes. However, it does have derivational affixes which are not borrowed,
i.e. never occur in Jaminjung utterances outside Kriol lexical forms.
The findings thus confirm hierarchies of borrowability as previously sug-
gested in the literature (e.g. Weinreich 1968, Heath 1978, Thomason and
Kaufman 1988). More specifically, they confirm the high borrowability of
grammatical morphemes with discourse-regulating functions (Stolz and Stolz
1996, Matras 1998). This is all the more striking as only few cases of these
borrowings – some sentence connectors and the phase particle yet – fill true
“structural gaps” in the recipient language. In many other cases, Kriol forms,
whose functions are very different from their English lexical sources and thus
likely to be modelled on the indigenous substrate languages, take the place
of existing Jaminjung equivalents. If the model proposed by Matras (1998) is
valid – he argues that the use of discourse-regulating markers from the dom-
inant language results from “fusion” of the respective grammatical systems

in this domain in the bilingual due to cognitive pressure – one may well ques-
tion whether it makes much sense to draw a strict boundary between code-
switching and borrowing in a situation like the one described here where all
Jaminjung speakers are bilingual in the dominant language, Kriol.
Abbreviations
Apart from the abbreviations used in interlinear glosses listed below, the symbol \ is
used to indicate a final (falling) intonation contour, and a comma to indicate a non-
final intonation contour, at a prosodic break.
1, 2, 3 1st, 2nd, 3rd person

abl ablative case (starting point of motion)
all allative case
assoc associative derivational suffix (“associated with X”)
aux auxiliary
cond conditional marker
cont continuous activity; derivational marker on uninflecting verbs
contr contrastive focus marker
dir directional suffix on locative nominals (outward direction)
du dual
erg/inst ergative/instrumental case
excl exclusive (pronominal category)
impf imperfective aspect
incl inclusive (pronominal category)
irr irrealis mood
loc locative case
neg negative/negation
obl oblique pronominal
pl plural
poss possessor/possessive
pot potential mood
priv privative marker, “without”
prs present tense
pst past perfective tense/aspect
rdp reduplication
restr restrictive marker
sfoc sentence focus marker
sg singular (pronominal category)
source source, origin (case)
subord subordinator (non-specific)

tag tag question
tr transitive marker (in Kriol)
References
Bavin, Edith, and Timothy Shopen

1985 Warlpiri and English: Language in contact. In: Michael G. Clyne (ed.),
Australia, Meeting Place of Languages, 8194. Canberra: Pacific Lin-
guistics.
Dixon, R. M. W.
2001 The Australian linguistic area. In: Alexandra Y. Aikhenvald and
R. M. W. Dixon (eds.), Areal Diffusion and Genetic Inheritance:
Problems in Comparative Linguistics, 64104. Oxford: Oxford Uni-
versity Press.
Graber, Philip
1987 The Kriol particle ‘na’. Working Papers in Language and Linguistics
21: 121.
Harris, John W.
1986 Northern Territory Pidgins and the Origin of Kriol. Canberra: Pacific
Linguistics C89.
Heath, Jeffrey
1978 Linguistic Diffusion in Arnhem Land. Canberra: Australian Institute of
Aboriginal Studies.
Hudson, Joyce
1983 Grammatical and Semantic Aspects of Fitzroy Valley Kriol. Darwin:
Summer Institute of Linguistics.
Johanson, Lars
2002 Structural Factors in Turkish Language Contacts. Richmond: Cur-
zon.
König, Ekkehard
1991 The Meaning of Focus Particles: A Comparative Perspective. London:
Routledge.
Matras, Yaron
guistics 36 (2): 281331.
McConvell, Patrick
1988 Mix-im-up: Aboriginal codeswitching old and new. In: Monica Heller
(ed.), Codeswitching: Anthropological and Sociolinguistic Perspec-
tives, 97124. Berlin: Mouton de Gruyter.
McGregor, William B.
1992. The semantics of ergative marking in Gooniyandi. Linguistics 30:
275318.
1998 Optional ergative marking in Gooniyandi revisited: Implications to the
theory of marking. Leuvense Bijdragen 87 (43): 491534.
2002 Verb Classification in Australian Languages. Berlin: Mouton de
Gruyter.
Meakins, Felicity, and Carmel O’Shannessy
Forthc. Ordering arguments about: Word order and discourse motivations
in the development and use of the ergative marker in two Australian
mixed languages. Paper submitted.
Munro, Jennifer M.
2000 Kriol on the move. A case of language spread and shift in Northern
Australia. In: Jeff Siegel (ed.), Processes of Language Contact: Stud-
ies from Australia and the South Pacific, 245270. Saint-Laurent,
Quebec: Collection Champs linguistiques.
2005 Substrate language influence in Kriol: The application of transfer con-
straints to language contact in northern Australia. Armidale: Ph.D. diss.,
Department of Linguistics, University of New England, Armidale.
Rhydwen, Mari
1993 Kriol: The creation of a written language and a tool of colonisation.
In: Michael Walsh and Colin Yallop (eds.), Language and Culture in
Aboriginal Australia, 155168. Canberra: Aboriginal Studìes Press.
1996 Writing on the Backs of Blacks: Voice, Literacy and Community in
Kriol Fieldwork. Brisbane: University of Queensland Press.
Rose, Deborah
1991 Hidden Histories. Black Stories from Victoria River Downs, Humbert
River, and Wave Hill stations, North Australia. Canberra: Aboriginal
Studies Press.
Sandefur, John
1979 An Australian Creole in the Northern Territory: A description of
Ngukurr-Bamyili dialects. Work papers of SIL–AAB, Series B, Vol-
ume 3. Darwin: Summer Institute of Linguistics.
1991 A sketch of the structure of Kriol. In: Susanne Romaine (ed.), Language
in Australia, 204212. Cambridge: Cambridge University Press.
Sandefur, John, and John Harris
1986 Variation in Australian Kriol. In: Joshua Fishman (ed.), The Ferguso-
nian impact, 180190. Berlin: Mouton de Gruyter.
Sandefur, John, and Joy Sandefur
1981 Introduction to conversational Kriol. Work papers of SIL–AAB, Series
B, Volume 5. Darwin: Summer Institute of Linguistics.
Sasse, Hans-Jürgen
1987 The thetic/categorical distinction revisited. Linguistics 25: 511580.
2006 Theticity. In: Giuliano Bernini and Marcia L. Schwarz (eds.), Prag-
matic Organization of Discourse in the Languages of Europe, 255–
308. Berlin: Mouton de Gruyter.
Schmidt, Annette
1985 The fate of ergativity in dying Dyirbal. Language 61: 378396.
1990 The Loss of Australia’s Aboriginal Language Heritage. Canberra:
Aboriginal Studies Press.
Schultze-Berndt, Eva
2000 Simple and complex verbs in Jaminjung. A study of event categorisa-
tion in an Australian language. Ph.D. diss., University of Nijmegen.
2002 Grammaticalized restrictives on adverbials and secondary predicates:
Evidence from Australian languages. Australian Journal of Linguistics
22 (2): 231264.
2003 Preverbs as an open word class in Northern Australian languages: Syn-
chronic and diachronic correlates. In: G. Booij and J. van Marle (eds.),
Yearbook of Morphology 2003, 145177. Dordrecht: Kluwer.
1996 Funktionswortentlehnung in Mesoamerika. Spanisch-amerindischer
Sprachkontakt (Hispanoindiana II). Sprachtypologie und Universalien-
forschung 49 (1): 79123.
Thomason, Sarah G., and Terrence Kaufman
1988 Language Contact, Creolization, and Genetic Linguistics. Berkeley:
Troy, Jakelin
1993 Language contact in Early Colonial New South Wales 1788 to 1791.
In: Michael Walsh and Colin Yallop (eds.), Language and Culture in
Aboriginal Australia, 3350. Canberra: Aboriginal Studies Press.
Tryon, Darrell T., and Jean-Michel Charpentier
2004 Pacific Pidgins and Creoles: Origins, Growth and Development. Ber-
lin: Mouton de Gruyter.
Weinreich, Uriel
1968 Languages in Contact: Findings and Problems. The Hague: Mouton.
Grammatical borrowing in Rapanui
Steven Roger Fischer
1. Background
Rapanui is the language spoken by about one-fourth of the descendants of the

original East Polynesian settlers of Easter Island in the south-eastern Pacific.
In its phonology, morphology, syntax and lexicon, Rapanui is a typical Poly-
nesian language. That is, vowels prevail over a limited consonantal inventory;
only open syllables are allowed; particles and lexemes constitute the mor-
phological system; verbal and nominal frames shape all syntactic units; and
nearly the entire lexicon shares cognates within related lexica of other East
Polynesian islands.
Rapanui is the sole indigenous language of Easter Island; there are no dia-
lects. (The island has only the single community of Hanga Roa.) Rapanui was
also the island’s only language until foreign settlement in 1864/1866. Until
the 1930s, when Chilean Spanish began to intrude more decisively, nearly
every indigenous Easter Islander spoke only Rapanui. (Easter Island was an-
nexed by Chile in 1888.) As of 1966, with the granting of Chilean civil rights
to the Rapanui people of Easter Island, Spanish became the preferred lan-
guage on the island, whereupon Rapanui approached extinction. Within the
past ten years in particular, however, Rapanui has made a significant come-
back, although bilingual and syncretic styles of discourse Rapanui have yield-
ed a “Spanish Rapanui” as well as a “Rapanui Spanish”, with many mixed
forms. Easter Island children remain mostly monolingual Spanish speakers,
although many possess passive knowledge of some Rapanui. Of the island’s
resident population of c. 4,000 in 2006 – 1,800 indigenous Rapanui, 2,200
mainland Chileans and others; their command of Rapanui unknown, a further
2,200 indigenous Rapanui live abroad, mostly in Chile – perhaps around 800
(or 1,000 if one includes expats) claim Rapanui competence; only about 500
(650) of these would be fluent in Rapanui. Recent Rapanui educational and
cultural programmes targeting children allow the hope that the number of
Rapanui speakers might increase, if only marginally, in future.
388 Steven Roger Fischer
2. Phonology
The Rapanui sound system closely resembles its main contact language,
Chilean Spanish (“castellano” on the island). Phonological adaptation often
occurs, however, when alien Spanish forms are transferred into Rapanui,
whereupon the Rapanui speaker chooses a relevant approximation of place
and manner of articulation. For example, Spanish /g/ and /x/ will usually be
interpreted as Rapanui /k/; Spanish /d/ as Rapanui /r/; Spanish /s/ and /č/
as Rapanui /t/. Spanish consonant clusters will either be simplified (/s/ will
regularly be omitted) or expanded through vowel insertion to create a Polyne-
sian open syllabic structure (also often with omission and/or re-articulation):
Spanish canasto ‘basket’ is Rapanui kanato, and Spanish pobre ‘poor (one)’
is Rapanui poere. There are no rigid rules in this process. Makihara (2001a:
195) points out, for example, that Spanish olvida ‘(he/she) forgets’ can be Ra-
panui /orvida/, /orvira/, /orovida/ or /orovira/. In Rapanui it does appear that
Spanish /l/ and /d/ are replaced with /r/ more often than Spanish /g/ and /x/ are
replaced with /k/. Older speakers, who learnt Spanish only imperfectly, tend
to be those who replace Spanish with Rapanui consonants more regularly and
frequently than younger speakers, who are more fluent in Chilean Spanish.
However, in certain formal contexts even fluent Spanish speakers will pro-
nounce borrowed Spanish elements using Rapanui phonology, this in order
to effect a “nativization” for social or psychological reasons. Nearly all such
alterations tend to be conscious and strategic; there has been no community-
wide systemization of Spanish phonological alteration. Although all vowel
sounds are shared with Spanish, Spanish /e/ is regularly pronounced as /ɛ/,
Spanish /o/ as /ɔ/ in Rapanui. Most Rapanui speakers, even elderly ones, now
appear to share Spanish prosody and intonation, with rare exceptions.
Ergative–absolutive distinctions characterized the early stage of Proto-Poly-

nesian (c. 1000 bc). But already by the much later Proto-East Polynesian
stage (c. ad 500) such functions had been replaced by ever more important
nominative–accusative distinctions. This eventually rendered Rapanui, once
distilled from South-east Polynesian, very clearly a nominative-accusative
language (though it is possible that vestigial ergative elements remain in
Modern Rapanui). East-Polynesian -Cia (that is, consonant + /ia/) passive
desinence was lost entirely in Rapanui, but for a small number of lexical-
Rapanui 389
ized passives. Reduplication, as in ve’ave’a ‘very hot’ and ’iti’iti ‘very little,
small’, follows standard inherited East Polynesian practice, whereby beyond
their main function as intensifiers such reduplications can also convey, in
verbs, both plural agency and repetitive action.
As with all Polynesian languages, Rapanui marks possession using aliena-

ble–inalienable distinctions in personal pronouns, demonstratives and certain
prepositions that alternate between the vowels /a/ and /o/, according to for-
mal and intuited rules of usage (Fischer 2000). With Spanish contact, how-
ever, alienable/inalienable possessives /a/ and /o/ are generalizing to /o/. For
example,
(1) ta’a pōki

2sg.poss child
‘your child’
will now sometimes be heard as to’o pōki, tu’u pōki or even tū pōki (corres-
ponding to Spanish tu ‘your’), seemingly resulting from a perceived notion
that additional marking of possessiveness is no longer necessary in Rapanui,
as Spanish does not have this alienable–inalienable distinction. Rapanui /o/
already commands a greater domain in possessive marking than does /a/;
thus, /o/ has been chosen as the “default marker” in this weakening of the
possessive marking system as a result of Spanish contact.
Postpositive positioning of qualifiers (nominal and adjectival) is demon-
strating a shift toward prepositive positioning in Rapanui, and this not only
in direct borrowings or calques. For example, Rapanui motore vaka repro-
duces English ‘motorboat’; Rapanui “should” have here vaka motore as a
mixed calque. (A particularly rampant intruder due to massive tourism of
late, English is being widely spoken on the island now, too.) Also, Rapanui
kē vece imitates Spanish algunas veces ‘sometimes’. The Rapanui demonstra-
tive phrase te me’e nei ‘this thing’ is now more often nei me’e after Spanish
esta cosa. In similar fashion, although ara nei ‘this road’ (lit. ‘road this’) can
still be heard, nā hare ‘that house’ after Spanish esa casa is now preferred.
In Rapanui, locative predicates now follow more closely the Spanish
model, in existentials and in adjectival predicates, as in the following three
examples (2: locative predicate; 3: existential; 4: adjectival predicate):
(2) ki rote vai kava

rlt inside/art water bitter
‘to the sea’
(3) Ai te maika ’i nei.

ext art banana rlt dem
‘There are bananas here.’
(4) Nā hare, hare ’iti’iti.

dem house house small
‘That house is small.’
As with all Polynesian languages, Rapanui has a phrase-structure gram-

mar, with no case marking. It also has no gender marking, nor has it systemat-
ically borrowed gender marking from Spanish. (Isolated items are beginning
to intrude, however, such as gender-marked diminutive suffixes.) Rapanui
marks the possessed within the noun phrase, but, as mentioned above, is cur-
rently reducing the /a/ and /o/ alienable–inalienable possessive distinctions to
“default” /o/. This process might stem from those indigenous Easter Island-
ers, all of whom speak Spanish, who intuit here a Spanish de which, for all
such cases of possession (except possessive pronouns), would employ a prep-
osition instead to mark this function. That is, a native Rapanui speaker would
now “feel” that there is no longer a need to make an alienable–inalienable
distinction, as no such need engages the identical Spanish statement. It is for
this reason, then, that /o/ is now commonly fulfilling this function as the “de-
fault” translation of Spanish de in analogous contexts.
After the Spanish model, not only has Rapanui shifted the syntax of the
attributive demonstrative to prepositive position (see the example nei me’e
above), it now frequently uses definite/indefinite articles like Spanish as well:
Rapanui te for Spanish el/la/lo and Rapanui ’etahi ‘one’ for Spanish un/una/
uno. Traditionally following the head, the Rapanui agent will now, again fol-
lowing the Spanish model, often precede the head. For example, an original
Rapanui construction
(5) He ’aroha, he tatangi ararua.

ml suffer ml weep both
‘Both suffer and weep.’
would now, in imitation of Spanish, more likely be phrased

Rapanui 391
(6) Ararua he ’aroha, he tatangi ’ā.

both ml suffer ml weep prog
‘Both suffer and weep.’
Rapanui te tangata ‘mankind (in general)’ is now being understood, after

Spanish el hombre, to mean ‘the man’, with specificity and definiteness. It ap-
pears that original Rapanui te ‘the’ was, as in all Polynesian languages, more
generic marker than definite article. However, under Spanish influence this
is being reanalysed. The entire Rapanui system of definiteness/indefiniteness
has recently been reinterpreted in deference to the Spanish perception, a fas-
cinating, perhaps even alarming, psycholinguistic phenomenon.
In the case of diminutives, Spanish -tita and -tito (and a few others, ob-
serving, interestingly enough, a gender distinction otherwise not found in
Rapanui) might rarely be suffixed to some Rapanui words and names. How-
ever, this is code-switching, not borrowing, and it remains on Easter Island
very infrequent and individual.
Like all Polynesian languages, but quite unlike Spanish, Rapanui has no
modal verbs. “Modality” is usually achieved through simple periphrasis. For
example:
(7) Ko te rivariva he oho koe atu.

foc art good ml go 2sg dir
‘You should/must go.’ (lit. ‘It is good you go away.’)
As of quite recently, frequent use is being made of Spanish tiene que ‘he/she/
it has to’ in order to show obligation in Rapanui; tiene que is simply used here
as a lexicalized borrowing, without relevant inflection. And puē from Spanish
puede ‘he/she/it can’ is often used to express ability and possibility as well;
again as a lexicalized borrowing, without relevant inflection, it is fast becom-
ing “native” Rapanui.
It is possible that Spanish ir ‘to go’ has prompted the increased use of Ra-
panui oho ‘go’ to indicate future, rather than oho’s most common function to
indicate simple motion; but this new use of oho would certainly be individual
and infrequent, too. For example, one can now hear:
(8) He oho au, he ha’uru.

ml go 1sg ml sleep
‘I’ll go to bed.’ (lit. ‘I go, sleep.’)
When Spanish verbs are used in Rapanui, it usually entails direct integra-
tion without extra marking (and nearly always with some phonetic borrowing
to reproduce the Spanish verb perfectly, if possible), verbness meaning and/
or infinitive–nominal distinction. (Inflection is generally ignored by Rapanui
speakers, also when speaking Spanish.) When Easter Islanders use Spanish
verbs while speaking Rapanui, it almost always involves code-switching of
some kind, not borrowing.
Because of massive Spanish contact in particularly the past 40 years, there
has perhaps been an increased use of the periphrastic passive voice by Ra-
panui speakers. Traditionally, Rapanui dropped its inherited Polynesian -Cia
passive desinence, though the passive is very common in other Polynesian
languages (particularly Māori); only isolated lexical vestiges of a marked
passive now remain in Rapanui. It is doubtful whether Rapanui possessed
such a robust periphrastic passive voice before 1966, the year Easter Island-
ers achieved Chilean citizenship and full civil rights. The periphrastic passive
one now hears on the island seems to be directly patterned after Spanish. For
example:
(9) Ku tike’a ’ā ia ’i te ūka.

t/a see res 3sg rlt art girl
‘He/She/It was seen by the girl.’
(10) He kī mai ’i te tangata.

ml say dir rlt art man
‘It is being said by the man.’
Since Rapanui has no verb meaning ‘to have’, Spanish tengo is often used
in discourse for first-person singular, as tengo au ‘I have’. Uninflected Span-
ish tiene ‘he/she/it has’ is then used for most other declensions, irrespective
of person or number. It is of value to note that, for this, Old Rapanui would
have used such an inherited construction as:
(11) He hare ’o’oku.

ml house 1sg.poss
‘I have a house.’ (lit. ‘It is (There is) house of mine.’)
Rapanui 393
Old Rapanui had no copula. With increased intercourse with Tahiti at the
end of the nineteenth century, Tahitian ’ē ‘and’ was borrowed. Tahitian ’ē is
used to connect the subject and predicate; hence, it is a connector which is
functioning as a copula:
(12) ’Ē he tu’u mai ki nei he adapta au ’ē he puē mo

coorc ml arrive dir rlt here ml adapt 1sg coorc ml mod ben
adapta.
adapt
‘And when I arrive here I adapted and could adapt’ (Makihara 2001a:
212)
Although Tahitian ’ē is still favoured, which is now regarded to be “native”

Rapanui, today one can often hear in Modern Rapanui discourse Spanish y as
copula, which appears to be slowly replacing Tahitian ’ē:
(13) He tike’a au i ’ā Juan y Leo y José y ku riri

ml see 1sg acc prs Juan coorc Leo coorc José coorc t/a anger
’ā.
res
‘I see Juan and Leo and José who are angry.’
In the wake of recent massive Spanish contact, the verbal semantics of flu-
ent Rapanui speakers has sometimes been re-evaluated. For example, Span-
ish recibe ‘he/she/it receives’ is now Rapanui recibe (for all declensions),
which describes only the physical act of receiving – as through the Chilean
post. This borrowed verb does not embrace, however, the intricate Polynesian
social obligations attending the communal act of receiving, which process
would still demand the use of the Rapanui verb rava’a.
In formal Rapanui speech, there are hardly any Spanish borrowings; this al-
lows us to construe that, in Rapanui, the use of Spanish entails almost ex-
clusively code-switching, not borrowing. With other parts of speech, numer-
als are almost entirely Tahitian, having replaced most of the Old Rapanui
numerals already at the end of the nineteenth century; Tahitian’s decimal
system was very close to Old Rapanui’s decimal system, needing only min-
imal replacements and minor phonological alterations. In normal Rapanui

discourse (not formal speech), there is frequent use of Spanish expletives or
such discourse fillers as Bueno!, but this, too, is code-switching and bilin-
gualism, not a “contact phenomenon” within Rapanui as such. In addition,
Spanish no is used often as a discourse marker, with an emotive emphasis –
No! – that sets it apart from standard Rapanui ’ina ‘no’, which is far milder
in tone.
Relative to connectors, see example (13) above illustrating Rapanui’s use
of Spanish y ‘and’. Furthermore, Spanish pero ‘but’ is now used in Rapanui,
too, as originally Rapanui had no ‘but’. For example:
(14) Ko au ’i nei, pero he oho.

foc 1sg loc here but ml go
‘Here I am, but I’m going.’
Spanish o is now standard Rapanui ō for ‘or’, too:
(15) Hoki ku hanga ’ā mo te maika ō te ’uhi ō ’ina.

int t/a desire res ben art banana or art yam or not
‘Do you want the banana or the yam or not?’
Spanish porque ‘because’ is also frequently used in Rapanui, replacing

original ’o te aha (lit. ‘of the what’). As with the Bueno! example above,
there is frequent bilingual code-switching in Rapanui, as evidenced by the
common use of Spanish ya ‘already’ and entonces ‘therefore’, for similar
reasons. Spanish siempre ‘always’, nunca ‘never’, jamás ‘ever, never’ are
routinely heard as well in Rapanui, which has no specific words for these tem-
poral concepts, simply nō ‘only’ and ’ina ‘not’ (the latter borrowed early from
Tahitian); in this particular case, the borrowing of these Spanish temporal
items is semantic supplementation. Sometimes one can also hear Spanish/
English anti-, but this is infrequent and is often linked to topical, foreign (i.e.,
non-Rapanui) themes: anti-nuclear, for example. Such Spanish words and
phrases as además ‘furthermore’, en corto tiempo ‘shortly’, sino que ‘but’,
ante ‘before’, juto (from Spanish justo, with /x/ phonetic intrusion) ‘just’ and
a few others are now common in Rapanui discourse, too. A very recent intro-
duction is the calque ō ’ina ‘or not’, patterned after Spanish o no.
Much of the above-mentioned clearly involves code-switching. However,
both puē (from Spanish puede) ‘can’ and tiene que ‘has to’ have, within the
past decade, virtually become grammaticalized in the Rapanui language.
Rapanui 395
Spanish influence seems to be causing Rapanui’s original VSO order to be-

come more SVO. This applies also to VS becoming SV; that is, transitivity
appears to be irelevant in the process of “Rapanui hispanicization”. However,
there is great individual variation. In Old Rapanui and early Modern Rapanui,
any fronted topic required introduction by the Rapanui possessive particle
’a. Today, this fronting function of ’a is now largely dispensed with, primar-
ily because Spanish demands no such particle in this position. In addition,
postpositive possessors are now almost exclusively prepositive, again prob-
ably in consequence of Spanish influence. Three generations ago, frequent
use of postpositive possessors were made, although prepositive possessors
prevailed. For example, archaic te hare ’o’oku ‘my house’ (lit. ‘the house of
mine’) would now only be Modern Rapanui to’oku hare, which agrees with
Spanish mi casa.
Spanish ante ‘before’ is used in Rapanui, suffixed with progressive par-
ticles, as ante + ’ā or as ante + ana to form a noun phrase than means ‘be-
fore’, ‘previously’, or ‘in the old days’. For example:
(16) Ante ’ā, tangata ta’e rahi te ha’aura’a.

before prog man neg many art reason
‘It’s because there weren’t many people before’ (Makihara 2001a:
205)
(17) Ante ana ho’i he kompania Williamson.

before prog emph ml company Williamson
‘In the old days, there was the Williamson Company’ (Makihara
2001a: 205)
Ante can also function, without progressives, before perfective particles and
prepositions:
(18) Ante ’i oho mai nei.

before perf go dir here
‘Before coming here …’ (Makihara 2001a: 206)
(19) Ante ki te tu’u mai o te presidente.

before dat art arrive poss dir art president
‘Before the arrival of the president …’ (Makihara 2001a: 206)
8. Syntax
Other features evidence contact phenomena in Rapanui syntax. Presuming

causative to produce a complex clause, the Rapanui causative prefix haka-
(< Proto-East Polynesian *faka-) now often prefixes Spanish borrowings and
code-switchings. For example:
(20) Muy peligroso mo hakafunciona.

very dangerous ben caus.function
‘It is very dangerous to make (it) work’ (Makihara 2001a: 215, n. 31/g)
In fluent discursive Rapanui, it is also becoming more frequent to initiate

a statement with a Spanish adverb or adverbial phrase, as is also habitual in
Spanish discursive speech (this is not a characteristic of any Polynesian lan-
guage, including Rapanui). Note, for example:
(21) Entonces ka orvire [olvida] tātou i rā parte.

therefore imp forget 1pl.incl acc dem part
‘Therefore, let us [incl] forget that part …’ (Makihara 2001a: 212)
(22) Y además to’oku mana’u en corto tiempo he rē tātou.

and in addition 1sg.poss opinion in short time ml win 1pl.incl
‘And, in addition, in my opinion we’ll shortly all win’ (Makihara
2001a: 212)
Question formation in Rapanui produces insightful features as well,

whereby Spanish intrudes on several levels. First, Spanish porque ‘why?’ ap-
pears to be on its way to replacing Rapanui ’o te aha in every context:
(23) Porque he oho atu koe?

why ml go dir 2sg
‘Why are you going away?’
Spanish will even redundantly supplement the native Rapanui interrogative:
(24) Porque he aha?

why ±spec what
‘Why?’ (lit. ‘Why why?’)
Rapanui 397
Yet Spanish interrogatives almost entirely occur in Rapanui only during code-
switching: an Easter Islander speaking a “purist” Rapanui will not insert a
single Spanish interrogative. Clearly, this is not a case of borrowing. Tahitian
might now be considered “native” on the island, but Spanish manifestly is not.
Spanish remains intrusive, still being consciously manipulated for effect.
Spanish no will be used as expletive or response in Rapanui; in all other
grammatical negation, Rapanui ’ina, kai, ’ina kai, ta’e and the suffix -kore (lit.
‘-less’) are used. Old Rapanu used simple serialization, without connectors;
now, Modern Rapanui co-ordinates multiple statements using such Spanish
words as y, bueno, puē (= Spanish pues ‘then’), entonces, ni … ni, o … o.
Some Rapanui speakers attempt to “sophisticate” adverbial clauses using
Spanish inclusions, but, again, this is code-switching, not borrowing, as these
inclusions are perceived always as intrusions. For example:
(25) ’Ina au kai o’o nunca ki tū hare.

neg 1sg neg enter never dir dem house
‘I have never entered that house.’
(26) Ka kai koe porque repuē tiene que oho.

imp eat 2sg conj después mod.has.to go
‘Eat! Because afterwards you’ve got to go.’
There is regular inclusion of Spanish que for the subordinating conjunc-

tion ‘that’, a category Rapanui does not possess. (Sometimes this is written in
Rapanui – which language is seldom written – as ke.) Rapanui relative-head
constituent order VS is then consistently reversed to SV with the inclusion of
Spanish que (here graphic ke):
(27) Hoki ’ite e koe ke ia ’i tu’u mai ai?

int know ag 2sg that 3sg perf arrive dir pho
‘Do you know that he’s arrived?’
However, standard Modern Rapanui, without intrusive Spanish que, would

normally express this as:
(28) Hoki ’ite e koe ’i tu’u mai ai e ia?

int know ag 2sg perf arrive dir pho ag 3sg
‘Do you know that he’s arrived?’
Again, nearly all Spanish inclusions in Rapanui involve code-switching

and, in the case of relative pronouns, such inclusions can be frequent; how-
ever, the same user can also speak Rapanui without any such Spanish inclu-
sions. Spanish adverbs, as discourse fillers, would hardly appear in “formal”
Rapanui speech.
9. Lexicon
Lexical intermixing can occur in nearly all but the most formal registers, to
such an extreme degree that Rapanui can actually appear at times to be more
correctly called “Spanish Rapanui” (Makihara 1999, 2001a). All Rapanui
speakers are also fluent in Spanish, and these speakers then freely intermin-
gle both languages, to varying degrees, throughout the day, depending on to
whom they are talking and the circumstances. Nonetheless, there is also Ra-
panui speech devoid entirely of Spanish. One cannot generalize, therefore,
the process of Spanish “borrowing” on Easter Island.
Lexical supplementation occurs in Rapanui more frequently than lexic-
al replacement. Again, this involves primarily code-switching. That is, the
Rapanui speaker might wish, in any given circumstance, to “impress” her or
his listener(s) by using the Spanish rather than the Rapanui word(s). Spanish
expressions are commonly used interchangeably with Rapanui expressions.
However, seldom are set formulae split into both languages at once, unless
perhaps to inject humour.
True lexical copying on Easter Island nearly always involves previously
unknown introductions to Rapanui culture. In this sense, a “hispanicization”
of the Rapanui lexicon has occurred, albeit in the form of lexical supplemen-
tation. However, whenever a “purist’s Rapanui” is being spoken very little of
the indigenous lexicon will include even Spanish replacements, much less
supplements. In purely Rapanui contexts – that is, in those situations not in-
volving foreign objects, introductions or situations – not one word of Spanish
need be spoken at all. Educated Rapanui people dedicated to reviving their
indigenous language are currently seeking such traditional contexts in order
to promote just such a “purist’s Rapanui”. And there is very much an intuited
sense of what this “purist’s Rapanui” should comprise, although this latter
language, for over a century no longer Old Rapanui, is a model of Polyne-
sian language intertwining as it is in fact a late nineteenth-century Rapanui–
Tahitian hybrid.
Rapanui 399
10. Conclusion
At this point in time, no fewer than five resident languages engage Poly-
nesian-oriented Islanders on Easter Island: Old Rapanui, which occurs today
only in public performance (this language is largely unintelligible); Modern
Rapanui, in both public and private domains; Spanish Rapanui, the local syn-
cretism which uses much code-switching and also relies on a greater number
of Spanish borrowings than does Modern Rapanui; Rapanui Spanish, the
local syncretic compromise which is also used as an ethnic emblem; and
Chilean Spanish, the language of the still dominant power and now preva-
lent resident population (2,200 to 1,800). As a result, the active language
continuum on Easter Island in 2006 would be: Modern Rapanui > Spanish
Rapanui > Rapanui Spanish > Spanish. The first and second of these lan-
guages, Modern Rapanui and Spanish Rapanui, are not distinctly delineated
languages per se, but robust speech varieties that interact with one another
daily. Modern Rapanui is thus more of a blanket designation for the local
Rapanui–Tahitian intertwining that is continuously changing. It is, at the very
least, not moribund but apparently enjoying an evident, though still small and
tentative, renaissance.
Spanish, when used in Modern Rapanui, is wielded as a tool: normally, but
certainly not exclusively, for code-switching. Of course, the Modern Rapanui
lexicon has borrowed a considerable amount of Spanish (Fischer 2001); but
such borrowings are limited mainly to introduced technology as well as to
hitherto unknown concepts and perceptions deriving from non-Polynesian
spheres of activity, with some exceptions, as illustrated above. Some gram-
matical borrowing from Spanish figures in Rapanui as well. Owing to the pau-
city of Rapanui speakers (c.8001,000), there has been no attempt as yet to
translate Spanish borrowings systematically and/or formally into some kind
of standardized, “received” Rapanui. This might conceivably be achieved
through expanding the semantic domain of the Old Rapanui lexicon. How-
ever, under present circumstances there would be little advantage in doing
this. Any reversal of the current dynamic of the hispanicization process will
simply be the demonstration of fewer Spanish intrusions and the gradual re-
turn to a Polynesian exclusivity.
Fluent speakers of Modern Rapanui are, on the whole, conscious of a
diminishing of code-switching particularly in more formal – that is, more
“Rapanui” or locally ethnic – contexts. More children are now being for-
mally integrated into indigenous language use. The hitherto private domain
of Rapanui use is becoming once again increasingly public. New language

projects are raising local awareness of the island’s original language, and pro-
moting more active use in the larger community in general. And the island’s
imminent political autonomy will certainly open a new domain of Rapanui
use, which must increase not only its status, but also its application as both a
spoken and written language.
Abbreviations
acc accusative loc locative preposition

aff affirmative particle ml mainline aspect particle
ag agentive mod modal
art article neg negative
asp aspect marker perf perfective aspect
ben benefactive pho phoric
caus causative poss possessive
conj conjunction prog progressive aspect
coorc coordinating conjunction prs person/place marker
dat dative res resultative
dem demonstrative particle rlt relational particle
dir directional t/a tense-aspect particle
emph emphatic 1sg first-person singular
ext existential 2sg second-person singular
foc focus 3sg third-person singular
imp imperative 1pl first-person plural
incl inclusive ±spec ± specific
int interrogative
References
Du Feu, Veronica
1996 Rapanui. (Routledge Descriptive Grammars.) London and New York:
Routledge.
Du Feu, Veronica, and Steven Roger Fischer
1993 The Rapanui language. In: Steven Roger Fischer (ed.), Easter Island
Studies: Contributions to the History of Rapanui in Memory of Wil-
liam T. Mulloy, 165168. (Oxbow Monograph 32.) Oxford: Oxbow
Books.
Rapanui 401
Fischer, Steven Roger

1992 Homogeneity in Old Rapanui. Oceanic Linguistics 31: 181190.
1997 The Rapanui language. In Rongorongo: The Easter Island Script.
History, Traditions, Texts, 358361. (Oxford Studies in Anthropo-
logical Linguistics 14.) Oxford: Clarendon Press.
2000 Possessive markers in Rapanui. In: Steven Roger Fischer (ed.), Pos-
sessive Markers in Central Pacific Languages, 333344. (Special dual
edition of Sprachtypologie und Universalienforschung 53.) Berlin:
Akademie Verlag.
2001 Hispanisation in the Rapanui language of Easter Island. In: Klaus Zim-
mermann and Thomas Stolz (eds.), Lo propio y lo ajeno en las len-
guas austronésicas y amerindias, 313332. (Lengua y Sociedad en el
Mundo Hispánico 8.) Frankfurt: Vervuert, Madrid: Iberoamericana.
Forthc. Reversing hispanicization on Rapa Nui (Easter Island). In: Thomas
Stolz, Dik Bakker and Rosa Salas Palomo (eds.), Hispanisation. Ber-
lin/New York: Mouton de Gruyter.
Makihara, Miki
1999 Bilingualism, social change, and the politics of ethnicity on Rapanui
(Easter Island), Chile. Ph.D. dissertation, Department of Linguistics,
Yale University.
2001a Modern Rapanui adaptation of Spanish elements. Oceanic Linguistics
40: 191223.
2001b Rapanui–Spanish bilingualism. Rongorongo Studies 11: 2542.
2004 Linguistic syncretism and language ideologies: Transforming sociolin-
guistic hierarchy on Rapa Nui (Easter Island). American Anthropolo-
gist 106: 529540.
Forthc. Rapa Nui ways of speaking Spanish. Language in Society 34.
Grammatical borrowing in Nahuatl
Una Canger and Anne Jensen
1. Background
Nahuatl is a Uto-Aztecan language spoken in Central Mexico by around

1 million people. The early missonaries invented an orthography for the lan-
guage, thanks to which we have access to sources written in Nahuatl as early
as the 1540s. Hence, we are in a position to trace changes in Nahuatl during
the last 500 years. Since the Spanish invasion in 1521, Nahuatl speakers have
been in contact with Spanish culture including the language. Spanish is the
official language in Mexico, and Nahuatl is not a written language anymore.
The use of Nahuatl nowadays is restricted to oral communication among fam-
ily members and friends and in the Nahuatl communities. Not all children
acquire Nahuatl, so most speakers belong to the middle and old generations,
and all Nahuatl speakers are bilingual in Nahuatl and Spanish.
Apart from English, Spanish is the only language that Nahuatl speakers
are in contact with at the present time. Since Nahuatl belongs to the Meso-
american linguistic area (Campbell et al. 1986), it shares features with other
Mesoamerican languages, e.g. a vigesimal number system and V-first. Evi-
dently, these features are contact-induced changes in Nahuatl, the speakers of
which are assumed to have been the last ones to settle in Central Mexico. The
geographic extent of the area in which Nahuatl dialects were and are spoken
is rather large, thus Nahuatl speakers have been in contact with a range of
other Mesoamerican languages. However, the results of those contact situ-
ations still have to be explored.
In the present chapter, we focus on the impact of Spanish on the gram-
mar of Nahuatl.1 The predominant possible influence from Spanish appears
in the domains of nominal structures, among these the expression of spatial
and other relations, constituent order, and to a lesser extent syntax. Nahuatl is
a polysynthetic language with agglutinative morphology, and these two fea-
tures have been exposed to little or no contact-induced changes. Concerning
phonology, Spanish phonemes, e.g. voiced stops, occur only in loanwords
that have not been adapted to the Nahuatl system. The voicing of intervocalic
stops found in some modern dialects is not necessarily a contact phenomenon
since such a change may take place without contact. However, while wovel
404 Una Canger and Anne Jensen
length was the most prominent prosodic feature in sixteenth century Nahuatl,
in the twentieth century Nahuatl vowel length is less prominent, and fixed
word stress is now generally recognized (cf. Canger ms).
The following information is based on data from two dialects, North Pue-
bla (NP) and North Guerrero Nahuatl (NG).
The first change is in the number category. With a couple of exceptions,2

inanimate nouns in sixteenth-century Nahuatl were not inflected for plural.
In modern dialects, however, most inanimate nouns are inflected for plural.
In (1), the plural morpheme -tin is suffixed to the inanimate noun root kal-
‘house’; the quantifier miyek ‘much’ is marked for plural, too, by the mor-
pheme -in:
(1) aʔmo miyek-in kal-tin.

neg much-pl house-pl
‘There are not many houses.’ (NP Nahuatl)
This structural change is likely to be contact-induced since animate nouns in

Spanish are inflected for plural.
The other change concerning nominal structures is the emergence of a
new part of speech, prepositions. In sixteenth-century Nahuatl, most spatial
and other relations are encoded by means of a single postposition, -k(o) ‘at’,
‘on’, ‘in’, which attaches to inanimate noun roots, and by relational nouns.
A relational noun was preceded by a possessor prefix referring to the head
of construction. The possessor prefix third-person singular is attached to the
relational noun -pan ‘on’ in (2), referring to the head λaʔtoʔka:ti:λanλi ‘ruler-
messenger’:
(2) inin λaʔto:l-li i:-pan ø-m-iʔtoa:-ya

dem word-abs.s poss.3s-on sub.3-refl-say-impf
in λaʔtoʔka:-ti:λan-λi.
def ruler-messenger-abs.s
‘These words were said about the ruler-messenger.’
(CF IV, folio 203 recto)
The same applies to the possessor prefix in (3), which is attached to the rela-
tional noun -na:wak ‘near’, ‘at’:
Nahuatl 405
(3) ti-wel-la-mati-s in i:n-na:wak

sub.1pl-well-obj.indef-know-fut def poss.3pl-near
to-te:kw-yo:-wa:n siwa:-pi-pil-tin.
poss.1pl-lord-deri-poss.pl
‘You will be happy near our goddesses, the cihuāpipiltin.’
(CF VI, folio 143 recto)
In some modern dialects, relational nouns have lost the possessor prefix and
function as simple prepositions. The preposition na ‘at’ in (4) is the reflex of
-na:wak in NG Nahuatl:
(4) ma ya ø-m-namaka na Lupe Peña.

imp go sub.3-refl-sell.pres at Lupe Peña
‘Off we go, it sells at Lupe Peña’s house.’ (NG Nahuatl)
In other dialects, the possessor prefix is maintained, but the whole word in-
cluding the suffix functions as a preposition. The sentence in (5) includes
i:pan, which also appears in (2). But in (5), it encodes the path of the move-
ment encoded by the verb -wi:¢ ‘come’:
(5) ti-wi:¢ i:pan n kaʔkalaʔ-λe.

sub.2s-come.pres to def village-abs.s
‘You come to the village.’ (NP Nahuatl)
In sixteenth-century Nahuatl the path was not encoded; instead the path was a
meaning component of a movement verb. An example of that is given in (6);
the path FROM is not encoded explicitly:
(6) ka o:sto:-yoʔ i:-λan ø-wi:¢.

part cave-deri poss.3s-below sub.3-come.pres
‘It (the turquoise) comes from within the mines.’
(CF XI, folio 203 verso)
In Spanish, however, the path is encoded, in (7) by the preposition a ‘to’:
(7) Vienes al pueblo.

come.pres.2s to.det.mask village
‘You come to the village.’
Since Spanish has prepositions, the reanalysis of Nahuatl relational nouns as

prepositions may be ascribed to the influence from Spanish. The consequence
of this reanalysis is far-reaching concerning the grammar of modern Nahuatl
because a new part of speech, prepositions, has emerged. But this change has
left the language (or its speakers) in a process of change: In NP Nahuatl some
of the relational nouns still form a category of their own,3 at the same time
they are used as prepositions. Moreover, some Spanish prepositions have
been borrowed; see (31) in Section 6.
In sixteenth-century Nahuatl the prevalent order of possessed and posses-
sor is the one shown in (2), i.e. the possessor succeeds the possessed. This
order may have facilitated the reanalysis of relational nouns as prepositions.
Apart from a few exceptions, Spanish verbal structures have not been bor-
rowed, neither have Nahuatl verbal structures undergone changes. Spanish
verbs have been borrowed from an early stage of the contact between Na-
huatl speakers and Spaniards. Loan-verbs are derived from Spanish infini-
tives, which are treated like Nahuatl noun roots by means of the suffix -oa
– as shown in (8) with the Spanish verb cantar ‘sing’:
(8) san ke:man ø-wenti-ʔ ø-kwi-kwi:ka-ʔ

only when subj.3-be.drunk-subj.pl subj.3-red-sing-subj.pl
ø-cantar-oa-ʔ
subj.3-sing-deri-subj.pl
‘Only when they (the women) are drunk, do they sing, do they sing.’
(NP Nahuatl)
One verbal structure borrowed from Spanish forms part of a new mode
category, it encodes obligation. The verb -piya ‘to have’, ‘ to guard’4 is used
as an auxiliary, the obligatory subject prefix is attached to the auxiliary and
to the main verb (there is no infinitive in Nahuatl), as shown in (9):
(9) ti-k-piya ti-k-či:wa-s mo-tarea.

sub.2s-obj.3s-have.pres sub.2s-obj.3s-do-fut poss.2s-homework
‘You have to do your homework!’ (NP Nahuatl)
Except for the finite form of the main verb in the Nahuatl construction, it is a
calque of the Spanish construction encoding obligation:
Nahuatl 407
(10) tienes que hacer tu tarea.

have.pres.2s to do your homework
‘You have to do your homework!’
In sixteenth-century Nahuatl, future tense was used for encoding obligation,

but the future tense is currently used for encoding another member of the
mode category, potential – as it applies to Spanish. The main verb in (11) is
-yes, the future form of -kaʔ ‘to be’:
(11) ø-ye-s i:-aška Pedro.

sub.3s-be-fut poss.3s-property Pedro
‘It is probably Pedro’s property.’ (NG Nahuatl)
The emergence of potential within the mode category is a structural change in

the sense of a change of the over-all pattern of the category. The periphrastic
construction encoding obligation in (9), however, is not, since obligation was
encoded in sixteenth-century Nahuatl.
In (9) above, the morpheme -s expressing future tense appears. This mor-
phological marking of future tense now co-exists with a periphrastic con-
struction which also expresses future tense. The verb yaw ‘go’ is used as an
auxiliary to which a subject prefix is attached, the auxiliary is succeeded by
the main verb to which the obligatory subject prefix is attached. According to
the transitivity of the main verb, an object prefix may also be attached as is
the case in (12), in which the main verb is iʔkitia ‘weave’:
(12) n-ya ni-k-iʔkitia.

sub.1s-go.pres sub.1s-3s.obj-weave.pres
‘I will weave it’, ‘I am going to weave it.’ (NG Nahuatl)
This is a calque of the Spanish periphrastic future shown in (13):
(13) voy a tejer.

go.1s.pres to weave
‘I will weave’, ‘I’m going to weave.’
However, its an open question, to which extent the tense category as such has
undergone contact-induced changes.
The vast majority of Spanish loans appear in other parts of speech. Nahuatl
had – and still has now – a vigesimal number system, but except for the nu-
merals below 10 Spanish numerals are preferred. Some Nahuatl quantifiers
are still used in everyday speech, one of these is miyek ‘much’, ‘many’ (see
(1) in Section 2), but in NP Nahuatl, the Spanish quantifier poco ‘(a) little’
has been substituted for the Nahuatl one. This is not the case in NG Nahuatl,
where the Nahuatl quantifier te¢i ‘(a) little’, ‘a few’ is used.
Apart from a few Nahuatl connectors still in usage, the connectors intro-
ducing adverbial clauses are all Spanish. However, the syntax of adverbial
clauses remain Nahuatl. Moreover, in NG Nahuatl the three Spanish coordi-
nating connectors have been borrowed, i.e. y ‘addition’ (14), o ‘disjunction’
(15), and pero ‘contrast’ (16).
(14) in tepewaši ø-li:l-ti-k y ya:l ø-limpio.

def tepehuaje sub.3-black-deri-partc and pron.3s sub.3-clean
‘The tepehuaje (a tree) is black, and it is clean.’ (NG Nahuatl)
(15) ø-k-ilwia i:-naʔnaʔ o i:-taʔtaʔ

sub.3-obj.3s-say to.pres poss.3s-mother or poss.3s-father
ni-k-elewia inon ičpoka-λ.
sub.1s-obj.3s-like.pres that young.girl-abs.s
‘He (the young man) says to his mother or his father, “I like that
young girl.”’ (NP Nahuatl)
(16) niʔʔ-k-tila:na ok-se-pa i:ka no-derecha

sub.1s-obj.3s-pull.pres still-one-time with poss.1s-left
pero am ø-ki:sa in nekwal-mekal.
but neg sub.3-come out.pres def necualmecate
‘I pull it once more with my left (hand), but the necualmecate does
not come out.’ (NG Nahuatl)
In NP Nahuatl, o ‘or’ and pero ‘but’ have also been substituted for Nahuatl
connectors, and in both dialects the two connectors may connect phrases,
clauses and sentences. Spanish y ‘and’ is not used in NP Nahuatl. Instead the
speakers use (i:)wan ‘and’. In (17) two sentences are coordinated:
Nahuatl 409
(17) ni-k-či-čipe:wa i:šwa-to:ma-λ wa:n luego este

sub.1s-obj.3s-red-peal leaf-tomato-abs.s and then ah
ni-k-¢in-ko-koto:na čil-le.
sub.1s-obj.3s-bottom-red-cut chili-abs.s
‘I peal leaf tomatoes and then, ah, I cut off the stalk of the chili
fruits.’ (NP Nahuatl)
Apart from that, (i:)wan also coordinates phrases, i.e. like the second wan
in (18):
(18) inon costumbre ø-ki-piya n pueblo

that practice sub.3-obj.3s-have.pres def village
a:kin ø-mo-namik-tia
who sub.3-refl-meet-caus.pres
bien ø-ki-ø-λokolia-ʔ miyek cosas
well sub.3-obj.3s-obj.3s-donate-subj.pl many things
wan telpoka-λ ø-ki-ø-λokolia-ʔ cervezas
and young.man-abs.s sub.3-obj.3s-obj.3s-donate-subj.pl beers
wan guajolotes.
and turkeys
‘That practice the village has: one who marries – they donate him/her
many things, and young man [sic!], they donate beers and turkeys.’
(NP Nahuatl)
The first wan shows the third level of coordination – it combines two suc-
ceeding, units of discourse larger than a sentence the contents of which are
related to each other.
In sixteenth-century Nahuatl, -wa:n occurs as a relational noun meaning
‘in the company of’,5 it was used for connecting phrases and to a certain ex-
tent clauses and sentences. Still, -wa:n does not function at discourse level for
combining succeeding “chunks” – this function was carried out by aw ‘and’,
which is not found in NP Nahuatl. Thus, the use of wan in NP Nahuatl re-
sembles the function of Spanish y, and this fact may be considered a contact-
induced change.
In addition to the three aforementioned connectors, the Spanish adverbs
luego ‘subsequently’, ‘then’ (see (17) above) and entonces ‘then’ (19) are
used as discourse organizing devices in NP Nahuatl,6 whereas NG Nahuatl
also maintains the Nahuatl word okino ‘then’, ‘thus’ (20):
(19) ni-k-elewia inon ičpoka-λ

sub.1s-obj.3s-like that young girl-abs.s
iwkon ø-k-iʔto-s n novio
in that way sub.3-obj.3s-say-fut.s def fiancé
entonces ø-ya-s i:-čan n novio
then sub.3s-go-fut.s poss.3s-home def fiancé
‘“I like/want that young girl”, the fiancé will say it in that way, then
the fiancé will go (to his) home.’ (NP Nahuatl)
(20) okino in ok-se: kwaw-oʔλal ø-ki-piya

then def still-one wood-stick sub.3-obj.3s-have.pres
in hilo len-i:ka ni-k-wapana in hilo
def warp what-with sub.1s-obj.3s-fasten def warp
okino in kwaw-oʔλal in len-i:pa o
then def wood-stick def what-on past
ø-m-parejo ya ø-ki:sa
sub.1-refl-align.pret.s now sub.3s-come out.pres
‘Then the warp has another warp rod to which I fasten the warp,
then the warp rod on which I have aligned the warp, comes out.’ (NG
Nahuatl)
The discourse markers pues ‘then’, ‘well’, bien ‘well’ and bueno ‘well’, ‘sure’
and the hesitation marker este (see (17)) are borrowed from Spanish.
The predominant constituent order of clauses in sixteenth-century Nahuatl

is VS, or rather predicate–subject, since nouns and other parts of speech
may function as predicates; it is still discussed whether VOS or VSO is the
unmarked order. As it appears from e.g. (17) and (19) in Section 4, VO-order
(and the order predicate–subject) is still used in modern Nahuatl, there
are, though, occurrences with SV- and OV-orders; in (14) above the subject
in tepewaši ‘tepehuaje’ (a tree) precedes the predicate li:ltik ‘black’, and in
(18) the object inon costumbre ‘this practice’ precedes the verb -piya ‘has’.
Still, such constituent orders are also found in sixteenth-century Nahuatl,
thus nothing can be concluded with regard to Spanish influence on constitu-
ent order in main clauses. With respect to subordinate clauses, the constituent
Nahuatl 411
order in modern Nahuatl is identical to the order found in sixteenth-century

Nahuatl, albeit subordinating devices have been borrowed from Spanish.
Within some categories of phrases, changes have taken place. A noun
phrase may precede a modifier or vice versa in sixteenth-century Nahuatl.
The noun phrase in ¢a¢apa:sli we:yi lit. ‘the batten big’ from NG Nahuatl
(see (24) in Section 6), and the noun phrase se: pueblo kwal¢in ‘a village
nice’ from NP Nahuatl in (30) are two modern examples of the order head–
modifier. However, the preferred constituent order is modifier–head. The
abandonment of the head–modifier order may not be due to Spanish influ-
ence, since that constituent order is found in Spanish, too.
Another category of phrases includes quantifiers. In sixteenth-century Na-
huatl, the quantifier moči/noči ‘all’ occupied clause-initial position, while the
quantified noun phrase occupied a position subsequent to the verbal complex.
This applies to the subject in (21) which is quantified by the initial močintin
‘all’:
(21) moči-ntin ø-m-a:yo:-iʔčiki:-ya-ʔ in okič-tin

all-pl sub.3-refl-fontanel-scrape-impf-sub.pl def man-abs.pl
i:wan in siwa:-ʔ
and def woman-abs.pl
‘All the men and (all) the women shaved their heads.’ (CF X, folio
138 recto)
In modern Nahuatl the noun phrase always occurs immediately after noči:
(22) después este n pastel ø-mo-repartir-oa noči gente.

then ah def cake subj.3-refl-distribute-deri all people
‘Then, ah, the cake it is distributed (to) all (the) people.’ (NP Na-
huatl)
The change from the originally discontinuous phrase shown in (21) to a con-
tinuous phrase like the one in (22) may be contact-induced, since a quanti-
fier and a noun (phrase) always form a continuous phrase in Spanish. How-
ever, the development may have taken place independently. Phrases including
other quantifiers, e.g. miyek ‘much’ (see (1) in Section 2), were continuous
and the current constituent order including moči/noči may be a result of the
speakers’ wish for analogue phrase structures.
6. Syntax
One of the characteristic syntactic features of sixteenth-century Nahuatl is the

existence of prehead relative clauses, which co-existed with post-head rela-
tive clauses. The head of the relative construction in (23) is in amiʔi:yo:¢in in
amoλaʔto:l¢in ‘your honoured breath, your honoured words’, the preceding
relative clause is inside the square brackets:
(23) ma: ti-k-to-mak-toka-tin

part sub.1pl-obj.3s-refl-give-follow-vet.pl
[in nika:n ø-wa:l-ki:sa]
[conn here sub.3-dir-go out
in am-iʔi:yo:-¢in in amo-λaʔto:l-¢in.
def poss.2pl-breath-hon def poss.2pl-word-hon
‘Let us avoid pretending that your honoured breath, your honoured
words [that are coming out here] are given to us.’ (FC III, folio 36
verso)
Pre-head relative clauses are absent in modern Nahuatl, and may have been
abandoned due to Spanish influence; Spanish had and still has only post-head
relative clauses.
As it appears from (23), relative clauses in sixteenth-century Nahuatl are
formed by gapping;7 the relative clause is introduced by the connector in. In
NG Guerrero, that strategy is still found. The relative clause in (24) is intro-
duced by in:
(24) m-ø-piya in ¢a¢apa:s-li we:yi y ok-se:

sub.1s-obj.3s-have.pres def batten-abs.s big and still-one
¢a¢apa:s-li [in ø-ki-toka].
batten-abs.s [conn subj.3-obj.3s-follow.pres
‘I have the big batten and another batten [that follows it].’
(NG Nahuatl)
In some cases in is omitted, and intonation is the only indication of whether

a clause should be interpreted as a relative clause or as a simple sentence of
its own.
Still another strategy for relative clause formation is found in NG Nahuatl.
Given that NP-Rel has the syntactic function of an adverbial, a relative pro-
noun combined with a preposition is used. This is the case in (20) in section
Nahuatl 413
(2). The two relative clauses are introduced by len ‘what’ (an interrogative
pronoun), to which the prepositions i:ka ‘with’ and pa ‘on’ are attached.
(former relational nouns). Irrespective of the syntactic function of NP-rel,
relative clauses are introduced by a relative pronoun in NP Nahuatl. It is also
an interrogative pronoun, which shows whether the NP-head is human or not.
The composite pronoun λeinon ‘what’ is used when NP-head is -human:
(25) inon noči1 [λein-on1 ni-k-mati ašan].

dem all what-dem sub.1s-obj.3s-know.pres now
‘That is all [that I know now].’ (NP Nahuatl)
When the NP-head is +human, the pronoun a:kin ‘who’ introduces the rela-
tive clause:
(26) pero noč-ten1 [a:kin1 ø-ki-piya-ʔ servicio]

but all-pl [who sub.3-obj.3.s-have-pl service
o:-ø-λa-šλaw-keʔ
past-sub.3-obj.indef-pay-pret.pl
‘But all [who have service = electricity] have paid.’ (NP Nahuatl)
Finally, the speakers use Spanish que for introducing relative clauses:
(27) pero ye kin i:-tiempo que ø-walla-s inon ampliación

but already just poss.3s-time that sub.3-come-fut dem extension
[que ø-teč-segurar-o-kiw].
[that sub.3-obj.1pl-guarantee/secure-deri-intro
‘But soon it was already the (its) time that that extension [that came
to secure us] would come.’ (NP Nahuatl)
In sixteenth-century Nahuatl headless relative clauses were introduced

by the connector in followed by an interrogative pronoun. If the referent
was non-human, the interrogative λein ‘what’ was used. If the referent was
human, the interrogative a:kin ‘who’ was used:
(28) no: ø-ki-a:na-ti:w [in λein

also sub.3-obj.3s-grab-dir.imperf [conn what
no-yo:l-ka:-w ni-k-nemi:-tia]i .
poss.1s-live-partc-poss.s sub.1s-obj.3s-live-caus.pres
‘He will also go to grab [whatever animal I raise].’ (CF XI, folio 8
recto)
(29) inin λaʔto:l-li i:i-tečpa ø-m-iʔtoa [in a:kin

dem word-abs poss.3s-at sub.3-refl-say.pres [conn who
iλaʔ senkaʔ λasoʔ-λi ø-k-iʔλakoa]i.
something very valuable.thing-abs.s sub.3-obj.3s-destroy.pres
‘These words are said about [whoever destroys something very valu-
able]’ (CF VI, folio 199 verso)
Assuming that the abandoning of in as a general subordinator NP Nahuatl

was the first step of a change, then there would have been no marking of a
restrictive relative clause, whereas a headless relative clause would still have
been introduced by one of the interrogatives a:kin and λein. The speakers
may have expanded the use of the interrogatives to introduce restrictive rela-
tive clauses as well. Later they may have borrowed the Spanish subordinator
que – which may be on its way to becoming the preferred item for introduc-
ing relative clauses, as it is in other Nahuatl dialects.
As mentioned in Section 4, other categories of subordinate clauses have
not been affected by the contact with Spanish. However, Nahuatl is a poly-
synthetic language, and it may be in the process of drifting towards a less
polysynthetic structure due to Spanish influence. In Section 3 two new peri-
phrastic constructions were shown: In (9) the construction used for encoding
obligation – a calque of the corresponding Spanish construction; in (12) the
periphrastic construction expressing future tense was shown. There are at
least two further instances of periphrastic constructions that may be devices
for the possible drift mentioned above. The first is shown in (30), and consists
of a main clause with the verb -elewia ‘like’, ‘want’ and a complement clause
introduced by the particle ma:
(30) ni-k-elewia ma ø-mo-či:wa se: pueblo kwal¢in.

sub.1s-obj.3s-like part sub.3-refl-make.pres one village nice
‘I like that it becomes a nice village.’ (NP Nahuatl)
The other construction is formed by a main clause with the verb -neki ‘want’
and a subordinate clause in which the verb is in the future tense:
(31) ya o ø-ki-nek ø-mo-mač-ti-s

now past sub.3-obj.3s-want.pret sub.3s-refl-know-caus-fut.s
mas entonces hasta Huauhchinango.
more then to Huauhchinango
‘Now he wanted to learn more, then (off) to Huauhchinango.’ (NP
Nahuatl)
Nahuatl 415
The complex construction in (31) is also found in sixteenth-century Nahuatl,

co-existing with a composite verb form, in which -neki is an auxiliary, the
main verb occurs in the future tense:
(32) aw is ø-nel-li ø-mo-toli:nia-ʔ in

and here sub.3-truth-abs.s sub.3-refl-afflict-sub.pl def
ikno:-k wa:w-λi in ikno:-o:se:lo:-λ
orphan-eagle-abs.s def orphan-jaguar-abs.s
in ø-miki-s-neki-ʔ in
conn sub.3-die-fut-want-sub.pl conn
aʔ-ø-nemi-s-neki-ʔ
neg-sub.3-live-fut-want-sub.pl
‘And here is (the) truth: the poor eagle (and) the poor jaguar that want
to die, that do not want to live, suffer.’ (CF VI, folio 17 recto)
The composite verb form in (32) is not found in the two Nahuatl dialects in
question, and its abandonment in favour of the complex construction may be
caused by the contact with Spanish as the constructions in (30)–(31) resem-
ble Spanish constructions with the verb querer as main verb.
7. Lexicon
Spanish loanwords are adopted in many domains of the Nahuatl lexicon in-
cluding kinship terms. Presumably, the amount of loanwords correlates with
a speaker’s age, in the sense that children and younger adults use more loan-
words than older speakers. However, in the domain of grammatical vocabu-
lary – demonstratives, (emphatic) pronouns and place deictics – no borrow-
ing has taken place.
8. Conclusion
Discourse organizing morphemes and expressions have been borrowed from

Spanish, some prepositions have been borrowed too, just like subordinating
connectives introducing adverbial clauses have been borrowed. Prepositions
as a new part of speech have emerged from relational nouns, possibly due
to the influence from Spanish. However, currently (some) relational nouns
co-exist with prepositions. Furthermore, Spanish verbs and nouns have been
borrowed into Nahuatl, but in general they have been adapted to the Nahuatl
structure of verbs and nouns, i.e., Nahuatl subject, object and possessor af-
fixes are attached to the borrowed items. The marking of tense and direction
has not been affected by Spanish. Furthermore, there are only few obvious
contact-induced changes in the syntactic structure of Nahuatl, for example,
relative clause formation. Changes of constituent order within some types of
nominal phrases have taken place, and the changes may or may not be the
result of contact with Spanish. Negation at all levels shows no influence from
Spanish, the same applies to non-verbal predication. Nahuatl speakers still
use nominal sentences.
It seems as if the polysynthetic structure of Nahuatl is becoming less poly-
synthetic as some polysynthetic structures have been substituted by perifra-
stic constructions corresponding to Spanish ones. However, a comparison
with the development of other polysynthetic languages exposed to a long con-
tact with Spanish may reveal whether the development in Nahuatl is unique
or a common feature.
Abbreviations
abs absolutive non-possessed neg negation

caus causative obj object
conn connector part particle
def definite past augment, present in the preter-
dem demonstrative ite and the pluperfect
deri derivational morpheme pl/pl plural
dir directional poss possessor prefix showing the
dire directional inflection with a person and number of the pos-
final meaning sessor, or possessum suffix
fut future showing the number of the
hon honorific possessed
imp imperative pres present
impf imperfective inflection, which pret preterite
may not have strictly imperfec- pro emphatic pronoun
tive meaning partc participle
indef object prefix referring to no red reduplication
referent refl reflexive prefix
intro introverse verb conjugation s/s singular
with aspectual and final mean- sub subject
ing vet vetative
irr irrealis
Nahuatl 417
¢ = dental affricate with dental-alveolar release

č = dental affricate with palato-alveolar release
λ = dental affricate with lateral release
ʔ = glottal stop
Notes
1. Nahuatl has had an immense influence on Mexican Spanish with regard to par-
ticular domains of the lexicon, e.g. flora and fauna. Next to no research has been
carried out on the Nahuatl impact on Spanish grammar.
2. tepe:λ ‘mountain’, and ciλa:lin ‘star’.
3. Some of them form part of composite nouns, as it is the case in sixteenth-century
Nahuatl:
ø-kaʔ λa:l-pan ø-kaʔ i:pan λa:l-li
sub.3-be.pres earth-on subj.3s-be.pres on earth-abs.s
‘it is on the ground’ ‘it is on the ground’
4. Before contact with Spanish -piya meant only ‘to guard’, but undoubtedly due
to Spanish influence it acquired the additional meaning ‘to have’
5. However, i:-wa:n may already have been reanalyzed as a connector in sixteenth-
century Nahuatl. In (21) in Section 5, the possessor prefix i:- third-person singu-
lar does not agree with the plural of the noun siwa:- ‘woman’.
6. Moreover, the Spanish adverb después ‘subsequently’, ‘then’ has been borrowed
into NP Nahuatl.
7. We do not conceive of the obligatory subject and object affixes as elements ex-
pressing NP-rel.
References
Campbell, Lyle, Terrence Kaufman, and Thomas C. Smith-Stark

1986 Meso-America as a linguistic area. In: Language 62 (3): 530558.
Canger, Una
1980 Five Studies Inspired by Nahuatl Verbs in -oa. Copenhagen: Reitzel/
The Linguistic Circle of Copenhagen.
1988 Nahuatl dialectology: A survey and some suggestions. In: Language
54 (1): 2872.
N.d. (Changing) word prosody in Nahuatl. MS.
Carochi, Horacio, S. J.
1983 [1645] Arte de la lengua mexicana con la declaracion de los adverbios
della. Facsimile of 1645 edition. México: Universidad Nacional Autó-
noma de México.
Cerón, Isaías Mendoza, and Una Canger

1993 In tequil de morrales. Copenhagen: Reitzel.
Jensen, Anne
Forthc. The emergence of progressive aspect in modern Nahuatl. To appear.
Forthc. Syntactic changes in Nahuatl. To appear.
Launey, Michel
1986 Categories et operations dans la grammaire nahuatl. Ph.D. disserta-
tion, Université de Paris-IV.
1992 Introducción a la lengua y a la literatura náhuatl. México: Universi-
dad Nacional Autónoma de México.
Facsimile edition: CF = Codice Florentino: el manuscrito 21820 de la Collección

Palatina de la Biblioteca Medicea Laurenziana (en facsimil). Florencia:
Biblioteca Medicea Laurenziana 1979.
Grammatical borrowing in Yaqui
Zarina Estrada Fernández and Lilián Guerrero
1. Background1
Yaqui, including Mayo, its major dialect, along with two other distinct lan-
guages of this family, Tarahumara and Guarijio, constitute the Taracahitan
branch of the Uto-Aztecan linguistic family.2 Yaqui is spoken mainly in Mex-
ico by more than 15,000 people living along the Yaqui River in the Central
West part of the State of Sonora. Across the US–Mexican border, in Pascua,
Arizona, just south of Tucson, there is an estimated 1,000 speakers of this
language. The Yaqui speakers presently in Arizona migrated to the US at the
beginning of the twentieth century. The traditional Yaqui settlements in Mex-
ico are eight small towns: Cócorit, Bácum, Tórim, Vícam, Pótam, Ráhum,
Huirivis, and Belén; the first six were founded by Spaniards beginning in
1617, although the first Spanish contact goes back to 1523, when Diego de
Guzmán tried to conquer this original, indigenous nation. Since then, Yaqui
has been in continuous contact with Spanish.3
Nowadays, most Yaqui people speak Spanish, but with different degrees
of competence. In such a contact situation, Yaqui is the minority or vernacu-
lar language, and Spanish (or English in the US) the dominant language. The
degree of bilingualism is typically asymmetrical. There are a few speakers,
most of them elderly, who do not seem to understand or speak Spanish in
Mexico, or English in the US, and who might be considered as monolingual
in Yaqui.
The Yaqui are generally known as an indigenous group that has demon-
strated strength, pride, and a demanding character throughout four hundred
years of Spanish occupation. It has probably waged more military revolts
against the Spanish or Mexican governments than any other group, particu-
larly from 1608 till 1929. The Yaqui are also among the few native groups that
do not allow others to photograph them or record their festivities.
Currently, the Yaqui language is spoken within a family context, during
religious rituals and ceremonies, as well as in traditional government events.
Most of the situations in which Yaqui is spoken take place among people be-
longing to the same ethnic group, but in other everyday activities, e.g. polit-
ical, educational, or economic, the speakers make use of Spanish. From 1994
420 Zarina Estrada Fernández and Lilián Guerrero
till now they have conducted a bilingual program in order to teach Yaqui in
all elementary schools in the Yaqui area.
Most of the data considered for this chapter are the results of Estrada’s
own field notes, while preparing a Yaqui–Spanish dictionary (Estrada et al.
2004), while preparing a language documentation archive (Estrada and Bui-
timea, in press),4 and also while preparing a collection of several discourse
genres, now in progress.5
2. Phonology
The Yaqui sound system has five vowels, fifteen consonants, two of which are
glides. In comparison with other Uto-Aztecan languages, the Yaqui phono-
logical system is quite simple and it resembles the Spanish one.6 Vowels in
Yaqui are the same as in Spanish: /a/, /e/, /i/, /o/ and /u/. The complexity of the
system is found in long vowels, in their syllabic weight, and in their combin-
ation with the glottal stop. In the adaptation of loanwords, vowel lengthening
replaces a stressed vowel from Spanish:
(1) boosio ‘goiter’ (Spanish bocio)

(2) faajam ‘belt, girdle’ (Spanish faja)
(3) kameeyo ‘camel’ (Spanish camello)
(4) serbeesa ‘beer’ (Spanish cerveza)
(5) taasa ‘cup’ (Spanish taza)
The Yaqui consonant system differs from the one in Spanish only in a
few features: lack of the labiodental fricative /f/, lack of the dental stop /d/,
presence of the labiovelar fricative /bw/, presence of the glide /w/ which also
replaces the Spanish velar stop /g/, presence of an aspirated laryngeal frica-
tive /h/ written by the Yaqui from Sonora as <j>, and finally the absence of
the trill /r/ and the palatalized /ñ/. The phonological impact of some of those
features in the adaptation of loanwords can be illustrated as follows:7
(6) arorno ‘adornment’ (Spanish adorno)

(7) kape ‘coffee’ (Spanish café)
(8) kareta ‘cart’ (Spanish carreta)
(9) wolpo ‘gulf’ (Spanish golfo)
Yaqui 421
There are two phenomena related to nominal structures that arise in elem-
ents borrowed into Yaqui. Following Matras and Sakel (2007), both involve
the direct borrowing of morphemes (i.e., mat), rather than remodeling of
the structure (i.e., pat). The first phenomenon is not a general one, since it
applies randomly to a limited number of Spanish plural nouns. As shown in
the examples (10) through (13), some nouns are borrowed as plural forms
marked by -s; this suffix however, has lost his functional value in Yaqui due
to semantic bleaching:
(10) aaso-s ‘garlic’ (Spanish ajo)

(11) laabo-s ‘nail’ (Spanish clavo)
(12) waeba-s ‘guava’ (Spanish guayaba)
(13) ankele-s ‘angel’ (Spanish ángel)
One possible explanation is that Yaqui adapts non-plural nouns from

Spanish (e.g. aasos ‘garlic’ < Spanish ajo) as if they were plural or collective
entities. Evidence for the semantic bleaching of the Spanish plural suffix is
observed in example (14), where the lexical item waka-s ‘cow-pl’ is addi-
tionally pluralized by means of the Yaqui plural suffix -im:
(14) waka-s-im ‘cows’ (Spanish vaca)
Examples of non plural or non collective nouns borrowed from Spanish but
bearing the plural suffix -im from Yaqui are illustrated below, where due to
morpho-phonological factors, -im sometimes reduces to -m:
(15) kuchi’-im ‘knife-pl’ (Spanish cuchillo)

(16) kajtiila-m ‘castle- pl’ (Spanish castillo)
(17) santo-m ‘Saint- pl’ (Spanish santo)
(18) ornia-m ‘stove-pl’ (Spanish hornilla)
The following examples show that Spanish mass nouns are also adapted
into Yaqui as plural nouns:
(19) aina-m ‘flour- pl’ (Spanish harina)

(20) chicharoon-im ‘pig skin- pl’ (Spanish chicharrón)
(21) pan-im ‘bread- pl’ (Spanish pan)
(22) aros-im ‘rice- pl’ (Spanish arroz)
The second phenomenon concerns the borrowing of the suffix -ero which
derives agentive nouns. The suffix occurs in Yaqui as either -eo, -e’o or -ero;
Dedrick and Casad (1999: 77) document the suffix as -leo ~ -reo, but they
do not identify the suffix as originally from Spanish. Two examples from
these authors are kanoa-reo-m ‘boat-agt-pl’ (from Spanish canoa ‘boat’)
and kuču-leo ‘fish-agt’ (from Yaqui kuchi ‘fish’).
(23) bantear-eo ‘who carries the flag’ (Spanish abanderado)

(24) apar-eo ‘who plays the harp’ (Spanish arpero)
(25) tampar-eo ‘who plays the drum’ (Spanish tamborilero)
(26) kapint-eo ‘carpenter’ (Spanish carpintero)
(27) labele-eo ‘who plays violin’ (old Spanish ravel)
(28) bak-e’o ‘cowboy’ (Spanish vaquero)
Most of the time, the meaning of the loanword is maintained in Yaqui,

although semantic extensions may be observed. For instance, the Spanish
word vacas ‘cow’ in (14) is first borrowed as wakas ‘cow’, it then extends its
meaning to ‘meat’; later when the agentive suffix is added, it derives the noun
wakareo meaning ‘butcher, the person who sells meat’. A recent loanword
from the same Spanish noun vaca is shown in (28), where the phoneme /b/
from Spanish is preserved in the derivation of the agentive noun ‘cowboy’ (or
‘horseman’).
The use of this derivational suffix is very productive within the language,
since it is also used to derive agentive nouns from non-Spanish nouns and
verbs, as demonstrated in:
(29) kuta-reo ‘woodcutter’ (Yaqui kuta ‘wood’)

(30) tajkai-reo ‘tortilla maker’ (Yaqui tajkai ‘tortilla’)
(31) bwik-reo ‘singer’ (Yaqui bwike ‘sing’)
(32) ji’ik-reo ‘dressmaker’ (Yaqui ji’ike ‘sew’)
(33) ji’ojte-reo ‘writer’ (Yaqui ji’ojte ‘write’)
The Yaqui verbal structure is barely affected as the result of contact, except
for one case of direct loan of a morpheme (mat): the verbalizer suffix -oa.
This suffix, originally borrowed from Nahuatl, is used to derive finite verbs
from Spanish infinitival forms. Some examples are presented below:
Yaqui 423
(34) abogar-oa ‘to advocate, argue in favor’ (Spanish abogar)

(35) wantar-oa ‘to hold’ (Spanish aguantar)
(36) kombilar-oa ‘to mix’ (Spanish combinar)
(37) piar-oa ‘to lend’ (Spanish fiar)
(38) leiar-oa ‘to read’ (Spanish leer)
(39) passar-oa ‘to pass’ (Spanish pasar)
(40) pensar-oa ‘to think’ (Spanish pensar)
The adaptation of Spanish verbs via the Nahuatl suffix -oa, is relatively
new in Yaqui. Karttunen and Lockhart (1976: 32) mention that the -oa strat-
egy (along with two related forms, -uia and -ltia) was generalized in Nahuatl
around 1700. This process may have been expanded into Yaqui a bit late,
since the grammar of Tomas Basilio, published by Buelna (1989),8 only pro-
vides three verbal loans, none of them with the suffix -oa (and all involving
the Yaqui verbalizer suffix -te): capom-te ‘to cut testicles’ < Spanish capon;
manso-te ‘to tame’ < Spanish manso ‘tame’; and compes-ec-te or pes-ec-te
‘to confess’ < Spanish confesar plus -ek ‘to have’.
A slightly different explanation is provided by Dedrick and Casad (1999:
143), who argue that the suffix -oa may have been derived from the Yaqui
verb jooa ‘to do, to make’ (hooa in Dedrick and Casad’s orthography). Later
on their grammar, and mainly based on Karttunen (1984: 4), the authors
recognize that some of the verbs borrowed from Spanish “were mediated
through Nahuatl” (Dedrick and Casad 1999: 325326). In addition to jooa,
Estrada et al.’s Dictionary (2004) lists only four non-Spanish verbs ending in
the suffix -oa, shown in (41)–(43). Rather than cases of morphological deriv-
ation through the suffix -oa, these verbs might be the case of a phonological
coincidence:
(41) bamsoa ‘to hang’

(42) bo’ojoa ‘to walk’ (< bo’o ‘road’ and jooa ‘to do’, ‘to make’)
(43) jaboa ‘to become full’
Examples showing the full adaptation of those verbs are provided in

(44). As with any other verbal stems, tense–aspect–modal operators, as
well as voice morphemes are properly added to these verbal loans. Further-
more, the Spanish loanwords taking the suffix -oa have been also expanded
into other Uto-Aztecan languages: e.g. Huichol, e.g. panchar-oa ‘to iron’
(Chablé p.c.).9
(44) a. jumak jente-ta alborotar-oa-k …

perhaps people-acc uphold-vblz-pfv
‘quizás (eso) alborotó a la gente …’
‘perhaps (the fact) uphold the people ’
b. junaman bea te desarmar-oa-wa-k …
there then 1pl disarmed-vblz-pass-pfv
‘allá entonces fuimos desarmados …’
‘We were then disarmed …’
Besides the cases commented on before, most Spanish lexical influence is

found in the category of ‘other parts of speech’, especially numerals, function
words, discourse markers, vocatives and idioms. Here, we find both, elem-
ents taken over directly from Spanish, i.e. mat, and elements remodeling the
Yaqui structure, i.e. pat. In fact, this last category opens the discussion on
when lexical and grammatical borrowing ends and code-switching begins.
The numeral system in Yaqui is vigesimal (base 20), and evenly involves
basic, derived and compound items. The basic numerals, 16, are senu or
wepul, woi, baji, naiki, mamni, busani; nine is batani. The next three are
derived: woobusani ‘seven’, wojnaiki ‘eight’, and wojmamni ‘ten’. Numbers
from 10 through 20 are compound nouns: all take the stem for ten as the first
part of the compound plus the adverbial locative element ama ~ aman ‘there’
followed by the cardinal number, e.g. wojmamni aman senu ‘eleven’. Above
20, the numerals take the word takaa ‘body’ (i.e. 20 fingers) as the basic unit,
e.g. senutakaa ‘twenty’. The rest of the system gets more complex as in wou-
mamni bajisi aman mamnitakaa ‘seven hundred’. For this reason, numbers
above ten are often borrowed from Spanish.
The most common function word borrowed from Spanish is the conjunc-
tion o ‘or’:
(45) a. ume tomt-i kateme achai-m-ta-ka o

dem-pl born-adjvz walk.pl-nmlz father-pl-acc-sub or
mala-m-ta-ka…
mother-pl-acc-sub
‘Those youngsters from today, their fathers or mothers…’
b. o ankeliito-m-ta-k-uni kaa aet ju’unea-tek…
or angels-pl-acc-sub-also neg 3sg.obl know-cond
‘Or when the children didn’t understand…’
Yaqui 425
c. o jiba yo’ota bea au nattemai-ne…

or always elder-acc then 3sg.dat ask-fut
‘Or they always will ask to/for an elder…’
Another common borrowed subordinator is pos ~ poj ‘well’, ‘then’, ‘so’

(< Sp. pues). Such discourse particles correspond to what Mushin (2001)
considers a hesitation word. That is, pos communicates an act of hesitation,
where the speaker gives place to a suspension of opinion or action, an act of
doubt or vacillation when the speaker or narrator looks for a brief moment
to think on what it follows or on what was just said. The clause in (46a) is a
good evidence to show the adaptation of this discourse particle into the Yaqui
grammatical system, since the second position enclitic subject pronoun =ne
is attached at the end of poj. In (46c), pos appears in second position, imme-
diately following (bwe)ta ‘but’.
(46) a. Poj… inepo jewi im naa weye…

well 1sg.nom yes loc dir go.sg.hab
‘Well…, I live here…’
b. Poj ne kaa in=ejkuela-k kaita
so 1sg.nom neg 1sg.nom=school-have neg
ejtudio-ta ne jippue-k…
studies-acc 1sg.nom have-pfv
‘so…, I haven’t had school, no studies I had…’
c. Ta pos kaa itepo a=kulpa-k…
but then neg 1pl.nom 3sg.acc=guilt-pfv
‘But we are not guilty…’
Yaqui makes use of several strategies to express conditional clauses. How-

ever, the Spanish marker si ‘if’ is commonly used in this type of construc-
tions. As in Spanish, the particle si usually introduces the protasis clause:
(47) a. si nee a=mabett-ne-’u…

if 1sg.acc 3sg.acc=accept-fut-nmlz
‘If I were accepted…’
b. junama bea ne economia-ta bea ne
there then 1sg.nom economics-acc then 1sg.nom
nattemae-k si ama aayu-k o kaa ama aayu-k.
ask-pfv if there exist-pfv or neg there exist-pfv
‘(And) there, then, I asked if there was economics or not.’
In Spanish, the comparative conjunction komo ‘than’ (< Sp. como) has
more than one function, one to express similarity, another to express com-
parison. Yaqui uses the suffix -su for the first function as in (48a), and the
postposition benasi for the second one, as in (48b):
(48) a. joan wikia-ta wi~wikosa-su-k.

John string-acc rdp~belt-mod-pfv
‘John used a rope like a belt.’
b. aapo ousi tu’i-si tekipan-oa em=usi-ta
3sg.nom int good-mod work.prs-vblz 2sg.gen=son-acc
benasi.
comp
‘He works as well as your son.’
When komo occurs as a loanword, it resembles a comparison particle, but

it also functions as an evidential marker expressing a greater degree of cer-
tainty. In the following examples komo might be translated into English as
‘more or less’, ‘as’, ‘like about’, or ‘like’:
(49) a. tekipanoa-reo-tu-kan komo todo el tiempo, jewi!

work-agt-cop-sub like all det time, yes
‘(It was a good) worker like all the time, yes!’
b. binwatu komo setentai ocho wasukte…
time_ago like seventy eight years
‘(I was born) like about seventy eight years ago…’
c. junum bea ne komo junak tiempo-po kaa ejkuela-wa…
then then 1sg.nom like moment time-loc neg school-pass
‘Then, about that period there was no school…’
d. komo iane eme inim neu aane into=t
like now 2pl.nom here 1sg.dat exist.hab and=1pl.nom
nau=aane.
together=exist.hab
‘as now that you are here with me and we are together…’
Another borrowed particle is the Spanish temporal preposition hasta ‘up

to’ as a locative, limitative particle illustrated in (50a, b), and hasta que ‘until’
as a temporal, limitative marker in (50c). Although the preposition hasta has
in Mexican Spanish two possible meanings, till and when, the second one has
not yet been documented in Yaqui:
Yaqui 427
(50) a. jeka-po chaasime asta junum ian “Ten Jawee-po”.

wind-loc hang-go.sg.prs till there today mouth open-loc
‘It went rolling down up in the air till what is today the “open
mouth” mountain.’
b. asta Merida-wi pino-m ja’awe-ka’a-po lutula.
till Merida-dir pine-pl be-pfv-loc right
‘Till Merida, right there where the pines are.’
c. asta ke n-a=bwise-k.
until that 1sg.nom-3sg:acc=get-pfv
‘Until the time I got it.’
Contrary to what has been documented by Lindenfeld (1971, 1982) very

few cases of the Spanish subordinator que have been found in our database
which is mainly based on oral discourse materials, except for one speaker,
Agripina Amarillas, which makes an extensive use of this loanword:
(51) Jiokot te a=pasa-roa-k ke te trajte-ta

not-good 1pl.nom 3sg.acc=spend-vblz-pfv sub 1pl.nom dish-acc
ama wo’ota-tua-wa-k munim jitasa joona-po jo’o-wa-me
loc throw-caus-imprs-pfv beans that oven-loc make-imprs-rel
‘We spend it so bad that we were forced to throw away the beans on
the oven that were prepared by somebody…’
Vocative expressions are also among those elements borrowed commonly

from Spanish. Such elements occur with certain frequency in creative or po-
etic texts. Some of the vocatives are: no ‘no’, ‘not’ <Sp. no; aai! ‘oh!’ <Sp.
¡ay!; aa! ‘oh!’ <Sp. ¡ah!:
(52) a. in(to)-te-m-emo=waa-ek? No!

and-1pl.nom-lig-rflx=sister-have no
‘…and they sisters among them? No!’
b. ¡Aai! ¡Semalulukut kaa ko’okosi yaapo maisi empo
hey hummingbird neg fragile look like 2sg.nom
emo yoa’e!
rflx tremulous.imp
‘Hey! Hummingbird you don’t look fragile when you move your-
self (your wings) tremulous!’
c. ¡Aa! Bwe’ituk te kaabe-ta-mak etejo-machi!…

oh because 1pl.nom none-acc-com talk-seem
‘Oh! Because we are not (sure) to whom he seems to be talking
with!’
d. Ah! Caramba, neu eela kaate ume yo-im.
ah great 1sg.dat behind be.pl.prs dem.pl yori-pl
‘Oh! Great, the Mexicans are near me!’
Furthermore, in a small collection of procedural discourses, where the

preparation of distinct foods is described, the particle ori appears as a hesita-
tion word.10 Such word has been probably borrowed from the Spanish tem-
poral conjunction ora. The alternative forms in Yaqui ori and orita seem to be
motivated by their position within the sentence – first or last. Unfortunately,
there is no way to demonstrate at this time that ori ~ orita are truly Spanish
loans. Examples are provided in (53):
(53) a. Ori, Loma Wamochil-po into ket pa~pajko-ria-wa…

mmhm Loma Guamúchil-loc and also rdp~-festivity-appl-imprs
‘And, mmhm, in Loma de Guamúchil (the festivities) are also
celebrated.’
b. muuni-m ota-k-a-me o kesum-k-a-me…
beans-pl bone-have-a-nmlz or cheese-have-a-nmlz
paapa-m wakas-ek-a-me orita…, ala!
potatoes-pl meat-have-a-nmlz mmhm yes
‘Beans with bone or cheese… potatoes with meat, mmhm, yes…’
The vocative words borrowed from Spanish and illustrated within this sec-
tion are quite common in Yaqui narratives, although their occurrence varies
according to the speaker or the discourse type. Every individual speaker will
give a distinct communicative force to the discourse according to their own
personal choice or attitude, that is, according to the pragmatic context; all of
the vocatives are fully adapted into the phonology of Yaqui.
Yaqui’s constituent order has received, in general, little or minimal influ-

ence from Spanish. However, the conjunction komo illustrated before in (49)
seems to have introduced a change in the constituent order within compara-
Yaqui 429
tive clauses involving the Yaqui postposition benasi. The example in (54) il-
lustrates benasi before the adjective teebe ‘tall’, which is the usual position
for komo in Spanish:11
(54) aapo nee benasi teebe.

3sg.nom 1sg.acc comp high
‘He is as taller as I am.’
In contrast with the position of benasi in (54), examples in (55) illustrate

the regular postnominal order of benasi:
(55) a. Ket-kea ili ito aet womta-la-ta benasi…

also-only dim rflx 3sg.dir scare-adjvz-acc comp
‘Like if we were scared a little bit (towards) ourselves…’
b. Tua te wepul mampusiam-po benasi…
true 1pl.nom one finger-pl-loc comp
‘In fact there was like about only one small amount…’ (lit. like
about a finger)
The order of benasi in examples in (55), which is the usual position of

the conjunction como in Spanish, also demonstrates the influence of this lan-
guage in introducing a change in the word order (pat) of Yaqui:
(56) Benasi t=a ta’a-pea.

comp 1pl.nom=3sg.acc know-des
‘Like when we want to know it (the sun).’
A comparative clause where the postposition betchi’ibo is in the word-

order position considered to be the original from Yaqui is given in (57):
(57) Jume kuchum che’a su~sua-k-an nee betchi’ibo.

dem.pl fish.pl more rdp~think-st-pasc 1sg.acc postp
‘The fish were smarter than me.’
7. Conclusions
After four hundred years of Spanish influence on the Yaqui language and
culture, the language shows few cases of grammatical borrowing. Language
contact between Yaqui and Spanish is strong at the lexical level, but almost
nil in the grammar. Yaqui has adopted a Nahuatl morphological strategy –
the suffix -oa – to adapt Spanish verbs, and the Spanish derivational suffix
-ero, adapted as -e’o, -eo or -ero, to derive agentive nouns. Most of the loan-
words are discourse particles: conjunctions or subordinating elements. Some
of them used as hesitative, hortative or vocative particles, which are typically
used at the discourse level. Most of the borrowings are mat-loans.
One last interesting contact phenomenon, which can be taken as a true in-
stance of code-switching rather than of borrowing, is the usage of multimor-
phemic elements. That is, expressions (idioms) which include more than one
word. Instances of code-switching are frequently used to give communica-
tive force at the discourse, but they can vary according to both the topic and
the speaker. Estrada, Morúa and Álvarez (2005) have illustrated some cases
where Spanish expressions are used in narrative texts as a way of introducing
communicative force. Good examples of code-switching are the following:
(58) a. Pero no es igual, ian tajti bea ne inim weama.

now dir then 1sg.nom here be-prs
‘… but it is not the same, I’m still around.’
b. Si nesio… ju’u yoi jodido!
int necio det yori jodido12
‘How stubborn… the damn white-man!’
c. soldado a huevo…
soldado a huevo
‘forced to be a soldier…’
d. Bwe tua nee chingo-k ommme…!
but int 1sg.acc chingar-pfv man
‘But, in fact, I fuck myself!’
Abbreviations
acc accusative dat dative
adjvz adjectivizer dem demonstrative
agt agentive des desiderative
appl applicative det determiner
caus causative dim diminutive
com comitative dir directional
comp comparative fut future
cond conditional imp imperative
Yaqui 431
imprs impersonal pasc past continuous

int intensive postp postposition
loc locative rdp reduplication
neg negative rflx reflexive
nmlz nominalizer rel relativizer
nom nominative sg singular
obl oblique st stative
pfv perfective sub subordinator
pl plural vblz verbalizer
prs present
Notes
1. We are grateful to the Max Planck Institute for Evolutionary Anthropology, in

particular to Bernard Comrie, for hosting Estrada during the summers of 2006
and 2007, making it possible for us to carry this work; we also thank Fred Field
for his useful comments on an earlier version of this chapter.
2. There are several articles dealing with the influence of Spanish on the structure
of Yaqui: Dozier (1964), Johnson (1943), Lindenfeld (1982); two are mainly
dealing with the Yaqui from Arizona (US) only Johnson’s study deals with the
Yaqui spoken in Sonora (Mexico).
3. ISO 6393: yaq.
4. Crescencio Buitimea Valenzuela, who has coauthored several studies with Es-
trada, is a native speaker of Yaqui born in Vícam, Sonora. He is now preparing
a Handbook dictionary of the Yaqui language and several books for teaching the
language.
5. Estrada and Silva (2006) deal with some properties of the Pascolas’s discourse,
a kind of religious discourse, which is used among the Yaqui in their festivities.
The Pascola is a Yaqui man who performs the role of an anti-religious person, a
clown. We want to express our gratitude to Manuel Carlos Silva Encinas for his
generous guidance and for sharing with us his discourse materials in Yaqui.
6. The similarities among the phonological inventory of Yaqui and Spanish is due
to a mere coincidence. See Voegelin, Voegelin and Hale (1962) as well as Whorf
(1935) for studies on the reconstruction of Proto-Uto-Aztecan. Throughout this
work we are using the orthography approved by the Yaqui from Sonora, Mexi-
co.
7. Escalante (1988) provided a brief account about the adaptation of loanwords
into the Yaqui from Pascua. Estrada (2005) presented a detailed analysis of the
phonological adaptation of Spanish loanwords into Yaqui.
8. Tomas Basilio’s grammar, deals with Tehuelco, an extinct Cahitan dialect.
9. Chablé’s field notes are centered on the Huichol variety spoken at La Palmita,
municipality of Mezquitic, Jalisco. Chablé is working on her Master thesis.
10. A procedural discourse has been described as an activity-oriented discourse,

where a series of activities, which are done by somebody, are described in a
chronological order.
11. Although Lindenfeld (1971: 89) however, has considered those constructions
as native from Yaqui.
12. In Spanish: ‘¡Qué necio… el yori jodido!’
References
Buelna, Eustaquio
1989 [1890]. Arte de la lengua cahita por un padre de la Compañía de Jesús.
México: Siglo XXI Editores.
Dedrick, John M., and Eugene H. Casad
1999 Sonora Yaqui language structures. Tucson: The University of Arizona
Press.
Dozier, Edward P.
1964 Two examples of linguistic acculturation: The Yaqui of Sonora and
Arizona and the Tewa of New Mexico. Language 32 (1): 146157.
Escalante, Fernando
1988 Spanish loanwords in Yaqui. Work presented in CAIL–American An-
thropology Association. Phoenix, Az.
Estrada Fernández, Zarina
2005 Spanish loanwords in Yaqui. (Uto-Aztecan language from Northwest
Mexico). Paper presented at the Fifth Workshop on Loanword Typ-
ology. 67 June, Leipzig, Germany.
Estrada Fernández, Zarina, and Crescencio Buitimea Valenzuela
Forthc. Yaqui de Sonora. Archivo de lenguas indígenas de México. México: El
Colegio de México.
Estrada Fernández, Zarina, Crescencio Buitimea Valenzuela, Adriana E. Gurrola Ca-
macho, María Elena Castillo Celaya, and Anabela Carlón Flores
2004 Diccionario yaqui-español y textos: Obra de preservación lingüística.
México: Plaza y Valdés.
Estrada Fernández, Zarina, María del Carmen Morúa Leyva, and Albert Álvarez
González
2005 Actitudes lingüísticas y contacto entre lenguas: Préstamos del español
en yaqui. XIV Congreso Internacional de la Asociación de Lingüística
y Filología de América Latina. 1923 Oct., Monterrey, Nuevo León.
Estrada Fernández, Zarina, and Manuel Carlos Silva Encinas
2006 El discurso de los pascolas entre los yaqui de Sonora, México. Friends
of Uto-Aztecan Conference (FUAC), Salt Lake City, UT. 2326 Au-
gust.
Yaqui 433
Johnson, Jean Bassett

1943 A clear case of linguistic acculturation. American Anthropologist 45
(3/1): 427434.
Karttunen, Frances
1984 An Analytical Dictionary of Nahuatl. Texas Linguistics Series. Austin:
University of Texas Press.
Karttunen, Frances, and James Lockhart
1976 Nahuatl in the Middle Years: Language Contact Phenomena in Texts
of the Colonial Period. Berkeley: University of California Publica-
tions. Linguistics 85.
Lindenfeld, Jacqueline
1971 Semantic categorization as a deterrent to grammatical borrowing: A
Yaqui example. International Journal of American Linguistics 37 (1):
614.
1982 Langues en contact: Le yaqui face a l’espagnol. La Linguistique 18
(1): 111127.
Mushin, Ilana
2001 Evidentiality and Epistemological Estance. Narrative Retelling.
Amsterdam: John Benjamins.
Voegelin, Carl F., Florence M. Voegelin, and Kenneth L. Hale
1962 Typological and comparative grammar of Uto-Aztecan I (Phonology).
Publications in Anthropology and Linguistics, Memoir 17 of the Inter-
national Journal of American Linguistics. Indiana University.
Whorf, Benjamin L.
1935 The comparative linguistics of Uto-Aztecan. American Anthropologist
37: 600608.
The case of Otomi: A contribution to grammatical
borrowing in cross-linguistic perspective
Ewald Hekking and Dik Bakker
1. Background
In this chapter we describe the Spanish grammatical and lexical borrow-

ings in Otomi, a language from Central Mexico. Otomi is a member of the
Otomanguean family, which according to Ruhlen (1991: 37) belongs togeth-
er with Uto-Aztecan and Tanoan to the Central Amerindian stock. Otomi,
which constitutes together with Mazahua, Ocuilteca, Matlatzinca, Pame and
Chichimeca the Otopame group within Otomanguean, is currently spoken by
around 310,000 mostly bilingual speakers on the highlands around Mexico
City in the states of Mexico, Hidalgo, Querétaro, Puebla, Guanajuato, Tlax-
cala, Veracruz and Michoacán. We discuss two Otomi dialects, viz. the dia-
lect of Santiago Mexquititlán in the municipality of Amealco situated in the
southern part of the state of Querétaro, and the dialect of San Miguel in the
municipality of Tolimán situated in the northern part of the same state. San-
tiago Mexquititlán is a village with a population of around 15,000 inhabit-
ants in the mountains of Mexico’s neovolcanic axis. San Miguel is a village
with a population of around 670 inhabitants in the semidesert of the Sierra
Madre Oriental. In both villages the vast majority of the population are eth-
nic Otomis and both Otomi dialects belong to north-western Otomi, one of
the larger variants of Otomi with around 33,000 speakers in total. The Otomi
dialect spoken in Santiago Mexquititlán is rather similar to the Otomi dia-
lect of the villages in the north of the state of Mexico. The dialect spoken in
Tolimán is similar to the Otomi dialect of the Valle del Mezquital in the state
of Hidalgo. Our description is mainly based upon a corpus of around 110,000
tokens, collected during fieldwork between 1993 and 2004, and the result of
interviewing a total of 115 respondents. In this section we will give a short
historical sketch of the Otomi language community. The following sections
will discuss a number of borrowing phenomena on the respective levels of
linguistic description.
Like the majority of the larger language groups in Latin America, Otomi
has a rather long contact history with Spanish. From around the year 1500
436 Ewald Hekking and Dik Bakker
onwards there has been contact between the two languages. As a result Otomi
has undergone pervasive influence from that European language. It was the
native language of the original inhabitants of the Valley of Mexico and the
surrounding valleys. Throughout history its speakers had to confront Aztecs,
Spaniards and Mestizos, speakers of Nahuatl and Spanish. Both languages
belong to other language families, Uto-Aztecan and Indo-European respect-
ively. Since the Otomis had to surrender to the Nahuas in the fifteenth cen-
tury, there has been a very close contact between the Otomi and Nahuatl
languages. During that contact the Nahuas developed a very negative image
of the Otomis, which later was passed on to the colonial chroniclers, such
as Sahagún (1999), whose Nahuatl speaking informants considered the Ot-
omis “toscos e inhábiles” (coarse and unskillful). The word Otomi is prob-
ably a derivation of the Nahuatl word tOtomitl ‘birdhunter’ (Jiménez Moreno
1939). The Otomis themselves prefer to call their language Hñäñho, Hñöñhö
or Hñähñu, and themselves Ñäñho, Ñöñhö or Ñähñu. In these names, the
morpheme ñä and its variant ñö mean ‘speak’. The morpheme -ñho is prob-
ably a derivation of the adverb hño ‘well’. The morpheme h- marks the imper-
sonal or passive voice (Hekking 1995). The Otomis have also been in contact
with the Mazahuas, with whom they had a relation of equality, and the Chich-
imecs, in comparison with whom the Otomis probably felt superior. In this
connection it is interesting to mention that the Otomis from Tolimán claim
that their forbears originally spoke Chichimeca. This could mean that in the
Otomi spoken in that community Chichimeca substrate might be found.
As the Otomis were the second most numerous group after the Nahuas on
the Mexican highlands, the Spaniards were very much interested in their con-
version. Although their language was considered to be very difficult because
of the fact that Otomi has considerably more vowels and consonants than
Spanish, a spelling system for Otomi was developed, as well as vocabularies
and grammars. Catechisms and legal documents were written in Otomi at
some scale. Especially missionary friars of the Franciscan order have studied
the Otomi language, such as Fray Alonso Urbano ([1605] 1990). The colonial
documents written in Otomi are not easily accessible, since the authors do not
always distinguish the Otomi phonemes that have no corresponding element
in Spanish, especially among the vowels.
After the independence of Mexico in 1813 the indigenous groups were
officially no longer recognized as such and lost most of the status implied
by that. As a result, many Otomis could no longer afford their education and
Otomi stopped being used by the civil authorities. Only a handful of scholars
continued learning and studying the language. It was in the nineteenth cen-
Otomi 437
tury that a process of language shift started. The Mexican Revolution (1911–
1917) did not lead to social change for the Otomi population, nor did it foster
recognition for their language, and stop language shift. On the contrary, after
a long history in which the Otomis had been degraded socially, they now be-
long to the lowest social levels of the Mexican society. They live in the most
remote and less fertile places on the highlands from an agriculture of subsist-
ence, reason for many of them to emigrate to the big cities, such as Mexico
City, Guadalajara and Monterrey. In the twentieth century several attempts
have been made to integrate the indigenous peoples in the national commu-
nity by means of the officially called Educación Bilingüe. This is taken care
of by indigenous teachers with a very negative attitude towards their own
roots and a complete lack of knowledge about bilingual education. As a con-
sequence, most Otomis are illiterate in their first language and very often have
insufficient command of the standard variety of Mexican Spanish. Otomi is
only spoken inside informal domains such as the family.
During the last 20 years, because of the construction of roads and schools,
the growing influence of the media and the increasing trade and emigration,
contact between the relegated Otomis and the Spanish speaking world of the
Mestizos has increased considerably. As a result, a rapid increase in contact
phenomena from Spanish may be observed (Hekking 1995, 2001, 2002; Hek-
king and Bakker 1998a, 1998b, 2005; Hekking and Muysken 1995). Because
of the fact that Otomi is a stigmatized language, only spoken by poor and
traditional people, many Otomis do not want to convey the indigenous lan-
guage to their children any longer.
So far the historical background of the Otomi language community. The
rest of this contribution is organized as follows. In Section 2 we describe how
parts of the Otomi phonological system have changed as a result of contact
with Spanish. In Section 3 we will see how Otomi has changed on a number
of typological parameters. In Sections 4, 5 and 6 we have a closer look at the
borrowing in nominal and verbal structures and with respect to grammatical
elements. In Sections 7, 8 and 9 we will discuss how Otomi constituent order,
syntax and lexicon have been affected by language contact. Finally, in Sec-
tion 10 we will draw some conclusions.
2. Phonology
When Spanish loanwords are inserted in an Otomi utterance, they may adjust
in different degrees to the phonological patterns of the target language. The
degree of phonological integration of the Spanish borrowings depends on the

age of the loanword and on the degree of bilingualism of the speaker. The
following adaptations of Spanish borrowings to the phonological system of
Otomi were observed.
a. The Spanish open central vowel a (ä in the spelling system of Hekking

2002) tends to nasalize in syllables that start with the nasal consonants m, n
or ñ: animä (< anima); apenä (< apenas); ngañä (< engaña)1
b. Sibilants and voiced and unvoiced plosives at the beginning of a syllable
tend to nasalize: mbakuna (<vacuna); mprenda (< prenda); nsinku (< sin
que); ndezde (< desde); nkada (< cada)
c. Unstressed syllables tend to get lost, especially when initial: biskleta (<
bicicleta); bwela (< abuela); fende (< defende); tonse (< entonces); reglo
(< arreglo); Rike (< Enrique); wanta (< aguanta)
d. Vowels that form the nuclei of unstressed syllables tend to be replaced by
the central vowel u: asundado (< hacendado); bispura (< víspera); kasu (<
caso); sumiya (< semilla)
e. Consonant clusters that do not exist in the phonological system of Otomi
tend to become simplified: akol (< alcohol); otubre (< octubre); another
way of simplification of complex consonant cluster is the insertion of the
central vowel u: asukwenta (< haz de cuenta); ekutarya (< hectárea)
f. Diphthongs tend to get lost, especially ie after the consonants d, f, m and k:
defende (< defiende); denda (< tienda); anke (< aunque); korpo (< cuerpo)
g. Final consonants tend to get lost: a bese (< a veces); abri (< abrir); kondisyo
(< condición)
h. Mid vowels tend to become high, especially in unstressed syllables: bisinu
(< vecino); gubyernu (< gobierno); but also in stressed syllables (which
may divert from the stressed syllable in the original): dumi (< tomín (=
old coin, eighth part of a castellano)), Huse (< José), kumpa (< compadre),
mundo (< montón)
i. The unvoiced velar fricative j tends to become a glottal fricative h: bruha (<
bruja); byeho (< viejo); hente (< gente); konhunto (< conjunto); Mehiko (<
Méjico); mehor (< mejor)
j. The s – especially between two vowels or at the beginning of a syllable
– tends to become palatalized: guxto (< gusto); mexa (< mesa); xebo
(< sebo)
k. The unvoiced dental and velar plosives p, t and k tend to become voiced,
especially with old loanwords: baga (< vaca); bindo (< viento); denda (<
tienda); yunda (< junta)
Otomi 439
l. axo (< ajo), paxa (< paja), Xuwa (< Juan) are examples of old loanwords,
the palatals of which are pronounced as fricative palatals in modern Span-
ish
Apart from these adaptations, new sounds are introduced into Otomi as well.
As a result of the incorporation of unassimilated Spanish loanwords, several
new segments have been added to the Otomi repertoire. Examples are the
trilled alveolar vibrant rr in loanwords such as burru (< burro) and surru
(< zorro), as well as the lateral l in loanwords like ladriyo (< ladrillo); lado;
biskleta (< bicicleta); bwelo (< abuelo). The lateral now also appears in native
items, such as lele ‘baby’. Through the adoption of Spanish and to a much
lesser extent also Nahuatl loanwords the affricated alveopalatal may have
been introduced in Otomi, as in chachalaka (< chachalaca (= type of bird));
chikiuite (< chiquiuitl (= basket)); chaketa (< chaqueta); chofe (< chófer)
(cf. Hernández Cruz, Torquemada, and Sinclair Crawford 2004). As a pos-
sible result affricates are also palatalized in native forms, e.g. in Santiago
Mexquititlán we found tx’aki instead of ts’aki (Hekking and Andrés de Jesús
forthc).
Furthermore we find consonant clusters which are typical for Spanish but
unknown in classical Otomi, as in ektarya and septyembre. Other patterns
unknown in the phonological system of Otomi are found in loanwords as
well, such as syllable final consonants, and unstressed first syllables, as in
prisidente and kampesinu.
Sofar we have no evidence that Otomi vowel and consonant harmony, tone
and intonation have been affected in any distinctive way by the contact with
Spanish.
3. Typology
In this section we will discuss how Otomi is changing on three typologic-

al parameters: constituent order, morphological structure and its parts-of-
speech system.
As in the other Otomanguean languages, VOS and VS are the basic con-
stituent orders at the clause level in classical Otomi (Suárez 1983; Yasugi
1995). In today’s language SVO order is frequently used (Lastra de Suárez
1994; Hekking 1995), although VS order is still very common, as is shown
in example (1) (from Salinas Pedraza 1983: 105 (521)).
(1) Nubye dä meengä rä ngu nu’ä rä ’ñeei.

now fut.3 go.back pos.3 house that def.sg witch
‘Now the witch will go home again.’
From a morphological point of view, classical Otomi has a rather compli-

cated synthetic structure on the lower syntactic levels, more specifically in
the noun phrase (NP) and the verb phrase (VP). At the sentence level the
structure is more analytical, and it is not uncommon to find asyndetic com-
pounding and the bare juxtaposition of constituents, with very few explicit
markers of the semantic or syntactic relations, such as adpositions, coordina-
tors and subordinators between the constituents. As a result the meaning at
the clause level must often be deduced from the meaning of the main verb
and the context. Also possession is marked via possessive pronouns rather
than via an adposition, as shown in example (2) (from Hekking and Andrés
de Jesús, forthc.).
(2) Di ’weti ár zexjo ar Xuwa.

pres.1 sew pos.3 pants def.sg Juan
‘I’m sewing Juan’s pants.’
Classical Otomi uses verbal suffixes to express additional participants in

the action expressed by the main verb, such as the suffix -wi for company and
-bi for beneficiaries. For other functions the language has a small amount of
particles at its disposal: mainly dige, ir nge, nguu and ja. Historically, these
had mainly verbal functions, but nowadays they resemble Spanish prepos-
itions in the sense that they mark objects of comparison, instrument, cause,
manner and spatial orientation. This is exemplified in (3) (from Hekking and
Andrés de Jesús fc).
(3) Poni ndunthe ar ngunt’uzaa

come out much def.sg dust-wood
ho thets’i ya zaa ir nge ar thegi.
where cut def.pl wood instr def.sg saw
‘You will find much sawdust where they cut wood with a saw.’
In order to combine two or more clauses into a compound sentence, clas-

sical Otomi typically uses juxtaposition, not only for simple conjunction but
also in the case of final clauses and the representation of (indirect) discourse.
For relative clauses, which are always postnominal in Otomi, the gapping
Otomi 441
strategy is used, without any form of connection via a relative pronoun or a

subordinator.2 All these functional gaps may have motivated Otomi speakers
to adopt many Spanish prepositions, subordinators, coordinators and relative
pronouns. Possibly as a side effect of this, the verbal suffixes which mark
a relation between the predicate and certain adjuncts in classical Otomi are
disappearing from the language (Hekking 1995: 155161). It may therefore
well be the case that, as a result of intensive contact with Spanish and the
ensuing high degree of bilingualism, Otomi is becoming a language with less
asyndetic connecting at the clausal level and more analytical structuring on
the lower syntactic levels.
Let us now have a closer look at the parts-of-speech system of Otomi.
Otomi could be characterized as a rigid language in the sense that the vast
majority of the major lexical items fall in one of two clearly separate classes,
i.e. nouns and verbs. Lexical elements of these two classes can be easily
distinguished by their specific positions in the syntax and by the different
types of proclitics and affixes that accompany them and are never found on
the other class. Among linguists with expertise in Otomi there is no general
agreement as to whether Otomi has a third and fourth class of lexemes, i.e.
adjectives and adverbs. Many concepts which in Spanish and English are ex-
pressed by way of an adjective are expressed by way of a noun or a verb in
Otomi. E.g. ra’yo ‘new’, dätä ‘big’, nduxte ‘naugthy’, ‘wild’, gunt’ei ‘jealous’
and ngoñä ‘bald’ are nouns, which are preceded by an article, as is shown in
example (4).
(4) a. D-ar nduxte.

pres.1-def.sg naughty
‘I am naughty.’
b. G-ar nduxte.
pres.2-def.sg naugty
‘You are naughty.’
Other lexemes, such as dathi ‘ill’, txutx’ulo ‘small’, and johya ‘content,
happy’ are verbs in Otomi, and get the usual verbal proclitics, as is shown in
example (5).
(5) a. Di=johya.
pres.1=be.happy
‘I am happy’
b. Di=johya-he.
pres.1=be.happy-pl.excl
‘We are happy’
A third group of lexemes, which are translated into English as adjectives,

such as ‘thin’, ‘fat’, ‘bitter’, ‘sweet’, ‘cold’, ‘warm’, ‘yellow’, ‘red’, ‘beauti-
ful’, ‘ugly’, ‘good’, ‘high’, and ‘low’ have the verbal suffixes -gi, -’i and -Ø to
mark first-, second- and third-person (in)direct object and are preceded by the
proclitic xi= that marks the third-person perfect in predicative function. As is
shown in example (6), these constructions could be analyzed as impersonal
constructions, i.e. ‘it is thin to me’.
(6) a. Xi=nts’ut’i-gi.
prf.3=thin-1.obj
‘I am thin.’
b. Xi=nts’ut’i-’i.
prf.3=thin-2.obj
‘You are thin.’
c. Xi=nts’ut’i-Ø.
prf.3=thin-3.obj
‘He is thin.’
The suffixes used by this third group suggest that these lexemes are a kind of
(stative) verbs. However, many linguists, notably Ecker; Hess; Voigtlander
and Echegoyen; Hekking and Andrés de Jesús; Lastra de Suárez; Andrews
and Bartholomew, although acknowledging that Otomi has less adjectives
than for example Spanish and English, treat this group as adjectives, since
they may be used adnominally. This is shown in example (7).
(7) a. ar hets’i ’ñoho

def.sg tall man
‘the tall man’
b. ar nzatho ’behñä
def.sg pretty woman
‘the pretty woman’
c. ar ts’ut’i nxutsi
def.sg slim girl
‘the slim girl’
Otomi 443
On the other hand Palancar (forthc.) claims on the basis of his data from the
Otomi spoken in San Ildefonso, in the state of Querétaro, that all lexemes
which denote property concepts in Otomi are encoded as verbs and nouns
and not as adjectives. He argues that examples like (7) should be analyzed
morphologically, as nominal compounds, rather than syntactically, as nom-
inal modification.
As we will show in Section 4, in the corpora we have collected in San-
tiago Mexquititlán and Tolimán very few borrowed Spanish adjectives were
attested indeed and the majority of the Spanish adjectives we did find were
used predicatively, functioning either as a verb or as a noun. This borrowing
behaviour of Otomi with respect to Spanish gives structural support to the
view that adjectives are not a regular part of speech of this language. How-
ever, we suspect that Otomi via its contact with Spanish, a language with a
multitude of adjectives, might in fact be in the process of developing adjec-
tives as a new category. This would then be on the basis of an existing verbal
subclass, i.e. transitive stative verbs. It would eventually lead to a typological
shift with respect to the parts-of-speech system.
Concepts which in Spanish and English are typically expressed by way
of adverbs, in Otomi are also expressed by stative verbs, just like some ad-
jective-like elements. On the basis of the derivational processes involved in
this, two groups may be distinguished, exemplified by tihi ‘to run’ → ’nihi
‘quickly’, and by xi hño ‘is good’ → xi hño ‘well’, respectively.
Of the 100,541 tokens in our corpus 15,571 (14.1%) are Spanish borrow-
ings. Just over half of these belong to a lexical category (noun, verb, adjec-
tive or adverb).3 These percentages are more or less similar for both dialects
we studied. Table 1 gives the percentages were found for the four parts of
speech. Since such figures may not say much on themselves, we have added
the figures based on comparable data for borrowings from Spanish in two
other Amerindian languages, Quechua and Guaraní.
What strikes most is that Otomi seems to borrow much less lexical ma-
terial, and then mainly nouns, and hardly any adjectives. Spanish nouns are
almost always borrowed in their singular form, and may be accompanied by
any Otomi nominal morphology, both proclitics and suffixes. This is illus-
trated in example (8).
Table 1. Lexical parts of speech borrowed

Percentages borrowed
Part of Speech Otomi Quechua Guaraní
Noun 40.7 54.0 37.2
Verb 4.8 17.7 18.3
Adverb 4.5 3.4 2.3
Adjective 1.9 8.5 7.4
Total 51.9 83.6 65.2
(8) Nu’bya ya nguu ya losa-’bya, ya teja,

today def.pl house def.pl concrete-act def.pl tile
ya laminä yá njo’mi ya nguu.
def.pl metalic.plate pos.pl roof def.pl house
‘Nowadays the houses are made of flagstone, and the roofs of the
houses are made of tiles and metallic plates’
As we will see in Section 6, almost half of the borrowed grammatical material

consists of Spanish prepositions. Regularly, in those contexts, Otomi particles
are suppressed which were traditionally used to express the corresponding re-
lation now taken over by the borrowed preposition. In that sense we could say
that the little case marking that was present in the language is being replaced
by the more analytical adpositional strategy on the basis of Spanish loans,
notably to mark instrument, cause, manner and spatial orientation.
Apart from this huge amount of contact-induced changes in relation mark-
ing we have also found a few examples of the introduction of gender marking.
Otomi does not mark gender, but in some native words it does as a result of
contact with Spanish. This is shown in example (9). The Otomi form is ’beto
‘grandchild’, which is unmarked for gender. In (2), the ‘o’ in the stem seems
to be confused with the Spanish suffix that marks masculine (-o) and follow-
ing the Spanish rule the Spanish feminine suffix (-a) is substituted for the -o
in Otomi when the speaker refers to a granddaughter.
(9) ’beto ’beta

‘grandson’ ‘granddaughter’
Over half of the borrowed Spanish adjectives (55%) we find in their canonical
function of modifier of the head noun in a noun phrase. In that function they
Otomi 445
are accompanied by the usual Otomi nominal morphology. Occasionally, we

find them in the position of the head noun (4%). In the rest of the cases, they
occupy the position of the main verb of the sentence.
Borrowed Spanish verbs are typically treated as verbs in Otomi. The usual
form is the infinitive without the final -r of Spanish: engaña (< engañar); es-
kohe (< escoger); eskribi (< escribir). Just like Otomi verbs, Spanish stems
may be accompanied by the usual Otomi verbal proclitics and case marking
suffixes. An example is given in (10).
(10) a. Ntonse nu-r txuku tobe.

then dem-def.sg dog still
b. Mi=molesta-tho nuya kolmenä.
imprf.3=bother-lim dem.3pl beehive
‘Then the dog still bothered the bees.’
Spanish verbs with vowel change and diphthongization generally keep their
diphthongs ‘ie’ and ‘ue’ (cf. kwenta in 11a), although not always (konta
in 11b).
(11) a. Nu-ge bwelo-ga mi=kwenta.

dem-dem grandfather-emph.1 past.3=tell
ke hä mi=ting-ar oro ’ne-r plata.
that yes imprf.3=find-def.sg gold also-def.sg silver
‘My grandfather used to tell that he used to find gold and also
silver.’
b. Bi=ma bi=konta-wi yá kompañera
past.3=go past.3=tell-incl.du pos.3 friend
nu-’u ot’=ar wela.
dem-3pl do=def.sg grandmother
‘She visited her girl-friends to tell them about the things done by
her grandmother.’
Arguably as a result of language contact, Otomi is losing some of its mor-

phological distinctions on its verbs. A striking one is the loss of the inclu-
sive/exclusive distinction in first person dual and plural agreement marking,
as in (12a). It seems to be replaced by the Spanish preposition ko (< con)

‘with’ (12b).
(12) a. Xta=e-’be ma mpädi.

prf.1=come-du.excl pos.1 friend
b. Xta=ehe ko ma mpädi.
prf.1=come with pos.1 friend
‘I have come with my friend.’
The same seems to be happening to several other verbal markers of classical

Otomi, such as the suffixes -wi and -hu, which mark the company in the dual
and the plural. They are also giving way to the borrowed preposition ko, as
shown in (13) and (14) below.
(13) a. Ar Mändo mi=ñä-wi ár nänä.

def.sg Armando past.3=speak-com.du pos.3 mother
b. Ar Mändo mi=ñä ko ár nänä.
def.sg Armando past.3=speak with pos.3 mother
‘Armando spoke with his mother.’
(14) a. Ya nxutsi ’ñeñ-hu yá ida.

def.pl girl play-com.pl pos.3 sister
b. Ya nxutsi ’ñeni ko yá ida.
def.pl girl play with pos.3 sister
‘The girls play with their sisters.’
This phenomenon has not only been found in Querétaro, but also in other dia-
lects (cf. Lastra de Suárez 1992; 1997).
We find it remarkable that relatively few verbs are borrowed. As was shown
in Table 1, their figure is significantly lower than for Quechua and Guaraní,
and can not, therefore, be a pure reflection of the frequencies of the different
parts of speech in spoken Spanish. We have no ready made explanation for this
other than assuming that Otomi is a nominally oriented language as opposed
to especially Guaraní, which seems to be much more verbally oriented.
Among the adverbs being borrowed we find different types: manner ad-
verbs, locative and temporal ones, etcetera. They are typically functioning at
the level of their use in Spanish. Thus, borrowed sentential and verb phrase
modifiers are found to operate at that level in Otomi as well. Spanish adverbs
Otomi 447
which are derived productively from adjectives with the suffix -mente are
usually borrowed in that form, not in the plain adjectival form.
As has already been indicated above, we found a remarkable number of non-

lexical loanwords from Spanish in Otomi. Of the 15,571 Spanish tokens in
our corpus, 48.1 percent are function words in Spanish. The figures for the
most frequently observed ones are found in Table 2. For contrast, we add
again the figures we found for Quechua and Guaraní.
Where borrowed lexical elements may be seen as fillers of semantic gaps
in the lexicon which do not directly affect the structure of the grammar of the
target language, the borrowing of grammatical elements may, or must have
implications for its morphosyntax. We will have a closer look at some of the
more striking cases.
6.1. Prepositions
What strikes most is the considerable amount of prepositions from Spanish

we found in our corpus, especially when we compare this with the very few
we found in Quechua and Guaraní. After the noun this is the category most
often borrowed. Among the 2,075 tokens there were 54 different types, a
considerable part of the complete inventory of Spanish. In classical Otomi
Table 2. Grammatical parts of speech borrowed

Percentages borrowed
Part of Speech Otomi Quechua Guaraní
Preposition 21.2 0.5 0.5
Coordinator 7.5 6.9 4.4
Discourse marker 6.5 0.6 0.8
Subordinator 6.1 0.8 4.6
Pronoun 0.6 0.1 0.2
Other 6.2 7.5 24.3
Total 48.1 16.4 34.8
there exist a small number of verbal affixes with preposition-like functions.

However, in general semantic functions are not often marked explicitly and
have to be deduced from the context. Borrowed Spanish prepositions make
a number of functions transparent. The following cases were observed most
frequently in our corpus.
a. The Spanish preposition pa (< para) is often used together with or instead
of the verbal suffixes -pi or -wi to mark the beneficiary.
(15) Nä-r hyokunguu bi=hyok-wi

dem-def.sg architect past.3=build-ben
’nar nguu pa-r ts’ut’ubi.
indef.sg house for-def.sg governor
‘The architect built a house for the governor.’
Note that the definite article cliticizes to the Spanish preposition.

b. The Spanish preposition ko (< con) is frequently found marking the
company instead of the equivalent verbal suffix. Also in this case, double
marking is common.
(16) Mnde ngi=’ño-hu ko hñu ya nxutsi.

yesterday past.2=walk-incl with three def.pl girl
‘Yesterday you walked with three girls.’
The same Spanish preposition is also used to mark the instrumental. Classical
Otomi would employ the particle ir nge, which would take the same syntactic
position as the preposition. It could be said that a grammaticalization pro-
cess is under way which might eventually transform this particle, and several
others in functional prepositions.
(17) Ar jä’i bi=dak=ar k’eñä kon minge=r.

def.sg man past.3=attack=def.sg snake with pickaxe=def.sg
‘The man attacked the snake with the pickaxe.’
c. In classical Otomi the particle ngu is used to mark the object of equa-
tion. Nowadays the Spanish preposition komo (< como) is often used instead.
Very common is the use of the fused double marking komo-ngu, as in ex-
ample (18).
Otomi 449
(18) Xi mi=txinga mi=mpefi

much past3=work.o.s.to.death past3=work
komongu ’nar meti.
like indef.sg animal
‘They worked themselves to death working like an animal.’
d. In order to mark purpose, typically unmarked in Otomi, the Spanish prep-

osition pa (< para) is often used, as shown in example (19).
(19) Thoku ’nar pont’i zaa pa da t’exu ja-r ’met’e.

make indef.sg cross wood for fut.3 put.on loc-def.sg roof
‘They make a wooden cross to put it on the roof.’
e. In order to mark the privative, the Spanish complex sinke (i.e. preposition
sin plus the subordinator que) is frequently used, as in example (20). The
standard way in Otomi employs the verb otho, which means ‘there is not’.
(20) Ya nxutsi xi=mboni sinke ar nänä.

def.pl girl prf=leave without def.sg mother
‘The girls have left without their mother.’
Also the Spanish prepositional construction embesde (< en vez de) is often
used in such contexts.
(21) Embesde-r k’ani nu’bya tam-’bya t’afi,

instead.of-def.sg vegetable now buy-act sweet
ya gayeta ’neh=ya refresko.
def.sg biscuit also=def.sg soft drink
‘Instead of vegetables they now buy sweets, biscuits and also soft
drinks.’
f. All kinds of locative relations, which are typically unmarked in classical

Otomi, get Spanish prepositions such as entre ‘between’; pa (< para) ‘in the
direction of’; desde ‘from’; asta (< hasta) ‘until’; a ‘to’; and de ‘from’.
(22) Ya dá=pengi de Jalpa.

already past.1=come.back from Jalpan
‘I have already come back from Jalpan.’
(23) Bi=dexu asta mñä dige ja-r zaa.

past.3=climb till on.the.top ref loc-def.sg tree
‘He climbed till the top of the tree.’
g. In classical Otomi possession is expressed by simple juxtaposition of pos-

sessed and possessor. As a result of contact with Spanish, this relationship is
now often made explicit via the preposition de ‘of’.
(24) Nixi Independensya nixi Reforma nixi Rebolusyon bi=nkambyo

Neither Independence nor Reforma nor Revolution past.3=change
yá kostumbre de ya ñäñho.
pos.3 habit of def.pl Otomi
‘Neither the Independence, nor the Reform, nor the Revolution have
changed the habits of the Otomis.’
h. The Spanish preposition de is also used to mark the partitive, and all kinds
of relationships which may be subsumed under ‘Reference’, above all in the
Tolimán dialect. In classical Otomi there are no markers for the relations con-
cerned. In (25)–(27) we give three examples.
(25) ’Na de ge’u i=ndude kaha.

one of dem.3 pres.3=carry box
‘One of them carries the box.’
(26) Di=ñä-wi de byaje pa Maxei.

pres.1=speak-incl.du ref trip to Queretaro
‘We talk about the trip to Queretaro.’
(27) Hoku ’nar krusi de ’nar xithe.

Make indef.sg cross ref indef.sg wood
‘He makes a cross of wood.’
As shown in example (28) below, the Spanish preposition ko (< con) ‘with’ is
also used to mark the ‘made of’ function.
(28) Ya tsita Nt’okwä xi=thoki ko=r yeso.

indef.sg saint San.Ildefonso prf.3=made.of with=indef.sg gesso
‘The saints from San Ildefonso are made of gesso.’
Otomi 451
Table 3. Spanish coordinators in the corpus

Coordinator Total tokens borrowed (speakers)
Coordination: i 200 (49)
Adversative: pero 188 (55)
Disjunction: o 188 (45)
Additional negative: ni 89 (55)
Contrastive: sino 30 (19)
6.2. Coordinators
The second most frequently borrowed grammatical category is Spanish coor-

dinators. In our two dialects we have found the types set out in Table 3. We
give the number of tokens in the corpus. The figures are roughly equivalent
for the two dialects. In brackets we give the number of speakers on the total
of 115 that used the respective forms at least once.
An example with the adversative in given in (29).
(29) Hin=di pä-ka ko=r hñäñho,

neg=pres.1 know-emph.1 with=def.sg Otomi
pero ko=r hñämfo hä di=pä-ka.
but with-def.sg Spanish yes pres.1=know-emph.1
‘I don’t know in Otomi, but in Spanish I do know it.’
6.3. Discourse markers
The third most important category of Spanish borrowings are discourse mark-
ers. We found the following Spanish elements quite frequently in paragraph
initial position: pos / pwes (< pues) ‘well’ (372 tokens; 75 speakers); and este
(204 tokens; 46 speakers). The latter is the masculine singular of the Spanish
proximate demonstrative, which is clearly used by Otomi speakers as turn
holder and hesitation marker. It is very characteristic for Mexican Spanish in
the same function. Hekking and Bakker (1998) suggest that these elements
give a Spanish flavour to Otomi utterances, and possibly also more status
to the speaker. Following Matras (1998) we might assume that, in fact, the
discourse structure of Otomi is converging towards that of Spanish discourse
in situations where Spanish is supposed to be the language with the higher
status. This is shown in examples (30) and (31).
(30) Temu gi=mä-nge?

What pres.2=say-emph.2
Pwes nuga di=mä-nga gatho ar za.
well I pres.1=say-emph.1 everything def.sg good
‘“What do you think?” “Well, I think everything is okay.”’
(31) Ar Xuwa bi … este … bí=hñuxu ’nar he’mi

def.sg Juan past.3 well past.3=write indef letter
pa bi=mända-wi ár mpädi Enrike.
for past.3=send-incl.du pos.3 friend Enrique
‘Juan … well … wrote a letter in order to send it to his friend
Henrique.’
6.4. Subordinators
The next class of grammatical borrowings we discuss are subordinators. Just

like nominal relations, in classical Otomi subordination is often unmarked.
We found a host of Spanish subordinators in the initial position of subordinate
clauses. Many Spanish subordinators consist of a preposition plus subordin-
ator que ‘that’. Examples are por que ‘because’; para que ‘so that’; ya que
‘since’; and hasta que ‘until’. Interestingly, some of these are borrowed with
the subordinator (porke; yake), while in other cases only the prepositional
part is borrowed (pa; hasta). The most frequently appearing borrowing in this
function is pa (697 tokens used by no less than 104 speakers). Pa (< para) is
also the most frequently appearing Spanish element in its prepositional func-
tion. It is not clear to us whether pa in its subordinating function should be
seen as a short form of para que or whether we are dealing with the same
element pa in two different syntactic functions. Other subordinators which
appear with some regularity are komo (< como) ‘how’, kwando (< cuando)
‘when’, and anke (< aunque) ‘although’. We found quite a few others be it
with low frequencies and for a limited number of speakers. Below we give
some examples of the use of Spanish subordinators in Otomi utterances.
(32) Necesita da nuya jä’i da hñunta

need fut.3 dem.pl person fut.3 get.together
pa da hoku ’nar mehe.
for fut.3 build indef.sg well
‘These people must get together in order to build a well.’
Otomi 453
(33) No=r bätsi bi=nzoni

dem=def child past.3=cry
porke bi=n-tsät’i na nts’edi-tho.
because past.3=refl-burn very strong-lim
‘The child cried, because it burned itself very severely.’
6.5. Pronouns
There are relatively many Spanish pronouns in our data, especially when
compared to the other two languages, which hardly borrow them at all. About
half of the ones we found are relative pronouns. The classical Otomi way of
relativizing is the gap strategy, i.e. apart from agreement markers there is no
form which represents the antecedent in the relative clause, as in (34a) below.
However, it is not uncommon in our corpus to find the Spanish subordinator
que ‘that’ in the first position of a relative clause. This is shown in example
(34b). Possibly as a result of loanshift, we may now also find relative clauses
which start with Otomi demonstratives, as in (34c), or with the Otomi inter-
rogative pronoun, as in (34d).4
(34) a. Nä-r jä’i [xi=xi-ku-ga-nu],

dem-def.sg person [prf.3=say-obj.1-emph.1-emph.3
m=tiyo-ga.
pos.1-uncle-1.emph
b. Nä-r jä’i [ke xi=xi-ki],
dem-def.sg person [that prf.3=say-obj.1
ge m-tiyo-ga-nu.
dem pos.1-uncle-emph.1-emph.3
c. Ar jä’i [nä’ä bi=xi-ku-ga nuna],
def.sg person [this past.3=say-obj.1-emph.1 emph.1
ge m-tiyo-ga.
dem pos.1-uncle-emph.1
d. Nä’ä-r jä’i [to bi=xi-ku-ga],
dem-def.sg person [who pst.3=say-obj.1-emph.1
nä’ä ma tiyo-ga-’ä.
dem pos.1 uncle-emph.1-emph.3
‘The person who has told me that, is my uncle.’
We find a Spanish construction also in headless relatives, as in (35) below. In

this case it is “doubled” by the newly developed Otomi demonstrative strategy.
(35) Gi=tsi lo ke nä’ä gi ne.

pres.2=drink sg.3.ntr what dem.sg pres.2 want
‘Drink what you want.’
Apart from the relative pronouns, there were quite a lot of Spanish indefin-
ite pronouns in our data. We found frequent occurrences of ni’na (< ni una)
‘none’; kadu ’na (< cada una) ‘each’; kada kyen (< cada quien) ‘everyone’;
kwalkyera (< cualquiera) ‘whomever’ and näda (< nada) ‘nothing’.
Also interrogative pronouns and adverbials are borrowed: ke (< que)
‘what’; por ke (< por qué) ‘why’; pa ke (< para qué) ‘what for’; and komo (<
cómo) ‘how’.
Even the emphasizing pronoun mismo ‘self’ is borrowed, alone as in (36a),

and in a double marked construction as in (36b).
(36) a. mismo Kwä

self God
‘God himself’
b. Enä mismo-se.
say self-self
‘He said himself.’
Almost certainly, the constituent order of Otomi has been influenced by

Spanish. As already mentioned earlier, there is a tendency in Otomi to re-
place the classical VOS and VS main clause orders by the Spanish orders
SVO and SV.
In nominal predications, classical Otomi has the same order as Spanish,
i.e. the subject precedes the predicate, as in (37). However, in the case of
property assigning predicates, which have proclitic xi= and optionally also
the suffixes -gi and -’i, the subject follows the predicate in the classical lan-
guage. But today also in these contexts subjects may be found in pre-predi-
cational position, just like in Spanish. This is shown in (38a/b).
Otomi 455
(37) Ar komisaryado ge
def.sg political.commissioner dem
ar heku-hai ja-r ehido.
def.sg divide-land loc-def.sg communal-fields
‘The political commissioner is the distributor of the lands in the com-
munal fields.’
(38) a. Xi=nkuhi ar ngo.

prf.3=delicious def.sg meat
‘The meat is delicious.’
b. Ar ngo xi=nkuhi.
def.sg meat prf.3=delicious
‘The meat is delicious.’
These changes in constituent order may well be in line with the restructuring
of the Otomi discourse in the direction of Spanish that we suggested above
with respect to the introduction of Spanish discourse markers.
Both in Otomi and Spanish the order of a possessive construction is pos-
sessed–possessor, so no changes may be expected here. The same applies to
the relative clause order: both languages are N–Rel.
8. Syntax
We see a clear tendency in Otomi to import some highly frequent Spanish

verbs and auxiliaries which mark aspect and modality. But also other changes
are taking place in the area of Otomi syntax. In this section, we discuss a few
examples of the most frequently attested phenomena.
A recently developed periphrastic construction marks progressive aspect.
It is almost certainly contact-induced, since Otomi uses the Spanish verb sige
‘continue’. This is shown in (39).
(39) Him-bi=patu nu-r hñäñho,

neg-past.3=change dem-def.sg Otomi
siempre sige ñätho.
always continue speak.Otomi
‘Otomi hasn’t changed, they always keep speaking it.’
The expression of modality is affected by contact-induced constructions

as well. Otomi frequently employs Spanish modal auxiliaries of necessity
and possibility, such as tyene ke, (< tiene que) ‘have to’ in example (40);
debe (< debe) ‘must’; pwede (< puede) ‘can’ in example (41); and nesesita
(< necesita) ‘need’ in example (32) above. These are then followed by the
future marking auxiliary and the main verb, just like the native alternatives
mahyoni ‘is necessary’ and ar tsa ‘is possible’.
(40) Pero tyene ke n-da mpefi.

but have.to past-fut.3 work
‘But they had to work’
(41) Ya mi=pwede n-da mats’i ja-r ’batha.

already past.3=can past-fut.3 help loc-def.sg field
‘They could already help in the field’
Another periphrastic contact-induced construction of a modal nature is the

use of the calque pets’i (~tiene ‘have’) as in (42a) instead of the original form
mahyoni ‘is necessary’ in (42b).
(42) a. Di pets’i ga mpe-ka

pres.1 have fut.1 work-emph.1
ga ’yo-ga pa nu’i gi ñuni.
fut.1 walk-emph.1 for you fut.2 eat
b. Mahyoni ga mpe-ka
is.necessary fut.1 work-emph.1
ga ’yo-ga pa nu’i gi ñuni.
fut.1 walk-emph.1 for you fut.2 eat
‘I have to work and to walk, in order that you eat.’
A new periphrastic construction developed to mark repetition. Next to the

classical construction ma ’nagi ‘again’ + main verb, the construction pengi
’return’ + main verb is very common in modern Otomi. Arguably, this is a
calque of the Spanish construction volver a + main verb. Double marking is
common, as shown in (43).
(43) Dá=pengi dá=uni ma ’nagi.

past.1=return past.1=give again
‘I gave it to them again.’
Otomi 457
Yet another example of potential calquing we find in expressions of pos-

session. There is a clear increase of the use of the verbs ’ñehe ‘to keep’, ‘to
breed’ for animate and pets’i ‘to have’ for inanimate possession. The origin-
al ways to express possession in Otomi employs the verb ’bui ‘exist’ or the
locative auxiliary ja ‘there is’ in combination with the possessive proclitic
and the possessed noun. Both new uses are probably calques of the Spanish
tener ‘to have’.
(44) a. Ma=t’ixu ya ’bui yoho yá=bätsi.

pos.1=daughter already exist two pos.3pl=child
b. Ma=t’ixu ya ’ñehe yoho yá=bätsi.
pos.1=daughter already keep two pos.3pl=child
‘My daughter has already two children.’
(45) ’Bui ’ra ya zaa pets’i yá=’ba xi=nt’axi.

be indef.pl tree possess pos.3pl=sap past.3=white
‘There are some trees that have a white sap.’
In predicative constructions, which in classical Otomi would be without a

copula, we sometimes find the Spanish copula ta (< está). It is used with both
nominal predicates and stative verbs. Examples are the predicatively used
noun u ‘salt’ in (46) and the stative verb txutx’ulo ‘to be small’ in (47).
(46) Ar ’yot’u-ngo ta ar u.
def.sg dry-meat cop def.sg salt
‘The dry meat is salty.’
(47) Ar tsut’äxi ar dätä ne yá=t’olo t’äxi ta txutx’ulo.

def.sg she.goat def.sg big and pos.3pl=child goat cop small.
‘The she-goat is big and her kids are small.’
Among the more striking contact phenomena in the syntax of Otomi are the
new strategies for the coordination and subordination of clauses. As already
discussed in Section 6 Otomi borrows a huge amount of Spanish coordinators
and subordinators. It uses many Spanish connectors of addition, contrast and
disjunction, many Spanish adverbial clause markers of time, place, manner,
purpose, cause, condition and concession, many Spanish subordinators in
complement clauses and Spanish relativizers in relative clauses. Because of
the introduction of these grammatical elements Otomi is now less asyndetic
in clausal relations. We gave several examples of this phenomenon above.

Yet another recent syntactic change that we observed is the disappearance
of the question particle ha that marks the beginning of a yes/no question. It
is found in example (48).
(48) Yogo hin-gi=pede ir=hñäki ko ya jä’i?

why neg-pres.2=tell pos.2=problem with def.pl person
(Ha) g-ar gone?
inter pres.2-def.sg dumb
‘“Why don’t you tell your problem to those gentlemen?” “Are you
dumb?”’
We have not enough evidence that the Spanish interrogative intonation is

adopted in these cases, but we have the strong suspicion that the marker is
disappearing under influence of Spanish, which does not have such a marker,
and uses intonation to contrast such questions with their declarative counter-
parts.
9. Lexicon
We have seen above that only half of the borrowed elements belong to one of
the four major lexical classes – noun, verb, adjective, adverb – and that the
other half are borrowed from the grammatical part of the Spanish lexicon.
Prominently among the latter are Spanish prepositions: they make up 44 per-
cent of the grammatical elements borrowed, and 21 percent of all borrowed
elements. Otomi does not have prepositions of itself, though it has deverbal
elements with adverbial function which have a syntactic slot just before noun
phrases. This is also where Spanish prepositions end up in Otomi utterances.
It may be argued that from the perspective of Otomi grammar, these bor-
rowed prepositions should be analyzed as adverbs on functional grounds, and
should therefore be added to the lexical part of the borrowed inventory rather
than to the grammatical part. This would bring the total contribution of lexical
borrowings closer to the percentage we found for Guaraní and Quechua.
When we restrict ourselves to the 1,413 meaning concepts defined by the
Loan Word Typology project (LWT; Haspelmath and Tadmor forthc.) rele-
vant for Otomi, we find the following distribution of Spanish borrowings.
Table 4 gives a breakdown for the semantic fields defined by the LWT project,
and only for those that have either more than 20 percent borrowings or less
Otomi 459
than 10 percent. The second column gives the total number of concepts de-
fined for the corresponding field; the third column gives the percentage for
which there is (also) a Spanish loanword; and the fourth column gives the
percentage of concepts for which there is only a loanword.
Not surprisingly, the category Modern World is the field with most bor-
rowings: for over half the entries a Spanish form is used. However, it is strik-
ing that for most of these there is also an Otomi word. This is different for
category 2, Animals, where loanwords are typically unique and have no cor-
responding word in Otomi. On the other side of the scale, we find that for
Body Parts, Physical World and notions for Sense Perception, there is hardly
any borrowing at all. We see that the overall percentage of borrowings is
around 16 percent. The percentage of borrowed types we found in our corpus
is somewhat higher, at 22.2 percent. This is to be expected, since the list of
concepts of the LWT project might be seen as defining the core vocabulary
of a language.
Among the data that we collected in Santiago Mexquititlán en Tolimán
we have found many cases of code-switching and code mixing. On the other
hand, we have also found a huge amount of Spanish phrases that are not to be
considered as code mixings, but composite or frozen borrowings, since they
Table 4. Spanish loans per semantic field

Total number Percentage Only Spanish
Semantic field of concepts borrowed word
Modern world 57 53% 18%
Animals 105 35% 29%
Agriculture/vegetation 66 29% 17%
Warfare and hunting 40 22% 5%
Dwelling/house/furniture 45 22% 4%
Clothing/personal adornment 57 21% 11%
Religion and beliefs 24 21% 8%
Spatial relations 74 9% 4%
Moral/aesthetic 48 8% 6%
Law 26 8% 4%
Body parts 156 6% 2%
Physical world 68 3% 0%
Sense perception 48 2% 0%
Total 1413 16.3% 6.9%
appear regularly and for different speakers. Most of them are noun phrases,
such as agua de mais ‘maize water’; barryo kinto ‘neighbourhood number
five’; el beintisinko de julyo ‘the twenty-fifth of July’; and la mera berdad ‘the
very truth’. But we have also found prepositional phrases such as a los kinse
diya ‘in two weeks time’; kon el tyempo ‘in due time’; and por mi parte ‘as
far as I am concerned’. We came across several complex subordinators such
as asi es de ke ‘so it happens that’; mbes de ke ’instead of’; and verbal phrases
such as kreo ke ’I think that’ and pares ke ‘it seems that’. One could argue
that these complex constructions are interpreted as unanalyzable entities by
speakers of Otomi. Some of such complex borrowings are used as discourse
markers, such as a ber; a lo mejor; ni modo; no mäs; and algo asi.
10. Conclusions
We may conclude that Otomi has been influenced by Spanish at many lin-
guistic levels. In the area of phonology, as a result of the adoption of unas-
similated Spanish loanwords, new sounds have been introduced, such as the
trilled alveolar vibrant rr, the lateral l and the affricated alveopalatal ch. Tone
does not seem to have been affected by contact.
As far as the lexicon is concerned, not only are many Spanish content
words introduced, both nouns, verbs, adverbs and some adjectives, but also a
high amount of function words, such as prepositions, coordinators, subordin-
ators, relative pronouns and discourse markers. We have also found a large
number of complete Spanish phrases.
The systematic adoption of function words makes the Otomi grammar less
asyndetic in its clause combining strategies. The loss of certain verbal affixes
which mark inclusivity and the company, and their replacement by Spanish
prepositions, make the language less synthetic in its morphological charac-
terization.
In relation to constituent order we have observed that the basic Otomi or-
ders V–O–S, V–S and Adjectival predicate-Subject are frequently replaced by
S–V–O, S–V and Subject–Adjectival predicate, the basic orders of Spanish.
Otomi syntax is undergoing restructuring as a result of the introduction of
several auxiliaries and a copula. Several other striking changes were observed
in this area.
As a result of all these changes, we estimate that Otomi would be situ-
ated somewhere between points 2 and 3 on the borrowing scale of Thomason
(2001).
Otomi 461
Abbreviations
1 first person instr instrumental

2 second person inter interrogative
3 third person lim limitative
act actuality loc locative
ben benefactive neg negative
cop copula ntr neuter
def definite obj (in)direct object
dem demonstrative pl plural
du dual pos possessive
emph emphatic prf perfect
excl exclusive pres present
fut future past past
imprf imperfective ref reference
incl inclusive sg singular
indef indefinite
Notes
1. (< engaña) should be read as ‘based on the Spanish word engaña’. Note that the
Spanish forms are given in the standard spelling, not in a phonological form.
2. See Comrie (1989: 147153) for a typology of relativization.
3. We distinguish between borrowings on the one hand and code-switching on the
other hand. Stretches of text which we consider to be code switches were not
included in these figures. We consider Spanish prepositions as grammatical ra-
ther than lexical elements. They will be discussed in Section 6.
4. These examples stem from a questionnaire with Spanish sentences which were
translated into Otomi by a number of native speakers from the two commu-
nities.
References
Andrews, Henrietta
1993 The function of verb prefixes in Southwestern Otomi. The Summer In-
stitute of Linguistics and the University of Texas at Arlington Publica-
tions in Linguistics 115. Dallas: Summer Institute of Linguistics and
the University of Texas at Arlington.
Bakker, Dik, and Ewald Hekking

1999 A functional approach to linguistic change through language contact:
The case of Spanish and Otomi. Working Papers in Functional Gram-
mar 71: 132. Amsterdam.
Bakker, Dik, Jorge Gómez-Rendón, and Ewald Hekking
Forthc. Spanish meets Guaraní, Otomi and Quichua: A multilingual confron-
tation. In: Thomas Stolz, Dik Bakker and Rosa Salas Palomo (eds.),
Aspects of Language Contact. Berlin/New York: Mouton de Gruyter.
Bartholomew, Doris
2004 Notas sobre la gramática. In: Luis Hernández Cruz, Moisés Victoria
Torquemada, and Donaldo Sinclair Crawford (eds.), Diccionario del
Hñähñu (Otomi) del Valle del Mezquital (Hidalgo). (Vocabularios In-
dígenas 45.) México, D.F.: Instituto Lingüístico de Verano.
Comrie, Bernard
1989 Language Universals and Linguistic Typology. Oxford: Blackwell.
Ecker, Lawrence
1952 Compendio de gramática otomí (introducción a un diccionario otomí-
español), Anales del Instituto Nacional de Antropología e Historia,
tomo 4, no. 32. México: Instituto Nacional de Antropología e His-
toria.
Guerrero, Alonso
2002 El Códice Martín del Toro. De la oralidad y la escritura, una perspec-
tiva Otomi. Siglos XV–XVII. Tesis de Licenciatura en Etnohistoria.
Escuela Nacional de Antropología e Historia.
Haspelmath, Martin, and Uri Tadmor. Loanword Typology project: A collaborative
project toward the comparative study of lexical borrowability in the
world’s languages. Coordinated by the Max Planck Institute for Evo-
lutionary Anthropology, Department of Linguistics.
Hekking, Ewald
1995 El Otomi de Santiago M: Desplazamiento Linguïstico, Préstamos y
Cambios Grammaticales. Amsterdam: Institute for Funtional Research
into Language and Language Use.
2001 Cambios gramaticales por el contacto entre el Otomi y el español. In:
Klaus Zimmermann, and Thomas Stolz (eds.), Lo Propio y lo Ajeno
en las Lenguas Austronésicas y Amerindias. Procesos Interculturales
en el Contacto de Lenguas Indígenas con el Español en el Pacífico
e Hispanoamérica, 127151. Frankfurt am Main/Madrid: Vervuert/
Iberoamericana.
2002 Desplazamiento, pérdida y perspectivas para la revitalización del
hñäñho. In: Yolanda Lastra de Suárez and Noemí Quezada (eds.), Es-
tudios de Cultura Otopame 3, 221248. México, D.F.: Universidad
Nacional Autónoma de México.
Otomi 463
Hekking, Ewald, and Dik Bakker

1998 Language shift and Spanish content and function words in Otomi.
In: Bernard Caron (ed.), Actes du 16e Congres International des Lin-
guistes. Oxford: Elsevier Sciences.
1998 El Otomi y el español de Santiago Mexquititlán: Dos lenguas en con-
tacto. Foro Hispánico 13, Sociolingüística: Lenguas en Contacto,
4574. Amsterdam: Rodopi.
2005 Problemas en la adquisición de una segunda lengua: El Otomi frente
al Español. In: Claudine Chamoreau, and Yolanda Lastra de Suárez
(eds.), Dinámica Lingüística de las Lenguas en Contacto. Hermosillo:
Universidad de Sonora.
Hekking, Ewald, and Pieter Muysken
1995 Otomi y Quechua: una comparación de los elementos prestados del
español. In: Klaus Zimmermann (ed.), Lenguas en Contacto en His-
panoamérica: Nuevos Enfoques. Frankfurt am Main: Vervuert.
Hekking, Ewald, and Severiano Andrés de Jesús
1984 Gramática Otomi. Querétaro (México): Universidad Autónoma de
Querétaro.
Forthc. He’mi Mpomuhñä ar Hñäñho ko ya Njat’i/Diccionario Explicativo
Ilustrado del Otomi del Estado de Querétaro.
Hekking, Ewald, and Severiano Andrés de Jesús (eds.)
2002 Ya ’bede ar hñäñho Nsantumuriya/Cuentos en el Otomi de Amealco.
Querétaro (México): Universidad Autónoma de Querétaro.
Hengeveld, Kees, Jan Rijkhoff, and Anna Siewierska
2004 Parts-of-speech systems and word order. Journal of Linguistics 40 (3):
527570.
Hernández Cruz, Luis, Moisés Victoria Torquemada, and Donaldo Sinclair Crawford
2004 Diccionario del Hñähñu (Otomi). México: Instituto Lingüístico de Ve-
rano, A.C.
Hess, H. Harwood
1968 The Syntactic Structure of Mezquital Otomi. (Janua Linguarum. Series
Practica 43.) The Hague: Mouton.
Jiménez Moreno, Wigberto
1939 Origen y significación del nombre Otomi. Revista Mexicana de Estu-
dios Antropológicos III: 6268.
Lastra de Suárez, Yolanda
1989 Otomi de San Andrés Cuexcontitlán, estado de México. Archivo de las
lenguas indígenas de México. México: El Colegio de México.
1992 El Otomi de Toluca. México, D.F.: Instituto de Investigaciones Antro-
pológicas, UNAM.
1994 Préstamos y alternancias de código en Otomi y en español. In: Carolyn
Mackay and Veronica Vazquez (eds.), Investigaciones Lingüísticas en
Mesoamérica. México: Universidad Autónoma de México.
Lastra de Suárez, Yolanda

1997 El Otomi de Ixtenco. México, D.F.: Instituto de Investigaciones Antro-
pológicas, UNAM.
Matras, Yaron
guistics 36 (2): 281331.
Palancar, Enrique
Forthc. Property concepts in Otomi: A language with no adjectives. Inter-
national Journal of American Linguistics.
Ruhlen, Merritt
1991 A Guide to the World’s Languages. London: Edward Arnold.
Sahagún, Bernardino de
1999 Historia general de las cosas de Nueva España. México: Porrúa.
Suárez, Jorge A.
1983 The Mesoamerican Indian Languages. Cambridge: Cambridge Uni-
versity Press
Thomason, Sarah G.
2001 Language Contact. Edinburgh: Edinburgh University Press.
Salinas Pedraza, J.
1983 Etnografía del Otomi. México: Instituto Nacional Indigenista.
Urbano, A.
[1605] 1990. Arte breve de la lengua Otomi y vocabulario trilingüe. R. Acuña
(ed.). México D.F.: Universidad Nacional Autónoma de México.
Voigtlander, Katherine, and Artemisa Echegoyen
1985 Luces contemporáneas del Otomi: Gramática del Otomi de la sierra.
México, D.F.: Instituto Lingüístico de Verano.
Yasugi, Yoshiho
1995 Native American Languages. An Areal–Typological Perspective. Osaka:
National Museum of Ethnology.
Grammatical borrowing in Purepecha
Claudine Chamoreau
1. Background
Purepecha (P’orhépecha or Tarascan) is classified as an isolated language,

spoken by around 110,000 people (10 percent of them monolingual), in the
state of Michoacan, in the west of Mexico. Spanish was introduced in the
sixteenth century, and became the official language of Mexico, where more
than about a hundred languages are still spoken. It gained more importance
with the linguistic policies of the Mexican Independence and Revolution, in
the nineteenth and in the twentieth centuries respectively. Spanish functions
as a prestigious language, and is connected to education, a better standard
of living, oral and written media, religion, administration, commerce, and
employment.
Nevertheless, Purepecha, in 2003, acquired (like the other indigenous lan-
guages spoken in Mexico) the status of official language. In general, Pure-
pecha is used only orally, having been established as a written language very
recently, and only used in that mode by a few individuals (specifically, the
intellectual-speakers or the teachers). The language is spoken by 28 percent of
the Purepecha children aged between 5 and 14, this data indicating that Pure-
pecha is not generally transmitted to the younger generation, who prefer to
learn and use Spanish. Moreover, the situation is not homogeneous in all the
communities. In some villages, the language functions for communication
among all family members and friends (salutations and discussions at home,
in the streets, in the shops or markets, and in children’s games). In other com-
munities, only the middle-aged and older people speak Purepecha.
Spanish has been the principal contact language for many centuries, how-
ever, before the Conquest, there were speakers of other languages in the area
– mostly Nahuatl (Uto-Aztecan family), and Otomi (Otopamean family). The
influences of these languages in Purepecha has not been studied in detail, but
one hypothesis has been offered regarding constituent order. Purepecha exhib-
its all the traits of an SOV language: (a) tense, aspect and modal markers fol-
low the verb, (b) postpositions, (c) suffixes almost exclusively, (d) case mark-
ers, (e) main verbs precede inflected auxiliaries, (f) genitives can precede the
head noun, and (g) relative clause can precede the head noun. Nevertheless,
466 Claudine Chamoreau
in the Lake Patzcuaro area, Purepecha has become SVO (Capistran 2002).
This order has been attested since the sixteenth century, and has become pro-
gressively more widespread since that time (Villavicencio 2006). Considering
that Nahuatl – and Otomi – present a verb-initial structure, this change prob-
ably has its roots in areal contact prior to the sixteenth century, with the subse-
quent influence of Spanish, an SVO language, continuing the process.
In the present chapter, I will concentrate on the influence of Spanish con-
tact on Purepecha, specifically, on the grammatical structure. related to this
contact are found in the areas of phonology, morphological typology, nom-
inal and verbal structures, other parts of speech, constituent order, and syn-
tax. This chapter deals with the dialect of Jaracuaro (denoted Jr), a peninsula
in Lake Patzcuaro, however, when necessary, I use data from other varieties.
Purepecha varieties are more or less mutually intelligible, nevertheless, great
sociolinguistic differences exist between them (Chamoreau 2005). Most of
the data considered for this chapter are the result of my own field research
projects carried out over a period of fifteen years.
2. Phonology
In the phonological system of Purepecha, two phonemes – that are not shared
with Spanish – have been influenced by Spanish: the retroflex consonant /ɽ/
and the high central vowel //. In some varieties (for example, Cuanajo), Pure-
pecha has a phonological opposition between the retroflex /ɽ/ and the flap /r/
(e.g. jurani ‘to make somebody cough’/ juɽani ‘to come’), however, in certain
varieties, this opposition no longer exists; the retroflex becomes either a flap,
losing the retroflex/flap opposition, or a lateral, a phoneme probably borrowed
from Spanish. Purepecha conserves an opposition, but shows a new lateral/
flap feature. In general, the lateral only appears in Spanish loanwords such as
azuli ‘blue’ (from Spanish ‘azul’), or limoni ‘lemon’ (from Spanish ‘limon’).
However, in some varieties (for example, Comachuen, Arantepacua), young
and middle-aged speakers use the lateral (jolempiri ‘teacher’), while the older
generation uses the flap (jorempiri), or the retroflex (joɽempiri). The use of
the lateral in Purepecha words reveals the replacement of the Purepecha
phoneme by the Spanish phoneme. Currently, Purepecha is acquiring a new
phonological opposition (Chamoreau 2002a).
The high central vowel // is used after /ts/, /tsh/ and /ʃ/, and a phonological
opposition appears between // and the high front vowel /i/ (e.g. tsiriri ‘rib’/
tsriri ‘paste’; kheʃi ‘shoulder’/ khaʃ ‘shape’). Nevertheless, particularly in
Purepecha 467
the varieties which have lost the retroflex /ɽ/, and have transferred the lateral
/l/, and in other varieties, in the case of the young and middle speakers, the
high central vowel // is no longer used, and /i/ replaces // (tsiriri ‘paste’, khaʃi
‘shape’). The phonological system of these varieties has lost a vowel, and, ac-
cordingly, presents the same vowel system as Spanish (/i/, /u/, /e/, /o/, /a/).
Purepecha has not undergone an important re-structuring of its typological

profile. It is an agglutinative and synthetic language that comprises a very
elaborated derivational verbal system. For example, for the passive, the verb
presents a derivational verbal suffix na (1), and in order to express the equa-
tional constructions (2) the e suffix is used. In certain varieties, the suffix
is i:
(1) Tʃkurhi kuɽi-ra-na-ʃ-ti xwata-ɽu.

firewood burn-caus-passiv-aor-ass3 hill-loc
‘The firewood was burned on the hill.’ (Jr)
(2) Xwánu xoɽempiri-i-ʃ-ti.

John teacher-pred-aor-ass3
‘John is a teacher.’ (Jr)
Nevertheless, Purepecha exhibits new tendencies, in which analytic-per-

iphrastic constructions appear. Passive (3) and equational (5) periphrastic
structures adapt Purepechan morphemes to Spanish patterns without the
transfer of linguistic material, which suggests that contact-induced grammat-
icalization processes have taken place.
The passive periphrastic construction emerges from a patient-oriented re-
sultative participle plus xa ‘be there’; a verb which became an auxiliary. Evi-
dence supporting the consideration of this construction as a remodeling of
the structure (PAT) includes: (a) Passive constructions involving passive par-
ticiples appear in Indo-European languages, and are very rare in the Americas
(Haspelmath 1994); (b) the agent is introduced as an oblique complement
by using the postposition ximpo (3) in the same way as the Spanish passive
construction with ser, whereas this is generally impossible in the Purepecha
passive derivational construction (1); (c) The subject is always the patient, as
in Spanish, whereas, in the derivational passive structure, the subject is the
divalent-patient or the trivalent-recipient (Chamoreau 2007); (d) the younger

generation use a passive periphrastic construction with the xinte ‘be’ copular
verb (4), treated as the Spanish ser auxiliary, calquing the Spanish Aux.-Part.
Order, whereas, in the passive periphrastic construction with xa (3), the Pure-
pecha Part.-Aux. order is preserved:
(3) Tʃkurhi kuɽi-kata xa-ɽa-ʃ-ti xutʃari tata ximpo.

firewood burn-partpp be there-ft-aor-ass3 pos1pl father inst
‘The firewood was burned by my father.’ (Jr)
(4) Enka no u-a-ka xuramukwa-nkuni xinte-a-ti

sub neg do-fut-subj law-com be-fut-ass3
ʃuka-kata.
dispute-partpp
‘If he does not respect the law, he will be punished.’ (Jr)
The analytic equational construction with xinte ‘be’ is an internally mo-

tivated reanalysis, from a demonstrative to a ‘be’ verb used as a presenta-
tive (Chamoreau 2006). As a result of the influence of Spanish, many young
people prefer to use the verb xinte (5) rather than use the derivational con-
struction (2). Xinte appears essentially with nouns and pronouns. Many young
people use xinte solely with adjectives (6) indicating a quality which refers
to identity, and which is independent of the situation, as in ‘ser’ in Spanish.
In (5) and (6), the Spanish SVAdj. order is a calque. This is a construction
that is in opposition to the construction with xa (7) which expresses a relative
quality dependent on the situation, as in estar in Spanish. With xa, the order
is generally the Purepecha SAdj.V order, although it is possible to find the
Spanish SVAdj. Order.
(5) Xwánu xinte-ʃ-ti xoɽempiri.

John be-aor-ass3 teacher
‘John is a teacher.’ (Jr)
(6) Iʃu pakanta, kwhiripu miri-kwali-s-p-ti ya ima-ni

here Pacanda people forgot-mid-aor-pas-ass3 already dem-obj
atʃati-ni ka myá-ntha-ʃa-p-ti eski ima xinte-p-ka riko.
man-obj and think-it-prog-pas-ass3 sub dem be-aor/pas-subj rich
‘Here in Pacanda, people had forgotten this man. They used to think
that he was rich.’ (Pc)
Purepecha 469
(7) Naranʃa téri xa-ɽa-ʃ-ti.

orange sweet be there-ft-aor-ass3
‘The orange is sweet.’ (Jr)
Changes that could have arisen through the influence of Spanish include the
tendency to use the plural marker itʃa, and the object case marker ni, with
inanimate entities (8). Traditionally, the plural and object case markers are
obligatory only for animate and definite entities (Villavicencio 2006).
(8) Ints-ku-ʃn-ti=kʃ tsɨtsɨki-itʃa-ni.

give-3app-prog-ass3=3pl flower-pl-obj
‘They used to give her flowers.’ (Jr)
The Spanish diminutive suffix has become productive in Purepecha, how-

ever, only the masculine ito is used, pronounced ito or itu. Gender does not
exist in Purepecha. This suffix is used with nouns, adjectives (9a), and classifi-
ers (9b). This latter exhibits an adaptation to the Purepecha nominal phrase.
(9) a. Witsintikwa phakha-ra-ʃ-ka=ni xantiakhu-itu.

yesterday stay-mid-aor-ass1/2=1 alone-dim
‘I stayed alone yesterday.’ (Jr)
b. Ma itʃakwa-itu wítʃu-ni ká-ʃn-ka=ni.
one long-dim dog-obj own-hab-ass1/2=1
‘I have one little dog.’ (Jr)
There are only a few contact phenomena in the verbal structures. The most
relevant of these is the transfer of the ser/estar semantic opposition (PAT-
influence), being adapted as a xinte/xa dichotomy (Section 3). As a result
of the influence of Spanish, the constructions with the verb xinte ‘be’ gain
a greater semblance to the Spanish construction with the verb ‘ser’: passive
constructions (4), equational constructions (5), and attributive constructions
(6). Many young speakers integrate an idiomatic expression, dejar de ser,
that they have calqued from the Spanish (10):
(10) Xorentpherakwa xurakhu-sn-ti xinte-ni ís.

education let-hab-ass3 be-inf thus
‘The education ceased to be like that.’ (Pc)
In the same way, the constructions with xa ‘be there’ have adopted the val-
ues of the Spanish ‘estar’ structures: passive constructions (3), and attributive
ones (7).
Purepecha shows a significant number of Spanish loans in the category ‘other

parts of speech’. Most loans are of the MAT type, but some cases of PAT-
influence have been established within the numeral system and discourse
markers.
The numeral system in Purepecha is vigesimal, and the remodeling to a
decimal system is due to Spanish influence. The numbers from 1 to 6, and
10 and 20 are generally known and used, but younger speakers prefer to use
Spanish numbers except for numbers below 5. Counting and adding are gen-
erally performed using Spanish numbers. There are no contact phenomena in
quantifiers.
The indefinite pronoun siempre ‘always’ is highly integrated, whereas
other indefinite pronouns appear only occasionally. Siempre is used, with the
vowel adaptation siémpri, by all generations (11), and has gained ground rela-
tive to the Purepecha indefinite pronouns mameni and menkhu, which also
express time (12)
(11) Ima siémpri mí-ti-ʃn-an-ti.

dem always know-face-hab-pas-ass3
‘He had always known it.’ (Jr)
(12) Xutʃari tati ménkhu xutʃi-o xa-ɽa-ʃn-ti.

pos1pl father always pos1-res be there-ft-hab-ass3
‘Our father is always at home.’ (Jr)
Purepecha has borrowed two Spanish coordinating connectors: o ‘disjunc-

tion’ (13) and pero ‘contrast’ (14). The Spanish connector y ‘addition’ has not
been borrowed. In this case, the Purepecha connector ka ‘and’ is used (15).
Purepecha 471
The connectors o and peru are concepts formerly unmarked in Purepecha.

The native Purepecha marker ka ‘addition’ is used to combine clauses. Pero
can additionally express a change of topic (16). On the phrasal level, o can be
used to combine phrases (15).
(13) Tʃi tʃentʃeki urapi-ʃ-ki o tʃi kawayu tuɽipi-ni.

pos2 donkey be white-aor-int or pos2 horse be black-inf
‘Is your donkey white, or is your horse black?’ (Jr)
(14) Xi wé-ka-ʃa-p-ka ni-ra-ni péro no

1 want-ft-prog-pas-ass1/2 go-ft-inf but neg
ú-ʃ-ka ʃaná-ra-ni.
be able-aor-ass1/2 walk-mid-inf
‘I wanted to go, but I was not able to walk.’ (Jr)
(15) Ínts-a-sn-ti=ks ima-nki aɽi-s-ka ya tétʃakwa

give-3plobj-hab-ass3=3pl dem-sub tell-aor-subj already wine
ka wiraterakwa ka sáni tsíri o sáni ʃapumata.
and alcohol and few corn or few roasted corn
‘They used to give them what they asked for: wine and alcohol, and a
little corn, or a little roasted corn.’ (Tr)
(16) Atʃati pyá-ʃ-ti yámintu ampe nénki wé-ta-ɽi-a-ka

man buy-aor-ass3 all what sub want-caus-body-fut-subj
ima péru ni-ntha-ʃ-ti=kʃ ya anima-itʃa.
dem but go-centrif-aor-ass3=3pl already soul-pl
‘The man has bought all he will need, but (on the other hand) the
souls have left.’ (Jr)
Many subordinating conjunctions are borrowed from Spanish. The Span-

ish complementizer que, pronounced ke or ki, is never used alone. One hy-
pothesis suggests that ke was borrowed from Spanish. Another possibility is
that Purepecha also had a subordinating conjunction with the form ki, attested
in the sixteenth century. A convergence between the two elements has been
favoured because they present the same form. This topic had not been stud-
ied yet. The subordinating conjunction ke is employed with other borrowed
markers, functioning as complex conjunctions: porki in a causal clause (17),
para ke in a purpose clause (18a), sikiera ke, which is a synthetic form of the
Spanish ‘si quiera’ in a hypothetical clause (19a), and sino ke in a contrast
clause (20). These elements are analyzed as subordinating conjunctions, be-

cause they respect the Purepecha constructions; the verb of the subordinating
clause is marked by subjunctive mood.
Purepecha has various subordinating conjunctions, enki, eʃka and eʃki,
which seem to be being progressively replaced in these constructions by ke.
Nevertheless, many speakers, especially those of the middle-aged and older
generations, prefer to use the Purepecha ximpoki ‘because’ instead of porki,
and to use the borrowed markers para (18b) and sikiera (19b), in combination
with the Purepecha conjunctions eʃka or eʃki.
(17) Ima xu-ɽa-ʃn-ti porki thu yoɽi-ʃ-ka.

dem come-ft-hab-ass3 because 2 call-aor-subj
‘He used to come because you called him.’ (Jr)
(18) a. Xwanu xu-ɽa-ʃ-ti para ke iʃe-ka=ri.

John come-ft-aor-ass3 for sub see-subj=2
‘John came for you to see him.’ (Jr)
b. Aɽi-ʃn-ti=kini para iʃki mi-ti-a-ka.
tell-hab-ass3=2obj for sub know-face-fut-subj
‘He tells you that, so that you know it.’ (Jr)
(19) a. Sikiera ke pirí-a-ka.

provided sub sing-fut-subj
‘Let’s hope he will sing!’ (Jr)
b. Sikiera iʃki tsma xu-nkwa-ka.
provided sub dempl come-centrip-subj
‘Let’s hope they come back.’ (Jr)
(20) No-teru=kʃ anatapu-etʃa-ni iʃe-a-ʃ-ti sino

neg-more=3pl tree-pl-obj see-3plobj-aor-ass3 but
ke lwegu=kʃ tʃapa-ta-a-ka.
sub then=3pl cut-caus-3objpl-subj
‘They do not see any trees anymore, because somebody cut them.’ (Jr)
The situation of variation between the use of ke or iʃka/iʃki in subordinat-

ing constructions is also attested with the comparative structure (see §8).
Generally, para is introduced in a purpose clause with para ke or para
iʃki (18a, 18b), however, it can also appear in a non-finite purpose construc-
tion (21).
Purepecha 473
(21) Thu no xatsi-ʃ-ka para xaka-khu-ni.

2 neg have-aor-ass1/2 for believe-ft-inf
‘You don’t have to believe him.’ (Jr)
Spanish temporal adverbializers that have been borrowed include: hasta

‘until’ (22), desde ‘from’ (23), apenas ‘as soon as’, pronounced apenaʃ (24),
luego ‘then’ (20), and entonces ‘then’, generally pronounced tonses (25).
The adverbializer hasta additionally has a spatial deictic use (26). There is
a native suffix of localization ɽu (26), which has an extended function (fixed
and removed localization, ablative, translative, etc.). The use of hasta allows
the specification of the type of localization.
(22) Xima khama-ʃ-ti ya ásta wéxuɽini.

then finish-aor-ass3 already until year
‘He had finished it by the new year.’ (Jr)
(23) Ántʃi-kuɽi-ʃa-ka=ni désde witsintikwa.

work-mid-prog-ass1/2=1 for/since yesterday
‘I have been working since yesterday.’ (Jr)
(24) Petu kwhi-a-ti apenaxï thu nya-ra-ka.

Peter sleep-fut-ass3 as soon as 2 arrive-ft-subj
‘Peter fell asleep as soon as you arrived.’ (Jr)
(25) Tonses no ampakiti=thu=tʃka ya ni-ntha-ʃ-ti ya.

then neg good=too=so already go-centrif-aor-ass3 already
‘Then the devil has left too.’ (Jr)
(26) Ni-a-ka=kʃ ásta xini yoɽekwa-ɽu.

go-fut-ass1/2=1pl until there river-loc
‘We will go up to the river.’ (Jr)
Purepecha did not have prepositions before contact; we can assume that
it was a language with only postpositions, and some suffixed case markers.
So the prepositions para and por are borrowed in combination with their
phrase-combining construction, i.e., they appear before the phrase or the
morpheme (Chamoreau 2002b). The preposition para functions in a recipient
clause (27a), and por expresses agentive (28a), causal (27a), and instrumental
clauses (28a). The Purepecha postposition ximpo ‘instrumental’ can be used
in functions similar to para (27b), or to por (28b, 29b, 30b). In all these con-
texts, the Purepecha marker may appear in a double construction (27c, 28c,
29c, 30c).
(27) a. Ima kuɽa-tʃi-ʃ-ti=rini itʃuskuta para ama-mpa.

dem ask-1/2app-aor-ass3=1obj tortilla for mother-posp3
‘He has asked me for tortillas for his mother.’ (Jr)
b. Ima kuɽa-tʃi-ʃ-ti=rini itʃuskuta ama-mpa ximpo. (Jr)
c. Ima kuɽa-tʃi-ʃ-ti=rini itʃuskuta para ama-mpa ximpo. (Jr)
(28) a. Mí-ti-ʃ-ti por ima.

know-face-aor-ass3 by dem
‘He knows it through him.’ (Jr)
b. Mí-ti-ʃ-ti ima ximpo. (Jr)
c. Mí-ti-ʃ-ti por ima ximpo. (Jr)
(29) a. Tstski urapiti kunti-kuɽi-ʃa-ti por kwetsapikwa.

flower white bend-ref-prog-ass3 under/cause weight
‘The white flower is bending under the weight.’ (Jr)
b. Tstski urapiti kunti-kuɽi-ʃa-ti kwetsapikwa ximpo. (Jr)
c. Tstski urapiti kunti-kuɽi-ʃa-ti por kwetsapikwa ximpo. (Jr)
(30) a. Xu-ɽa-ʃ-ka-ni por kamioni.

come-ft-aor-ass1/21 by bus
‘I went by bus.’ (Jr)
b. Xu-ɽa-ʃ-ka-ni kamioni ximpo. (Jr)
c. Xu-ɽa-ʃ-ka-n por kamioni ximpo. (Jr)
The Spanish marker komo is used in Purepecha to introduce a manner

clause.
(31) Pos=s komo ma atʃa=s xa-ɽa-ʃ-ti.

thus=foc like a man=foc be there-ft-aor-ass3
‘Thus, he was there like a man.’ (Jr)
The Spanish phrasal adverb ya is used to mark temporal values with two dif-
ferent nuances: it can introduce a completive value, generally employed with
the aspect aorist or the past aorist (32), or it can express a present value (33).
The story in (32) is about a vulture that has transformed itself into a woman,
Purepecha 475
and the woman into a vulture. The example expresses that the vulture turned
into a woman; that it was no longer an animal. In the same narrative, in (33),
there is a contrast between the first verb, in the past tense, which indicates the
state of the woman before, and the second verb, in the interrogative clause,
which indicates a question about the present state, which is the state of the
vulture.
(32) Ka ma waɽiti-i-ʃ-ti ya.

and a woman-pred-aor-ass3 already
‘And it is already a woman.’ (Jr)
(33) Thu no xama-ʃ-p-ka lísto antiʃ=ri xa-ɽa-ʃ-ki

2 neg walk-aor-pas-ass1/2 lively why=2 be there-ft-aor-int
ya.
already
‘You did not used to be lively, why are you now?’ (Jr)
Apart from marking temporal relations, this element ya functions like a

discourse marker, with the addition connector ka ‘and’. The latter begins a
clause, while the former ends one (32).
Finally, many discourse markers are borrowed by Purepecha from the
Spanish. The most frequent ones are the fillers: pues ‘thus, then, well’, pro-
nounced pwes or pos (31), and bueno ‘well, sure’, pronounced wenu. It is also
possible that as a result of the influence of Spanish, the use of the demonstra-
tive inte ‘this’ is used as a filler like este in that language. This element ap-
pears in the same conditions as does inte in Spanish: it expresses a hesitation,
a pause, etc. (34). It is a PAT-influence that is not connected to any direct
MAT-borrowing.
(34) Ximpoka=ni inte pats-nts-ka=na.

because=1 em be fade-head-subj=evid
‘Because, em, I am bald, they said.’ (Jr)
Constituent order seems to be influenced by areal contact prior to the six-

teenth century (see Section 1), with Spanish continuing the process, and
increasing it via the introduction of prepositions (while Purepecha had tradi-

tionally used postpositions).
8. Syntax
The organization of passive and equational constructions has been influenced

by Spanish (see Section 3). Many subordinating conjunctions and adverbial
markers are borrowed from Spanish, in combination with the grammatical
constructions they appear in, in that language.
A syntactic domain which has undergone an important reorganization due
to Spanish contact is the comparison of inequality. Purepecha had traditional-
ly utilized a comparative construction of superiority of two types (Chamoreau
1995): the exceed verb of action (35), and the combination of the exceed verb
of action and a coordinated polarity construction (36):
(35) pedro hatztamahati juanoni ambaqueni

Pedro xats-ta-ma-xa-ti xwano-ni ampake-ni.
Peter surpass-caus-transf-pres-ass3 John-obj be good-inf
‘Peter is better than John.’(Peter surpasses John in being good).
(Gilberti [1558] 1987: 109)
(36) pedro hatztamahati ambaqueni, ca noys

Pedro xats-ta-ma-xa-ti ampake-ni ka no=ʃ
Peter surpass-caus-transf-pres-ass3 be good-inf and neg=foc
juan
xwano.
John
‘Peter is better than John.’ (Peter surpasses with goodness and John
does not). (Gilberti 1987: 109)
Nowadays, there has been a reorganization as a result of a chain reac-

tion triggered by Spanish interference. A cross-dialectal perspective shows
that contact between the indigenous and Spanish constructions (the structure
más…que) has brought about the emergence of nine constructions which may
be classified in four different types (one type, the applicative construction, is
not treated here, because it is marginal and corresponds to the derivative mor-
phological characteristics of the language). I present here eight constructions
organized in three types:
Purepecha 477
Type 1 is a borrowing or a PAT-influence of the Spanish comparative con-

struction with borrowed degree and relator (37), with borrowed relator and
the degree calque (38) or their Purepecha calques (39).
(37) Enrike mas ʃepe-s-ti ke Pedru.

Henry more be lazy-aor-ass3 sub Peter
‘Henry is lazier than Peter.’ (Cn)
(38) I kamisa sáni=teru xuka-para-s-ti ke iʃu

dem shirt few=more put-shoulder-aor-ass3 sub here
anapu-e-s-ti.
orig-pred-aor-ass3
‘This shirt is more expensive than those from here.’ (Ih)
(39) Thu sáni=teru wiria-ʃ-ka eʃki xi.

2 few=more run-aor-ass1/2 sub 1
‘You have run more than I have.’ (Cc)
Type 2 is a mixed type, employing the Purepecha polarity construction

plus the Spanish comparative degree particle mas (40) or its Purepecha calque
sáni=teru (41)
(40) Xi xatsi-s-ka=ni mas itʃuskuta ka no thu.

1 have-aor-ass1/2=1 more tortilla and neg 2
‘I have more tortilla than you have.’ (I have more tortilla and you
have not) (Jn)
(41) Iʃu sáni=teru yó-tha-ɽa-ʃn-ti ka no xini.

here few=more long-leg-ft-hab-ass3 and neg there
‘Here is higher than there.’ (Here is higher and there is not) (Jr)
Type 3 is a hybrid type, employing the Spanish degree mas plus the relator
ke, and a locative construction with the Spanish preposition de (42), which
represents an instance of code-mixing, because it only appears in a few ex-
pressions, and never alone (the Spanish preposition de ‘of’ appears in this
context of comparative constructions and in some expressions, for example
de veras ‘sure’). This new hybrid locative construction does not occur either
in Spanish or in traditional Purepecha. This construction can occur with the
Purepecha degree calque sáni=teru (43):
(42) Inte atʃa mas khéri-e-s-ti ke de ʃo anapu yamintu.

dem man more old-pred-aor-ass3 sub of here orig all
‘This man is older than (of) all the others from here.’ (Tr)
(43) I tata sáni=teru=ʃ tʃhana-ʃn-ti ke de wapha-mpa

dem man few=more=foc jugar-hab-ass3 sub of son-posp3
‘This man plays more than his son.’ (Cn)
This hybrid construction may also occur with the Spanish preposition
entre. In (44), we can observe the presence of the borrowed marker of degree
más before the quality, and the comparative relator ke, which is followed by
the Spanish preposition entre.
(44) Iʃu más khé-ʃ-ti ke entre xini.

here more be big-aor-ass3 sub between there
‘Here is bigger than there.’ (Sat)
Finally, it is unclear whether Purepecha has ever had a comparative con-

struction of inferiority. In order to express this domain, Purepecha employs
two strategies: (a) to use the borrowed Spanish comparative construction of
superiority with the negation, employing either the Purepecha relator (45a)
or Spanish relator (45b), and (b) to borrow the Spanish construction with the
Spanish degree menos (pronounced menuʃ in certain varieties), and the Pure-
pecha calque relator eska (46), or the borrowed degree and relator (47).
(45) a. Maria sáni=taru no wiŋapi-ʃ-ti eski thu.

Maria few=more neg be strong-aor-ass3 sub 2
‘Maria is weaker (less strong) than you are.’ (Maria is not strong-
er than you are) (Ar)
b. Maria sáni=taru no wiŋapi-ʃ-ti ke thu. (Oc)
(46) Selia menos yó-tha-la-ʃn-ti eska=ni.

Celia less long-leg-ft-hab-ass3 sub=1
‘Celia is shorter (less tall) than I am.’ (Cm)
(47) Xi xatsi-ʃ-ka menuʃ ke thu wé-ka-ka.

1 have-aor-ass1/2 less sub 2 want-ft-subj
‘I have less than you want.’ (Jr)
Purepecha 479
9. Conclusion
The grammatical MAT-loans are numerous, and appear within their Spanish
grammatical constructions. A relevant phenomenon is the typological profile
of Purepecha, which shows new tendencies.
Purepecha is a synthetic–agglutinative language, and, nowadays, new ana-
lytic–periphrastic constructions appear, without modifying its elaborate mor-
phological system, but revealing a structural rapprochement to the Spanish pas-
sive and equational constructions: two distinct structures (a morphological one
and a periphrastic one) may simultaneously perform the same function. Pure-
pecha is exhibiting language-internal grammaticalization processes to replicate
Spanish models.
There are PAT-influences that are not connected to any direct MAT-borrow-
ing. Other PAT-influences are the comparative constructions, linked to MAT-
borrowing in some varieties, and showing only pattern reduplication in others.
Abbreviations
aor aorist int interrogative

app applicative it iterative
Ar Arantepacua Jr Jarácuaro
ass assertive loc locative
caus causative mid middle
Cc Cocucho neg negation
centrif centrifuge obj object
centrip centripetal Oc Ocumicho
Cm Comachuén orig origin
Cn Cuanajo partpp patient-oriented participle
com comitative passiv passive
dem demonstrative pas past
dim diminutive Pc Pacanda
evid evidential pl plural
foc focus posp kinship possessive
ft formative pos possessive
fut future pred predicativizator
hab habitual prog progressive
Ih Ihuatzio ref reflexive
inf infinitive res residential
inst instrumental Sat San Andres Tzirondaro
subj subjunctive transf transference

sub subordinating conj. Tr Tirindaro
References
Capistran, Alejandra
2002 Variaciones de orden de constituyentes en p’orhépecha. Topicaliza-
ción y focalización. In: Paulette Levy (ed.), Del Cora al Maya Yucate-
co: Estudios lingüísticos sobre algunas lenguas indígenas mexicanas,
349402. México: UNAM..
Chamoreau, Claudine
2002a Le système phonologique du purepecha: Une étude en synchronie dy-
namique. Travaux du SELF IX: 133161.
2002b Dinámica de algunos casos en purepecha. In: Zarina Estrada Fernán-
dez and Rosa Maria Ortiz Ciscomani (eds.), VI Encuentro Internac-
ional de Lingüística en el Noreste, 271290. Hermosillo: Unison.
2005 Dialectología y dinámica: reflexión a partir del purépecha. In: Clau-
dine Chamoreau (Coord.), Trace 47: Dinámica lingüística, 6181.
México: CEMCA.
2006 En busca de un verbo “ser” en purépecha: Cadena de gramaticaliza-
ción y gramaticalización en cadena. In: Rosa Maria Ortiz Ciscoma-
ni (ed.), VIII Encuentro Internacional de Lingüística en el Noreste,
6584. Hermosillo: Unison.
2007 Looking for a new participant: Passive in Purepecha. In: Zarina Es-
trada Fernández, Soren Wichmann, Claudine Chamoreau, and Albert
Álvarez González (eds.), Studies in Voice and Transitivity. Munich:
Lincom.
Gilberti, Maturino
1987 Arte de la lengua de Michoacán. Morelia: Fimax. (First publ. 1558,
Mexico: Juan Pablo Impresor.)
Haspelmath, Martin
2004 Passive participles across languages. In: Barbara Fox and Paul Hopper
(eds.), Voice: Form and Function, 151177. Amsterdam/Philadelphia:
John Benjamins.
Villavicencio, Frida
2006 P’orhépecha kaso sïrátahenkwa: desarrollo del sistema de casos del
purépecha. México: CIESAS-COLMEX.
Grammatical borrowing in Imbabura Quichua
(Ecuador)
1. Background
Imbabura Quichua (henceforth IQ) is a Quechua language spoken in the

Northern Andes of Ecuador by approximately 150,000 speakers.1 The prov-
ince of Imbabura ranks second among the nine Quichua-speaking provinces
of the Ecuadorian Andes as for the number of speakers (Haboud, 1998:
9192). Imbabura shows also the largest number of bilingual Quichua–Span-
ish speakers in the country (Büttner, 1993: 48: 49). Although there are a small
number of IQ monolinguals among elders, the tendency nowadays is towards
increasing levels of bilingualism accompanied with the maintenance of the
native language. IQ has been in contact with Spanish since the second half
of the sixteenth-century in a diglossic relation. The language is vigorously
spoken in most Indian settlements of the province at community and family
levels. It is taught in schools as part of the Bilingual Intercultural Education
Programme implemented by the Ministry of Education since 1986 with the
support from international cooperation agencies and the National Indian Or-
ganization (CONAIE). In the last decades IQ has entered oral media, and
regular radio broadcasting in IQ reaches all the corners of the province. The
language has a unified writing system since 1980, but this is used only for
textbooks of elementary education.
The fact that IQ shows a strong vitality in the Ecuadorian Andes should not
obscure its allochthonous origins. Quechua was brought to Ecuador by the
Incas in the second half of the fifteenth century, although another Quechua
variety was spoken as a lingua franca by autochthonous peoples long before.
From archaeological and early historical evidence it appears that one Barba-
coan language – Cara – was spoken in the present territory of Imbabura at
the time of the Inca invasion. It is likely that IQ speakers were in contact with
other languages of the same family – Tsafiqui and Awa Pit – through an ex-
tensive trade network at work until the second half of the seventeenth century
(Caillavet 2001: 81). The present chapter focuses on a one-to-one borrowing
482 Jorge Gómez-Rendón
situation between IQ and Spanish. Contact phenomena due to substratum

influence are mentioned only occasionally.
2. Phonology
The phonological inventory of pre-contact IQ does not include consonants

/b/, /d/, /g/, /β/ and /z/, nor medial vowels /e/ and /o/. These sounds entered
the language through Spanish loanwords (e.g. kaβažu ‘horse’, didu ‘finger’,
pagana ‘to pay’, poste ‘post’) and do not occur phonemically in native items.
They show a high degree of integration in IQ phonology, as observed by Cole
(1982: 199). A situation that has facilitated the incorporation of these sounds
in the native inventory is the fact that, with the exception of /β/, they have
native allophonic counterparts: thus, [b] is an allophone of /p/, [d] of /t/, and
[g] of /k/, all in nasal environments; similarly [e] is an allophone of /i/ and
[o] of /u/. The result is free allophonic variation in some Spanish borrowings.
Typically, Spanish medial vowels are raised (/e/>/i/, /o/>/u/) or otherwise
pronounced as close as possible to their Quichua equivalents. Partial assimi-
lation is more common in words with several medial vowels (e.g. [prizidínti]
~ [prisidínte] < Sp. /presidénte/), although non-assimilated borrowings are
not uncommon. Accordingly, there may be various ways to pronounce one
and the same word. Different phonetic realizations depend on (i) the envir-
onment, (ii) the speaker’s level of bilingualism, and (iii) the frequency of the
word.2 When ambiguities arise, these are solved by several mechanisms: e.g.
misa (Sp. mesa ‘table’) and misa (Sp. misa ‘mass’) are disambiguated by
the voicing of the intermediate sibilant in the second member. The assimila-
tion of borrowings is not always rule-governed and may be idiosyncratic to
a certain degree.
Other contact phenomena in phonology are found at syllabic and supraseg-
mental levels. According to the native pattern, the main stress falls on the pe-
nultimate syllable.3 The stress pattern in borrowings depends on their degree
of assimilation (e.g. kumunidá, Sp. comunidad). The retention of Spanish
stress patterns may be a disambiguating strategy in some cases. The native
pattern of syllable structure is CVC(V), with a limited number of consonants
in coda position (/k/, /s/). Consonant clusters in onset and coda positions
occur only exceptionally, often as a result of other morpho-phonemic pro-
cesses. Few Spanish loanwords avoid consonants in coda position: e.g. rilújo,
Sp. reloj ‘watch’. The most frequent type of clusters in Spanish loanwords
Imbabura Quichua 483
involve one of a set of plosives (/p, t, k/) plus a flap /ɾ/ like in prioste ‘sponsor
of a celebration’, trabajo ‘work’, and crema ‘cream’. Loanwords with clusters
in word initial position usually are not assimilated into IQ phonology – but
the speaker’s level of bilingualism may be decisive. Occasionally, a vowel is
inserted in between the plosive and the flap. This vowel is the same as the one
following the cluster: e.g. koronika, Sp. crónica ‘story’.
A final issue is the existence of certain phonetic realizations proper of IQ.
These realizations make IQ different from other Ecuadorian Quechua dia-
lects. They are claimed to come from substratum influence. In what follows
I focus on the phonological survey conducted by Fauchois (1988).
The main phonetic difference of Imbabura Quichua with respect to other
varieties spoken in the Andean Highlands is the fricativization of plosives /p/
and /k/ in all positions except nasal (cf. supra). The resulting [f] and [j] differ
in word-initial position from their aspirated counterparts [ph] and [kh] in the
rest of Ecuadorian dialects, but also from their non-aspirated equivalents [p]
and [k] in word- medial or word-final positions. Illustrative cases are pucuna
‘to blow’, realized as [fukuna] in Imbabura but [phukuna] in the central dia-
lects (e.g. Tungurahua); upiana ‘to drink’, realized as [ufiana] in Imbabura
but [upiana] in the Southern varieties (e.g. Saraguro, Azuay); cari ‘male’,
[jari] in Imbabura, but [khari] in the central dialects (e.g. Cotopaxi); and re-
ciprocal -naku-, realized as [-naju-] in Imbabura, but [naku] in the rest of the
provinces. Voiceless [t] and voiced [d] are distinct phonemes in all Ecuado-
rian dialects but not in IQ, where [d] occurs mostly in allophonic variation
with [t] (Fauchois 1988: 62).
There exist a number of lexical localisms and toponyms with the above-
mentioned phonetic features: e.g. muchiju, ‘Indian hat’; Abataj, name of an
Indian community. Spanish loanwords are assimilated according to the same
pattern: e.g. [juiřsa], from Sp. fuerza ‘strength’, realized as [Φuiřsa] in cen-
tral and Southern dialects. These facts tell us that we are before a phenom-
enon of substratum influence in IQ phonology. Recent research has shown
that a distinct cultural and linguistic group lived in the present territory of Im-
babura. There being no grammars or dictionaries available of this language,
most works have focused on toponomy, anthroponomy and early Colonial
documents. A short list of morphemes – both of lexical and grammatical
nature – have been identified from the substratum language (Caillavet 2001:
108). Interestingly, some of them show phonetic patterns similar to those de-
scribed above: [-pixal] ‘sinuosity in the landscape’; and [-tux] ‘characteristic
of a burial place’. Further research is required in this field.
3. Typology
Contact with Spanish has not changed the typological profile of IQ. Like
other members of the Quechua family, IQ remains a typical agglutinating
language. This is true not only of sociolects with minimal lexical influence
from Spanish but also of relexified varieties such as Media Lengua (Gómez-
Rendón 2005). What makes IQ – and Ecuadorian dialects in general – differ-
ent from other Quechua dialects is a lower degree of synthesis resulting from
the loss of verb-object agreement and possessive nominal suffixes. Consider
the following examples from the Peruvian varieties of San Martin and Junín
in comparison with Imbabura and Ecuadorian Quechua:
(1) San Martín Quechua Imbabura Quichua

Ñuka-ka maka-yki. Ñuka-ka kan-ta maka-ni.
1s-top hit-2.obj 1s-top 2s-acc hit-1s
‘I hit you.’ ‘I hit you.’ (Cole 1982: 6)
(2) Junin Quechua Ecuadorian Quechua

maki-yki kanpak maki
hand-2s. poss 2s.gen hand
‘your hand’ ‘your hand’ (Cerrón-Palomino 1987: 200)
The loss of personal reference markers in IQ introduced the obligatory use

of pronominal forms to mark the arguments of the predicate where other
dialects use pronouns only for emphasis (cf. Section 6.2). This particular
development cannot be attributed to contact with Spanish or substratum in-
fluence. The simplification of verbal morphology in Ecuadorian dialects may
be interpreted as a result of koineization. Cusco Quechua was brought to
present Ecuador alongside other dialects from central and Northern Peru.
The presence of different dialects contributed to the emergence of a koiné
(Cerrón Palomino 1987: 343), a process claimed for other peripheral areas of
the Inca Empire such as Salta and Tucumán in Argentina (de Granda 2001:
207ff).
No contact phenomena have been observed in the type of alignment and
affixation. Nevertheless, IQ has incorporated a few Spanish morphemes,
mainly through the borrowing of Spanish words with such morphemes.
These include agentive -dur and diminutive -itu.4 Consider the following ex-
amples from IQ:
(3) a. mam-ita
mother-dim
‘dear mother’
b. huas-ita
house-dim
‘little house’
(4) a. midi-dur
measure-ag
‘meter’
b. ñaupa-dur
go.ahead-ag
‘representative’
The diminutive ending and the agentive marker occur both with Spanish lex-
emes (left column) and native lexemes (right column).
Apart from these grammatical changes, IQ shows a large number of Span-
ish borrowings are assimilated into native patterns. In the classification of
parts of speech elaborated by Hengeveld et al (2004), IQ is considered a
language with two lexical classes, i.e. verbs and non-verbs. The class of non-
verbs conflates nominal, verbal and adjectival functions. Spanish borrow-
ings in IQ tend to match this pattern. Spanish nouns used as modifiers of
noun phrases and verb phrases are not uncommon. I discussed the results of
an investigation into the functional patterns of lexical borrowing elsewhere
(Gómez-Rendón 2006).
In spite of the considerable influence of Spanish on IQ lexicon and syntax,
IQ continues to be a topic-prominent language. The drop of the topic marker
-ka and its replacement with the focus marker mi is a common feature. How-
ever, this new development does not imply any loss of topic prominence (see
Section 7.1).
Nominal structures influenced by Spanish have to do with case marking and

NP structure. Borrowing of linguistic matter is present, though replication of
patterns is the most frequent phenomena. In the following sections I discuss
these structures in detail.
4.1. Case marking
Contact-induced phenomena in the use and the semantics of case markers in-
clude: (i) the loss of distinction between inalienable and alienable possession;
(ii) the loss of distinction between comitative ntin and instrumental wan, with
the resulting conflation of both in the latter; (iii) the drop of the obligatory
accusative marker on direct objects; (iv) the increasing tendency to use the
plural marker on nouns after numeral modifiers; (v) the use of Spanish lexical
borrowings to express local and spatial relations.
The loss of distinction between alienable and inalienable possession is
reflected on the gradual replacement of yuc with pac and on the alternative
use of lexical strategies (5). In both cases, the use or non-use of yuc makes a
difference from pre-contact IQ varieties.
(5) Ñami warmi-yuc ka-ni.

already woman-poss be.1s
‘I am married already.’
(6) Ñami kazara-shka ka-ni.

already married-ptcp be-1s
‘I am married already.’
IQ has different markers for the comitative (ntin) and the instrumental (wan).
Whereas ntin relates elements as if they formed one indivisible unity, wan
indicates the contingent bringing together of two elements or the instrumen-
tality of one with respect to the other. For Kaarhus (1989) the comitative–
instrumental distinction entails a unique understanding of space-time rela-
tions proper of the Quichua culture. As a matter of fact, heavily Hispanicized
sociolects of IQ have lost this distinction in either of two ways: (a) both case
markers are used interchangeably; (b) one marker (wan) conflates both mean-
ings. The second case, illustrated in (7) below, is far more frequent and has
resulted in the reduction of the case system on the model of Spanish.
(7) warmi-wan tarpu-ngapak ri-rka-ni.

woman-com sow-purp go-pst-1s
‘I went with a woman to sow.’
‘I went with my woman to sow.’
The two meanings of (7) cannot be disambiguated without context. The use of
wan and ntin resolves this ambiguity. By using ntin the speaker implies that
the woman is his wife and both of them form a couple; the use of a genitive
pronominal such as ñuka(pak) ‘my’ is needless in this case.
Another structure influenced by Spanish is the marking of direct-object
arguments. As a rule, pre-contact IQ marks direct objects with ta. On the
contrary, contact varieties tend to drop this marker. Consider the following
sentences lacking the accusative marker:
(8) churamu-kri-n shuc ley.

put-inch-3 one law
‘He/she is going to pass a law.’
(9) kankuna huasi-pi kati-nchi kay programa.

2pl.poss house-loc follow-1pl this program
‘At your home we listen to the program.’
(10) chari-nchi minimercado “Charito”.

have-1pl small.market “Charito”
‘We have the small-market [called] “Charito”.’ (Fauchois 1988: 117;
my glosses)
In the foregoing examples the accusative marker is systematically dropped

on the direct objects, which contain either Spanish borrowings (8), (9) or
code switches (10). Besides, the word order is SVO and not SOV as typic-
al of IQ. There seems to be certain connection between dropped accusative
markers, deviant word orders and heavy lexical borrowing. Fauchois (1988:
117) claims that the use of Hispanicized SVO word order in IQ makes it un-
necessary to mark direct objects because the element following the verb is
always the object. What Fauchois fails to notice however is that post-verbal
position is not assigned to objects by default and the identification of this
position with objects is possible only by contrasting Spanish-like word order
and IQ native word order. To this extent SVO is subsidiary to SOV and the
latter remains the most frequent word order, even in contact varieties. Notice
that Spanish lexical material reinforces the tendency to drop the accusative
marker in SVO constructions.
Other tendencies observed in contemporary IQ that may be explained in
terms of contact with Spanish concern the expression of number. Plural mark-
ing in IQ is obligatory, except if numerals precede the noun heads (Cole 1982:
128). The preference in such cases is the unmarking of number. Nevertheless,

marking plurality is increasingly frequent when numerals are involved, as
exemplified by (11) in comparison to (12) below:
(11) ñuka-ka ishkai churi-kuna chari-ni.

1s-top two son-pl have-1s
‘I have two sons.’
(12) ñuka-ka ishkai churi chari-ni.

1s-top two son have-1s
‘I have two sons.’
Double plural marking is another contact-induced development concerning

number. As a matter of fact, several Spanish words have been borrowed into
IQ in plural (13). In this case the Spanish plural ending -s and the Quechua
plural -kuna co-occur on the same lexeme, which results in apparent double
marking. However, not all cases of double marking may be interpreted in this
way for several reasons. Firstly, the number of lexemes borrowed in plural is
comparatively small. Secondly, some borrowings occur with or without the
Spanish plural. Third, cases are found of native lexemes in which the Spanish
plural ending occurs along with the native marker (14).
(13) chay kosa-s-kuna-manta mana japi-ni-chu.

that thing-(Sp)pl-(IQ)pl-abl neg understand-1s-neg
‘I don’t understand about those things.’
(14) riku-shka-ni kimsa alku-s-kuna Aguchu-pak patiyu-pi.

see-prf-1s three dog-(Sp)pl-(IQ)pl Aguchu-poss backyard-loc
‘I saw three dogs in Aguchu’s backyard.’
It may be argued that Spanish kosas in (13) is an instance of code-switching

rather than borrowing proper. This explanation however fails to explain the
occurrence of the native plural. If the code-switch is in Spanish, why does IQ
plural occur at all? Accepting two plural markers in one and the same noun
phrase implies two competing grammars.5 On the other hand, if we consider
(13) as a frozen borrowing, we have to explain the large number of borrowed
lexemes with the Spanish plural, a number that goes far beyond the few ex-
amples presented in the literature (Cole 1982: 129). In all, the occurrence of
double marking seems to be accountable in terms of borrowing rather than of
code-switching. The occurrence of lexical and grammatical couplets consist-

ing of native and borrowed items with different functional distributions (cf.
Brody 1987; Campbell 1993; Matras 1998) might offer an alternative explan-
ation in which pragmatic and processing factors motivate double marking.
Spanish influence on IQ case marking concerns also the expression of
local relations. Two Spanish lexemes occur in local relations: ladu, ‘side’;
and frinti, ‘front’. The former lexeme is especially productive in IQ. Consider
the following examples:
(15) kuanchi Kasko ladu kidana-ju-nchi pruyektu-wan.

1pl Casco side remain-dur-1pl project-com
‘We [the people] from Casco kept the project.’
(16) kay Imbabura ladu gente.

dem Imbabura side people
‘People from this side of Imbabura.’
(17) kay Topo ladu-kuna-pak-mi siyimpre obligatorio.

dem Topo side-pl-ben-emph always mandatory
‘That was always mandatory for [people] from Topo.’
(18) maijan ladu-man-shi Anglango ka-pa-rka.

which side-all-dub Anglango be-hon-pst
‘On which side was Anglango?’
(19) kuanchi-ka ladu-lla kausa-shka-nchi.

1pl side-lim live-prt-1pl
‘We lived on the side.’
(20) ishkay warmi-kuna kaballu-ta chumbi-wan ladu-ladu alza-n.

two woman-pl horse-acc belt-inst side.by.side lift-3pl
‘Two women, one on each side, lift the horse with the belt.’
The above examples can be classified according to the use of ladu: (a) those
in which ladu modifies the head of a noun phrase, be it a pronoun (15) or a
noun (16); (b) those in which ladu stands on its own, being the head of the
noun phrase itself and receiving inflectional morphology (17); (c) those in
which ladu is part of a postpositional phrase and accompanies a question
word (18); and (d) those in which ladu modifies the main predication, either
alone (19) or in reduplication (20). In addition, ladu has an ablative meaning

in (15) to (17). In these constructions the Spanish borrowing links the noun
head (implicit or explicit) to another noun indicating location. Interestingly,
ladu take postpositions such as manta (ablative), pi (locative), man (allative),
and ta (prolative). From the difference between local relations expressed by
locatives such as ladu ‘side’, in which the preceding noun does not take the
possessive, and local relations expressed by native locatives such as chaupi
‘middle’, in which it does, Cole (1982: 124) concludes that expressions such
as (15) to (17) are complex postpositions, one of whose components is the
locative. In general, ladu may be considered a secondary locative morpheme,
as it behaves exactly like other members of this class (e.g. uku ‘inside’, washa
‘behind’ or jawa ‘above’). Other non-relational uses of ladu include its use
as head and modifier of noun phrases, in the latter case with the meaning of
“lateral”. Finally, ladu can be adverbialized by lla or by reduplications as
typical of many IQ adverbials.
Though much less frequent when compared to ladu, Spanish frinti ‘front’
is used in IQ to express anterior location similarly to the native morpheme
chimba ‘front’. The occurrence of frinti is rather idiolectal, however. The fol-
lowing example is one of the few in the corpus:
(21) chay wambra-ka pungu frinti-pi shaya-ju-n.

dem young-top gate front-loc stand-dur-3
‘That youngster is standing in front of the door.’
4.2. NP structure
Contact with Spanish has influenced NP Structure in the following ways: (a)
the use of determiners shuk ‘one’ and kay ‘this’ to replace the native topical-
izer ka, which is dropped systematically in decontextualized speech events
such as radio broadcasting; (b) the occurrence of Spanish diminutive and
augmentative endings in borrowed and native lexemes; (c) the borrowing of
the Spanish agent nominalizer. While the first phenomenon may be classified
as pattern borrowing, the last two are cases of matter borrowing.
The prolific use of determiners shuk and kay at expense of topicalizer ka6
was first noticed for IQ in radio broadcasting (Fauchois 1988: 105). Interest-
ingly, this use is found beyond the context of broadcasting. Consider the fol-
lowing examples:
(22) shuk gallo-mi Katacupamba-pi-ka kanta-na.

one rooster-emph Katacupamba-loc-top crow-hab
‘A rooster used to crow at Katakupamba.’
(23) shuk tela tiya-n ni-k ka-ria-n.

one cloth there.be-3s say-pst.hab be-dur-3s
‘They used to say that there is one cloth.’
(24) kuanchi-ka kai kosecha-pi-ka puri-rka-ria-nchi.

1pl this harvest-loc-top go-pst-dur-1pl
‘We used to go to the harvest.’
(25) primero trata-ngapa kai kabesilla-ta trata-rka-ni.

first negotiate-purp this leader-acc negotiate-pst-1s
‘First, in order to negotiate, I negotiated with the leader.’
Originally, kay is a demonstrative while shuk is a numeral. Shuk (indefinite)

and kay (definite) are used on the model of the Spanish contrast between
indefinite un/una and definite (el/la) articles. This is only half of the ex-
planation, though. Fauchois (1988: 106) identifies three factors leading to
the overuse of kay: (1) the influence of Spanish structure on IQ whereby
the speaker expresses definiteness or indefiniteness through an element (de-
monstrative, numeral) preceding the noun head; (2) the speaker’s difficulty
to use the topicalizer ka in non-personal speech events; and (3) the need to
codify additional information in the absence of extra-linguistic signs. While
the first factor is clearly at work, neither the second nor the third are rele-
vant for the examples presented here, because these were gathered in nor-
mal communication settings. Notice that the topicalizer does occur in (24).
The co-occurrence of the topicalizer with the demonstrative implies that the
former does not mark definiteness. Definiteness in IQ is a by-product of
topic marking.7
Spanish augmentative and diminutive endings are typically used in loan-
words though occur on native lexemes as well. The use of Spanish augmen-
tative and diminutive endings has not motivated the disuse of their native
counterparts (augmentatives sapa and siqui; diminutive ku). On the contrary,
the Spanish ending and the Quechua affix are sometimes used contrastively
in couplets. Consider the following examples, each with a lexeme of different
origin.
(26) ñuka-ka shuk wawitu-lla chari-ni.

1s-top one child: dim-lim have-1s
‘I have only one little child.’
(27) Uyanza tiyimpu ñukanchi-ka papasu-wan puri-shka-nchi.

Uyanzas time 1pl-top father:supl go-plus-1pl
‘In times of Uyanzas we used to go with our grandfathers.’
In (26) the Spanish diminutive ending occurs on IQ wawa ‘child’. The dimin-
utive emphasizes how young the child is. In (27) the augmentative on Spanish
papa ‘father’ does not denote any quality of the speaker’s father. Instead, it
refers to the speaker’s grandfather. The low frequency of this compound in-
dicates that grammatical borrowings are used productively even though they
do not necessarily follow IQ rules. In order to form kinship terms for gen-
erations older than ego’s parents, IQ uses modifiers jatun ‘big’ or rucu ‘old’
and not augmentatives. IQ uses augmentatives on quality nouns only. These
differences suggest that borrowing implies a compromise between the mor-
phological strategies of both languages.
The Spanish agentive nominalizer dur often occurs unanalyzed on bor-
rowed lexemes such as bindidur ‘seller’, trabajadur ‘worker’ or mididur
‘meter’. In these cases it forms an indivisible unit with the root. It occurs also
on native lexemes:
(28) a. ñaupa-dur b. michi-dur

front-nmlz graze- nmlz
‘spokesman’ ‘shepherd’
(29) a. kalpa-dur b. yapu-dur guagra

run- nmlz plow-nmlz cow
‘runner’ ‘plowing ox’
While the productiveness of this Spanish nominalizer is limited in IQ, the use
of compounds is attested across generations and levels of bilingualism.
The influence of Spanish on IQ verbal structures includes matter and pattern

borrowing, with predominance of the former given the easy integration of
Spanish loan-verbs, modals and particles into IQ. At the same time, the oc-
currence of pattern borrowing in valency-changing devices and, most impor-
tantly, the gradual replacement of the nominalization strategies with finite-
clause subordination are modifying the typological outline of IQ.
5.1. TMA marking
An inventory of tma structures influenced by contact with Spanish includes:

(i) the replacing of ngapaj with purposive chun in coreferential constructions,
on the model of the Spanish subjunctive; (ii) the use of Spanish dizi- ‘say’ in
reportatives and quotatives; and (iii) the use of Spanish modal verbs.
In his grammar of Imbabura Quichua, Cole calls our attention to the fact
that “clauses employing the verbal suffixes -ngapaj and -chun are used in
roughly the same environments in which the present subjunctive is employed
in Spanish” (Cole 1982, 157). The statement can be interpreted simply as a
comparison to help the reader grasp the nature of these suffixes but the im-
plications go beyond that. In order to understand this new development in IQ
morphosyntax, it is necessary to remind the reader of two structural proper-
ties of IQ. All Quechua dialects use a marker of purpose in subjunctive noun
clauses. Imbabura Quichua has two different markers: ngapaj, for coreferen-
tial subjects in the main clause and the subordinate clause; chun, for subjects
with different referents. The following examples extracted from Cole (1982:
37f) are illustrative:
(30) muna-y-man ñuka mama-ta riku-ngapaj.

want-1s-cond 1s.poss mother-acc see-purp8
‘I want that I see my mother; I want to see my mother.’
(31) muna-ni Juzi pay-paj mama-ta riku-chun.

want-1s Juzi 3-poss mother-acc see-purp
‘I want José to see his mother.’
Although purposive constructions occur mainly as complements of voli-

tion verbs like muna ‘want’, the same restrictions of coreferentiality apply
for other verbs. Purposive constructions in contemporary IQ do not follow
this pattern. Nowadays chun tends to replace ngapak in coreferential envir-
onments. Consider the following example in which such replacement takes
place in such environment:
(32) atribi-k turiro-kuna tiya-shpa-ka paykuna toria-chun.

brave-nmlz bullfighter-pl be-ger-top 3pl fight-purp
‘If there were brave bullfighters to fight.’
Concurrently, the use of ngapaj becomes restricted to exclusively purposive

functions. Considering the frequent use of the Spanish subjunctive and the
lack of distinction between coreferential subjects in this language, it may be
hypothesized that long-term contact with Spanish led to the specialization
of native morphosyntactic structures. The possibility of an internal develop-
ment should not be entirely discarded, but the duration and intensity of con-
tact along with high rates of bilingualism in the speech community make a
contact-induced change more likely.
Of several Spanish utterance modifiers in IQ, two are used as particles
with modal nuances. Consider first the following examples of Spanish tálbis
‘perhaps’:
(33) chayka manchari-shpa tálbis uya-ria-nga tayta-kuna-ka.

then fear-ger perhaps hear-dur-fut parent-pl-top
‘If the parents fear [the penalties], they might listen [to the teachers].’
(34) chay-pash-chari tálbis asha-gu pay-kuna-pak-pash

that-adit-perhaps perhaps few-dim 3-pl-dat-adit
falta-k riku-ri-n.
be.missing-nmlz see-refl-3
‘They may see that quite a few things are still missing.’
Tálbis in (33) and (34) is a phonetic assimilation of Spanish tal vez ‘perhaps’.
This particle marks probability from an epistemic (33) or alethic modality
(34). Notice that tálbis co-occurs with its native counterpart chari in (34).
The particle may occur freely in the clause (e.g. 35).
Another particle derived from Spanish is gulpi. On the one hand, the wide-
spread use of this form across idiolects suggests it is an older borrowing. On
the other hand, the difference in the meanings of gulpi in Spanish and IQ sug-
gests a process of grammaticalization. In fact, the meaning of this particle in
contemporary IQ has no semantic relation to Spanish golpe ‘blow’, even if the
phonetic shape is basically the same. Consider the following examples from
our corpus:
(35) na llukchi-shka-nchik tukuy-lla-tak gulpi-ta trabaja-ngapak.

neg leave-prf-1pl all-lim-aff blow-advr work-purp
‘Not all of us leave at once for work.’
(36) shinallatak gulpi-lla tukuy-ta ayuda-shpa trabaja-chun

however blow-lim all-acc help-ger work-purp
muna-y-manta-pash.
want-inf-abl-adit
‘But we help all to work just because we want it.’
In the foregoing examples gulpi is used to stress the inclusivity of the first-
person plural. This interpretation is further confirmed by the co-occurrence of
gulpi with native tukuy ‘all’ in both examples. Given that IQ has lost the clusiv-
ity distinction characteristic of other Quechua languages, gulpi serves in part
to fill this gap in IQ. Notice that gulpi is adverbialized in (35) and qualified
for degree in (36). In the following example gulpi co-occurs with tukuy and
functions as an adverbial (intensifier) without any additional morphology.
(37) gulpi tukuy tandanaju-shpa llanka-na ka-n yani.

blow all unite-ger work-inf be-3 think:1s
‘It is necessary that all of us work together.’
Another phenomenon with bearing on tma marking involves the borrow-

ing of Spanish-derived verbs minishti ‘need’, kiri ‘want’ and pudi ‘be able to’
as shown in the examples below:
(38) komuna-kuna-wan-ma ashtawan trabaja-na minishti-nchik.

village-pl-com-aff more work-inf need-1pl
‘We need to work more with the villages.’
(39) na inkipash problema-kuna-ta tini-ngapa kiri-ni.

neg whatever problem-pl-acc have-purp want-1s
‘I do not want to have any problems whatsoever.’
(40) utru iskuila-kuna-pi-pash problema-ta tiya-shka-manda

other school-pl-loc-adit problem-acc to.be-prf-abl
mana aprindi-i pudi-n.
neg learn-inf be.able-3
‘They cannot learn because they had problems in other schools as well.’
(41) trabaxu-pak-ka ñukanchi piko-lla-mi minishtiri-n.

work-dat-top 1pl.poss pickaxe-lim-aff be.needed.3s
‘Our pickaxe is needed for work.’
Examples (38), (39) and (40) are modal verb constructions in which Spanish-
derived minishti ‘need’, kiri ‘want’ and pudi ‘can’ are used as auxiliaries of
necessity, volition and ability, respectively. On the contrary, sentence (40)
shows the same verb minishti ‘need’ used as a non-modal verb reflexivized
with the suffix ri, hence its intransitive interpretation. The phonological shape
and etymological origin of minishti make it a loan-verb of older import in IQ.
The verb form comes from archaic Spanish menester ‘need’ as occuring in
constructions like haber menester ‘to be needed’. These constructions are not
used anymore in local Spanish but were used until the late eighteenth-centu-
ry. Pudi and kiri are of much later import. The above examples also show that
Spanish-derived verbs take IQ inflectional morphology like any native verb.
A final issue to be dealt with in this section concerns evidentiality. IQ and
other Ecuadorian Quechua dialects show a system of evidential values that
include one type of first-hand information and three types of second-hand in-
formation including reportativity, quotativity and inference.9 Although Span-
ish has not influenced the structure of evidential values in IQ (but see 7.1),
one case of matter borrowing is attested which consists in the replacement of
the native reportative/quotative form ni ‘say’ with the Spanish verb root dizi
‘say’. This replacement is reported only for the speech of younger bilinguals.
The following examples illustrate evidential and non-evidential uses of dizi:
(42) Quotative evidential

chayka kutichi-n “estoy buscando mi yunta de bueyes” dizin
then answer-3 [I am looking for my yoken of oxen] QUOT
‘Then he/she answers, “I am looking for my yoke of oxen”.’
(43) Reportative evidential

patrun da-shca rumi-ka kuri ka-shka dizin.
landlord give-ptcp stone-top gold be-prf rep
‘It is said that the rock the landlord gave [to him] was gold.’
The above examples illustrate the use of Spanish-derived dizi in a variety of

contexts. The semantic equivalence between the root of the loan-verb and the
native root ni allows for the replacement of dizin in (42) and (43) with IQ ni
without change of meaning. As typical of IQ evidentials, dizin occurs at the
end of the quote in (42) and the clause (43). Moreover, both instances of dizin
carry the same tense marker of the main clause. All of these features show
that ni and dizi are semantically and morpho-syntactically equivalent.
5.2. Integration of Spanish loan-verbs
The integration of Spanish loan-verbs in IQ occurs through direct insertion,

i.e. without extra marking. Spanish verbs are borrowed as verb roots without
infinitive endings. The resulting roots are often assimilated to IQ phonology.
Example (44) below illustrates this strategy. Several roots are subject to fur-
ther morphophonological changes such as elision (45) or epenthesis (46) of
syllables in order to conform to IQ phonotactics. Loanwords dating back to
the first century of contact with Spanish are particularly interesting. On the
one hand, their Spanish origin often goes unnoticed by IQ speakers due to
their degree of assimilation (cf. uya below). On the other hand, old loan-verbs
have fallen into disuse in local Spanish (cf. parla below). Accordingly, the
loan-verbs in (44) and (45) are invariably identified as Spanish borrowings
by the speakers while the loan-verbs in (46) and (47) are considered part of
IQ vocabulary.
(44) valora-r: value-inf > balora-

ñuka-manta ishka-ndi cultura-ta balura-ni
1s-abl two-com culture-accvalue-1s
‘As for me, I value both cultures.’
(45) acompaña-r accompany-inf > compaña-

compaña-shunchi yamta minga-i-ta
accompany-1pl.fut firewood community.work-inf-acc
‘Let’s go together to collect firewood.’
(46) casa-r(se): marry-inf(+refl) > kaza+ra-

wambra-kuna ka-shpa-ka
youngster-pl be-ger-top
shuk paya paya warmi-ta kazara-ri-nga
one old old woman-acc marry-refl-3.fut
‘Though he is young, he will marry a very old woman.’
(47) parla-r talk-inf > parla-

oi-r hear-inf > uya-
chay-manta parla-shka-ta uya-pa-shka-ngi-chu
that-abl speak-ptcp-acc hear-hon-prf-2s-int
‘Have you heard [people] talking about that?’
The integration of loanwords involves the re-semantization of source-lan-

guage meanings, because loan-verbs are not always borrowed with the same
meaning they have in Spanish. This is particularly true for the early stages of
contact, where bilingualism among IQ speakers was incipient. In later stages
Spanish-derived verbs usually match the source-language semantics. Thus,
Spanish botar ‘throw’ was interpreted in the sense of ‘to give away’. Other
examples include Spanish desbaratar ‘to mess up’ used in the sense of ‘to
hurt’, or tratar ‘to treat’ in the sense of ‘negotiate’.
An interesting phenomenon is the verbal use of loan nouns and adjectives
as verbs in IQ. This tendency is rooted in the lexical flexibility of IQ. In the
following example a Spanish noun is used verbally.
(48) na kai llacta shina-ka flauta-k ka-shca-n-chu nin

neg this village like-top flute-hab.pst be-plus-3-neg rep
medio Camuendo-ta flauta-k kashka nin.
similar Camuendo-acc flute-hab.pst be.plus-3 rep
‘It is said that they did not use to play the flute as it is proper of this
village [but] in the style of Camuendo village.’
(49) primero jatun flauta tiya-na cutin uchilla-gu, Castilla flauta-gu.

first big flute be.inf then small-dim Castilla flute-dim
‘First it was the big flute, then the small Spanish-like flute.’
In the above example the verbal marker of habitual past is added to the Span-
ish lexeme flauta ‘flute’. The same lexeme is used as a noun in (49). Notice
that no derivation mechanism is involved in the verbal use of flauta in (48).
The trans-categorization of Spanish borrowings is not uncommon in IQ,
where one often finds Spanish nouns used as adjectives and adverbs, or ad-
jectives used as nouns and adverbs. A detailed discussion of Spanish lexical
borrowings in IQ is presented elsewhere (Gómez-Rendón 2006a).
5.3. Contact-induced valency changes
Certain developments in IQ verbal morphology may be attributed to contact

with Spanish. These include (a) the use of reciprocal naku as a plural marker
for intransitive verbs; (b) the extension of reflexive ri to cover reciprocal
meanings; and (c) the use of reflexive ri on the model of Spanish impersonal
se. In what follows I discuss these developments and their possible motiv-
ation by Spanish contact.
As Muysken (2000b: 984) notes, reciprocal naku is used with intransitive
verbs denoting actions performed together with someone else. He gives the
following example:
(50) puri-na[k]u10-n
walk-recp-3
‘They walk together.’
Notice that puri-n is marked for person but not for number so it may refer to
singular and plural subjects alike. In (50) puri-naku-n unambiguously refers
to several persons walking together. The question is whether this particular
development of IQ is induced by contact with Spanish. To explain this in-
novative use of the reciprocal in terms of contact-induced change we need to
demonstrate that (a) Spanish reciprocals can be used also as plural markers,
and (b) this use serves as a model for the IQ reciprocal.11 While the recip-
rocal-plural relation is demonstrated by the trivial fact that every reciprocal
form implies several individuals and reciprocal morphemes in fusional lan-
guages like Spanish (se, nos, os) also show number, the particular use of the
reciprocal in IQ does not necessarily follow from its contact with Spanish.
Other Quechua languages with a similar history of contact (e.g. Argentinean
Quechua) use naku as a reciprocal only (Alderetes 2002: 5).12 Accordingly,
it may be hypothesized that Spanish triggered the innovative use of the re-
ciprocal as a verbal plural marker on the basis of the common semantics of
reciprocality and plurality. From this point of view the influence of Spanish
would consist in expressing both in one morpheme naku instead of two,13 i.e.
a case of pattern borrowing.
The taking over by naku of an additional (plural) meaning seems to have
caused the reflexive ri to include reciprocity. That such extension is a gradual
process is demonstrated by the fact that it is not uncommon that both markers
occur in one and the same verb (51):
(51) paykuna-ka yanka-manta maka-ri-naku-nkuna.

3.pl-top firewood-abl fight-ref-rcp-pl
‘They are fighting with each other because of the firewood.’
A step towards the replacement of the reciprocal is illustrated in (52) below,

where the reflexive occurs instead of the reciprocal but requires a comitative
marker to signal the common action of the verb ‘coordinate’:
(52) na kunbeniu tiyanchu, purki ñukanchi kurdina-ri-nchi

neg agreement there.be-neg because 1pl coordinate-ref-1pl
uspital-wan.
coordinate-ref-1pl hospital-com
‘There is no agreement, because we cooperate with the hospital.’
Arguably Spanish served as a model on account of the existence of one para-

digm of verbal morphemes expressing reflexivity and reciprocity. However,
like one must be cautious in formulating a hypothesis that links this develop-
ment to contact only, as there may be other, internal changes at work. Contact
with Spanish is therefore one of several influencing factors and should be
understood as a trigger of change.
Reflexive ri shows another innovative use in IQ and other varieties of Ec-
uador. Notice the use of ri in the following example:
(53) ñaupak Sanjuan-ka siempre-mari obligatorio ka-na

front Sanjuan-top always-aff obligatory be-hab
ishkai pañuelo-ta binda-ri-shka.
two handkerchief-acc bandage-ref-plus
‘In former San Juan [festivals] it was obligatory to blindfold [the
horse] with two handkerchiefs.’
Consider the argument structure of binda ‘bandage’ (Sp. vendar) in (53). Ori-
ginally, binda is a transitive verb with agent and patient arguments. The agent
and patient of binda are implicit in (53). In the context of the story it is clear
that the official sponsor of the festivals used to harness a horse, and that part
of the animal’s apparel consisted of two handkerchiefs. From the participants
in the story, we assume that the sponsor is the agent, the horse is the patient
and the handkerchiefs are the instruments. However, this distribution of ar-
guments does not correspond to nominal and verbal morphology in (53). For
one thing, the accusative marker ta indicates that “two handkerchiefs” is the
patient. This interpretation does not contradict the structure of participants in
the event. Rather, the use of ri is unexpected in this context, where it cannot
be interpreted as reflexive or reciprocal. What function does ri perform? In
my view the Spanish impersonal se give us some clues. In this language the
impersonal pronoun se is homophonous with the reflexive–reciprocal pro-
noun se. Furthermore, both forms are often cliticized to the verb root. In this
context, the second clause in (53) seems to be a calque from the Spanish im-
personal construction in (54):
(54) se vendaba dos pañuelo-s.

impr bandage:pst two handkerchief-pl
‘They bandage two handkerchiefs.’
If we exclude the opposite word orders (Spanish VO versus Quechua OV),

both clauses show a one-to-one equivalence. From this point of view, ri is
neither a reflexive nor a reciprocal but expresses an impersonal agent just like
Spanish se.
The replacement of otherwise different morphemes for reflexive, recipro-
cal and impersonal with one and the same morpheme (ri) could be explained,
satsifactorily in my opinion, by contact with Spanish, in which language one
and the same paradigm serves the three purposes.
5.4. Clause linking: nominalization versus subordination
One of the most important contact-induced changes in contemporary IQ is

the increasing replacement of embedded nominalized constructions with hier-
archical, Spanish-modelled subordinated clauses. The subordination strategy
in IQ makes use of Spanish subordinators including: (a) relativizer que ‘that’
after verba dicendi; (b) relative pronoun lo-que ‘that (which)’; and (c) sev-
eral conjunctions like purki ‘because’ or si ‘if’. In this section I focus on the
subordination of complement clauses as objects of transitive verbs and leave
the discussion of the other types for Section 7.
The replacement of nominalization with subordination can be understood
best if we compare example (55) with the corresponding embedded construc-
tion in (56):
(55) paykuna-lla shaya-shpa paykuna apa-shka-n lo-que

3pl-lim arrive-ger 3pl take-ptcp-3pl that-which
muna-shka-n.
want-prf-3pl
(56) paykuna-lla chaya-shpa paykuna munashka-ta apa-shka-n.

3pl-lim arrive-ger 3pl want-ptcp-3 take-prf-3
‘Upon their arrival, they took what they wanted.’
The use of the compound pronoun lo-que ‘that which’ as a clause linker has a
number of effects on the morphosyntax of the IQ clause: (a) while the clause
munashkata in (56) is embedded in the main clause, munashkan in (55) is
postponed to the main clause and linked to it by the pronoun; (b) the em-
bedded construction in (56) is marked by accusative -ta and falls within the
scope of apa- ‘take’; (c) whereas the verb in (55) is finite, the verb in (56) is
non-finite; (d) Quechua OV word order in (56) is replaced by Spanish-like
VO in (55).
The foregoing use of lo-que shows that lexical borrowing of function words
may have important effects on the morphosyntax of the recipient language.
Other studies have shown a similar impact of Spanish conjunctions and ad-
verbials on the matrix of Meso-American languages such as Pipil (Campbell
1987) and Nahuatl (Hill and Hill 1986). I show further effects of Spanish
function words on IQ in Section 7.
Even if loan nouns and verbs make the bulk of Spanish borrowings in IQ, the
contribution of other parts of speech is not trivial and has considerable ef-
fects on the structure of the language. This section analyzes loanwords from
different parts of speech and discusses the extent of their influence on other
levels of IQ structure.
6.1. Numerals and quantifiers
While many IQ grammars and dictionaries boast a full set of native numer-
als from one to hundred, their use in spontaneous everyday conversation is
limited to ten in the best of cases. Above ten, speakers use Spanish numerals
even though it is not uncommon to use Spanish numerals also for five to ten.
Ordinal numbers come from Spanish without exception. The following ex-
ample illustrates cardinal and ordinal numbers from Spanish:
(57) iskuila-manda-ka llukshi-wa-rka ña trese añu-mi

school-abl-top leave-1s-pst already thirteen year-val
llukshi-rka-ni, kay kuartu gradu-manda ñuka-ka
leave-pst-1 dem fourth grade-abl 1s-top
‘I quit school when I was thirteen, I was there since the fourth grade.’
Apart from Spanish numerals, which are ubiquitous in any type of speech
genre, IQ has borrowed several Spanish quantifiers. Unlike numerals, loan
quantifiers have not replaced their native counterparts and may co-occur with
them in couplets for emphasis. The most frequent Spanish-derived quantifi-
ers are tuditu ‘all’ and alkunu ‘some’. The following examples illustrate their
use:
(58) alkunus chicha maltaca chaypi-mi wardaria-na.

some:(Sp)pl chicha beer-top there-loc-val keep-hab.pst
‘Some people used to keep chicha beer in there.’
(59) chaimanda-mi ñuka prisidente tukushpa.

therefore-val 1s president become-ger
alkunas kusas-kuna-ta allichi-shka-nchi.
some:(Sp)pl things-pl-acc improve-prf-1pl
‘Therefore, when I became president, we improved several things.’
(60) tuditu-mi tukuy-lla-mari tiya-naku-rka-nchi, mikuna-gu-ta

all-val all-lim-aff be.sitting-rcp-pst-1pl food-dim-acc
miku-shpa tiya-ura-mari
eat-ger there.be-when-aff
‘At lunch time every one of us was sitting together.’
(61) ñakutin impresa tuku-rka-nchi chaymanda kumpra-rka-nchi

then business begin-pst-1pl therefore buy-pst-1pl
tuditu asinda
all estate
‘Then we started the business and bought all [the lands of] the
hacienda.’
Spanish quantifiers may be used either as noun modifiers or pronouns. Notice

that alkunu in (58) is a plural masculine pronoun while alkunas is a plural
feminine pronoun in (59). Both forms have been borrowed along with the
Spanish markers of number and gender. One might interpret alkunas in (59)
as a code switch to the extent it is accompanied by another Spanish bor-
rowing, i.e. kusas ‘things’. Both borrowings would thus form a noun phrase
inserted in the morphosyntactic frame of Spanish. But this is not the case
because IQ markers occur in the same phrase.
In (60) and (61) tuditu occurs without Spanish or IQ morphology. In (60)
tuditu co-occurs in a couplet with IQ tukuy ‘all’, with validator mi and affirm-
ative mari emphasizing the idea of inclusivity in the noun phrase. Whereas
examples (58) to (60) follow OV word order, the second clause in (61) has
VO word order. Notice that no accusative marker occurs on the direct ob-
ject tuditu jazinda in (61). Arguably, the co-occurrence of lexical borrowings
eventually influences IQ morphosyntax.
6.2. Pronouns
In the last section I showed that Spanish-derived quantifiers are used as pro-
nouns in IQ. In this section I show that the influence of Spanish on the pro-
nominal paradigm of IQ goes beyond. A well-documented change involving
pronouns in IQ concerns the use of native adjective kikin ‘proper’ as a polite
second-person pronoun. Politeness in IQ is marked on the verb by means of
the honorific affix -pa-, as shown in (62) below. Consider the following ex-
change extracted from an interview:
(62) a. maijan iskuila-pi-tak ka-pa-rka-ngi.

which school-loc-aff be-hon-pst-2s
‘Which school did you go to?’
b. ñuka-ka iskuila T.H. kay kumunidad Uksha llakta-pi-mi
1s-top school T.H. dem community Uksha village-loc-val
ñuka-ka yachaju-pa-rka-ni.
1s-top study-hon-pst-1s
‘I went to school T.H. in this village of Uksha.’
Compare this strategy with the use of pronoun kikin in (63), where it co-
occurs with the honorific affix. Kikin can be used in plural and receive any
of a set of nominal markers. Kikin is also the base form of the possessive
pronoun kikinpa.
(63) kikin-ta tapu-gri-pa-ni may-manda-ta ka-pa-ngi,

2s.hon-acc ask-inch-hon-1s where-abl-int be-hon-2s
kikin-pa shuti-gu-ta-pash willa-shpa ali ka-pa-ngi-man
2s.hon-poss name-dim-acc-adit inform-ger good be-hon-2s-cond
‘I am going to ask you where you come from and be so kind to tell us
your name as well.’
This development is typical of IQ and other highland varieties of Ecuadorian

Quechua (CIEI 1982: 107). Arguably, kikin developed in the early stages of
contact with Spanish, when social relations between Spaniards and Indians
were modelled on a hierarchy of castes. Nowadays kikin is falling into disuse,
being preserved only in conservative sociolects. The reason to hypothesize a
contact-induced change in this case is the existence of a pronominal paradigm
based on politeness distinctions in local Spanish. The intensity of contact and
the higher levels of bilingualism among IQ speakers are two influencing fac-
tors. Why a similar development is not attested in other varieties of Quechua
remains an open question.
The presence of Spanish in the pronominal paradigm includes the subset of
interrogative pronouns. Spanish-derived ura(s) ‘hour(s)’ is suffixed to the in-
terrogative lexeme ima ‘what’ to form the loan blend imauras ‘when’, ‘at what
time’. The same form is used in indirect questions, as illustrated in (64):
(64) paykuna-ka yacha-n imauras-mi yaku-ka chiri chiri

3pl-top know-3 when-val water-top cold cold
ka-shka-ta-pash imauras-mi yaku-ka kunuc-lla ka-n.
be-prf-acc-adit when-val water-top warm-lim be-3
‘They know when the water is very cold and when it is warm.’
The use of imauras is widespread across generations and levels of bilin-

gualsm, which leads us to assume a comparatively earlier introduction of this
form. On the contrary, the use of a pronominal duplet involving first-person
pronoun ñuka and cross-reference marker wa is a late result of contact. Con-
sider the following example:
(65) wambra kashpa makanaju-ria-ni wambra-pura-kuna y chay

youngster be-ger fight-dur-1s youngster-com-pl and that
ñuka-ta kashtiga-wa-ria-n.
1s-acc punish-1s.obj-hab-1
‘When I was young, I used to fight with other boys and they [his par-
ents] punished me.’
Duplicated structures involving person reference serve contrastive purposes.

However, the double marking of person in (65) suggests no emphatic or con-
trastive readings. We cannot ascribe this type of double marking to contact
with full certainty. The loss of verb-object agreement markers in Ecuadorian
Quechua was internally motivated, for which reason it is safer to view Span-
ish as a trigger, not as a cause.
6.3. Discourse markers and adverbial particles
The number of Spanish particles and discourse markers in contemporary IQ

deserves special attention. A broad inventory includes the following categor-
ies: connectors, adverbial clause markers, time deictics and discourse mark-
ers. Examples of several of these function words were presented in different
parts of this chapter. The discussion in this section focuses on connectors,
time deictics and discourse markers. Spanish adverbial clause markers are
addressed in Section 8.
The borrowing of Spanish connectors includes additive y ‘and’, contras-
tives o ‘or’ and dino (from Spanish de no ‘if not’) and disjunctive pero ‘but’.
These connectors are used to coordinate sentences and smaller constituents.
IQ has an additive of its own but lacks a function word to express contrast
and disjunction. Spanish y is ubiquitous in IQ discourse while additive pash
continues to be used in a variety of contexts. Unlike the Spanish conjunction,
pash cannot occur in the first conjunct. The following example from Cole
(1982: 79) illustrates the difference:
(66) (*y) ñuka(-pash) kamlla-ta gushta-ni;

and 1s(-adit) toasted-corn-acc like-1s
(y) ñuka pani(-pash) kamlla-ta gushta-n;
and 1s sister(-adit) toasted.corn-acc like-3
y ñuka wawki(-pash) kamlla-ta gushta-n.
and 1s brother(-adit) toasted-corn-acc like-3
‘I like toasted corn, my sister likes toasted corn, and my brother likes
it too.’
Examples of contrastive o and dino are given in Cole (1982: 80). This author
considers both connectors equivalent. However, the following example from
our corpus shows that this is not necessarily the case and very often both
occur as a single conjunctive similar in meaning to the Spanish expression o
si no ‘unless/instead’. This is illustrated in (67), where marker ka marks the

topic of the previous sentence.
(67) kulki-gu-ta paykuna apamu-n

money-dim-acc 3.pl bring-3 or else
o dino-ka kuinta-man dipusita-mu-n.
or if.not-top account-dat deposit-ctrp-3
‘They bring the money home or deposit it in the account instead.’
The Spanish disjunctive pero ‘but’ is another connector of frequent occur-

rence in IQ. It differs from additive and contrastive connectors in that it links
sentences only:
(68) ashta yarijay yarijay ka-shkan-ka nin pero payka ali

much famine famine be-prf-top rep but 3s-top good
jatu-k kashkanga nin.
sell-nmlz be-prf-top rep
‘It is said that there was a lot of famine, but it is said that he still sold
well.’
The disjunctive often co-occurs with Spanish time adverbs such as intonses
‘then’, siympre ‘always’, nunca ‘never’ or antes ‘before’. It remains to inves-
tigate whether we are dealing here with borrowings or code switches. Some
Spanish connectors have been incorporated to IQ discourse without assimila-
tion. The lack of phonetic assimilation cannot be attributed to a recent history
of borrowing but to the fact that these loanwords must be perceptually salient
in native discourse.
Spanish time adverbs in IQ include all the days of the week. Times of the
day show a mixture of native and borrowed lexicon, as shown in Table 1.
Other time adverbs from Spanish are aura ‘nowadays’ (< Sp. ahora), intonses
‘then’ (< Sp. entonces), and siympre ‘always’ (< Sp. siempre). These ex-
amples show Spanish adverbs used as time deictics in IQ (see (69)–(71)).
Table 1. Times of the day in Imbabura Quichua

Spanish-derived Quechua native English
mañana --- morning
tařdi chishi afternoon
nuche tuta evening/night
(69) chayka chay ladu-kuna-man-lla-mi ashtaka siympre chiri chiri

thus that side-pl-all-lim-val much always cold cold
ka-na-ta yacha-n.
be-inf-acc know-3
‘So it is always very cold around those places.’
(70) intonses chay tanda-kuna-ta kara-k ka-rka genti-man,

then that bread-pl-acc give-hab.pst be-pst people-dat
genti-man, gañan genti-man, chai jatun tanda.
people-dat hacienda.worker people-dat dist big bread
‘At that time they used to give those big pieces of bread to hacienda
workers.’
(71) aura-pi-mari nachu fishta-kuna-pi-ka nachu

today-loc-aff neg.int festival-pl-loc-top neg.int
chay kuitis-shina-lla rucu-ta ninanta reventa-chi-n.
that rocket:pl-like-lim old-acc much explode-caus-3
‘Nowadays, in the festivals, they have lots of those old fireworks.’
Aura is different from other adverbs in that it occurs with further markers
including locative pi (cf. 77), topicalizer ka and affirmative mari. Notice that
the original Spanish lexeme ahora means ‘now’, ‘today’ and ‘nowadays’.
Only the third meaning has been preserved in IQ. The other two are covered
by the native lexeme kunan.
Intonses can be used also as a discourse marker in sentence boundaries.
In fact, the latter use is more frequent in IQ. When used as discourse marker,
intonces does not refer to a specific point in time but signals a succession of
events as shown in the following example:
(72) intonses chaymantaka kunan banda Santa Marianita nishkami, chay

patronpa asinda korredorpi, intonses chaypi tokanajuna hashta
kolonpamba, intonses chayta ña karashka jipaka, hashta hashtami
bailak kana, jari, huarmi, intonses karashka jipaka amozeras mote
yanushcamantaima carana.
‘And then the band ‘Santa Marianita’ stayed in the front yard of the
hacienda, and then they played very hard, and then men and women
danced all together, and then the servants gave toasted corn to the
people.’
Table 2. Spanish discourse markers in Imbabura Quichua

Spanish-derived Meaning
osea that is
intonses then
buino well
este this
diai and then
klaro of course
Compared to other function words borrowed from Spanish, discourse mark-

ers show a modest rank in terms of frequency. Table 2 includes some Spanish
discourse markers in IQ from the most frequent to the least frequent.
Occasionally discourse markers are followed by the Spanish conjunctive
que. Examples of this are o sea que ‘which means that’ and claro que ‘of
course (that)’. Given that the foregoing markers are prolific in local Spanish
discourse, their frequency in this language explains partly their frequency
in IQ. Nonetheless, the primary motivation of their prolific borrowing is the
pragmatic dominance of the donor language with respect to the recipient lan-
guage (Matras 1998: 285). This dominance is clear for the contact situation
between Spanish and Quechua in the Andes, where both languages are in a
diglossic relation in respect to each other, with Spanish as the dominant lan-
guage and vehicle of literacy.
6.4. Adjectives and adverbs
In this section I discuss adjectival and adverbial expressions of comparison to

the extent of their influence by Spanish. Other issues concerning adjectives
and adverbs such as the incorporation of loan adjectives along with number
and gender markers or the borrowing of manner adverbs have been addressed
in previous sections. Given that the system of parts of speech in IQ makes
no distinction between adjectives and adverbs (Schachter 1985: 17; Gómez-
Rendón 2006), the following discussion is valid also for adverbial compari-
son, even if examples are not provided for lack of space.
In his description of Imbabura Quichua, Cole (1982: 93) gives the follow-
ing example to show the way in which adjectival and adverbial expressions
of comparison is made in IQ:
(73) Tumas-ka [Marya-ta yali-j] ali trabaja-n.

Tomás-top [María-acc surpass-nmlz] good work-3
‘Tomás works better than María.’
In (73) the standard of comparison is an embedded clause nominalized by the

agentive nominalizer -j while the subject of the main clause is the compared
element. The basis of comparison is expressed by the main predicate. The
connection between both elements is made by the verb yali ‘surpass’. How-
ever, the comparative construction in (73) is not the only one available in IQ,
where it is associated with the most conservative speakers. Other construc-
tions attested are illustrated below:
(74) chay jipa-ka Gonzalo rura-shpa saqui-na [pero más claro]cs-Sp

that after-top Gonzalo make-ger leave-hab [but clearer]
Galo ashtawan14 yali-shpa rura-k ka-shka ni-n
Galo more surpass-ger do-nmlz be-prf rep
‘Afterwards Gonzalo stopped making [the festival] but they say that
Galo used to make better [festivals than Gonzalo].’
(75) siempre runa-kuna rikunaju-shka-nchi maijan hospital-mi
always Indian-pl visit-prf-1pl which hospital-val
ashtawan mas maltratoka, Otavalo hospitalmi
more more mistreatment-top Otavalo hospital-val
ashtawan-ga yali maltrato tiya-shka.
more-top surpass mistreatment there.be-prf
‘We Indian people have always visited hospitals that mistreat pa-
tients, the hospital of Otavalo mistreat patients more [than others].’
(76) Pidro uchilla ka-n, tuditu wawa-kuna-ta gana -n
Pidro little be-3 all child-pl-acc win-3
‘Pidro is smaller than all the other children.’ (lit. Pedro is small, he
wins all the children)
The alternative constructions of comparison differ from the traditional strat-

egy in several ways. Compare first (74) above. This construction differs from
(73) in that the standard of comparison in (74) precedes the compared elem-
ent without participating in the yali predicate. In addition, yali in (74) is not
nominalized but subordinated by means of the gerund marker. The compari-
son between both elements is made explicit by a code-switched connective
phrase pero más claro ‘but clearer’. In example (75) the standard of compari-
son is implicit or inferred from the preceding discourse. Although ashtawan
and yali occur in the second clause, the latter appears without extra marking.
Finally, (76) shows an innovative construction where the clause containing
the compared element and the clause containing the standard of comparison
occur one after the other without a coordinator in between. In this construc-
tion the loan-verb gana ‘win’ has replaced native yali ‘surpass’.
In this gamut of alternative constructions it is possible to trace a continuum
from the traditional IQ construction in (73) to the most hispanicized structure
in (76). Construction (76) has been reported also for Imbabura Media Lengua
(Gómez-Rendón 2001: 197) and other mixed varieties in Ecuador (Muysken
1997: 397).
7. Constituent order and syntax
The affluence of Spanish loanwords in IQ goes hand in hand with less vis-
ible changes at the levels of the clause (constituent order) and the sentence
(syntax). Although syntactic developments are not necessarily explained by
lexical borrowing, the co-occurrence of Spanish lexical borrowing and syn-
tactic calquing on the model of Spanish suggests a close relation between
these phenomena.
In syntax pattern borrowing prevails over matter borrowing, even though
the former often implies the latter. Thus, for example, subordinated construc-
tions (instead of nominalized embedded clauses) imply the borrowing of
Spanish conjunctions. Several issues related to word order have been ad-
dressed in previous sections and will be not discussed here. An inventory
of syntactic contact-induced changes in IQ includes: (a) Spanish SVO word
order in declarative sentences and the replacement of topicalizer ka with
focus particle mi; (b) Spanish SVO word order in non-verbal predicative
constructions with copulas; (c) an ongoing shift from relative clause–head
to head–relative clause order mediated by interrogative pronouns used as
relative markers; (d) question formation on the basis of unmarked declara-
tive sentences with Spanish-like interrogative intonation contours; and (e)
the borrowing of Spanish subordinators and the replacement of nominalized
clauses with adverbial subordinated clauses. Apart from these undisputed
contact-induced changes, there are other minor developments in IQ not in-
cluded here on account of their limited frequency.
7.1. Word order in declarative sentences and non-verbal predications
In IQ the verb always occurs in sentence-final position, being immediately

preceded either by the subject in intransitive constructions or the object in
transitive constructions. There exists a clear tendency nowadays to Spanish-
like SVO word order associated with the drop of topicalizer ka and/or the
replacement thereof with the focus particle mi. Consider the following ex-
amples from Fauchois (1988: 117):
(77) kallari-naku-nchik shuk mushuk semana-ta.

begin-recp-1pl one new week-acc
‘We all start a new week.’
(78) churamu-kri-n shuk ley.

put-inch-3 one law
‘They are going to pass a law.’
(79) kankuna huasipi kati-nchi kay programa.

3pl house-loc follow-1pl this programme
‘We watch this programme at their place.’
Example (77) shows SVO word order but marks the direct object with accu-
sative ta. On the contrary, examples (78) and (79) not only have Spanish-like
SVO word order but also lack the accusative marker on the direct object. Ac-
cording to Fauchois “[the new SVO word order] is almost systematic if the
object is a nonce borrowing” (1988: 117; my translation). A further syntax-
related change induced by contact is the drop of the topicalizer and the even-
tual replacement thereof with the focus particle mi. This change is visible in
non-verbal predicative constructions involving a copula in SV word order.
Consider the following example from a interview:
(80) bueno ñuka shuti-mi kapan Roberto ñuka-mi ka-pa-ni

well 1s.poss name-val be-hon-3 Roberto 1s-val be-hon-1s
Chaupi Inti Caluquí llacta-manta.
Chaupi Inti Caluquí community-abl
‘Well, my name is Roberto Tocagón and I come from Chaupi Inti
Caluqui.’
In (80) the interviewee uses SV word order and drops the topicalizer ka on the
subject of both sentences, instead of which he uses the focus marker (ñuka
shuti-mi; ñuka-mi). The replacement of the topicalizer is common in non-
verbal predicative constructions involving copulas. According to Fauchois,
the drop of the topicalizer and its replacement with the focus marker is due to
the fact that Quechua lacks pre-established models to present the information
in long-distance communication and makes use of Spanish models (1988:
119). However, the replacement in (80) occurs in contextualized face-to-face
speech. Spanish influence is obvious but the effects go beyond syntactic cal-
quing. Therefore, this change may be associated with a new structure of the
evidential system. Given that mi marks focus and first-hand information, the
use of this marker as a topicalizer results in the loss of evidential marking.
This explanation should be supported with additional data to be conclusive.
7.2. Head–relative clause order and relative pronouns
IQ lacks relative pronouns. Relative clauses are embedded nominalized con-

structions preceding their heads as illustrated in the following example:
(81) llaki-yuk-kuna-ta kulki-ta tapu-shpa yanapa-k runa.

problem-poss-pl-acc money-acc ask-ger help-nmlz man
‘Man who helps people with problems by asking them for money.’
The loss of nominalization strategies discussed in Section 5.4 has resulted in

the creation of relative clauses following heads. Relative clauses and heads
are linked by interrogative pronouns used as relative pronouns. Consider
these examples:
(82) tukui llakta-kuna-pi may kay ratu puñu-ku-n

all village-pl-loc where this time rest-dur-3
‘In all the village where people are sleeping now.’
(83) tauka mamita-kuna pi-kuna-mi kay mineros-pak

several mother-pl who-pl-val this miner:pl-poss
warmi-kuna ka-n.
women-pl be-3
‘Several mothers who are the wifes of these miners.’
(84) tandanakuy ima-ta rurashka kay kabildu.

meeting what-acc make-prf this council
‘The meeting (that) this council celebrated.’
Different from adverbial clauses with Spanish conjunctions, the foregoing

clauses use IQ pronouns in the function of subjects (83)–(84) and objects
(84). These pronouns may be pluralized (83) or receive case marking (84).
The verb in the relative clause is finite and receives tma marking. The result-
ing clause is closely similar to a Spanish relative clause (Fauchois 1988: 113).
Whereas these constructions are typical of the speech of bilinguals, they are
not uncommon in conservative idiolects.
In Section 5.1 we discuss the use of Spanish verba dicendi in quotative
constructions. A parallel development in contemporary IQ is the use of Span-
ish relativizer que ‘that’ after the IQ verb ni- ‘say’, as illustrated below:
(85) shinallatak ni-n que gallu-ta yali-shpa-ka osea yanapa-n nin.

however say-3 that cock.acc pass-ger-top that.is help-3 evd
‘However they say that people help you organize the rooster festival.’
In (85) the Spanish relativizer que heads the complement clause of the finite
verb form nin, not to be confused with the evidential form nin in sentence-
final position. The fact that the finite verb nin co-occurs with the Spanish
relativizer suggests that it is not an evidential but a verbum dicendum whose
main function is to reinforce the reportative meaning of the evidential.15
7.3. Question formation: dropped markers and Spanish intonation
Yes-no questions in IQ are formed by the suffixation of interrogative -chu to

the focalized constituent of the sentence, without any particular word-order
or intonation contour marking the interrogation. This strategy is in the fol-
lowing examples:
(86) a. kaya-ka pay shamu-nka-chu.

tomorrow-top 3s come-3.fut-int
‘Will he/she come tomorrow?’
b. kaya-ka pay-chu shamu-nka.
tomorrow-top 3s-int come-3.fut
c. kaya-chu pay-ka shamu-nka.

tomorrow-int 3s-top come-3.fut
Yes-no questions in Spanish are formed by moving the main verb to sentence-
initial position and/or giving an interrogative intonation to the questioned
element. These strategies have been adopted by IQ speakers. In more con-
servative sociolects Spanish intonation co-occurs with interrogative marker
chu; in more Hispanicized ones, the interrogative marker is dropped, and
declarative sentences are distinguished from their interrogative counterparts
either by inverted verb–subject order with interrogative intonation (87a) or
by intonation only (87b, c).
(87) a. shaMU-nka pay kaya-ka.

tomorrow-top 3s come-3.fut-int
b. kaya-ka PAY shamu-nka.
tomorrow-top 3s/int come-3.fut
c. kaYA-ka pay shamu-nka.
tomorrow-int 3s come-3.fut
Spanish also influences the formation of wh-questions in IQ. The typical

order of wh-questions in IQ is WH + SOV, as shown in (88) below. This im-
plies that interrogative and declarative sentences share the same word order:
(88) ima-ta-tak paykuna-ka rura-n.

what-acc-int 3pl-top make-3
‘What do they make?’
This order is often inverted in wh-questions in contemporary IQ, with the

main verb following the wh-word and followed in turn by the subject as in
(89) and (90):
(89) tandanakuy parti-manta rima-shpa, kikinkuna-ka imashina-ta

meeting part-abl speak-ger 2.pl-top how-int
winachi-shka-ngichi chay organisasion-ta-ka.
create-prf-2pl that organization-acc-top
‘Concerning meetings, how did you create the organization?’
(90) kikinkuna yuyay-pi ima-shi ka-n shuk grupo ni-shka.

2.pl.poss thought-loc what-dub be-3 one group say-ptcp
‘In you opinion, what might this so-called group be?’
As shown in (90), the inverted word order occurs also in non-verbal predica-
tive wh-questions involving a copula. The use of the copula itself is a calque
from Spanish, because IQ requires no copulative verb in such cases. From
the frequency of constructions like those illustrated above we conclude that
IQ has calqued the syntactic pattern of Spanish, in which the verb follows
the wh-word.
7.4. Adverbial clauses: Spanish subordinators and the loss

of nominalization
In Section 5.4 I showed that IQ nominalized constructions are being gradual-

ly replaced by subordinated clauses on the Spanish model. A related develop-
ment is the use of Spanish subordinators including lo-que (cf. 5.4), relativizer
que (cf. 7.2.) and conjunctions of causal relation (porque), condition (si) and
concession (mas que). In this section I discuss the use of Spanish conjunc-
tions in adverbial subordinated clauses. Consider the following example:
(91) ñukanchik ishka-ndin yachachik-kuna-ka rimanchik-yarin, pero si

1pl two-com treacher-pl-top speak-1pl-aff but if
tapu-nchik ñukanchik shuk-lla shimi-pi yachakuk-kuna-ta
ask-1pl 1pl.poss one-lim language-loc student-pl-acc
mana intindi-nga-chu porque paykuna-pa nima mana ka-n-chu.
neg understand-3-neg because 3.pl-dat nothing neg be-3-neg
‘We as teachers speak [IQ] indeed, but if you ask students in our lan-
guage, they do not understand, because it means nothing to them.’
In (91) the Spanish conditional si ‘if’ is used instead of the verbal suffix -kpi
for non-coreferential subjects. Notice the adversative conjunction pero in the
same example. The word order in the conditional sentence is SVO instead
of SOV. The last clause indicates a causal relation. It is marked by Spanish
porque ‘because’ and not by the IQ suffixes -manta or -rayku. The Span-
ish subordinator porque never co-occurs with its IQ counterpart (the suffix
-manta). On the contrary, conditional como16 does co-occur with native suf-
fixes -kpi or -shpa. Example (92) illustrates this case of doubling.
(92) chayka como yapa alpa-ta charishpaka, kay-kaman-mi

then because too land-acc have-ger-top this-all-val
ka-shka kan chay shuk hacienda.
be-prf be-3 that one hacienda
‘As the hacienda had a lot of land, it reached up to this area.’
Another loan subordinator, for concessive adverbial clauses, is maske ‘al-

though’, from Spanish mas que, as illustrated below:
(93) maske ñuka ashta yapatalla wasi-pi rima-kpi-pash,

though 1s too much-acc-lim house-loc speak-ger-adit
ñuka mama wasi-pi solo kichwa rima-n
1s.poss mother house-loc only Quecuha speak-3
‘Even though I speak too much [Spanish] at home, at my mother’s
home they speak only Quechua.’
This compound conjunctive co-occurs with additive -pash on the main verb.
In (93) the verb of the concessive clause carries the suffix -kpi, but this is ra-
ther uncommon.
Spanish subordinating conjunctions are frequent in contemporary IQ and
their use is widespread across generations and levels of bilingualism. How-
ever, their co-occurrence with native suffixes is more frequent in conserva-
tive dialects. In innovative varieties, finite verbs occur without native suffixes
more often than not. The fact that subordinated conditional clauses without
suffixes of coreferentiality makes innovative varieties closer to Spanish. In
general, the loss of nominalization and other morphosyntactic changes asso-
ciated with it is a gradual process, the stages of which can be found in differ-
ent idiolects within the same speech community.
8. The lexicon
The influence of Spanish on the lexicon of IQ involves all semantic fields,

from kinship and household to education and administration. According to
the results from a corpus of spontaneous speech collected in Imbabura, the
presence of Spanish borrowings in IQ amounts to nearly one fifth of the total
number of lexemes (21%). However, the contribution of Spanish borrowings
to the native lexicon is not the same across idiolects, with those of older gen-
erations showing less influence than those of younger, more bilingual speak-
ers. As for the type of Spanish borrowings, all lexical classes except pronouns
and adpositions are borrowed, though in different numbers. Nouns are by far
the largest lexical class (55%), followed by verbs (16%), adjectives (8%) and
adverbs (2%). The contribution of function words is not unimportant, with a
total of 17 percent of tokens including mainly conjunctions, discourse mark-
ers, interjections, numerals and frozen borrowings. In general, frozen borrow-
ings are distinguished from code switches on the basis of their phonological
assimilation and their integration into IQ morphosyntax. A large number of
these borrowings are idioms and situation-bound formulaic expressions for
greeting, thanking, requesting and the like. A thorough analysis of Spanish
lexical borrowing in IQ and the ways of integration of Spanish loanwords into
the native system of parts of speech has been carried out elsewhere (Gómez-
Rendón 2006a).
9. Conclusion
Quechua and Spanish have a history of four hundred years of contact in the
Andes. The intensity of contact has substantially increased in the last cen-
tury as a result of the expanding power of the nation-state and the diffusion
of media in rural areas. The existence of higher levels of bilingualism in Im-
babura has strengthened the influence of the dominant language on the lexi-
con and the grammar of IQ. The outcome is a strongly hispanicized variety of
Quechua. Such variety appears to be very adaptive to the new communicative
settings imposed by modern society. In fact, contemporary IQ is a living lan-
guage after four centuries of contact because it succeeded in making a com-
promise between the communicative needs imposed by the official language
and the speakers’ cultural need to preserve their linguistic identity.
Notes
1. With no linguistic census available, this is only a reasonable estimate. Ethno-

logue gives a number of 300,000 speakers in 1977, which is evidently an exag-
geration considering that the whole population of Imbabura (i.e. Mestizos and
Indians) hardly reached 250,000 people by 1982. (INEC 2001).
2. As a matter of fact, nonce borrowings are much less integrated to IQ phonology
and may be considered cases of insertional code-switching (cf. Muysken 2000a:
32). Furthermore, it appears that phonetic assimilation into native patterns goes
hand in hand with grammatical accommodation, as noted by Fauchois (1988: 92).
3. This is typical of Ecuadorian dialects. Peruvian and Bolivian dialects show di-
vergent patterns.
4. Another possible candidate is the prefix la-. In Argentinean Quechua (Santiago
del Estero) laya occurs before all kinds of nouns and has the meaning “type of ”.
However, it is neither phonetically reduced nor cliticized. Likewise, IQ speakers
use laya with all nouns except kinship terms, in which case the short form lais
used, “indicating a type of kinship following the original” (cf. CIEI 1983: LVI;
my translation). Interestingly, the word laya is obsolete in Ecuadorian Span-
ish, except in some archaic varieties spoken in rural areas. The case of lais all
the more exceptional because no prefixes exist in IQ, nominal and verbal mor-
phemes being all suffixes. An alternative analysis is that lais a reduced (gram-
maticalized) form of the verb illa-c ‘be.missing-ag’.
5. An explanation in the frame of the model of “embedded language islands”
(Myers-Scotton 2002: 139ff).
6. Notice that IQ has no articles to mark definiteness and uses the topicalizer ka for
definite referents.
7. Interestingly, example (22) shows emphatic mi occupying the position typically
assigned to topic marker ka. As shown in Section 7, the fact that emphatic and
focus markers usually swap places in modern IQ is a result of the structural re-
organization of the language under Spanish influence.
8. Cole gives the label “subjunctive” for ngapaj and chun alike. I prefer to call
them ‘purposives’ because of their original meaning in IQ.
9. For a study of evidentiality in Ecuadorian Quechua in the frame of Functional
Discourse Grammar, see Gómez-Rendón (2006b).
10. This morpheme has at least five different realizations in Ecuadorian dialects
(CIEI 1983: XLI): [naku], [naju], [nau], [na], [nu]. The example given by
Muysken comes from Lowland Ecuadorian Quechua, a group of dialects spo-
ken in the provinces of Napo and Pastaza where the same behaviour of naku as
verbal plural marker is observed.
11. My analysis here differs from the one provided by Cole (1982) who claims that
“-naku does not express reciprocity but rather joint action of some kind [and]
this action may be, but is not necessarily, reciprocal” (1982: 92f). Coles accepts,
however, that this marker can be used as an emphatic verbal pluralizer.
12. http://usuarios.arnet.com.ar/yanasu/main.htm. Dated 21 June 2006.
13. According to CIEI (1983), “the reason for the use of this morpheme may be an
assimilation to the verbal plural marker in a process of metathesis, .i.e. the inver-
sion of syllables” (CIEI 1983: XL; my translation).
14. The IQ adverb ashtawan ‘more’ may occur in the embedded clause as in (74) or
co-occur with Spanish más ‘more’ as in (75).
15. This is probably due to the semantic bleaching of the evidential in the context of
a new information structure, where less emphasis is placed on evidential values,
following the model of Spanish discourse.
16. orque and como have a causal meaning but only como clauses may be accompa-
nied by IQ markers.
References
Adelaar, Willem and Pieter Muysken

2004 Languages of the Andes. England: Cambridge University Press.
Alderetes, Jorge R.
2001 El Quechua de Santiago del Estero. In: Jorge R. Alderetes (ed.), El
Quechua en Argentina, URL: http://usuarios.arnet.com.ar/yanasu/
main.htm.
Bakker, Dik, Jorge, Gómez-Rendón, and Ewald Hekking
Forthc. Spanish meets Guaraní, Otomí and Quichua: A multilinguals confron-
Brody, Jill
1987 Particles Borrowed From Spanish as Discourse Markers in Mayan
Languages. Anthropological Linguistics 29: 507521.
Büttner, Thomas
1993 Uso del quichua y el castellano en la Sierra ecuatoriana. Quito: Edi-
ciones Abya Yala.
Caillavet, Chantal
2001 Etnias del Norte. Quito: Editorial Abya Yala
1987 Syntactic Change in Pipil. International Journal of American Linguis-
tics 53 (3): 253280.
1993 On proposed universals of grammatical borrowing. In: Robert Jeffers
(ed). Selected papers of the Ninth Conference on Historical Linguis-
tics. Amsterdam: John Benjamins.
Cerrón-Palomino, Rodolfo
1987 Lingüística Quechua. Cusco: Centro de Estudios rurales Andinos Bar-
tolomé de Las Casas.
CIEI
1982 Caimi Ñucanchic Shimiyuc-Panca. Quito: Ministerio de Educación y
Cultura, PUCE.
1983 Ñucanchic llactapac shimi. Quito: Ministerio de Educación y Cultura,
PUCE.
Cole, Peter
1982 Imbabura Quechua. Amsterdam: North-Holland Publishing Company.
Fauchois, Anne
1988 El quichua serrano frente a la comunicación moderna. Quito: EBI-
Abya Yala.
Gómez-Rendón, Jorge
2001 La deixis pronominal en la media lengua de Imbabura: comunidades
de Casco Valenzuela y El Topo. MA Thesis. Departament of Applied
Linguistics, Pontificia Universidad Católica del Ecuador.
2005 La media lengua de Imbabura. In: Pieter Muysken and Hella Olbertz
(eds.), Encuentros y conflictos. Bilingüismo y contacto de lenguas en
el mundo andino, 3957. Madrid: Vervuert Iberoamericana.
2006a Condicionamientos tipológicos en los préstamos léxicos del castel-
lano: el caso del quichua de Imbabura. Actas del XIV Congreso del
ALFAL2005, Monterrey: ALFAL.
2006b Interpersonal Aspects of Evidentiality in Ecuadorian Quechua. In:
Miriam van Staden and Umberto Ansaldo (eds.), ACLC Working
Papers 1: 3750.
Granda, German de
2001 Estudios de Lingüística Andina. Lima: Pontificia Universidad Católica
del Perú.
Haboud, Marleen
1998 Quichua y Castellano en los Andes Ecuatorianos: Los efectos de un
contacto prolongado. Quito: Ediciones Abya Yala.
Hengeveld, Kees, Jan Rijkhoff, and Anna Siewierska
2004 Parts-of-speech systems and word order. Journal of Linguistics 40 (3):
527570.
Hill, Jane, and Kenneth Hill
1986 Speaking Mexicano: The Dynamics of Syncretic Language in Central
Mexico. Tucson: University of Arizona Press.
Instituto Nacional de Estadísticas y Censos
2003 Censo de población y vivienda 2001. Quito: INEC.
Kaarhus, Randi
1989 Historias en el tiempo, historias en el espacio: Dualismo en la cultura
y la lengua quichuas. Quito: Editorial Abya Yala.
Matras, Yaron
guistics 36 (2): 281331.
Muysken, Pieter
1997 Media Lengua. In: Sarah Thomason (ed.), Contact Languages: A Wider
Perspective, 365426. Amsterdam: John Benjamins.
2000a Bilingual Encounters: A Typology of Code-mixing. Cambridge: Cam-
2000b Semantic transparency in Lowland Ecuadorian Quechua morphosyn-
tax. Linguistics 39 (5): 873988.
Schachter, Paul
1985 Parts of speech systems. In: Timothy Shopen (ed.), Language Typ-
ology and Syntactic Description. Volume 1: 361. Cambridge: Cam-
Grammatical borrowing in Paraguayan Guaraní
1. Background
Paraguayan Guaraní (henceforth PG) is a Tupi-Guaraní language spoken

by five million people in Paraguay and the Argentinean Province of Cor-
rientes. An overwhelming majority of speakers of PG are also speakers of
Spanish with different degrees of bilingualism. The century-long contact be-
tween PG and Spanish has resulted in high levels of bilingualism and con-
vergence. Contemporary Paraguayan Guaraní differs in several ways from
pre-contact Guaraní1 as a result of lexical and grammatical borrowing from
Spanish. Rather lately PG has been also in contact with other Indo-Euro-
pean languages such as German and Portuguese, but an evaluation thereof
falls outside the scope of this chapter. The present contribution focuses on a
one-to-one borrowing situation between Spanish and PG.2
Too often Paraguay is presented as a model bilingual society without
an accurate assessment of facts. Numbers show a somewhat different scen-
ario. According to the 2002 census,3 Guaraní monolinguals (27%) are sig-
nificantly more numerous than Spanish monolinguals (6.56%), particularly
in rural areas. Furthermore, the percentage of bilinguals is only 59, in other
words, less than two thirds of the country’s population. Differences are also
qualitative. Spanish and PG show complementary distribution across social
spaces, a situation that may be qualified as diglossic, with Spanish as the
dominant language (Meliá 1992). Paraguay is a unique case in Latin Amer-
ica, but its uniqueness is founded less on bilingualism than on the fact that
PG is the only Indian language spoken by non-Indian citizens as their moth-
er tongue.
As a matter of fact, rural PG shows less Spanish influence than urban
PG. However, Spanish is present in urban and rural varieties alike. Assum-
ing a clear-cut distinction between them is therefore inaccurate. Speakers
and local policy-makers often use the terms Guaraníete (‘true Guaraní’)
and Jopara (‘mixed Guaraní’) to take a stand before language use and iden-
tity. Similarly, the rural-urban division represents two opposite linguistic
ideologies (purism versus mixture) which are at the very heart of language
policies and politics in Paraguay.4 In what follows I make reference to both
varieties grouped under the label of ‘Paraguayan Guaraní’ (PG). When

contact-induced changes are attributed to only one group of the speech
community, the bilingual status of speakers is mentioned accordingly.
2. Phonology
Spanish loans have entered PG with or without assimilation, though unassim-

ilated items are more numerous than assimilated ones. To the extent that core
vocabulary is composed of an important number of unassimilated loans, the
sounds introduced through them have become part and parcel of the phono-
logical inventory of the language. If compared to pre-contact Guaraní, PG has
six new segments, /Φ, č, ð, ř, l, λ/. With the exception of /λ/, present in Span-
ish but probably also introduced through an Indian language (Gregores and
Suárez 1967: 89), the presence of the above-mentioned sounds in PG may
be attributed to Spanish only. Apart from being present in Spanish-derived
vocabulary, they occasionally occur in native items in the speech of younger
bilinguals. Segments /Φ, č, ř/ show the same primary articulation as native
phonemes /p, š, r/ but differ from them in their secondary articulation. Like-
wise, Spanish-derived /ð/ differs from its native counterpart in that the latter
occurs always nasalized in the cluster /nd/. Because /l/ and /λ/ do not have
native counterparts sharing place or manner of articulation, they can be con-
sidered, for all purposes, alien to the pre-contact inventory. As for consonant
alternations, a significant degree of free variation is found across idiolects be-
tween sound pairs /č/-/š/ and /l/-/r/. The vowel inventory of PG has remained
virtually untouched by Spanish, except for the tendency observed in some bi-
lingual children and young adults to either relax the tensed high central vowel
or pronounce it as the obstruent [γ]. This development is limited mostly to
urban lects. In all, we can safely state that the six-vowel set from pre-contact
Guaraní is preserved in the vast majority of speakers.
Suprasegmental phonology also shows effects from Spanish contact. Some
bilingual children and young adults do not (fully) nasalize affixes attached to
nasal roots as required by nasal harmony rules: e.g. the non-nasal recipro-
cal jajo occurs on nasal roots instead of its nasal allomorph ñaño. Similarly,
regressive or progressive nasalization does not always occur in the speech
of some bilinguals: e.g. mitãnguera ‘children’ is realized as [mitanguera] in-
stead of [mitãŋũẽrã].
Stress patterns also have changed as a result of contact: while assimilated
loans follow the native pattern “with stress in the last syllable, no matter in
Paraguayan Guaraní 525
which syllable the stress originally fell” (Gregores and Suárez 1967: 91), the
bulk of unassimilated loans preserve the primary stress in the same syllable as
in Spanish: thus, for example, one may find assimilated loans such as [kesú]
‘cheese’ beside unassimilated items like [bentána] ‘window’.
As regards syllable structure, unassimilated loans have introduced non-
canonical onsets and codas: clusters formed by a plosive plus a flap (e.g. /tr/,
/pr/); sibilant /s/ occupying coda positions in non-assimilated loans, particu-
larly in frozen borrowings with the Spanish plural ending (e.g. kosa-s-kuéra
‘things’).
3. Typology
From a typological perspective PG may be defined as polysynthetic. This

statement must be taken however with caution in view of the following an-
alysis. Consider the answer given in (1) to the question whether knowing a
second language is good:
(1) a. Chéve guarã nda-i-perhudisial-ri 5

1.obj for neg-3S.prs-detrimental-neg
b. re-mombarete-ve-hína pene arandu
2S-strengthen-more-prog 2poss knowledge
c. a-medida-que la ñe’ẽ rei-kuaa.
in.proportion.as dem speak 2S-know
‘For me it is not bad, [because] you strengthen your knowledge
inasmuch as you know the language.’
The foregoing example contains three Spanish borrowings: the adjective per-
hudisial ‘detrimental’; the complex conjunctive a medida que ‘in proportion
as’; and the singular feminine article la. Notice that Spanish adjectives and
nouns are commonly used in PG in syntactic positions other than the proto-
typical ones. I will return to the use of lexical borrowings in Section 9. The
use of the Spanish article is analyzed in detail in Section 4.3. For the moment
let us focus on the linking strategies in (1). Although (1b) is semantically
dependent on (1a), the causal relation made explicit in the English transla-
tion is only implicit in PG. In turn, (1c) is linked to (1b) by the Spanish con-
junctive a medida que ‘in proportion as’ indicating ‘fulfilled condition’ and
subordinating (1c) to (1b). Both linking strategies co-exist nowadays in PG.
However, if compared to pre-contact Guaraní, which shows a strong prefer-
ence for parataxis and postpositions instead of connectives, contemporary

PG has certainly lost some of its synthetic nature. In addition, if compared
to an equivalent construction in pre-contact Guaraní, clause (1c) seems to be
a syntactic calque of Spanish. In (1c) the predicate head kuaa ‘know’ has two
arguments, the second-person subject expressed by the prefix rei- and the ob-
ject la ñe’ẽ ‘the language’. The construction is fully grammatical in PG and
yet syntactically different from its more synthetic equivalent reñe’ẽkuaa ‘you
know how to speak it’, in which ñe’ẽ is incorporated to the verb stem while
only one argument remains explicit.
It is uncertain whether reduplicative structures found in PG are contact-
relevant, since other Tupi-Guaraní languages use reduplication as well. Be-
sides, reduplication is present also in Paraguayan Spanish – just like in other
Spanish varieties spoken in Latin America. The most likely hypothesis here
is mutual reinforcement of reduplication in a process of convergence.
Nominal structures are the category most influenced by Spanish. There are
five structures that show contact-induced changes: possession, number, de-
terminers, case marking, and local relations. I address each of them in this
section.
4.1. Possession
Paraguayan Guaraní has three ways to mark possession: possessor–possessed

juxtaposition (2); suffixation of ablative -gui to possessor (3); the use of the
Spanish preposition de between possessed and possessor (4).
(2) umi organización dirihente-kuéra ndive ro-ñe’ẽ

some organization leader-pl with 1pl.excl-speak
‘We speak with some leaders of the organization.’
(3) upépe o-ñe-monda-paite ore sy-kuéra-gui ryguasu

then 3-pass-steal-all 1pl.excl mother-pl-abl hen
‘At that time our mothers’ hens were all stolen.’
(4) Oĩ-há-pe guive o-je-gueraha preso padre de familia

3.be-rel-loc from 3-pass-take imprisoned parent of family
‘Since then parents of families were imprisoned.’
The construction in (3) is similar to possessed–possessor constructions fol-

lowed by ablative -gua. These constructions are used as fixed expressions
and show relatively low frequency of use (Guasch 1997: 62). Another less
common way to mark possession in PG is through the Spanish preposition
(4).6 Notice that the possessed and the possessor are Spanish loanwords. This
restriction suggests that we are dealing here with phrasal units borrowed from
Spanish. I return to the issue of phrasal borrowings in PG in the last section.
Two other contact-induced changes concerning possession need explan-
ation. One is the loss of consonant alternation on the possessed element in
young bilingual speakers, particularly in urban areas. Accordingly, posses-
sive constructions like (5a) occur instead of (5b):
(5) a. ko jagua hesa

dem dog eye
‘The dog’s eyes.’
b. ko jagua resa
dem dog eye
‘The dog’s eyes.’
The other change has to do with the distinction between alienable and inal-
ienable possession. Although PG has no morphological marker to distinguish
one kind of possession from the other, the distinction is expressed by syntac-
tic and lexical means, for instance, through noun incorporation (6a) or the use
of specific lexemes for body parts (7a). In contact varieties noun incorpora-
tion is being increasingly replaced by phrasal constructions (6b), and body-
part lexemes do not show consonant alternation, as shown in (7b).
(6) a. a-johei-ta che-juru.

1S-wash-fut my-mouth
‘I will wash my mouth.’
b. a-je-juru-hei-ta.
1S-refl-mouth-wash-fut
‘I will wash my mouth.’
(7) a. che rague

1S.poss hair
‘my hair’
b. che tague
1S.poss hair
‘my hair’
While the loss of consonant alternation in possessive constructions like (7a)

is a recent development in PG – it is ungrammatical in a number of lects – the
phrasal alternative to noun incorporation in (6b) is a choice with specific con-
notations, as shown by some authors (Velázquez-Castillo 1995: 555). There-
fore, Spanish contact has not produced the construction itself but restricted
both alternatives to one.
4.2. Number
Number is usually not marked in PG. The interlocutor depends mostly on

contextual cues for number distinction. If necessary, the ending kuéra is
added to the noun to indicate plurality (8a), except if a numeral or a quantita-
tive adjective precedes the head noun. At the same time an incipient tendency
is observed in PG to (double) plural marking on Spanish loanwords (8b), in-
cluding heads modified by numerals or quantifiers (8c).
(8) a. professor-kuéra-ndi kolehio tiempo-pe.

teacher-pl-com school time-loc
‘[I speak Spanish] with the teachers during the school.’
b. brasilero-s-kuéra a-ñemongeta hendi-kuéra heta vése.
Brazilian-pl-pl 1S-talk 3.com-pl many time.pl
‘Brazilian people, I talk with them many times.’
c. heta-iterei kampesino-kuéra nd-o-kuaa-i la kastellano.
many-very peasant-pl neg-3-know-neg dem Spanish
‘Many peasants don’t know Spanish.’
I have not found evidence of systematic use of the plural kuéra in the corpus.
There is positive evidence, however, that Spanish-derived lexemes quantified
by numerals or quantitative adjectives determine the obligatory use of the
Guaraní plural marker.
Another contact-induced development in PG related to an incipient mor-

phological distinction of number is the differential use of the Spanish article:
la for plural and singular and lo for plural.7 There are few examples in which
the presence of one or the other calls for a singular or plural reading. Com-
pare the following sentences:
(9) i-porã-iterei la o-ñe-mbo’e la-mitã-(kuéra)-me.

3.be-good-very pro 3-pass-teach art-child-(pl)-acc
‘It is good that he [the teacher] teaches the child/the children.’
(10) che a-segui va’ekue

1S 1S-follow pst
ko edukasión rehegua lo-mitã(-kuéra*) apyté-pe
dem education concerning art-people middle-loc
‘I continued to be with people on education issues.’
Example (9) is ambiguous if taken out of context: mitã may refer to a number
of children or to one specific child. In (10) the homophonous lexeme mitã
‘people’ is grammatical if and only if Spanish lo is precliticized. Further-
more, the plural marker may occur in (9) without added meaning, but not in
(10) where it is ungrammatical.
4.3. Determiners and pro-forms
Guaraní boasts a complex system of deictics used to mark not only definite-
ness but also spatial relations and other referential functions. Spanish articles
have been added to this system as determiners and pro-forms.
4.3.1. The Spanish article as determiner
The Spanish article may be used in PG as a determiner with both native and
non-native nouns. Not all Spanish articles are borrowed: only the feminine
singular la and the plural masculine/neuter los. The latter has dropped the
final /s/ to become simply lo. Of these two, lo is used quite rarely and with
plural nouns only. On the contrary, la is used very frequently and without
number distinction. Examples (9) and (10) above are illustrative of both uses.
In the function of determiner, la and lo are pronounced unstressed and tend
to form one phonological word with their heads. A possessive may occur be-
tween the determiner and the head as shown in example (11):
(11) ij-apyte-pe-kuéra o-u la che tio

3poss-middle-loc-pl 3S-come dem 1poss uncle
ha o-henoi la iñ-ermano-kuéra.
and 3-call dem 3poss-sibling-pl
‘My uncle came with them and then called his brothers and sisters.’
The order of constituents in the noun phrase in (11) cannot be explained by

Spanish influence, because articles never occur before possessives in this
language. The reason is to be found rather in PG grammar. The possessive in
PG does not perform the same referential function as the possessive in Span-
ish and must be accompanied by native or Spanish-derived determiners. Con-
sider the following example:
(12) nd-ai-kuaá-i pe nde róga.

neg-1S-know-neg dem 2poss house
‘I don’t know your house.’ [lit. I don’t know that your house]
In a proximal-distal dimension, pe refers to an object far from the speaker but

near to the addressee. Some authors classify pe as a deictic of visual reference
(Trinidad Sanabria 2004: 696). An accurate identification of referents in the
communicative setting defined by speaker and addressee seems to be relevant
in PG. This task cannot be carried out by possessives, hence the widespread
use of demonstratives, including the Spanish article in constructions like (11)
above.
Spanish-derived la also occurs with one of a set of tense-marked nomin-
alizers (Gregores and Suárez 1967: 128) which include present va (13), past
va’ekue (14/15) and future va’erã (16). These nominalizers make nouns out
of phrases. Here are some examples:
(13) la o-ñe-mbo’é-va nda-ha’e-i

dem 3-pass-teach-nmlz.prs neg-3.be-neg
[la misma cosa]8 la o-ñe-ñe’ẽ-va
[the same thing] dem 3-refl-speak-nmlz.prs
‘What is taught is not the same as what is spoken.’
(14) la o-u-ypy-va’ekue peteĩ tio

dem 3-come-first-nmlz.pst one uncle
‘One uncle who came first’
(15) che ru la o-mano-ma-va’ekue o-japo doce año.

1S father dem 3-die-already-nmlz.pst 3-do twelve year
‘My father, who died twelve years ago.’
(16) nd-ai-kuaa-i la ha’e-va’erã.

neg-1S-know-neg dem 1S.say-nmlz.fut
‘I do not know what I would say.’
When used along with la, nominalizers make relative clauses, which may be
either restrictive (14) or non-restrictive (15). These constructions are used
when the nominalized clause is the subject (13) or the object (16) of the main
clause. To nominalize a clause that stands in oblique relation to the main
clause, the relativizer ha without la is used instead.
4.3.2. The Spanish article as a pro-form
Other uses to which the Spanish article is put in PG include several types of
pro-form. Pro-forms are expressed only by the Spanish article /la/, which is
used either as a freestanding pronoun or as a relativizer. Freestanding forms
may be used in two basic co-referential functions: cataphoric (17) and ana-
phoric (18)–(19).
(17) nda-che-tiempo-i la a-japo hagua otra cosa.

neg-1S-time-neg pro(x) 1S-do for (other thing)(x)
‘I don’t have time to do other things.’
(18) alguno-ko no-ñe’ẽ-i-ete la kastellano,

some-dem neg-speak-neg-very dem Spanish(x)
oi-ke-rõ eskuela-pe-nte la ña-aprende-pa.
3-come-when school-loc-only dem(x) pl-learn-compl
‘Some [of us] don’t speak Spanish, only when we go to school, we
learn it.’
(19) arema rei-ko nde ko Hernandarias-pe?

long.time 2S-live 2S (dem Hernandarias-loc)(x)
arema ai-me-te voi la a-nace ko’ápe.
long.time 1S-be-very thus dem(x) 1S-be.born here
‘Do you live here in Hernandarias for long time? – Long time I live
here where I was born.’
In (17) la refers forward to the noun phrase otra cosa ‘other thing’ while in
(18) the second instance of la refers back to the noun phrase la kastellano ‘the
Spanish language’. As it becomes clear from both examples, la may stand not
only for bare heads but also for whole phrases, regardless whether they con-
tain native lexemes or Spanish insertions. An illustrative case of how flexible
the use of Spanish-derived la may be in PG is found in (19), where la refers
back to the locative phrase ko Hernandariaspe ‘here in Hernandarias’. If the
second utterance stood by itself, la should be read as an adverbial equivalent
of ‘where’ in the English translation. A similar reading is suitable for (20),
where pro-form la refers back to the noun Brasil. Pro-nominal and adverbial
readings are satisfactory in both examples.
(20) che nda-se-guasu-i, Brasil-pe la a-ha.

1S neg-leave-much-neg Brasil-loc dem 1S-go
‘I don’t leave home too often, to Brazil (there) I have gone.’
4.4. Case marking
Contact-induced change is also visible in case marking of objects. The patient

and the recipient of a transitive verb are usually marked in PG by pe (or its
allomorph me in nasal environments). Although the presence of this marker
is obligatory in human objects, it is regularly dropped in direct objects with
the result of seemingly ungrammatical constructions. Compare the following
examples:
(21) che memby-kuéra hasẽ-mba

1S child-pl cry-compl
che ména-pe o-guerahá-rõ preso.
1poss husband-acc 3S-take-when imprisoned
‘My children cried to death when my husband was taken to prison.’
(22) pe alumno-pe o-ñe-mbo’e va’erã pe

dem student-acc 3S-pass-teach oblg dem
i-lengua-materna-pe.
3poss-tongue-mother-loc
‘It has to be taught to students in their own language.’
(23) ij-apyte-pe-kuéra o-u la che tio

3poss-middle-loc-pl 3S-come dem 1poss uncle
ha o-henoi la iñ-ermano-kuéra.
and 3-call dem 3poss-sibling-pl
‘My uncle was among them and later he called his brothers and
sisters.’
The marker pe occurs in direct objects (21) and indirect objects (22). In the
first case it may be dropped, as shown in (23) where the noun phrase iñer-
manokuéra ‘3poss.siblings’ is the patient of the verb ohenoi ‘3.call’. Spanish
does not mark direct objects – except if they are proper nouns – but always
marks indirect objects. Accordingly, the elision of pe in PG cannot be attrib-
uted to the contact language. A similar elision of patient markers has been
observed in other languages in contact with Spanish such as Ecuadorian Que-
chua (Gómez-Rendón, in this vol.)
There are cases in which the Spanish preposition a – marking human dir-
ect objects – co-occurs with the corresponding Guaraní marker, as illustrated
in (24). This construction does not occur with native noun phrases, how-
ever.
(24) nda-ha’e-i ko’ága sekundária-pe

neg-3.be-neg today high.school-loc
e-je-eksihi [a los alumnos]-pe.
imp-pass-discipline [to art.pl student.pl]-acc
‘It is not that nowadays in high school students are disciplined.’
In my view these isolated cases are instances of code-switching rather than

borrowing. This assumption is confirmed when looking at (25) where a whole
Spanish noun phrase is not marked by pe:
(25) o-ñe-me’ẽ-va’erã [a los padres].

3S-refl-give-oblg [to art.pl parent.pl]
‘It will have to be given to the parents.’
4.5. Local relations
Further evidence for contact-induced change in nominal structures comes

from the expression of local relations. There are two relations where Span-
ish borrowings occur: those expressed by the Spanish equivalents of English
‘between’ and ‘side’.
(26) entre mbovy pe-japó-raka’e la almacén.

among some 2PL-make-dist.past art store
‘Some of you set up the store.’
(27) upéi katu a-maña óga lado.

after then 1S-look house side
‘Afterwards I looked at the house’s side.’
Both loanwords are heads of their respective phrases. While the adpositional
phrase entre mbovy in (26) is head-first, the noun phrase óga lado in (27) is
head-last. On the other hand, all instances of entre show one argument as op-
posed to some uses of this preposition in Spanish (e.g. entre tu y yo ‘between
you and me’). Besides, the noun lado stands in a pseudo-possessive relation
to its modifier óga. The borrowing lado is grammaticalized in the expression
gotyo-lado, as shown in the following examples:
(28) mba’e la nde porte hína nde familia-kuéra gotyo-lado.

what art 2poss live prog 2poss relative-pl to-side
‘How is your life from the side of your family?’
(29) ro-ho kuri Brasil gotyo-lado.

1PL.excl.go rem.pst Brazil to-side
‘We went to Brazil.’
Interestingly, gotyo itself is a grammaticalization of koty ‘side’ and vo ‘for’.

It is not clear however whether gotyo-lado is fully equivalent to gotyo or im-
plies a different spatial relation. It appears that the correspondence between
the native structure and the mixed counterpart is not one-to-one since (28)
requires an ablative reading whereas (29) is best interpreted as allative case.
Arguably there exist grammatical borrowings from Spanish in location-
stationary relations. Consider the following sentence:
(30) o-ñe-guahẽ chupe upépe [a las dos].

3S-refl-arrive 3.acc there.loc [at the.pl two]
‘They arrived there at two.’
The fact that prepositions a ‘at’ and en ‘in’ appear only in time expressions as
heads of Spanish noun phrases suggests that we are dealing here with code
switches.
Verbal structures show fewer contact-induced changes. These include (1) the
double marking of aspect; (2) the use of the verb ‘have’ instead of non-verbal
constructions; (3) the modification of valency-changing devices; and (4) the
integration of Spanish loan-verbs and the predicative use of non-verbal loans.
5.1. Double marking of aspect
Influence from Spanish in the marking of aspect is clear from the co-occur-
rence of Spanish adverb ya ‘already’ alongside with the native perfective
marker mã as in (31).
(31) Kova ko masãna ya hi’ayu-pa-mã.

pro dem apple already 3.ripe-compl-prf
‘This apple is already completely ripe.’
The occurrence of Spanish adverb ya in perfective constructions has resulted

in double marked constructions. The loan adverb ya never occurs by itself
(Gregores and Suárez 1967: 154) but always with aspect markers – including
iterative jei ‘again’.
(32) pero ya a-ha-jei-ta-mã-hina.

but already 1S-go-again-fut-prf-pro
‘but I will be going already.’
5.2. The verb ‘have’ replacing non-verbal constructions
Another contact-induced phenomenon in verbal structures is the use of the

verb guereko ‘have’ instead of non-verbal constructions. Given the limited
use of this verb in pre-contact Guaraní – where it expresses possession – the
current use of guereko in contemporary PG may be attributed to Spanish,
from which the former has calqued a number of syntactic constructions with
the verb tener ‘have’. The result is twofold: the decreasing use of non-verbal
alternatives and changes in the valency structure of certain verbs. Consider
the following example:
(33) o-porandu chéve mbovy mitã-pa a-guereko

3-ask 1S.acc how.many child-int 1S-have
Ha a-guereko dies ha’e chupe.
and 1S-have ten 1S.say 3S.acc
‘He asked me how many children I have, and I told him that I have ten.’
In pre-contact varieties of Guaraní sentence (33) is not expressed by a predica-

tion. The lexeme ra’y ‘son of a father’ indicates also ‘having a son’. To further
specify the subject (father) and the object (son), a person marker is prefixed to
the noun and followed by a modifier. The result (34) is radically different:
(34) o-porandu chéve mbovy che-ra’y.

3-ask 1S.acc how.many 1S-son
ha che-ra’y pa ha’e chupe
and 1S-son ten 1S.say 1S.acc
‘He asked me how many children I have, and I told him that I have ten.’
In similar terms, the predicative construction in (35) differs from its non-
predicative counterpart in (36) because the latter requires only the noun h-oga
‘3.house’ and the noun inflected for third person.
(35) Pa’i A. o-guereko-va’ekue mbohapy hóga Kapi’ipé-pe.

priest A. 3S-have-pst three house Kapi’pé-loc
‘Father A. had three houses in Kapi’ipé.’
(36) Pa’i A. hóga-va’ekue mbohapy Kapi’ipé-pe.

priest A. 3S-house-pst three Kapi’pé-loc
‘Father A. had three houses in Kapi’ipé.’
The verb ‘have’ also occurs in a number of Spanish expressions such as tener
poder sobre ‘to have influence on’ or tener la culpa ‘to be guilty’, many of
which include whole noun phrases from Spanish. Example (37) illustrates
the second expression:
(37) o-guereko tuicha poder ore apyté-pe.

3S-have big power 2pl middle-loc
‘He exerted big power upon us.’
A further construction involving the verb ‘have’ is a syntactic calque from

the Spanish periphrastic construction of obligation, as shown in (38) below.
In this case, the verb ‘have’ co-occurs in a serial verb construction even if the
corresponding tense-mood marker va’erã is present also on the second verb.
(38) che a-guereko a-kobra-ve-va’erã.

1S 1S-have 1S-collect-more-oblg
‘I have to collect more [money].’
The presence of guereko in serial verb constructions like (38) was uncommon
in pre-contact Guaraní. The use and function of the verb ‘have’ in PG have
been restructured under the influence of Spanish.
5.3. Contact-induced valency changes
Contact-induced valency changes include (1) the use of the Spanish clitic se
as a model for the use of the Guaraní reflexive ñe; and (2) the use of the causa-
tive marker mbo with Spanish-derived verbal lexemes.
5.3.1. Spanish clitic se modeling the use of ñe
The proclitic se is used both for reflexive and impersonal passive constructions
in modern Spanish. There are cases, however, where the use of se results in am-
biguous constructions. On the basis of this model, the prefix ñe in PG may re-
ceive reflexive and passive interpretations. Consider the following examples:
(39) Upérõ avei campesino o-ñe-mombarete.

then also peasant 3S-refl-strengthen
‘Then the peasants got stronger.’/‘The peasants were strengthened.’*
(40) A-guereko gueteri upe kuatia-i

1S-have still dem book-dim
upérõ o-ñe-me’ẽ-va’ekue chéve.
then 3S-pass-give-pst 1S.dat
‘I still have the little book I was given.’
The Guaraní marker ñe admits passive and reflexive readings alike. Some-
times, however, only one reading is possible and contextual clues are re-
quired for disambiguation. Example (39) cannot be passive if interpreted in
a broader context: peasants were not strengthened but became stronger by
themselves despite the persecution they suffered. On the contrary, example
(40) accepts a passive interpretation.
The question is to what extent the use of ñe constructions is influenced
by the inherent ambiguity of Spanish se. One way to explore the complex
relations between contact-induced uses of reflexivity and passive voice is
to analyze how ñe is used with Spanish loan-verbs. Consider this case of a
potentially ambiguous use of ñe:
(41) jamás na-ñe-komunika-mo’ãi

never neg-recp-communicate-cond
la ña-ñe-komunika háicha la Guaraní-me.
pro 1PL-recp-communicate like pro Guaraní-loc
‘They would never communicate the way we communicate in
Guaraní.’
For a correct interpretation of this example, it is necessary to analyze the

meaning of the verb in question. The Spanish verb may be transitive with
the meaning of ‘notify’ or intransitive with the meaning of ‘have an inter-
change of words’. Only the first meaning allows for reflexivity and passive-
ness. Both occurrences of ñekomunika in (41) mean ‘have an interchange of
words’ and cannot be interpreted as passive or reflexive. As the verb in the
second clause is inflected for first person plural inclusive, it becomes evident
that the only possible reading of ñekomunika is reciprocal in both cases. In
fact, example (41) refers to the communication between Spanish monolin-
gual speakers and bilingual PG speakers in Paraguay. What is relevant here
is that the speaker marks the reciprocal relation by means of ñe instead of
the reciprocal ño. This shows that reflexivity, passiveness and reciprocity are
conflated all in one verbal prefix, just like Spanish fuses the three categories
in one clitic (i.e. se)
5.3.2. Causativization of loan-verbs
A common valency-increasing strategy in PG is the causativization of transi-

tive (42) and intransitive (43) loan-verbs by prefixing the causative marker
mbo/mo to the root.
(42) la autorida-kuéra o-gueraha la i-kuatia o-mo-firma.

art authority-pl 3-take art 3poss-paper 3-caus-sign
‘The authorities took their paper to have them sign it.’
(43) o-ñe-ha’ã lomitã o-mo-nace peteĩ modelo pyahu.

3-refl-effort people 3-caus-be.born one model new
‘People try to create a new model.’
Person-reference markers including reflexives and reciprocals must precede

the causative. In a few cases the causative occurs before personal inflection.
The following sentence is an example:
(44) Upe kuatia oha’ãva o-mbo-jo-topa

dem paper 3-effort-nmlz 3-caus-recp-meet
obispo ha dirigente-kuéra
bishop and leader-pl
‘That document tried to reunite bishops and leaders.’
The order of morphemes in this case could be explained by the meaning of the
loan-verb topar ‘to bump’, resemanticized in PG as ‘to meet’. This meaning is
possible also in Spanish, provided the verb takes the reciprocal se. In (44) the
native reciprocal jo is attached to the loanword to give the same meaning as
in Spanish. PG has thus borrowed not only the phonetic form of the word but
also its potential meanings, the latter realized through native morphology.
A final issue related to the causativization of Spanish-derived lexemes
has to do with their status in the source language. In principle only verbs are
borrowed from Spanish as heads of predicate phrases. However, this is not
always the case. The predicative use of non-verbal loans is discussed in the
next section.
5.4. Integration of Spanish verbs and predicative uses of non-verbal loans
Spanish verbs are borrowed in PG by deleting the infinitive ending /-r/, with-
out any derivative marker (e.g. komunika ‘communicate’). The causative
mbo/mo may occur on intransitive loan-verbs which are thus transitivized
with a slightly different meaning from the original (e.g. mo-nace ‘create’).
The causative can occur also on transitive verbs to give them a mediative
meaning (e.g. mo-firma ‘to have someone sign’), in which case the prefix
mbo/mo performs the same function of the suffix uka.
Nouns, adjectives and whole phrases may be used also predicatively in PG
through two different mechanisms: (1) the causative is prefixed to the non-
verbal lexeme; (2) the non-verbal lexeme or the phrase is integrated without
further marking. Whereas the first mechanism occurs more frequently with
nouns (45), the second is common with adjectives and phrases (46)–(47).
(45) o-moĩ i-pyti’á-re i-po,

3-put 3poss-chest-on 3poss-hand
o-mbo-kurusu ha he’i chupe-kuéra.
3poss-caus-cross and 3-say 3.acc-pl
‘He put his hand on his chest, made the sign of the cross and said to
them.’
(46) i-provechoso-va’erã pe i-vida-diaria-pe.

3.be-useful-oblg dem 3poss-life-daily-loc
‘It has to be useful in his daily life.’
(47) nda-i-deprovecho-mo’ãi chupe la Guaraní.

neg-3.be-useful-cond 3.acc art Guaraní
‘Guaraní wouldn’t be useful for him/her.’
The loan noun kurusu in (45), from Spanish cruz ‘cross’, is transformed into a
verb by the causative marker. In (46) the adjective provechoso ‘useful’ has the
present third-person marker and serves as the head of the predicate phrase.
The Spanish prepositional phrase de provecho ‘of benefit’ in (47) is assimi-
lated as a single unit and receives verbal inflection just like the loan adjective
in the previous example. This particular behavior of loanwords in PG is ex-
plained through the system of parts of speech of PG, in which lexemes from
several word classes are used flexibly. The classification of parts of speech in
PG is addressed in Section 9.
Spanish grammatical borrowings include parts of speech other than nouns

and verbs. This section addresses the influence of Spanish on numerals and
quantifiers, pronouns, connectors, discourse markers and time adverbs. Sev-
eral contact phenomena analyzed in detail in previous sections are mentioned
here only in passing – e.g. the pronominal use of the Spanish article.
6.1. Numerals and quantifiers
The presence of Spanish numerals in PG is massive. Pre-contact Guaraní had

a five-value numerical system. Nowadays it co-exists with a Spanish-based
10-value system in PG. Recently efforts have been made to expand the sys-
tem through neologisms, but their actual use by the speaking community is
reduced to writing. While some Spanish numerals have been assimilated (e.g.
by dropping the final /s/, dos ‘two’ > do), others are pronounced exactly as
in Paraguayan Spanish. The limitations of the vernacular system resulted in
the nearly total replacement thereof in Hispanicized urban sociolects or in
the coexistence of both systems in rural sociolects less immersed in a market
economy. These observations are valid both for cardinals and ordinals. Quan-
tifiers include only the loanwords todo ‘all’ and poco ‘few’.
6.2. Indefinite pronouns and determiners
A number of Spanish indefinite pronouns and determiners have expanded the

original Guaraní inventory. Spanish pronouns and determiners are used for
people and things in PG. They include known, negative, universal and other
referents. Pronouns and determiners are presented in Table 1 alongside their
semantic values.
Table 1. Spanish-derived pronouns and determiners in PG

Known Unknown Negative Universal Other
Person alguno – nipeteĩ enteroveva otro
Determiner alguno – nipeteĩ entero otro
Thing alguno – nipeteĩ enteroveva otro
(48) alguno o-maneja-ve ko situasión.

some 3-control-more dem situation
‘Some people controlled this situation better.’
(49) oi-ko alguno líder o-gusta-háicha.

3-be some leader 3-like-so
‘There were some leaders that liked it that way.’
(50) nipeteĩ na-i-ñapysẽ-i ore rendá-pe.

nobody neg-3-appear-neg 2pl.poss house-loc
‘Nobody showed up by our house.’
(51) che nipeteĩ parte nd-a-juhú-i i-vai-ha.

1S no part neg-1S-find-neg 3.be-bad-rel
‘I found that no part was bad.’
(52) ro-japo reunión entero socio roi-mé-va.

2pl-do meeting all partner 2pl-be-nom
‘We had a meeting among all partners.’
(53) Piribebýi o-je-aparta-ité-voi ha o-moĩ otro téra.

Piribebýi 3-ref-apart-very-AFF and 3-put other name
‘Piribebýi separated and adopted other name.’
Indefinite determiners occur often with Spanish-derived noun heads (49),

although cases are attested which show such determiners alongside native
PG lexemes (53). Spanish indefinite pronouns are always borrowed in their
masculine form (48), (52), and (53). No indefinite pronouns are borrowed for
unknown referents. The negative pronoun nipeteĩ ‘nobody/noting’ is a loan
blend of Spanish ni ‘nor’ and peteĩ ‘one’ (50). The same form is used as a
negative determiner in (51). The universal determiner entero ‘all’ is borrowed
in masculine form but used without gender distinction. The same form used
as a universal pronoun requires the nominalizer va.
6.3. Connectors
I have shown in Section 3 that Spanish conjunctions are borrowed in PG as

clause linkers. Spanish connectors occur both in rural and urban sociolects,
Table 2. Simple connectors from Spanish and their PG equivalents

Coordination Subordination
PG Sp PG Sp
terã O -rõ si
ha/katu pero -re porke
ha y -ramo aunke
-ha ke
in adult and children speech. An inventory of Spanish connectors with their

PG equivalents is given in Table 2.
The first criterion to classify loan connectors is whether they are com-
pound or not. A further criterion is the type of relation they establish between
clauses: connectors that link clauses at the same level (coordinators) and con-
nectors that link subordinate clauses to main clauses (subordinators). Some
examples of simple connectors follow.
(54) ndo-ro-japó-i mba’eve i-cóntra-pe pero ro-torva

neg-1pl.excl-do-neg nothing 3-against-loc but 1pl.excl-annoy
ichupe porque ha’é-nte o-rrekohe-paité-va’erã mandyju
3.acc because 3S-only 3-collect-compl-oblg cotton
‘We did nothing to her but we annoyed her because she was the only
one who had to collect cotton in the area.’
(55) ko’ãga peve livro o-ú-va castellano-pe-meme o inglés

today until book 3-come-nmlz Spanish-loc-usually or English
‘Until today the books that come are usually in Spanish or English.’
(56) o-ñe’ẽ la guaraní-me

3-speak art Guaraní-loc
si ha’e-kuéra oi-pota la kampesino vóto
if 3-pl 3-want art peasant vote
‘They speak in Guaraní if they want to get peasant’s vote.’
(57) San-Ignacio-gua no-ĩ-ri ko tembiapo ndive,

San-Ignacio-abl neg-3.be-neg dem work with
aunke la juventú San-Ignacio-gua o-apoya avei
although art youth San-Ignacio-abl 3-support also
‘People from s.I didn’t participate though s.I. youth did give their
support.’
The most frequently used Spanish connectors are porke ‘because’ and pero
‘but’. Coordinator y is used mainly to link code switches. The difference be-
tween Spanish subordinators and Guaraní subordinators lies on the fact that
the latter are postpositional. The only exception is subordinator ke, which oc-
curs in different constructions analyzed in the next section.
6.3.1. Subordinator ke
Beside the above-mentioned connectors, PG has incorporated the Spanish

subordinator ke ‘that’ in four different constructions: after some Spanish prep-
ositions used as conjunctions (58); in indirect quotations, with or without the
verb ‘say’ (59); in some time expressions to link the adverbial to the clause
(60); and in adjective/adverb comparison to link the compared terms (61).
(58) Durante ke ha’e o-u, n-o-pená-i ore-rehé

during that 3S 3-come neg-3-worry-neg 2pl-about
‘During the Father’s visit, nobody worry about us.’
(59) El dia ke pe jevy pende rape vaí-gui

art day that dem again 2pl.poss way bad-abl
‘On the day [when] change your bad habit again.’
(60) Pe ka’aru katu ou jevý-ma citación,

dem afternoon well 3-come again-prf notice
ke karai Isaac t-o-hóje t-o-ñe-presenta-mi
that mister Isaac, imp-3-go imp-3-rfl-report-mit
‘That afternoon a notice came saying Isaac has to report.’
(61) i-kuenta-vé-ta ña-ñe’ẽ inglés ke la Guaraní.

3.be-count-more-fut 1PL-speak English than art Guaraní
‘The fact that we speak English will count more than we speak Guar-
aní.’
Example (58) illustrates one of the few cases of Spanish prepositions in PG.
Unlike the English translation – in which during is the head of the prepos-
itional phrase – the compound connective durante qu e9 links two independ-
ent verb phrases. In (58) the speaker is using the preposition durante as equiv-
alent to the conjunction mientras on the basis of their similar semantics. The
subordinator ke in (59) simply connects a relative phrase to a noun phrase.

Example (60) shows the subordinator ke in a sort of quotative construction
signaling indirect speech. In this case no reportative verb such as ha’e ‘to say’
or ndaje ‘it is said’ is present. Ha’e always precedes its complement in pre-
contact PG while ndaje may occur either clause-initially or clause-finally.
When the subordinator is used with the reportative, ndaje occurs only clause-
initially. Cases like (60) illustrate the ways in which the borrowing of Spanish
connectors has influenced PG syntax. Another use of subordinator ke with
far-reaching consequences for PG morphosyntax is the one illustrated by ex-
ample (61). In this case, ke links two terms of a comparison. The use of the
subordinator has induced two changes at the level of morphology and syntax:
one is the drop of the ablative marker gui on the second term of the com-
parison (in this case, the noun phrase la Guaraní); the other is the obligatory
position of the second term after the subordinator – where in pre-contact PG
the second term may be located somewhere else in the sentence, provided it
has the comparative marker. Similar contact-induced changes in comparative
constructions have been observed in Ecuadorian Quichua (Fauchois 1988).
6.4. Discourse markers
Discourse markers borrowed from Spanish are ubiquitous in PG. Some loan
connectors and adverbs are used as discourse markers, for which reason it is
difficult to distinguish one use from the other. The double function of some
Spanish connectors and adverbs has facilitated their use as discourse markers
in PG. Among the most frequent Spanish discourse markers used in PG are:
the causal-consecutive entonces ‘then’; the appositives o sea ‘that means’ and
por ejemplo ‘for example’; and the resumptive bueno ‘well’. The analysis of
Spanish discourse markers in PG is complicated by the fact that they usually
appear on code-switching boundaries.
6.5. Time deixis
The main contact phenomena in time deixis in PG include the borrowing of

the days of the week in unassimilated or assimilated form. Time adverbs from
Spanish include ahora ‘now’, siempre ‘always’, tedia ‘today” (from este día,
‘this day”) and entonces ‘then’. As shown in the previous section, it is dif-
ficult to classify entonces and ahora as time adverbs since they are used also
as discourse markers. In such case the only way to tell one use from the other
is by knowing whether they are pronounced within the intonation contour of
the clause (adverbs) or independently (discourse markers).
Constituent order in PG is relatively free, with SVO as the unmarked order in

the clause (Gregores and Suárez 1967: 182). This feature makes it difficult to
track any contact-induced change in word order and explain new syntactic de-
velopments in terms of Spanish influence. There are, however, certain struc-
tures clearly influenced by Spanish. These were addressed already in previous
sections: the order of elements in possessive constructions (Section 4.1); the
order of elements in the noun phrase with respect to articles and possessives
(Section 4.3.1); and the use of the Spanish subordinator in comparative con-
structions (Section 6.3.1).
8. Syntax
Apart from syntactic structures concerning coordination and subordination,

both addressed in Section 6.3, there are two other contact phenomena in the
field of syntax that might be attributed to Spanish influence. On the one hand,
there is evidence that declarative sentences are transformed into questions by
receiving Spanish interrogative intonation contours, without using the inter-
rogative particles piko or pa. On the other hand, the first part of the negative
circumfix nda-i, when attached to third-person finite verbs is often realized
as no- instead of ndo-. This form resembles the Spanish negative adverb no.
However, the nasal onset in both forms prevents us from being conclusive in
this respect, since a replacement of the negative circumfix with the Spanish
adverb would imply the elision of the second part of the circumfix, which is
not the case. Spanish influence is thus limited to morpho-phonemics.
9. Lexicon
Spanish lexical borrowing is pervasive in PG. All semantic fields in PG show

traces from the contact language. From religion and kinship to idiomatic and
context-bound expressions, Spanish words have entered PG in large num-
bers, with or without assimilation (cf. Section 2).10 However, there is a differ-
ence in the number and type of loanwords depending on the sociolects, with
more Hispanicized lexicon in urban than rural varieties.
Except for pronouns, all lexical classes contain loanwords from Spanish.
Of course, the contribution of Spanish lexemes to each class is different. Ac-
cording to the results from a corpus of spontaneous speech collected in urban
and rural areas, the presence of Spanish borrowings in PG amounts to one
fifth of the total number of lexemes, excluding code switches. If these are
included, the presence of Spanish in PG is even greater. An analysis of the
same corpus in terms of lexical classes identified gave nouns (37.2%), verbs
(18.3%), adjectives (7.4%) and manner adverbs (0.9%). The most frequent
grammatical borrowings included articles (19.1%), conjunctions (7.5%), nu-
merals (1.7%), discourse markers (0.8%), adpositions (0.5%) and non-per-
sonal pronouns (0.2%). Spanish grammatical borrowings either replace, or
co-exist with, native morphosyntactic mechanisms. Spanish lexical borrow-
ings are not used always in their prototypical functions in PG. The analysis
of Spanish loanwords for syntactic position showed that even if prototypical
functions are the most frequent for nouns (heads of referential phrases) and
adjectives (modifier of referential phrases), these are often used in other syn-
tactic positions. Thus, Spanish nouns are used also as verbs and adjectives
whereas adjectives are used as verbs, nouns and adverbs. The following ex-
amples show the flexible use of Spanish loanwords:
(62) la che gente-kuéra che rú-gui o-lado.

dem 1.poss family-pl 1.poss father-abl 3.side
‘My family sides with my father.’
(63) la mbo’ehára Guaraní i-fanático.

dem teacher Guaraní 3-fanatic
‘That Guaraní teacher is a fanatic.’
(64) o-ñe’ẽ atravesado la Guaraní.

3-speak crossed art Guaraní
‘They speak Guaraní confusedly.’
The Spanish noun lado in (62) is used as the head of a predicate phrase with
the meaning of ‘align oneself’, without any derivation whatsoever. In simi-
lar terms, the adjective fanático in (63) is used as the head of a predicate
phrase (‘to be fanatic’) without any copula or derivation. Finally, the Spanish
adjective in (64) is used as the modifier of the predicate oñe’e ‘they speak’,
meaning ‘in a confused manner’. This flexible use of Spanish borrowings in
PG mirrors the parts of speech system of the recipient language, where lex-
emes are grouped in two classes: verbs and non-verbs. The latter includes
Spanish nouns, adjectives and adverbs.11
10. Conclusion
Like many other Amerindian languages, Paraguayan Guaraní has been in

contact with Spanish during the last 450 years. Unparalleled in other Latin
American countries, the history of Spanish-Guaraní contact has produced a
large bilingual population and made both languages converge to each other.
Albeit not analyzed here, the influence of Guaraní on Paraguayan Spanish is
not less important.
In this chapter I have described the influence of Spanish on the grammar
of Paraguayan Guaraní. Contact-induced changes in grammatical structures
are especially noticeable in the noun phrase. All parts of speech show traces
of matter or pattern borrowing. Syntactic structures have been not less affect-
ed, particularly in relation to the use of Spanish connectors in clause linking
and the calque of Spanish constructions. The all-encompassing influence of
Spanish has led some linguists (cf. Melià 1992) to claim that Paraguayan
Guaraní is not Guaraní anymore but a third language resulting from the mix-
ture of both languages in contact.
Notes
1. In what follows I use the term pre-contact Guaraní to refer to the Guaraní lan-
guage as spoken before the Spanish conquest. For a thorough description of pre-
contact Guaraní, see Ruiz de Montoya (1994). Arte y bocabulario de la lengua
guaraní.
2. The study of the influence of Spanish on Guaraní has a seminal work in Morini-
go’s Hispanismos en el Guaraní (1931). In the last two decades the study of the
linguistic outcomes of Spanish-Guaraní contact has received increasing atten-
tion. Worthy of mention are a number of articles published in the two volumes
of Sociedad y Lengua: Bilinguismo en el Paraguay (1982), which scrutinize the
linguistic and sociolinguistic aspects of language contact in Paraguay from dif-
ferent perspectives.
3. For full data on the census, visit the website www.dgeec.gov.py.
4. While current bilingual programmes in Paraguay promote Jopara as the lan-

guage of schooling, several efforts have been made in the last decade to ‘cleanse’
Guaraní by producing prescriptive grammars and dictionaries to fill lexical gaps
by introducing neologisms.
5. The examples used in this chapter come from three sources: Atlas Lingüístico
Guaraní-Románico, Sociología-Comentarios (2002); Kokuera Rembiasa: Expe-
riencias campesinas 3 vols. (1992); and my own corpus, collected during a three-
month fieldwork visit to rural and urban Guaraní-speaking areas in Paraguay.
6. Other cases in which the Spanish preposition is used instead of simple juxtapos-
ition include partitive constructions.
7. The original gender distinction in Spanish coded in feminine la and masculine/
neuter lo is absent in PG since la is used both for male and female while all cases
of lo include only plural nouns regardless of gender. Both pre-contact Guaraní
and modern PG lack grammatical gender. Spanish adjectives and nouns are bor-
rowed along with their original gender endings, but this does not mean that gen-
der is grammaticalized in PG.
8. In every example where a code switch occurs, this will be identified between
square brackets.
9. This construction is ungrammatical in Spanish. To coordinate two simultaneous
events the conjunction mientras is used, with or without que.
10. For a comprehensive study on Spanish loanwords in Paraguayan Guaraní, see
Morínigo (1931).
11. In extensive cross-linguistic study of the use of Spanish loanwords in three typo-
logically different Amerindian languages including PG is presented elsewhere
(Bakker, Gómez-Rendón and Hekking, forthc).
References
Bakker, Dik, Jorge, Gómez-Rendón, and Ewald Hekking

Forthc. Spanish meets Guaraní, Otomí and Quichua: A multilinguals confron-
Comisión Nacional de Rescate y difusión de la Historia Campesina
1992 Kokuera Rembiasa: Experiencias Campesinas. Asunción: CEPAG
Corvalan, Grazziella, and Germán de Granda (eds.)
1982 Sociedad y bilingüismo en el Paraguay. Asunción: Centro Paraguayo
de Estudios Sociológicos.
Fauchois, Anne
1988 El quichua serrano frente a la comunicación moderna. Quito: Edito-
rial Abya Yala.
Gregores, E., and C. Suárez

1967 A Descriptive Grammar of Colloquial Guaraní. The Hague: Mouton.
Guasch, Antonio
1997 Gramatica de la lengua guaraní y antología de prosa y verso. Asun-
ción: CEPAG.
Melià, Bartomeu
1992 La lengua guaraní del Paraguay: historia, sociedad y literatura. Ma-
drid: MAPFRE.
Morínigo, Marcos
1931 Hispanismos en el guaraní. Madrid: Editorial Espasa Calpe.
Ruiz de Montoya, A.
1994 [1640] Arte y bocabulario de la lengua guaraní. Asunción: CEPAG.
Thun, Harald (ed.)
2002 Atlas lingüístico guaraní-románico: Sociología. Kiel: Westensee Ver-
lag.
Trinidad Sanabria, Lino
2004 Diccionario Avañe’ẽ ilustrado. Buenos Aires: Editorial Océano.
Velazquez-Castillo, Maura
1995 Noun incorporation and object placement in discourse: The case of
Guaraní. In P. Downing and M. Noonan (eds). Word Order in Dis-
course, 555580. Amsterdam: John Benjamins.
Velazquez-Castillo, Maura
2002 Grammatical relations in active systems: The case of Guaraní. Func-
tions of Language 9 (2): 133167.
Grammatical borrowing in Hup
Patience Epps
1. Background
The Hup language1 (also known as Hupda or Jupda; see Epps 2005a) is spo-
ken by approximately 1500 people in the Vaupés region, located on the bor-
der of Brazil and Colombia. Hup is a member of the small, under-described
Nadahup or Makú family,2 whose four established members all live within
the Upper Rio Negro region of Amazonia. These are Hup’s closest sister
Yuhup (Ospina 2002), the more distant Dâw (S. Martins 2004), and the most
distant relative Nadëb (Weir 1984), as measured in both genealogical (genet-
ic) and geographic distance. The languages Kakua/Nukak and Puinavé have
also frequently been classified together with the Nadahup languages (see,
e.g., Loukotka 1968), but their relationship has yet to be conclusively dem-
onstrated.
Hup is currently fully viable, learned as a first language by all Hup chil-
dren. However, by the time they are adults, virtually all Hupd’əh (‘person-
pl’, ‘Hup people’) are fluent in Tukano, the most widely spoken language of
the Eastern branch of the Tukanoan family, of which some dozen or more
members are spoken in the Vaupés region. This bilingualism on the part of
the Hupd’əh is not unusual in the context of the Vaupés, which is well known
for the striking multilingualism of its inhabitants, rooted in the local practice
of linguistic exogamy. According to this practice, marriage obligatorily takes
place across group or ethnic boundaries, which are defined primarily by the
language its members speak (e.g. Sorensen 1963, Jackson 1983). The near-
constant contact among local languages that this practice fosters has led to
the identification of the Vaupés as a linguistic area (which itself has numerous
features in common with the larger Amazonian region). The effects of diffu-
sion have been documented in detail for the Arawak language Tariana, which
has undergone significant grammatical influence from Tukanoan (Aikhen-
vald 2002, etc.).
Unlike the Tukanoan and Arawak peoples of the Vaupés, the Nadahup
peoples do not practice linguistic exogamy, preferring to marry among clans
within their own ethnic/linguistic groups. However, the Hupd’əh (and the
Yuhup, who also live within the Vaupés) are in close socioeconomic contact
552 Patience Epps
with Tukanoan speakers. As forest-dwellers and masters of hunting and

gathering, the Hupd’əh have traditionally supplied labor, meat, and other for-
est products to their Tukanoan neighbors, who live along the rivers and prac-
tice fishing and manioc farming for subsistence. In exchange, the Hupd’əh
are given manioc products and other trade goods, but are considered socially
inferior by the Tukanoans and the Tariana. Accordingly, the burden is on the
Hupd’əh to learn Tukano to communicate with their neighbors, whereas the
latter have in general no interest in learning Hup. This one-sided bilingualism
appears to have been in place and stable for many generations.
This sociolinguistic situation has led to a profound Tukanoan influence on
Hup, such that Hup is clearly involved in the Vaupés linguistic area (see Epps
2007). As is the case with Tariana (Aikhenvald 2002, etc.), this influence
primarily involves the borrowing of grammatical categories, rather than the
actual forms of morphemes (i.e. ‘pattern’ rather than ‘matter’); this can be at-
tributed to Hup speakers’ awareness of the local taboo against language mix-
ing, despite the fact that they are not part of the linguistic exogamy system
that fosters it. Accordingly, speakers resist the borrowing of actual forms, of
which they are more aware, while less consciously rearranging the categories
and organization of their grammar to conform to those of Tukano.
This chapter focuses on the effects in Hup of unilateral diffusion from
Tukano, with possible influence from other Tukanoan languages as well. Be-
cause this diffusion has involved mostly substance rather than form, it may
be difficult to determine with complete certainty whether a given similarity
is indeed due to borrowing, rather than to inheritance or independent innova-
tion. In general, positive evidence for borrowing is taken to be the presence
of a feature both in the Tukanoan languages and in Hup, and its absence in
other Nadahup languages; however, features that are typologically very com-
mon are considered suspect as candidates for diffusion, since the chance that
they were independently innovated in Hup and Tukano is relatively high.
Furthermore, because Yuhup and to some extent Dâw have also undergone
contact with Tukanoan languages, in some cases hypotheses regarding diffu-
sion rest on the feature’s absence in Nadëb alone, and the possibility that the
feature existed in the proto-language and was later lost from Nadëb may be
difficult to rule out. The situation is also complicated by the fact that detailed
descriptions of Nadahup languages exist only for Hup (Epps 2005a) and Dâw
(Martins 2004).3
In general, Portuguese influence on Hup grammar is very limited; it is
mentioned in the database where relevant, but comes up only rarely. Contact
Hup 553
between Hup speakers and non-Indians is recent (mostly limited to the past
thirty years), and is still quite infrequent. Very few Hupd’əh speak more than
a few words of Portuguese, and Portuguese borrowings into Hup are for the
most part restricted to lexical items (especially names of culturally new ob-
jects); i.e. ‘matter’ instead of ‘pattern’. Many of these may in fact have en-
tered the language via Tukano, which in most cases uses the same borrowed
lexical forms.
2. Phonology
Hup has a significantly larger inventory of vowels and consonants than do

the Tukanoan languages (e.g. nine vowels in non-nasal contexts vs. Tukano’s
six), but all of the segmental contrasts found in Tukano also exist in Hup.
The pronunciation of word-initial /y/ as [dy] ([ñ] in nasal contexts) is a pan-
Vaupés feature and almost certainly due to contact, as is the pronunciation
of intervocalic /d/ as a flap [ɾ] and the prenasalization of voiced stops. The
lack of word-initial /g/ is also common to both Hup and Tukano (but not to
Nadëb or Dâw).
The most striking contact features in Hup’s phonology are prosodic. Nasal-
ization is a morpheme-level prosody in both Hup and the Tukanoan languages
– certainly a contact feature – and accordingly nasal consonants and voiced
stops are allophones4 (whereas segments contrast for nasalization in Dâw and
Nadëb; compare Hup mǔn [fully nasal] and Dâw múd [half nasal, half oral]
‘caatinga, forest with sandy soil’). Also, Hup has a system of word-accent,
in which two tonal contrasts are realized on stressed syllables (example 1),
which parallels the system of pitch-accent found in Tukanoan languages.
(1) Hup
núh (high/falling tone) ‘head’
nǔh (rising tone) ‘tapioca’
3. Morphological typology and constituent order
Like Tukanoan, Hup has relatively agglutinative morphology, with com-

pounding of multiple verb roots and the piling up of affixes and clitics (ex-
amples 2 and 3).
554 Patience Epps
(2) Hup
d’ǔç höd tətəd-d’óʔ-óy=mah.
timbó 3pl beat.timbó-take-dynm=rep
‘They beat timbó, it’s said.’
(3) Tukano (Ramirez 1997a: 176)

yesê-a-de pihî-dẽe-ya!
pig-pl-obj call-gather.together-imp
‘Call the pigs together!’
Also like the Tukanoan languages, Hup is dependent-marking and predomi-

nantly suffixing. Nadëb, in contrast, is more isolating and is largely head-
marking and prefixing. Furthermore, Hup (like its sisters Yuhup and Dâw) has
nominative–accusative alignment, whereas Nadëb is ergative in its constituent
order and cross-referencing pronominal forms (see Weir 1984). Finally, Hup
and Tukanoan favor SOV constituent order (although both languages allow
considerable flexibility governed by pragmatics), while Dâw and Nadëb are
reported to favor SVO. There is no doubt that Hup fits the Vaupés typological
profile, although it is not entirely clear whether some of these changes might
actually have occurred in Nadëb rather than in its sisters.
One of the most striking contact effects on Hup nominal structures is in the
system of case marking. Like Tukano, Hup uses a single object suffix to
indicate direct objects, indirect objects, beneficiaries and recipients, and a
single oblique marker for locative, instrumental, and comitative functions,
as in (4). Moreover, object marking in Hup and in Tukanoan is sensitive to
animacy and definiteness. The Hup case markers do not appear to be cognate
with forms used for case marking functions in Hup’s sister languages, which
suggests that they developed in Hup relatively recently under Tukanoan influ-
ence – although Hup apparently used its own resources for the forms of the
markers, rather than borrow them directly from Tukano.
(4) Hup
a. Instrumental
m’ǎc-át p´d höd bib’-ní-h, děh=teg-éh.
mud-obl dist 3pl close-infr2-decl water=tree-decl
‘They would stop it up again with mud, the water tree.’
Hup 555
b. Location
ʔãh yamhidʔ-h, cãwyucé-ét.
1sg sing-decl São.José-obl
‘I sang at São José Village (during a drinking party).’
c. Comitative
ʔah=ʔíp-ít ʔãh ni-ʔeʔ-ní-h.
1sg=father-obl 1sg be-perf-infr2-decl
‘I lived with my father.’
Contact has also shaped the Hup system of nominal classification. As dis-
cussed in Epps (2006b), an incipient system of classifying nouns (using terms
derived from plant parts, such as tat ‘fruit, round thing’) has formed in Hup
under the influence of the noun classifier system in Tukanoan languages. As
in Tukanoan, the Hup classifying terms organize inanimate referents on the
basis of shape (round, flat and flexible, etc.), as illustrated in (5)–(6), and
animates on the basis of gender (male/female). One of the most interesting
features of the Hup system is that – as an incipient system currently affecting
only a few nouns – it is most frequently used in creating neologisms to name
new, culturally introduced items (such as soccer balls, batteries, etc.). This
can be interpreted as a mechanism for resisting outright lexical borrowing
(from Portuguese or Tukano), which is in keeping with the Vaupés avoidance
of obvious language mixing.
(5) Hup
kw=tat hot.pepper=round ‘hot pepper’ (fruit)
tác=tat kick=round ‘soccer ball’

biâ-ga hot.pepper-round ‘hot pepper’ (fruit)
kapê-ga eye-round ‘eye’
As in Tukano, possession in Hup may be either alienable or inalienable.

The inalienable construction, formed via the juxtaposition of the possessor
and possessed NPs, is a regionally widespread and cross-linguistically com-
mon structure, but the alienable construction in Hup likely owes its form to
Tukanoan. Like Tukanoan, but unlike Nadëb (which uses special possessive
classifiers for a few nouns and simple juxtaposition in all other cases), Hup
indicates alienable possession by means of a possessive marker between pos-
sessor and possessed (example 7). Dâw employs a similar construction but
556 Patience Epps
an unrelated marker, suggesting that Hup’s strategy (and probably Dâw’s)

was developed recently and followed the Tukanoan model (illustrated in ex-
ample 8).
(7) Hup
pedú nˇh cug’æ̌t
Pedro poss book
‘Pedro’s book’

m’ yaá wi’i
2sg poss house
‘your house’
Number marking in Hup also resembles that found in Tukanoan languages.

As with object case marking, Hup nouns normally receive an overt plural/
collective marker (which follows the noun in both Tukanoan and Hup) only
when their referents are animate and definite, and obligatorily when these are
human; e.g. Hup tǔg=d’əh (howler.monkey=pl) ‘howler monkeys’; compare
Tukano emô-a (howler.monkey-pl) ‘howler monkeys’ (Ramirez 1997a: 205).
Also like Tukano, Hup has a singulative marker that can occur on a small set
of nouns – mostly certain kinds of insects that typically appear in groups:
(9) Hup
cw=d’əh (biting.ant.sp-pl) ‘biting ants (sp.)’
cw=ʔãw (biting.ant.sp-sing) ‘a (single) biting ant (sp.)’

butu-a (termite-pl) ‘termites’
butu-a-w˜ (termite-pl-sing) ‘a (single) termite’
Several features of Hup’s tense, aspect, and mode categories probably owe
their form to Tukanoan influence. While it is difficult to ascertain definitively
whether contact played a role in their development, the categories of comple-
tive, inceptive, iterative, and habitual aspect are used in very similar ways in
both Hup and Tukano, and are in general marked by post-stem morphology in
Hup 557
both languages. The same applies to the modal categories of counterfactuality

(irrealis), optative, conditional, and epistemic modality; many of these also
appear to have grammaticalized relatively recently in Hup – further evidence
that they developed during the period of Tukanoan contact. Other aspect-
ual and modal meanings are conveyed via verb compounding (see below),
and compounded verb roots are a source of markers of aspect and modality
through grammaticalization processes. In the case of tense, Hup appears to
have developed an obligatory future specification under Tukanoan influence,
as well as a distinction between recent and distant past tense, realized by
means of optional contrast particles:
(11) Hup
a. wæd-y´ʔ-ý páh ʔa h-a h.
eat-tel-dynm prx.cntr 1sg-decl
‘I ate (it) recently.’
b. wæd-y´ʔ-ý cám ʔa h-a h.
eat-tel-dynm dst.cntr 1sg-decl
‘I ate (it) a while ago; before today.’
A particularly striking effect of contact between Hup and Tukanoan is

found in the expression of evidentiality. While the reported evidential speci-
fication is found in Nadëb and reconstructs to Proto-Nadahup, Hup has since
developed additional evidential categories of nonvisual (i.e. heard, smelled,
tasted, or felt) and inferred information, using markers grammaticalized from
compounded verb roots (see Epps 2005b), and almost exactly paralleling the
evidential specifications found in Tukano. The fusion of evidentiality with
tense (obligatory in Tukano) is also found in Hup, but is limited to the report-
ed evidential morpheme and the optional distant past tense marker (a common
combination in narrative), and then only in one dialect (examples 1213).
(12) Hup
j’ǔg-út=maám töh wɔn-kot=máh-ah.
forest-obl=rep.dst.cntr 3sg follow-go.in.circles=rep-dec
‘In the forest, long ago, they say, he wandered following (the tapir).’

diay wa’î yahá-apɨ’.
dog fish steal-rec.past.rep.3sg.nonfem
‘The dog stole the fish (in the recent past, it’s said).’
558 Patience Epps
Moreover, Hup has developed a second inferred evidential construction based

on the borrowed verb ni- ‘be’ (one of the few examples of borrowed ‘matter’;
example 14); this appears to be a calque of the identical Tukano construction,
likewise built upon the verb diî [nii] ‘be’ (example 15).
(14) Hup
póh, děh=teg g’et-ʔeʔ-ní-h.
high water=tree stand-perf-infr2-decl
‘Really high, the water-tree stood.’
(15) Tukano (Ramirez 1997: 140)

yaa wecé ma’a wi’ô-’kadã dĩî-áma.
poss field path obstruct-nmlz.pl.perf be-rec.past.vis.3pl
‘They’ve blocked the path to my manioc field.’ (proof: logs across the
path)
Hup’s strategy for verbalization almost certainly owes its form to contact
with Tukano. While Tukano has a verbalizing suffix -ti, Hup employs the
verb -ni- ‘be’ (discussed above) for the same purpose, usually creating ‘have
N’ expressions (e.g. Hup h m-ni- ‘have a sore’, Tukano kamî-ti ‘have a sore’
[Ramirez 1997a: 353]).
Several of Hup’s valency-changing structures probably also arose under
Tukanoan influence. Its reflexive, reciprocal, and applicative markers are un-
like those in other Nadahup languages and all appear to be recently grammat-
icalized; they are used in similar ways to those in Tukanoan. Like Tukano,
Hup forms a passive by means of a verbal marker and an object case suffix
on the semantic agent:
(16) Hup
yaʔám tiyı̌ʔ-ǎn hup-mæ̀h-æ̀y.
jaguar man-obj pass-kill-dynm
‘The jaguar was killed by the man.’
(17) Tukano (Ramirez 1997b: 187)

di’i-t´ wĩ’ba-g´-de bopê-dõ’o-’kado
pot-cl child-masc.sg-obj break-pass-nmlz.place.perf
dĩî-ap.
be-rec.past.vis
‘The pot was broken by the child.’
Hup 559
Also, the reflexive marker (a verbal prefix in both Hup and Tukano) can ap-
pear as a nominal suffix/enclitic with an intensifying function in both lan-
guages (e.g. Hup ʔãh-hup, Tukano yö’ˇbasi [1sg-rflx] ‘I myself’). Finally,
causative meanings are conveyed by means of verb compounding – generally
a very productive process in both Hup and Tukanoan, although much less so
in Hup’s sisters. In some cases the Hup compounds (causative and otherwise)
appear to be calques of their Tukanoan counterparts (e.g. Hup d’oʔ-sud- and
Tukano mii-sãa [take-be.inside] ‘put inside’).
Hup’s numeral system has been profoundly influenced by contact with Tu-
kano. While the numerals 1 to 3 are reconstructable for the Hup-Yuhup-Dâw
branch of the Nadahup family, Hup 4 is clearly a calque of the Tukano form
(literally ‘has a sibling/companion’), as is 5 (‘one hand’) and the numerals 6
to 20 (a base-five strategy involving the addition of fingers and toes); these
forms (4 and above) are in fact found throughout the Vaupés and even beyond
(Epps 2006a). In more recent years, Hup speakers have also borrowed Por-
tuguese numerals (particularly for 6+) – both the actual forms as well as the
base-ten strategy. These are currently used interchangeably with the native
forms.
A few other grammatical forms have been borrowed from Portuguese,
probably via Tukano (which also uses these forms). These include the adver-
bial particle té ‘until, up to’ (from Portuguese até; example 18), and the nega-
tive emphasis particle næ̀ may also derive ultimately from Portuguese nem
‘neither/nor’. Interestingly, Hup has borrowed a disjunction ʔó ‘or’ (from Por-
tuguese ou; example 19) without borrowing a conjunction (‘and’), a cross-
linguistically somewhat unusual scenario (Matras 1998).
(18) Hup
té yawarete ʔãh ham-b´-h.
until Yawarete 1sg go-hab-decl
‘I always go as far as Yawarete.’
(19) Hup
ʔó cap g’`, ʔó mtaʔáp g’`, ʔãh bʔ-ni-té-h.
or other year or three year 1sg work-be-fut-decl
‘Next year, or a third year, I’ll stay here to work.’
560 Patience Epps
Hup and Tukano share a few interjections, such as the expression ʔagö
‘ouch’. Hup speakers also typically use borrowed Portuguese expressions for
most days of the week and times of day; these probably entered the language
via Tukano.
7. Syntax
Several features of Hup syntax probably owe their form to contact with Tu-
kano. The verb ni- ‘be’ (Tukano dĩî [nii]) – one of the few examples of a
shared form (‘matter’ rather than ‘pattern’) in these languages (as mentioned
above) – is used in both languages as a copula with predicate nominals and
adjectives (although in Hup it appears only when certain tense–aspect–mode
specifications are present), as illustrated in examples (20)–(21). In addition to
this copular function of ni-/niî, its simple verbal meaning ‘exist, be present’ is
common to both languages, as is its use in an inferential evidential construc-
tion (see §5 above).
(20) Hup
náw töh ni-ʔě-h.
good 3sg be-perf-decl
‘He used to be good.’
(21) Tukano (Ramirez 1997: 258)

teé pũdikã ãyú-sehé dĩi-ap.
those contrast good-nmzl.inan.pl be-rec.past.vis
‘Those really are good.’
Negation in Hup has several features in common with Tukanoan, also like-
ly due to contact between the languages. As in Tukano, Hup clausal negation
is expressed via a verbal suffix (non-cognate across the Nadahup family); in
addition, both languages have distinct negative lexical forms meaning ‘does
not exist’ and ‘I don’t know’.
Another point of resemblance between these languages is the appearance
of case suffixes directly on verb stems as markers of headless relative clauses
in object position within the main clause, as well as on adverbial clauses re-
lating to time, as illustrated in (22)–(23) (with Wanano, a close relative of
Tukano). The presence of this feature in Hup is almost certainly due to con-
tact.
Hup 561
(22) Hup
ʔam wæd-túk-uw-aˇn d’oʔ-næ̀n-æ̀h.
2sg eat-want-flr-obj bring/take-come-decl
‘(We) brought what you wanted to eat.’
(23) Wanano (Stenzel 2004: 287)

~bʉ’ʉ chʉ-dua-re ~da-ta-i
2sg eat-desid-obj bring/take-come-vis.perf.1
‘(We) brought what you wanted to eat.’
8. Lexicon
As discussed in Section 1, Hup speakers – like the other peoples of the

Vaupés – tend to resist the outright borrowing of form, particularly lexical
items. Nevertheless, some words in Hup clearly have a Tukanoan source –
although at least as many of the identifiable lexical borrowings are calques,
rather than actual borrowed forms (e.g. ‘deer’ for ‘manioc-processing tripod’;
‘medicine-house’ for ‘hospital’, etc.). An interesting characteristic of the dir-
ectly borrowed Tukano words is the high proportion of verbs in relation to
nouns (borrowed Tukano verbs outnumber nouns roughly two to one; see
Epps forthc.), whereas nouns are cross-linguistically much more commonly
borrowed. This can probably be explained by the fact that verb roots usu-
ally occur in multi-morphemic contexts, often involving several compounded
stems; thus embedded in native forms within a given word, borrowed forms
may be less easily noticed and therefore not censored by either speakers or
listeners (as observed by Aikhenvald 2002: 224 regarding Tariana). Partic-
ularly for those nouns having a Tukanoan source, borrowings and calques
seem to cluster in the semantic domains relating to agriculture and ritual/reli-
gion, suggesting that the words were borrowed together with the concepts to
which they refer. Both verbs and nouns enter the language as roots, requiring
no additional morphology to convert them to acceptable form.
Mostly within the last generation, Hup has adopted quite a few lexical
items (but almost no grammatical material) from Portuguese. These include
both nouns and verbs, and for the most part refer to newly adopted cultural
items (e.g. ‘cup’, ‘spoon’); nevertheless, Hup avoids many potential borrow-
ings by creating its own neologisms from native material. Many of these
Portuguese borrowings may have entered Hup via Tukano (which also uses a
considerable number of them), since few Hupd’əh speak Portuguese or have
562 Patience Epps
much contact with non-Indians. Attitudes toward allowing Portuguese forms

to enter one’s speech are not generally as negative in the Vaupés as they are
toward other indigenous languages – possibly a reflection of the relative pres-
tige of Portuguese in the region.
9. Conclusion
Hup is an intriguing case of a language that has largely resisted the adoption
of borrowed forms (MAT), while assimilating many aspects of the grammat-
ical structures and categories found in the contact language (PAT). It is in
some cases difficult to be certain whether similarities in particular structure
are due to contact, rather than to inheritance or independent innovation, but
there is no doubt that a considerable amount of convergence has occurred,
bringing Hup firmly into line with the Vaupés regional profile. This profile
has been largely determined by the Eastern Tukanoan languages – principally
Tukano, which has exerted a more or less unilateral influence on the local
Nadahup languages (Hup and Yuhup) and on Arawak Tariana (Aikhenvald
2002).
Contact has affected Hup’s grammar at virtually all levels, including its
nominal and verbal structures, syntax, and discourse. On the level of phon-
ology, Tukanoan has had a particularly strong hand in shaping Hup’s pros-
odic features of nasalization and tone (word-accent) – likely due to the rela-
tively high discourse salience of such features. Contact also appears to have
restructured Hup’s typological profile in significant ways, affecting its gram-
matical alignment, constituent order, etc. Tukanoan has influenced the ex-
pression of number, case, noun class and gender in the Hup noun phrase; in
the verb phrase, tense, aspect, mode, and evidential categories have been af-
fected, and likewise the mechanisms for indicating changes in voice and va-
lency. Hup’s numerals have also undergone profound contact influence, and
some clausal connectors, subordinating mechanisms, and discourse markers
have been borrowed as well; pronouns, on the other hand, appear to have es-
caped Tukanoan influence. Negation and the use of a copula may also have
been shaped according to the Tukanoan model. Of all areas of Hup grammar,
the lexicon appears to have been relatively resistant to change; nevertheless,
considerable calquing (lexical PAT-borrowing) and some outright MAT-bor-
rowing have taken place, including a high proportion of verbs.
In conclusion, the pattern of contact between Hup and Tukanoan has fa-
vored the borrowing of PAT over MAT, a cross-linguistically unusual situ-
Hup 563
ation. This can be attributed to the sociolinguistic context in which Hup is

spoken, in which the pan-Vaupés resistance to overt language mixing plays
an important role – despite the fact that Hup speakers do not practice the
linguistic exogamy that fuels this resistance among the other peoples of the
Vaupés.
Abbreviations
cl classifier nonfem nonfeminine

decl declarative obj object
desid desiderative obl oblique
dist distal pass passive
dst.cntr distant past contrast perf perfective
dynm dynamic poss possessive
flr filler pl, pl plural
fut future prx.cntr proximate contrast
hab habitual rec.past recent past
imp imperative rep reported
inan inanimate sg, sg singular
infr2 inferential 2 sing singulative
masc masculine tel telic
nmlz nominalizer vis visual
Notes
1. Information on Hup (aka Hupda, Jupde) was obtained via the author’s field-
work on the Rio Tiquié, Amazonas, Brazil, conducted in 20002004. Support
from Fulbright-Hays, the National Science Foundation (Grant no. 0111550),
and the Max Planck Institute for Evolutionary Anthropology, Leipzig is grate-
fully acknowledged. Many thanks go to my Hupd’ǝh hosts and language teach-
ers, as well as to the Museu Parense Emílio Goeldi and the Instituto Socio-
ambiental in Brazil for practical assistance with fieldwork. I am also grateful
to Alexandra Aikhenvald and to an anonymous reviewer for comments on this
material.
2. The name “Nadahup” is here considered preferable to “Makú”. Not only is
the latter name encountered in the literature in reference to several unrelated
languages and families in Amazonia, but it is also in common use as a high-
ly insulting ethnic slur in the Vaupés, directed toward the speakers of these
564 Patience Epps
languages. ‘Nadahup’ combines elements of the names of the four established

members of this family, Nadëb, Dâw, Yuhup, and Hup.
3. Information on Tukano comes from Ramirez (1997a, b). Nadëb data are prin-
cipally from Weir (1984), and Yuhup data from Ospina (2002).
4. Morpheme-level nasalization is nonetheless represented orthographically for
Hup by means of nasal segments (at least one per word), for easier reading.
References
Aikhenvald, Alexandra
2002 Language Contact in Amazonia. Oxford: Oxford University Press.
Epps, Patience
2005a A grammar of Hup. Ph.D. diss., Deptartment of Anthropology, Uni-
versity of Virginia (forthc. 2008 Mouton de Gruyter).
2005b Areal diffusion and the development of evidentiality: Evidence from
Hup. Studies in Language 29 (3): 617650.
2006a Growing a numeral system: The historical development of numerals in
an Amazonian language family. Diachronica 23 (2): 259288.
2006b Birth of a noun classification system: The case of Hup. In: L. Wetzels,
(ed.), Language Endangerment and Endangered Languages: Linguis-
tic and Anthropological Studies with Special Emphasis on the Lan-
guages and Cultures of the Andean-Amazonian Border Area. (Indig-
enous Languages of Latin America series (ILLA); Publications of the
Research School of Asian, African, and Amerindian Studies (CNWS))
Leiden University.
2007 The Vaupés melting pot: Tukanoan influence on Hup. In: Alexandra
Aikhenvald and R. M. W. Dixon (eds.), Grammars in Contact: A
Cross-Linguistic Typology, 267289. (Explorations in Linguistic Typ-
ology 4.) Oxford: Oxford University Press.
Forthc. Hup. In: Martin Haspelmath and Uri Tadmor (eds.), Loanword Typ-
ology Project, Max Planck Institute for Evolutionary Anthropology,
Leipzig.
Jackson, Jean
1983 The Fish People: Linguistic Exogamy and Tukanoan Identity in North-
west Amazonia. Cambridge: Cambridge University Press.
Loukotka, Cestmir
1968 Classification of South American Indian Languages. Los Angeles:
University of California.
Martins, Silvana A.
2004 Fonologia e Gramática Dâw [Dâw phonology and grammar]. Ph.D.
diss., University of Amsterdam. Amsterdam: LOT.
Hup 565
Matras, Yaron
guistics 36: 281331.
Ospina Bozzi, Ana Maria
2002 Les structures élémentaires du Yuhup Makú, langue de l’Amazonie
Colombienne: Morphologie et syntaxe. Ph.D. diss., Université Paris 7
– Denis Diderot.
Ramirez, Henri
1997a A Fala Tukano dos Ye’pa-Masa. Volume 1: Gramática [The Tukano
Language of the Ye’pa-Masa: Grammar]. Inspetoria Salesiana Mis-
sionária da Amazônia, CEDEM: Manaus.
Ramirez, Henri
1997b A Fala Tukano dos Ye’pa-Masa. Volume 2: Dicionário [The Tukano
Language of the Ye’pa-Masa: Dictionary]. Inspetoria Salesiana Missio-
nária da Amazônia, CEDEM: Manaus.
Sorensen, Arthur
1967 Multilingualism in the Northwest Amazon. American Anthropologist
69: 670684.
Stenzel, Kristine
2004 A reference grammar of Wanano. Ph.D. diss., Department of Linguis-
tics, University of Colorado.
Weir, E. M. Helen
1984 A negação e outros tópicos da gramática Nadëb [Negation and other
topics of Nadëb grammar]. MA thesis, UNICAMP, Campinas.
Mosetén borrowing from Spanish
Jeanette Sakel
1. Introduction, including genetics and sociolinguistics1
Mosetén is spoken by approximately 800 people in the foothill region of the

Bolivian Andes. It belongs to the small, unaffiliated language family Mo-
setenan (Sakel 2004). Spanish was first introduced by missionaries in the
seventeenth century and gained more in importance when permanent mis-
sions were established in the nineteenth century. It soon became a highly
prestigious language that was used for communicating with strangers, and
that was connected to welfare and money, while the function of Mosetén was
restricted to communication among family members and friends. Still today,
Mosetén is only used orally, and not all children in the ethnic group learn
it. Most speakers of the language belong to the middle and old generations.
Moreover, all speakers of Mosetén are bilingual in Mosetén and Spanish, and
many feel more comfortable speaking Spanish. Spanish is also the language
used for writing and other official purposes. Only very recently, Mosetén has
been established as a written language (Sakel 1999, 2001, 2002b; cf. also
Sakel 2004: 5052), but is not generally used, apart from the attempts of few
individuals.2
Spanish is the only language that Mosetén is currently in contact with.
Even though there are speakers of other languages in the area – mostly Ay-
mara and Quechua settlers from the higher Andean regions, as well as some
speakers of Trinitario and Yurakare – only some Mosetenes know occasional
words in those languages. Certain categories in the Mosetén language, how-
ever, give clues to former contact situations. Mosetén occupies a unique pos-
ition in the transition zone between the Amazon and the Andean highlands,
and shows traits of an Amazonian linguistic area (such as proposed by Payne
1990), as well as loanwords – and possibly also grammatical influence – from
languages of the Andean highlands.
In the present chapter, I will concentrate on the one-to-one borrowing
situation between Mosetén and Spanish, more specifically the influence of
Spanish on the grammatical structure of Mosetén.3 The main contact phe-
nomena in Mosetén from Spanish are found on the level of discourse organ-
ization and the integration of verbs. To my knowledge, there are no contact
568 Jeanette Sakel
phenomena from Spanish in the phonology of Mosetén, apart from sounds

appearing in loanwords that have not been phonologically integrated into the
system of Mosetén. The same holds for the overall typological profile of Mo-
setén, which does not seem to have changed through contact with Spanish.
The contact phenomena are found in the areas of nominal structures, other
parts of speech, constituent order, and syntax. Verbal structures are only be
represented by the integration strategies of loan-verbs.4
Within nominal structures, there are only two phenomena that possibly have
arisen through contact with Spanish. Both involve the remodelling of the
structure (PAT), rather than direct MAT-loans.
In the gender agreement system, the feminine was originally used as the
unmarked gender in the language, e.g. referring to an unspecified group of
people (1).
(1) Mö’-ïn yi-’-in ats-i-jo-i katyi’ äwä’-mï’.

3f-p say-f.s-p come.m.s-vi-ins-m.s eh child-3m.sg
‘They (father and mother) said that their son came.’
Most probably under the influence of Spanish, many young speaker prefer the
use of the masculine gender in these instances (Sakel 2002a: 302303, Sakel
2004: 9091), i.e. mi’in ‘they masculine plural’ would be used.
Another structural change in the nominal structures that could have arisen
through the influence of Spanish is the tendency to use of plural marking
with inanimate objects whose plurality is not in focus. Traditionally, marking
of plurality by the plural marker in is only obligatory animates and in cases
where the plurality is in focus. Still, many speakers use it with inanimate
nouns even when their plurality is not in focus, which could possibly be due
to the influence of Spanish.
3. Verbal structures (the integration of loan-verbs)
In the same way as there are only very few contact phenomena in nominal
structures, verbal structures are not usually remodelled or taken over from
Spanish. The only phenomenon related to language contact in the verbal sys-
Mosetén 569
tem of Mosetén involves the integration of loan-verbs. Mosetén has a system

of complex predicates (cf. Sakel 2007b), which means that there are only
1016 5 ‘real’ verbs – or verbness markers – in the language, while all other
verbal elements are combinations of one of these and a non-finite element.
Spanish verbs are treated in the same way as native non-finite elements, and
accommodated into the verbal system of Mosetén by the addition of a verb-
ness marker. The elements borrowed from Spanish are all non-finite in na-
ture, including verb-roots (3), non-finite verbal forms (6, 7) or other parts of
speech, such as nouns (2, 4, 5, 8). Only two of the native verbness markers
are used to integrate loan-verbs: -yi- (examples 68) and -i- (examples 25).
The latter is very frequent in established complex predicates, but is generally
not productive in forming new verbs, whereas -yi- is productively used both
with native non-finite forms and to integrate loan-verbs. The unproductive
marker -i- only appears with 4 verbs that are borrowed from Spanish, all of
which must have been borrowed at a stage when the marker could still be
productively used to form verbs. This is supported by the fact that these are
concepts that the Mosetenes presumably did not have prior to their contact
with Spanish missionaries, but that became important in their new society
and were borrowed very early on.6
(2) viaje-i- ‘to travel’ (from the Spanish noun viaje ‘journey’)
(3) dewe-i- ‘to owe’ (from the Spanish verb-root debe- ‘owe’)
(4) reso-i- ‘to pray’ (from the Spanish noun rezo ‘prayer’)
(5) fieshta-i- ‘to party’ (from the Spanish noun fiesta ‘party’)
(6) pasar-yi- ‘to happen’ (from the Spanish infinitive pasar ‘happen’)
(7) saludar-yi- ‘to greet’ (from the Spanish infinitive saludar ‘greet’)
(8) suerte-yi- ‘to be lucky’ (from the Spanish noun suerte ‘luck’)
The bulk of Spanish loans is found in the category of ‘other parts of speech’,
comprising mainly function words and discourse markers. Most loans in this
category are of the type MAT, i.e. elements taken over directly from Spanish.
The numeral system of Mosetén is decimal, but has probably arisen from
a quintenary system (Schuller 1917; Sakel 2004: 168), though it is unclear
if this remodelling has happened due to Spanish influence. Clear Spanish
MAT-influence is found in the usage of numerals, since in everyday speech,
Spanish numerals – especially those above 10 – are preferred. Quantifiers are
570 Jeanette Sakel
occasionally borrowed from Spanish, even though their usage is not regular
and they would qualify as instances of code-switching.
Turning to indefinite pronouns, the forms nunca ‘never’ and siempre ‘al-
ways’7 are frequently used in Mosetén. Other indefinite pronouns only very
rarely used. There is a general tendency in Mosetén for expressions of time
being taken over more frequently from Spanish than other expressions (i.e.
person, thing, location or manner) (cf. also adverbializers).
Particles and discourse markers make up the major part of elements bor-
rowed from Spanish into Mosetén. All Spanish connectors are well-estab-
lished loans in Mosetén, even though they differ in the environments in which
they can be used. Thus, the three Spanish coordinating conjunctions y ‘add-
ition’ (9), o ‘disjunction’ (10)8 and pero ‘contrast’ (12) can connect clauses,
as well as functioning as connectors in discourse, while only o ‘disjunction’
also can be used to combine phrases (11). The connectors are borrowed to-
gether with their clause-combining construction, i.e. they appear between
the two juxtaposed clauses, as in Spanish. This construction is different from
native strategies in Mosetén, where addition is expressed by mere juxtapos-
ition, while contrast is marked by a clitic, appearing on the first element of
the second clause (13).9 In many instances, the borrowed Spanish markers
and grammatical constructions appear together with the native strategies in
double constructions (14).
(9) Wën-jö-’ khö’ï mömö’ jish-yi-ti-’ y

move-dj-f.s mo.f only.f comb-vy-re-f.s and.E
me’-me’ shiph-ki-’ raej dyaba.
so-rd leave-vk-f.s all peanut
‘She must have come, [and] she just combed herself and thus the pea-
nuts came out [of her hair].’
(10) Me’-nä-ki jäe’mä tse’-mö’ me’ färä-yë-bän-’-yä’-wïn

so-fo-co dm mother-3f.sg so fry.banana-vy-again-f.s-ad-c
o jäe’mä wi-ki-’-ya’.
or.E dm spin-vk-f.s-ad
‘And then [the child’s] mother was frying bananas or spinning.’
(11) Chhibin o tsiis ji’-jaem’-te penne

three or.E four ca-good-vd.3m.o raft
chapa-ti-k-dye-tyi’.
big.raft-vt-an-b-l.m
‘Three or four rafts to make a big raft.’
Mosetén 571
(12) Me’ jïmë mö’ pero mö’ maj-jo-’ me’.

so close 3f.sg but.E 3f.sg much-vj-f.s so
‘This (water-source) is closer, but that one has more (water).’
(13) Mi’-we ködy-a-j-ki-ki material, itsi-ki tata.

3m-dr beg-vi-dir-an-dk.m.s material.E nx.m.s-co father
‘They went there to ask for material, but the priest was not there.’
(14) Tyiñe-tyi’ pero-ki pen’-ki jai’ba-i.10

semi.red-l.m but.E-co side-co white-vi.m.s
‘It (the peanut) is semi-red, but one side is white.’
Pero does not only express contrast, but can also be used to mark a change
in topic. This extension in function seems to have been motivated by analogy
with the native marker -ki, which like pero can be used to express contrast
(Sakel 2007a).
In the same way as coordinating conjunctions, many subordinating con-
junctions are borrowed from Spanish. These include complementizers and
adverbializers.
Object complements can be expressed by a marker resembling the Span-
ish complementizer que, often pronounced ki (cf. example 15), and most
probably being borrowed directly from Spanish and then phonologically in-
tegrated.11
(15) Jam’-ki-ki rai’s-e-’ ki kasiki jam ji’-ka-te cemento.

ng-co-rd want-vi-f.o that.E cacique ng ca-bring-3sg.o cement.E
‘But he did not want the cacique to make him bring the cement.’
I have only one example of the Spanish complementizer a being used in Mo-
setén – and the speaker later corrects it, identifying it as a “slip of the tongue”,
i.e. it is not an integrated loan (example 16).
(16) 1992 khan aj yäe yakchh-i-ti a karij-tya-ki

1992-in yet 1sg begin-vi-re.m.s to.E hard-vd-an.m.s
jäe’mä doktor-tom hospital-khan.
dm doctor.E-com hospital.E-in
‘In 1992 I began to work with the doctor at the hospital.’
572 Jeanette Sakel
Turning to adverbializers, the Spanish markers si ‘if’, pajki ‘so that’ and
the temporal markers hasta, desde and cuando are borrowed into Mosetén.
Again, markers of time are prominent among those borrowed.
The Spanish marker si ‘if’ is used in Mosetén to introduce a conditional
clause (17). This clause appears again in the same construction as in Span-
ish. Clauses with the borrowed marker si seem to fulfil a narrower function
than in Spanish, in expressing if-clauses giving alternatives. In other cases,
the native forms are used (cf. Sakel 2007a). Thus, the functions are divided
between the borrowed and the native elements. There are several native ways
of expressing conditional clauses in Mosetén, all of which involve clitics.
Example (18) shows a hypothetical conditional clause marked by the clitic
-ya’.12
(17) Me’tyi-tyi’ yäe yi-n “si mi rai’s-e-’ jäe’mä

dm 1sg say-1sg.o if.E 2sg want-vi-3F.o dm
ji’-chhae-yi-ti khäei’-si’ phe-ya-k-dye’ o rai’s-e’
ca-know-vi-re.m.s rf.s-l.f speak-no or.E want-vi-3F.o
chhi-ban-mi jäe’mä piñ-i-dye’-in jedye’-jedye’ mö’-yä’
know-again-2sg dm cure-vi-no-p thing-rd 3F-ad
wiya’-in kïchï tsä’-ïn Köwë’dö’-wë-ïn.
old.man-p go.on.m.s alive-p Covendo-dr-p
‘Thus she said to me, “If you want to study our language, or if you
want to know about the (native) medicines, there are the old people in
Covendo, they are still alive.”’
(18) Mi’-ra’ wën-chhï-sh-än-yä’ tye-baj-te-ra’ yäe kerecha.

3m.sg-ir move-dc-ds-again.m.s-ad give-again-3m.o-ir 1sg money
‘If he comes back again, I’ll give him his money.’
Purpose and causation are expressed by the marker pajki, which seems to be
a phonologically integrated marker of Spanish para que ‘so that’. The integra-
tion process has probably happened in the following way: para que is often
pronounced paa-que in Bolivian Spanish, and a final aspiration was added to
the first part, which is a typical phonological trait of Mosetén. The meaning of
pajki in Mosetén is broader than para que in Spanish, denoting both purpose
(cf. the first occurrence of pajki in example 19) and causation (cf. the second
occurrence of pajki in example 19), while in Spanish it is only used as a pur-
pose marker. The marker porque, which in Spanish expresses causation, in
not borrowed into Mosetén (and pajki is used in all instances).13
Mosetén 573
(19) Khin’-ki-ra’ ti-ksi paj-ki-ra’ tsin dyaidye’-chhe

now-co-ir bring-3pl.o.m.s for-co-ir 1p stranger-su
bae’-ja’ paj-ki kï’-yä’-mïn na’-e-n’
live-vi.1pi.s for-co size.dim-ad-as born-vi-1.pi.s
yi-nä khä jike-in.
say.m.s-fo well ps-p
‘And now he will bring them [here], so that we will marry them, be-
cause we [i.e. our babies] are born [too] small, so he says.’ [situation:
in old days, the Mosetenes married mainly among themselves, which
lead to genetic defects and babies being born prematurely. One of the
priests wanted to introduce ‘new blood’ to avoid these problems]
Finally, temporal adverbial clauses introduced by ashta (a phonologically

integrated form from Spanish hasta ‘until’) are very frequent in Mosetén.
Again, the marker borrowed from Spanish can be used with an extended
function compared to the Spanish source, denoting both an endpoint in time
‘until’ (cf. example 20) and a beginning in time ‘when’ (21).14
(20) Mi’-khan mi’ bae’-i me’-ki keo’-te-in

m-in 3m.sg live-vi.m.s so-co search-vy.3m.o-p
ashta tyaj-ke-te in.
until.E find-dk-3m.o-p
‘They searched (for him) where he lived, until they found him.’
(21) Sokoj-ko-ye-’ sombrero-in raej-jin-in ashta wën-jö-i

take.off-rd-vy-3F.o hat.E-p all-p-p until.E move-dj-m.s
virjen-ya’-in.
virgin.E-ad-p
‘They take off their hats when they come to the virgin (Mary).’
The marker ashta appears parallel to native ways of expressing temporal ad-
verbial clauses, such as marking by the polyfunctional clitic -ya’ (which is,
e.g., also used in hypothetical conditional clauses, cf. example 18 above).
Other temporal adverbial clause markers borrowed form Spanish are cuando
and desde. Cuando can, in the same way as ashta, be used to express ‘until’
and ‘when’, but is less frequent. Desde only appears in combination with
ashta, expressing ‘from–until’.
Apart from that, the Spanish expression cada vez ‘every time’ is occasion-
ally used in Mosetén to express reversal and repetition, appearing together
with, or replacing, native reduplication or affixal marking.
574 Jeanette Sakel
The Spanish particle pues ‘thus, well’ is sometimes used in functions simi-
lar to native cliticized focus markers.
Many discourse markers are borrowed from Spanish into Mosetén. The
most frequent ones are the tag question nowe, from Bolivian Spanish no ves
‘don’t you see’ (cf. example 22), and the filler awer, from Spanish aver ‘let’s
see’, which can also appear at the beginning of a turn (23). All coordinating
conjunctions can be used as sequential discourse markers, linking elements
of discourse to the overall context (cf. examples 2425 for the introduction of
turns by coordinating conjunctions and 914 above). Other discourse mark-
ers that appear – though less frequently – are osea ‘that means, so’, explain-
ing or enquiring about what was said before, porlomenos ‘at least’, siquiera
‘at least’, pues ‘thus, then, well’, claro ‘sure’, claro pues ‘well, sure’, bueno
‘well, sure’ and eso es ‘that is it!’, used as a turn-taking device and for min-
imal response.
(22) Mö’-nä khä Hernan tipi-ti-’ mäei’-ya’ jäe’mä

15
3F.sg-co well Hernan measure-vt-f.s first-ad dm
Marcelina Duran-tom, nowe?
Marcelina Duran-com right.E
‘And this Hernan, they measured the first time, together with
Marcelina Duran, right?’
(23) Alberto, äjj, awer-nä khä, mi’ jady-i-ti,

Alberto em let’s.see.E-fo well 3m.sg go.and.come.back-vi-dt.m.s
mi’-nä khä jäe’mä chhome’ ïtsä-dye-i.
3m.sg-fo well dm also play.game-no-vi.m.s
‘Alberto, well, let’s see, he came [hereto, performed the action, and
went away again], and he was also hmmmm playing games.’
(24) a. Edy-win
Edy-c
‘the dead Edy’
b. O-jam tata pariente-dyera’ khä mi’-tyi’-mi’.
or.E-ng father relative.E-mo well 3m.sg-l.m-3m.sg
‘Or also the relatives of the father (i.e. from his land).’ (turn)
(25) a. Aj camion-chhe’ jïj-ti tsin Rapash-khan.

yet truck.E-su go-dt.m.s 1P La Paz-in
‘We went to La Paz by truck.’
Mosetén 575
b. Y ats-i-ban-ya’-nä mï’ïn chhï-tyäkä’?

and.E come.m.s-vi-again.m.s-ad-fo 2P also-em
‘And when you came back you also [went by truck]?’ (turn)
In some cases, the Spanish answer particle si ‘yes’ is used in Mosetén, usually
in the function of a minimal response.
Place deixis is often expressed by loans from Spanish, and some of the
markers have both temporal and spatial meanings. Spatial prepositions in-
clude ashta (from Spanish hasta ‘until) (cf. example 26),16 desde ‘from’ (27)
and rarely a ‘to’ (28).
(26) Yij-ya’ ashta Karanawi jöf aj majmi jäe’mä.

foot-ad until.E Caranavi yet yet road dm
‘By foot until Caranavi, there was the road already.’
(27) Jama, desde bae’edye’-khan?

em from.E village-in
‘Well, from the village?’
(28) Julio-khan jiti-n-yäe a Riberalta.

July.E-ad send-1sg.o-1sg to.E Riberalta
‘In July, I was sent to Riberalta.’
Temporal deixis includes the markers ashta ‘until’, desde ‘since’, nunca
‘never’ (29) and less frequently in my data en ‘in’ (30). Furthermore, ai weses
(from Spanish a veces ‘sometimes’) is occasionally used in Mosetén.
(29) Nunca katyi’ khä bailar-yi-’ mö’ achae Diana.

never.E eh well dance.E-vy-f.s 3F.sg dog Diana
‘The dog Diana will never dance.’
(30) Me’-me’-ye-ki öi yomodye’ en mil novecientos noventa y seis …

so-rd-vy-an.m.s de.f year in.E 1996.E
‘Well, in that year, in 1996, …’
Several Spanish markers expressing the opposite meaning or delimitation of

some kind can be used in Mosetén, such as embesde, from Spanish en vez de
‘instead’ (31), ni ‘not even’ (32) and sin ‘without’ (33).
576 Jeanette Sakel
(31) Mi’-ya’-ki-ki mi’ kerecha-ki-ki embesde kerecha karto-ki

3m-ad-co-rd 3m.sg money-co-rd instead.E money carton.E-co
mö-mö’.
3F-rd
‘And there, and instead of money, (he gave) only carton money.’
(32) Khäkï jäe’mä säem’ atsi-jo-i tsin, ni phir-ti-tsin
because dm fast come-dj-vi.m.s 1p ng.E get.stuck-vt-1p
jam anik me’ wën-chhïï-tsin.
ng sure so come-dc-1p
‘Because we came here fast, we did not even get stuck [in the mud on
the road], this way we came here.’
(33) Jike ats-i i-ya’-katyi’ tyoko’-ye-’: sin
ps come-vi.m.s m-ad-eh press.stomach-vy-3f.o without.E
saeks-e-dye’.
eat-vi-no
‘And then he comes here and feels the stomach (of the dog): there is
no food.’
The constituent order of Mosetén has partially been influenced by Spanish.

Most notably, while Mosetén does not traditionally make use of adpositions,
it has introduced prepositions from Spanish – both as MAT loans in their
constructions, as well as through remodelling of native material to fulfil the
purpose (cf. e.g. the use of ashta in example 26). Furthermore, I have the im-
pression that the constituent order of Mosetén often follows Spanish patterns,
even though this is again a less clear case of contact influence.
6. Syntax
There are several Spanish contact phenomena in the syntax of Mosetén. Pri-
marily, these involve the organization of clauses, i.e. strategies for coordinat-
ing and subordinating clauses.
Negation strategies are not influenced by Spanish, but a number of delimi-
tation markers, expressing similar concepts, are borrowed from Spanish into
Mosetén (cf. examples 3133).
Mosetén 577
Coordination strategies are borrowed from Spanish together with the

MAT-loans of coordinating conjunctions y ‘and’, o ‘or’ and pero ‘but’. These
markers appear in the same grammatical construction as in Spanish, i.e. be-
tween the clauses they coordinate (cf. examples 914). The same holds for
embedding. Many subordinating conjunctions are borrowed from Spanish,
together with the grammatical constructions they appear in. In this way, com-
plementation follows Spanish patterns when the complementizer ke/ki, from
Spanish que is used in Mosetén. Likewise, adverbial clauses follow Spanish
patterns when Spanish adverbial clause markers are used.
7. Summary
This is a typical one-to-one borrowing situation between languages of very

different status and prestige. Most loans from Spanish are grammatical elem-
ents (MAT) used at the level of discourse organization, such as coordinat-
ing conjunctions, subordinating conjunctions, deictic markers (of time and
space), delimitation markers, and discourse markers. Apart from that, there
is a strategy for the integration of Spanish verbs, following the native patterns
of forming complex predicates. Spanish also seems to have had influence on
the grammatical pattern of Mosetén. Thus, MAT-loans usually appear within
their Spanish grammatical construction, i.e. coordinating conjunctions ap-
pear between the clauses they combine. Other PAT-influence that is not con-
nected to any direct MAT-borrowing, however, is difficult to establish. We do
not have old data of Mosetén, which leaves it unclear whether something is
the remodelling of a pattern due to Spanish influence, or whether it is a native
tendency in the language (cf. word order).
Abbreviations
ad adessive relation de demonstrative pronoun

an antipassive dim diminutive
as associative dir directional marker
b benefactive dj associative motion ‘on the way
c nominal past here’
ca causative dk associative motion ‘after arrival
co contrast there’
com comitative dm filler
dc associative motion ‘on the way dr downriver
there’
578 Jeanette Sakel
ds associative motion ‘after arrival nx existential negation

here’ o object
dt associative motion allomorph p plural
E Spanish form pi (1st person) plural inclusive
eh hearsay evidential ps past tense
ele elicitation example rd reduplication
em emphasis marker re reflexive and reciprocal
f feminine rf reference marker
fo focus marker s subject
in inessive sg singular
ins inceptive aspect su superessive
ir irrealis vd verbness marker -tyi-
l linker in noun phrase vi verbness marker -i-
m masculine vj verbness marker -jo-
mo modal marker vk verbness marker -ki-
ng negation vt verbness marker -ti-
no nominalization vy verbness marker -yi-
Notes
1. This description is based on my own fieldwork on Mosetén in Bolivia, between

1999 and 2002.
2. My main consultant, Juan Huasna Bozo, is today teaching Mosetén at the newly
established ‘Universidad de Monseñor Jorge Manrique en Palos Blancos’ in the
Mosetén region.
3. There are a number of instances where Mosetén may have had influence on the
structure of Spanish. However, there are only few data – and often the same
categories appear in the Spanish spoken by other lowland peoples in Bolivia.
It is thus unclear whether these are areal traits of the areas or due to substrate
influence from Mosetén. I will concentrate on Spanish contact influence in Mo-
setén.
4. There are furthermore many lexical loans from Spanish.
5. This is based on an old system of event classification that is has become lexical-
ized in parts and it is difficult to established exactly how many markers were
originally involved in the system (cf. Sakel 2007b).
6. Marking by -i- and -yi- does not are not applied due to phonological reasons.
Rather, these are two different verbness markers with different functions in the
language (cf. Sakel 2007b).
7. Siempre is often used to express hearsay-evidentiality – both in Spanish spoken
by the Mosetenes, as well as when borrowed into Mosetén. The use of siempre
as an evidential in Bolivian Spanish seems to have come about through PAT-
influence from indigenous languages.
Mosetén 579
8. In the same way as in some varieties of Spanish, the construction o – o can be

used to list alternatives, similar to ‘either or’, etc. (cf. Sakel 2007a).
9. There is no clear native way of marking disjunction in Mosetén. Most probably,
the native marker was lost from Spanish marker o took over this function. In
some cases, the native clitic -ki ‘discontinuous topic’ can be used in functions
similar to disjunction.
10. The clitic -ki appears on the first element of the second clause to marker contras-
tive coordination. Since the conjunction pero is borrowed and added between
the clauses, the speaker struggles to identify the first element of the second
clause, which can be seen in the marker -ki appearing twice.
11. There is a native discontinuous topic marker, a clitic with the form -ki, the use of
which, however, is very different from a complementizer, making a native PAT-
reanalysis less likely.
12. Apart from marking hypothetical conditional clauses, this clitic is used to mark
adessive and temporal relations (cf. Sakel 2004).
13. I am not aware of this extended function being used in the Spanish spoken by the
Mosetenes or among Bolivians; thus, this is probably an innovation that arose in
Mosetén.
14. There is no native structure in Mosetén with such an extended function, nor – to
my knowledge – in the Spanish spoken in Bolivia (cf. Sakel 2007a for further
discussion).
15. A group of men and women, for which the feminine form is used in Mosetén
(cf. Sakel 2002).
16. Cf. also ashta used as a subordination marker. Ashta can also express ‘even’,
parallel to Spanish.
References
Payne, Doris L.
1990 Morphological characteristics of Lowland South American languages.
In: Doris L. Payne (ed.), Amazonian Linguistics: Studies in Lowland
South American languages, 213242. Austin: University of Texas
Press.
Sakel, Jeanette
2002a Gender agreement in Mosetén. In: Mily Crevels, Simon van de Kerke,
Sérgio Meira and Hein van der Voort (eds.), Current Studies on South
American Languages, 287305. Leiden: ILLA 3.
2004 A Grammar of Mosetén. (Mouton Grammar Library 33), Berlin: Mou-
ton de Gruyter.
2007a Language contact between Spanish and Mosetén: A study of grammat-
ical integration. International Journal of Bilingualism 11 (1): 2553.
580 Jeanette Sakel
Sakel, Jeanette (ed.)

2007b The verbness markers of Mosetén from a typological perspective. In:
Bernhard Wälchli and Matti Miestamo (eds.), New Challenges in Typ-
ology, 315338. Berlin: Mouton de Gruyter.
Sakel, Jeanette (ed.)
1999 Poromasi’ Pheyak’dye’in [old stories]. Bolivia: Proyecto GRAMO
(2nd rev. edn 2002).
2001 Ojtere’ [the rooster]. Bolivia: Proyecto GRAMO.
2002b Tsinsi’ kirjka [our book]. Bolivia: Proyecto GRAMO.
Schuller, Rudolph
1917 Introduction to Benigno Bibolotti’s Moseteno Vocabulary and Treaties
from an unpublished manuscript in possession of Northwesterns Uni-
versity Library. Evaston and Chicago: Northwestern University.
Index of subjects
ability, 69 n. 9, 33, 45 directional, 351

absolutive see alignment as discourse device, 409, 545
accent locative adverbial, 153, 270, 424
word accent, 553, 562 manner adverbs, 265, 266
pitch accent, 553 modal–aspectual adverb, 250251,
accusative, 486, 487, 501, 502, 504, 512 333
nominative–accusative alignment, negative, 330, 334, 339, 546
388, 554 phasal, 5, 56, 57, 70, 72, 158, 270,
active see alignment, prefixes 279
additive, 66, 218, 373, 374, 506507, time adverbs, 170, 176, 179, 269–
517 270, 474, 507508, 535, 545, 573
adjective, 192 reason/cause adverbs, 146, 293
active, 266, 277 relative clause, 412
adjectival predicate, 248, 547 spatial adverbs, 269270
adjective–noun collocation, 276 utterance-level, 270
used as adverb, 498 see also compounding
attenuative, 266 affiliation, 192
adjective–noun order, 159, 253254, adverbial clauses, 61, 66, 115, 179, 224,
528 408, 415, 506, 511, 514, 516517,
adjective formation, 81, 87 560, 573, 577
adjective-deriving affixes, 266 linkage, 292293
noun–adjective agreement, 127, 139 markers, 23, 115, 206, 273, 319, 408,
used as noun, 498 457
used as pronoun, 504 affix, 61, 125, 169, 203, 206, 249, 253,
and verb, 441, 443 308, 371, 416, 417 n. 7, 460, 491,
xenoclitic vs. oikoclitic inflection, 524, 573
265 adjectivizers, 253
see also affix aspect, 290
adpositions, 42, 44, 76, 169, 170, 277, causative, 298
279, 372, 518, 576 derivational, 249250, 251, 256,
order, 86, 87, 272 265266, 268, 270, 315, 382
tone pattern, 85 extraction, 265266, 277278
adverb, 158, 231, 270, 277, 441, 446, honorific, 504
454, 458 inflectional, 315, 382
comparison, 509511 locative, 127
adverb-deriving affixes, 265, 266 loss, 460
adverbial particles, 476, 506, 559 person-number, 82
adverbialization, 473, 490, 495, preposition-like function, 448
571572 pronominal, 372
582 Index of subjects
affix (cont.) and gender, 79, 201

and proclitics, 441 answer particles, 5758, 66, 116, 270,
TMA 128129, 369 575
verb derivation, 83 antipassive, 67, 577
verbalizer, 251 applicative 67, 430, 476, 479, 558
see also adverb area see linguistic area
affricates, 83, 9293, 97, 125, 152, 216, article, 390, 519 n. 6, 529532
229, 263, 417, 439, 460 definite, 41, 139143, 155, 162, 201,
agent, 44, 99, 204, 205, 369, 390, 422, 248, 348, 391, 448
430, 467, 473, 485, 500 definite–indefinite contrast, 491
habitual, 311 indefinite, 157
impersonal, 203, 501 proclitic, 264
agent nominalization, 490, 492, 510 articulation, 36, 37, 199, 200, 242, 286,
agglutination, 4041, 47, 123, 484, 305, 388
553554 secondary, 524
change to analytic structure, 132 shift, 38
agreement aspect, 41, 8283, 9495, 112, 128–
case, 271, 279 129, 132, 142, 155, 162, 288, 289,
definiteness, 139, 141 290291, 293, 296, 315, 369, 455,
dual, 445446 474, 556557, 560
gender, 568 double marking, 535
loss, 484 punctual vs. continuous, 367
negation, 273, 279 see also adverb, imperfective,
noun–adjective, 127 particle, perfect, perfective
noun classes, 126127 associative, 153
object, 95 motion, 371, 577, 578
person, 162163 177 augmentation, 70
person–number, 274 auxiliary-verb constructions, 370
plural, 43, 281 n. 3, 445446 aversive, 267
subject–object, 127 basic vocabulary, 70 n. 15, 267, 276,
verb, 205 286, 356
verb-object, 484, 506
aktionsart, 4445, 46, 66, 70, 250, 269, benefactive, 170, 310
281 bilingual(ism) 89, 19, 21, 2526, 27
alignment, 168, 288, 369, 484 n. 6, 3132, 3940, 53, 67, 68, 69
active, 202, 204205 n. 2, 383
ergative–absolutive, 288289, 388 Arabic (Khuzistani)–Persian, 137
nominative–accusative, 554 Chinese–Vietnamese, 357
past tense, 180 Hup–Tukano, 551553
shift, 41 Kriol–English, 367
allomorphs, 278 Kriol–Jaminjung, 383
animacy, 42, 404, 457, 469, 554556, Kurdish–Turkish, 175, 180
568 Manange–Nepali, 287, 297
Index of subjects 583
Mosetén–Spanish, 567 347, 348, 357 n. 10, 358 n. 11,

Nahuatl–Spanish, 403 360, 371, 469, 555
Otomi–Spanish, 435, 437, 441 clitic, 70 n. 14, 110, 154, 168170,
Paraguayan Guaraní–Spanish, 523– 173, 177, 202, 203, 205, 217, 220,
524, 527, 549 n. 4 264, 238, 309, 325, 375, 382, 425,
Quichua–Spanish, 481483, 492, 441442, 445, 457, 537538,
494, 498, 505, 514, 517 553554, 570, 572, 573, 574, 579
Rapanui–Spanish, 387, 394 n. 9, 10, 11, 12
Romani–Hungarian, 276 deictic, 86
Rumungro–Hungarian, 261 and free morpheme, 309310
Rumungro–Slovak, 280 intensifying, 252, 559
Yaqui–Spanish, 419420 loss of clitic pronouns, 168
body parts, 7576, 101, 256, 276, 527 versus suffix, 369
versus verb, 178
calques, 16, 45, 53, 54, 55, 147, 171, cliticization, 178, 448, 501, 519 n. 4
176, 192, 209, 233, 237, 278, 279, clusivity, 53, 62, 206, 460, 495, 504
293, 324, 355, 380, 406, 407, 414, loss, 445446
456, 457, 468, 469, 477, 478, 501, coda, 235, 292, 307, 482, 525
516, 526, 536, 537, 548, 558, 559, code-switching, 16, 177, 180, 253, 303,
561 367, 396, 397, 459460, 504,
cardinal numerals, 51, 84, 269, 270, 510511, 547
271 vs. borrowing 330340, 364, 372,
case (marking) 4041, 42, 69, 93, 95, 380, 381, 383, 391, 392, 393394,
153154, 162, 169170, 205, 267, 397398, 399, 424, 430, 461 n. 3,
281 n. 6, 288289, 367, 374, 465, 488489, 507, 518, 518 n. 2, 533,
469, 473, 486490, 514, 532535 535, 569570
doubling, 154 and discourse markers 545
versus enclitics, 369 dropping case markers 487
extension, 42, 232233 and language loss 338
loss, 4243, 248, 271 insertional 9, 331, 364, 380, 518
and adposition, 44, 172, 370 n. 1, 532
causal relation, 20, 5556, 131, 219, linking 544
293, 471, 516, 519 n. 16, 525 comitative, 42, 119, 169, 310, 369, 430,
causative, 47, 83, 98, 100101, 146, 479, 486, 500, 554, 555, 577
173, 266, 268269, 296, 298 n. 7, comparative, 41, 59, 172, 239, 519
374, 537, 540, 559 n. 14, 509511, 544
double causative marking, 269 subject, 510
of transitive and intransitive loan- synthetic, 266
verbs, 539, 540 word order, 220, 429, 510
see also prefix see also adjective, conjunction, nega-
circumpositions, 169172 tion, preposition, subordinator,
circumstantial, 193 suppletion
classifiers, 42, 43, 44, 70, 292, 297, complement, 95, 9798, 99, 111, 113,
complement (cont.) 295, 351352, 424, 428429, 430,

145, 168, 178179, 209, 350, 414, 440, 502, 511, 514
457, 467, 501502 adversative, 218, 273, 278279, 451,
factive, 208 516
factual, 144, 223 causal, 146, 374, 516
factual vs. non-factual, 145, 223 comparative, 426
of modal predicates, 274275 concessive, 144, 252
see also word order conditional, 330, 334337
complementation, 71, 179, 183, 577 contrastive, 339
complementizer, 97, 98, 102, 110, 112, coordinating, 33, 55, 179, 206, 224,
114, 119, 145, 148, 162, 208, 223, 239, 240, 255, 273, 318, 559, 570–
224, 275, 279, 319, 334, 337, 471, 571, 574, 577
571, 577, 579 grammaticalized, 41
factual, 5456, 144, 145, 158 phrasal, 20
non-factual, 274 subordinating, 5556, 158, 160, 177,
completive, 4445, 474, 556 192, 207, 219, 222, 239, 252, 374,
complex predicates, 182, 365, 370, 569, 397, 471472, 476, 501, 517, 577
577 temporal, 224, 428
compounding, 101, 174, 276, 290, 324, see also compounding, coordinator
354355, 440, 443, 492, 561 connectors see conjunctions
adverbial, 85 consonant, 37, 65, 76, 9293, 114, 152,
connective, 544 216, 230, 234, 236, 287, 304306,
conjunction, 517 345, 387, 420, 438, 482
numeral, 292, 314, 424 alternation, 524, 527528
past, 204205 alveolar opposition, 286
perfect, 203 Begadkephat, 186187
pronoun, 271, 279, 502 clusters, 18, 7677, 152, 167, 216,
verb, 100, 142, 296, 356, 553, 557, 232, 241, 247248, 307, 388, 438,
559 439, 482
vowel, 202 coda, 292
concessive, 55, 56, 66, 144, 148, 208, deletion, 167
224, 252, 255, 517 dental, 139
concord, 127, 130 gemination, 9697, 139, 152, 188,
conditionals, 66, 9394, 142, 156, 208, 212, 216, 263264
224, 330, 336, 374, 425, 516517, gradation, 236
573 harmony, 439
particle (marker, subordinator) interdental, 198
3233, 45, 5556, 173, 219, 224, palatalization, 93, 230, 354
516, 572, 579 n. 12 pharyngalized, 38, 152, 199
word order, 516 prenasalized, 125
see also conditional retroflex, 346
conjunction(s) 5456, 62, 70 n. 14, constituent order see word order
130131, 193, 176, 208, 218, 224, construct state, 41, 140141
converbs, 41, 61, 95, 96, 102, 173, 179, determiner, 154, 202, 490, 529532,
217, 222, 293, 295 541542
coordination, 55, 61, 85, 179, 183, 193, definiteness as relative-clause marker,
254, 255, 294, 295, 409, 451, 457, 155
543, 546, 577, 579 negative, 542
see also conjunction(s) diminutive, 4344, 188, 217, 233236,
copula, 32, 46, 60, 94, 95, 102, 104, 249, 390391, 469, 484485,
119, 146, 156, 159, 163, 173, 490492
177179, 202, 217, 226, 272, 273, direct object, 221, 230, 442, 486, 487,
297, 321, 325, 393, 457, 460, 468, 504, 512, 532533
511513, 516, 547, 560, 562 enclitic pronouns as, 325
difference main and subordinate de-topicalization, 226
clause, 99, 364 suffix, 554
word order, 220221 topicalizing, 147
see also negation direction, 98, 127, 271, 349, 351, 356,
correlative particle, 144 416
counterfactual, 93, 557 discourse
see also irrealis procedural, 432
coverbs, 48, 269, 371 discourse markers/particles, 5, 20,
cross-reference, 96, 119, 120, 505 21, 25, 57, 58, 62, 66, 67, 70,
current contact, 6, 39, 246, 247 130133, 143, 158, 176, 238239,
241, 252253, 270, 377379, 382,
dative 94, 95, 113114, 120, 153, 168, 447, 451452, 455, 460, 470, 475,
170, 217, 232, 233, 248, 267, 297 506509, 541, 546, 547, 562, 569,
days of the week, 59, 65, 157, 192, 334, 570, 574
338, 348, 380, 507, 545, 560, emotive, 394
decimal, 50, 291, 292, 338, 348, 393, hesitation, 238, 410, 425, 428
470, 569 resumptive, 545
definite article, 35, 41, 43, 139, 140– sequential, 375
141, 142, 148, 155, 157, 162, 201, disjunctions, 20, 55, 114, 119, 408, 451,
248, 264, 277, 390, 391, 448 457, 470, 506, 559, 570, 579
definiteness, 42, 43, 44, 70, 115, 116, donor language, 16, 509
138, 139142, 148, 278, 348, 391, dual, 445446
491, 519 n. 6, 529, 554
agreement definiteness embedding, 254, 577
relative clauses, 86 ergative, 41, 288289, 290, 296297,
and specificity, 391 369, 388
deictic element, 5354, 58, 66, 68, 86, past-tense inflection, 202, 204205
155, 158, 161, 163, 251, 252, 255, split ergative, 168, 288
256, 269, 271, 277, 278, 381, 473, loss, 152153
506507, 529, 530 evidentials, 173, 174, 426, 496, 513,
development into relativizers, 130 514, 519 n. 15, 557558, 560, 578
dependent marking, 110 n. 7
exclusive see clusivity inferred certainty, 128, 551, 557558

existentials, 156, 162, 171, 226, 389 infinitive, 48, 127, 157, 161, 173, 205–
verbs, 46, 142143 206, 217, 223, 268, 277, 406, 445
experiencer, 101 vs. nominals, 392
loss of modal ~ 217
focus particles, 56, 57, 115, 143, 158, subjective ~ 274275, 279
270, 272, 319, 374376 instrumental, 554
frequentative, 269, 277 intention(al) 9495, 156
fricatives, 93, 124125, 186, 212, 232, interrogative clause, 511
247, 263, 305306, 330, 368, 420, embedded, 273, 502
438, 439 intonation, 458, 546
future tense/aspect, 44, 46, 66, 128, polar, 514515
155, 173, 189, 289, 315, 367, 391, see also particles
407, 414415, 456, 530, 557 interrogative pronoun, 54, 157, 252,
analytic, 236237 270, 278, 292, 318, 322, 396397,
414, 453, 454, 505
gemination see consonants as conjunction, 224
gender, 4244, 79, 99, 141, 157158, comparison, 230
169, 185, 189190, 200201, creating subordinators, 219
239240, 249, 271, 279, 281 n. 6, and indefinites, 114
311312, 313, 390, 391, 444, 469, and negative indefinites, 237238,
542, 549 n. 7, 555, 568 241
genitive, 84, 138, 140, 202, 217, 233, and relative marker, 223, 255, 264,
234, 309310 275, 413, 511, 513
loss, 248 intonation see interrogative
goal, 113, 114 irrealis, 128, 288289, 557
iterative, 45, 129, 155, 250, 535
habitual, 44, 46, 128, 129, 142, 155,
311, 498, 556 kinship terms, 111, 119, 304305,
harmony, 3637 311312, 382, 415
head marking, 110
hearsay, 578 n. 7 lexicon, 24, 47, 7677, 161162, 175,
209212, 215, 225, 231, 245, 256,
idioms, 100101, 147148, 218, 276, 275276, 295297, 344, 353, 387,
320, 321, 469, 518, 546547 398, 415, 447, 458460, 485, 507,
imperative, 100 517518, 562
imperfective, 95, 371 across, 270, 351
inceptive, 155, 556 addition, 408, 457, 470, 471, 570
inclusive see clusivity around, 42
incorporation, 47, 527528 beside/next to, 153, 176
indicative, 155, 207, 223, 376 between, 176
vs. subjunctive, 210211 have, 46, 95, 117, 118, 156, 171, 456,
indirect object, 170, 221, 226, 533, 554 535, 536537
linguistic area, 6, 16, 2224, 43, 69, synthetic, 467, 479, 526
168, 175, 226, 272, 286, 288, 578 type change, 33, 40, 69 n. 7, 132,
Amazonia, 567 308, 416, 444, 479
Australia, 364, 366, 382 motion, 371372, 391
Balkans, 20, 226 movement
Caucasus–Mideastern–South Asian,
47, 49 narrative, 115, 128, 293, 336, 428, 430,
Caucasus–Anatolia–South Asia, 49 557
Ethiopian, 91102 nasal harmony, 524
Gurage, 20 necessity see modality
Meso-America, 24, 403 negation, 19, 58, 117, 119120, 129,
Papua, 329 152, 160, 206, 271, 291, 338, 341
South Asia, 286, 288 n. 1, 350, 376377, 382, 397, 416,
Vaupés, 24, 551552 560, 562, 576
West Africa, 107122 agreement, 273, 279
Western Asia, 264 circumfix, 546
loanverbs, 250, 267268, 277, 296, comparison, 478
568569 constituent negation, 376
locative predicate, 389 contrastive, 339
converb, 96
modality, 4546, 55, 56, 57, 66, 117, double negation, 237, 254255, 339
156, 162, 217, 269, 315, 334, 340, emphasis, 20, 242 n. 4, 559
356, 367, 456 negative copula, 129
alethic, 494 negative polarity item, 376
complement of modal predicate, (inherently) negative verbs, 129, 291,
274275 350
deontic, 175, 206, 288 subordinate clause, 255
epistemic, 46, 175, 223, 494, 557 see also particles, prefix, pronouns
modal complements, 161, 222223 nominalization, 314, 493
modal verbs, 113, 296, 391, 493, agent, 492, 510
495496 causation, 298 n. 4
non-factual complementizer, 56, 158, embedded propositions, 224
274 location
through periphrasis, 391 loss, 516517
see also adverb, counterfactual, irrea- modality, 175
lis, negation, particle relative clause, 179, 513
morphological type vs. subordination, 501502, 510, 511
agglutinitive, 33, 41, 110, 123, 308, non-verbal predications, 159160, 163,
369, 467, 479, 484, 553 272273, 416, 511, 512513, 535,
analytic, 479 536537
isolating, 40, 47, 48, 49, 110, 308, noun incorporation, 174, 527528
368, 554 number, 86, 110, 111112, 143, 157–
polysynthetic, 40, 403, 414, 525 158, 202, 312, 404, 528529
number (cont.) deictic, 251, 256, 269

absence of, 79, 312313, 499, 529 discourse, 116, 175176, 225, 365,
in superlative, 239 377379, 425, 428, 430
see also agreement distributive, 269
number system emphasis, 559
quintenary, 50, 569 focus, 20, 5657, 115, 143, 144, 158,
vigesimal, 292, 403, 408, 424, 470, 252, 269270, 272, 319, 374375,
471 511, 512, 574
numerals, 20, 27 n. 4, 5053, 62, genitive, 202
6566, 84, 130, 157, 162, 176, imperative, 100
188, 191, 231, 269, 270, 271, 279, inclusive, 206
291292, 312314, 338, 348, 381, interrogative, 57, 176, 218, 458, 546
393, 408, 424, 487488, 502504, limitative, 426
528, 541, 559, 562, 569 modal, 20, 57, 175, 206, 250251,
269270, 315, 356, 494
object see direct object, indirect object negative, 61, 66, 160, 176, 206, 237,
obligation see modality 376377
onset, 167, 287, 307, 525 object, 201, 440, 448
consonant clusters, 482 optative, 269
nasal, 546 phasal, 206
optative, 189, 222, 226, 269, 557 perfective, 315, 395, 535
possessive, 395
palatalization, 38, 93, 231, 241, 346, progressive, 45, 395
354 question–answer, 5758, 66
see also consonant relative, 61, 86, 87, 192193, 206,
participle, 8081, 142143 207, 239
perfective, 174 repetition, 56
passive, 203, 268, 277, 467 requestive, 219
secondary, 173 sentence, 358
source for tense, 189 subordinating, 85, 207, 208, 382
particles, 35, 66, 84, 110, 206207, superlative, 59, 172, 231, 239240,
218, 269, 387, 492493 241, 253
agreement, 116 tense, 315
additive, 373 vocative, 116, 269, 311
adverbial, 20, 506, 559 see also topicalizer, utterance
attributive, 139 partitive, 172, 233, 450, 549 n. 6
collective, 22, 312 passive, 47, 173, 191, 203, 315, 350,
comparative, 172, 206, 253, 426, 440, 354, 388389, 392, 436, 470, 479,
477 558
complement introducing, 414 impersonal, 113, 537
connective, 212, 373374 and reflexive, 538
contrast, 557 see also participle, prefix, subject
coordinating, 85 patient, 203, 205, 532533
perfect, 142, 189, 442 deictic, 53, 270271, 278

compound, 203 indefiniteness, 278
experiential, 45, 95 likeness, 319
past, 368 literary, 310
perfective, 81, 82, 9697, 128129, locative, 127
174175, 281 n. 4, 288, 289, 290, loss, 405
395, 535 mood, 369
see also particle negation, 129, 237238, 270, 291
person, 82, 241, 484, 499 noun class, 126127
double marking, 506 object, 407
see also possessive from particle, 45
pitch, 286288 passive, 308, 315, 537538
pitch accent, 553 person, 82
pluperfect, 142, 143, 146, 148 possessor, 404405, 417 n. 5
plural potential, 371
associative, 112, 266, 279, 281 n. 3 progressive, 45, 189
double marking, 155, 488489, 528 pronominal, 264, 278, 536
see also number quantifying, 271
place of articulation, 36, 38, 305, 388 reflexive, 537538, 559
possessive constructions, 41, 43, 44, 85, subject, 331332, 333, 406407, 526
100, 153, 172, 202, 217, 226, 248, superlative, 59, 266, 278
309, 310, 314315, 390, 404, 440, verbalizer, 331, 332
450, 457, 526528, 536 preposition, 4243, 153154, 159, 160,
alienable–inalienable, 389390, 486, 229230, 309311, 440441, 446,
527, 555 447451, 458, 473, 476
loss of affixes, 404405, 484 used as conjunction, 544
possession verbs, 4747 comitative, 369
possessive marker, 9495, 155, 348, comparison, 172, 220
389, 417, 526, 555 coordinative, 85
predicative possession, 118, 171, 156 doubling, 370
word order, 43, 60, 94, 117, 118, 154, emergence, 264, 267, 281 n. 8,
159, 171, 219, 253, 395, 404, 406, 404406, 576
455, 530 genitive, 8485
see also lexicon: have, particle, pre- locative, 127, 176, 205, 267, 351,
fix, pronoun, subject 369, 370, 405, 477478
potential, 371, 407 oblique, 369, 533
prefix, 41, 110, 112, 127, 168, 173, 264, possession, 248, 239, 315, 390,
281 n. 5, 519 n. 4, 554 526527
active, 308, 315 reinforcer, 224
affiliation, 192 and subordinators, 412413, 452
artificiality, 265 temporal, 426
aspectual, 250 privative, 53, 116, 449
causative, 396, 539, 540 proclitic see clitic
pro drop, 81, 272 368, 388, 553

probability see modality purposive, 374, 493494, 519 n. 8
progressive, 129, 189190, 205, 372,
395 quantifiers, 5, 85, 157, 269, 270, 381,
emergence, 44, 45, 110, 112, 455 404, 405, 408, 411, 470, 502504,
see also prefix 528, 541, 569
pronouns, 35, 43, 5354, 62, 70 n, 14, questions see interrogative
161, 176, 189, 277, 349, 372, 415, quotative, 101, 493, 496, 514, 545
547, 562
attributive demonstrative, 390 realis see modality
bound, 64, 349 reason, 98, 101, 119, 224, 273, 319
compound, 502 see also causal
composite, 113, 413, 502 recipient 68, 96, 267, 354, 468, 473,
demonstrative, 453 532
emphatic, 270, 454 recipient language, 1617, 3738,
genitive see possessive 3940, 62, 67, 509
impersonal, 501 reciprocal, 53, 54, 64, 157, 191, 483,
indefinite, 157, 251, 470, 541542, 499501, 519, 538, 539, 558
570 see also pronoun
interrogative, 252, 292, 413, 454 reduplication, 40, 110111, 312, 356,
as relative, 255, 413, 453, 511, 513 389, 479, 489490, 526, 573
negative indefinite, 237238, 270, reflexive, 47, 5354, 113, 191192,
570 252, 277, 352, 499501, 539,
oblique case forms, 281 n. 6 558559
optionality, 81 intransitive interpretation, 496
personal, 22, 53, 68, 111, 152, 168, and (impersonal) passive, 537538
200, 205, 271, 310, 312, 389 see also pronoun
honorific, 316317 relative clause, 60, 61, 81, 8687, 99,
possessive, 85, 253, 390, 440, 441, 115, 145, 161, 179, 207, 223, 230,
487 255, 275, 364, 398, 412414,
quantifiers used as, 504505 440441, 453, 544545
reciprocal, 5354, 64, 157, 271, 279, asyndetic–syndetic, 192193
501, 524 causal, 98
reflexive, 53, 64, 84, 113, 191, 252, headless, 414, 454, 560
270, 501 locative, 321322
relative, 86, 130, 239, 255, 305, 398, nominalizer, 531
412413, 453, 454, 501, 502, 511, relative concord, 130
513514 subject, 8081, 86, 98
resumptive, 53, 157, 161, 331 see also particles, pronouns, word
subject, 8182, 425 order
see also adjective, compounding, relativizer
interrogative see pronouns
prosody, 18, 26 n. 3, 3839, 152, 365, repetition, 5657, 270, 456, 573
reportative, 493, 496, 514, 545 partitive, 233

reversal, 573 in passive, 467468
figure–ground, 117 in possessives, 117, 180
see also passive, prefix, pro-drop,
sampling, 14, 33, 68 relative clause, word order
semantic functions, 143, 448 subjunctive, 17, 45, 145, 155157, 161,
serial verbs, 110, 117, 118, 537 173, 175, 189, 274275, 279, 472,
situation-bound expressions, 320 493494, 519
source language see donor language see also subordination
spatial expressions, 85, 153, 256, 267, subordination, 61, 66, 9495, 99, 144,
269270, 279, 311, 370, 381, 403, 177, 224226, 254, 272275,
404, 440, 486, 473, 529, 534, 575 452453, 457, 493, 525526, 536,
split ergative see ergative 546
Sprachbund see linguistic area conjunctive, 224
stative/active, 80, 82, 153, 271, 442, vs. nominalization, 501502
443, 457 see also copula, negation, relative
stem modification, 371 clause
stops, 304 subordinator, 5556, 66, 223, 255, 269,
alveo-palatal, 368 273274, 425, 427, 430, 543, 544–
dental, 263, 420 545, 579 n. 16
distinctions, 167 in comparatives, 546
vs. fricative, 212 see also conjunctions, particles
glottal, 187, 304305, 326 n. 6 420 suffix, 23, 43, 44, 7883, 9396, 99,
preaspirated, 229 102, 110, 111, 127, 130, 153, 155,
prenasalization, 553 168174, 177, 181, 189192,
velar, 420 201, 204, 206, 233242, 248, 249,
voiced, 199, 403, 553 265270, 277, 281, 290, 293, 310,
voiceless, 242 n. 2, 263 311, 355, 369, 370372, 404406,
voicing, 403 421423, 426, 430, 440448,
weakening, 200 454, 465, 467, 469, 473, 484, 493,
stress, 18, 8283, 187, 188, 200, 241, 496, 505, 514519, 526, 540, 554,
247248, 264, 287288, 404, 438, 558560
439, 482, 524, 525, 529530, 553 superlative, 232, 239240, 241
subject, 81, 82, 95, 96, 107, 117, 127, double marking, 240
202, 223, 275, 331333, 368369, see also particle, suppletion
349, 377, 393, 406, 411, 493, 494, suppletion, 371
512513, 514, 516, 536 comparative, 158
agreement, 127, 274, 499 gender, 79
collective, 312 imperative, 100
in comparative, 510 ordinals, 5253
of complement, 158 superlative, 158
impersonal, 341 n. 2 tense–aspect, 369
nominalized clause, 531 suprasegmental phonology, 482, 524
syllable structure, 18, 78, 80, 82, 188, utterance, 38, 48, 273
213 n. 4, 241, 264, 292, 307308, modifier, 494
346, 354, 346, 387, 482, 525 planning, 20, 33, 35, 163, 221, 226
utterance-initial particle, 116
tense see alignment, ergative, future, utterance-final particle, 116
participle, particle, suppletion see also adverb
time expressions, 58, 535, 544, 545
see also days of the week, time of valency, 47, 48, 372
the day changes, 281 n. 4, 290, 499501,
times of the day, 59, 65, 157, 192, 380, 535, 536, 537539, 558
507 differentiation, 218
tone, 18, 38, 77, 8283, 286288, 298 marking, 49, 113, 562
n. 6, 345 350, 354, 553 see also transitivity
class, 83 verb compounding see compounding
lexical, 80 verb serialization see serial verbs
merger, 286 verbalization, 48, 58, 181 n. 5, 268,
pretone position, 247 331333, 423
see also adposition ideophones, 99, 100
topic infinitives, 422
change, 471, 571 interjections, 251
development, 131, 221 nouns, 142, 268, 558
de-topicalization, 221, 226 see also prefix
discontinuous, 579 n. 9, 11 verbness, 49, 371, 392, 569, 578 n. 6
fronting, 395 vocative, 200, 249, 305, 427428
left dislocated, 115 see also particle
marker, 238, 321, 485, 507 voice, 47, 350, 423
position, 331, 333 marking, 113, 562
prominence, 485 see also passive, valency
shift, 131, 375, 378 voicing, 232, 247, 286287, 403, 482
switch, 378 vowel, 3638, 65, 78, 80, 82, 124, 152,
topic–comment, 352, 358 n. 14 167, 187188, 247, 263, 306307,
topicalization, 147, 309 330, 387, 388, 420, 436, 438,
transition see shift 466467
topicalizer, 490491, 508, 511, 512– adaptation, 470
513, 519 n. 6 amplitude, 288
transitivity, 96, 202204, 218, 288289, alternation, 168
296297, 368, 372, 407, 500, 538 compound vowel, 202
agent marking, 369 diphthongization, 445
ditransitive, 96 epenthesis, 167
marker, 156, 372 gliding, 305
participant marking, 532 harmony, 167, 169, 174, 263, 439
see also causative, ergative, reflexive, insertion, 388, 483
word order intensity, 288
length, 77, 80, 117, 124, 213 n. 4, conditionals, 516

216, 241, 263, 403404, 420 direct and indirect object, 221
lowering, 77 inversion, 516
semivowels, 305 pragmatic conditioning, 365, 379
in relative clause, 255
word classes, 4849, 61, 308 relative clause–head, 60, 193, 230,
word order, 19, 35, 81, 8687, 99, 153, 397, 511, 513514
158159, 207209, 219222, 226, subordinate clauses, 177
248, 254, 272, 294, 320321, 333, subject–predicate, 159, 321
335, 368, 377, 411, 454, 465, 487, and transitivity, 86, 395
501, 504, 511517 see also comparative, copula, posses-
non-predominant sive constructions
non-verbal predications, 512513
noun phrases, 84, 138139, 146, written language, 19, 123, 161, 245,
253254, 321, 370, 528 246, 247, 302, 303, 322323, 325
change, 15, 19, 43, 60, 141, 147, 177– n. 3, 344, 348, 354, 357 n. 2, 367,
179, 219, 230, 390, 429, 502, 546 465, 567
Index of authors
Abondolo, Daniel, 280 Bostoen, Koen, 125129

Acharya, Jayaraj, 287, 292, 296 Bradley, David R., 283
Adelaar, K. Alexander, 329 Braukämper, Ulrich, 92
Adelaar, Willem, 329 Brody, Jill, 489
Afanasjeva, Nina Jelisejevna, 229 Bryan, Margaret A., 107
Aikhenvald, Alexandra Y., 3, 26, 32, 36, Bubeník, Vít, 269
55, 551, 552, 561563 Buelna, Eustaquio, 423
Albert, Ruth, 250 Buitimea Valenzuela, Crescencio, 431
Alderetes, Jorge R., 499 Bulut, Christiane, 166168, 172175,
Alidou, Ousseïna, 76, 88 179181
Álvarez, Albert González, 430 Burssens, Amaat, 125
Alves, Mark J., 345, 347, 350, 352, 358 Büttner, Thomas, 481
Ameka, Felix K., 110114, 117
Andrés de Jesús, Severiano, 439442 Caferoğlu, Ahmet, 215
Andrews, Henrietta, 442 Caillavet, Chantal, 481, 483
Arnold, Werner, 186, 188 Campbell, Lyle, 24, 31, 169, 403, 489,
Aronson, Howard I., 250 502
Canger, Una, 404
Backus, Ad, 9 Capistran, Alejandra, 466
Bakker, Dik, 17, 437, 451, 549 Carlos Silva Encinas, Manuel, 431
Bakker, Peter, 10, 70 Carochi, Horacio, S. J.
Barker, Milton E., 353 Casad, Eugene H., 422423
Bartholomew, Doris, 442 Cerrón-Palomino, Rodolfo, 484
Bavin, Edith, 369 Chamoreau, Claudine, 23, 466468,
Baxter, Alan N., 315 473, 476
Becker, Alton L., 309 Charpentier, Jean-Michel, 366
Behnstedt, Peter, 188 Christiansen, Niels, 83
Bender, M. Lionel, 75 Christiansen, Regula, 83
Benedict, Paul K., 349 Chyet, Michael, 171, 206
Bernus, Edmond, 75 Cole, Peter, 482, 487490, 493493,
Bernus, Suzanne, 75 506, 509, 519
Bickel, Balthasar, 288, 289, 297, 298 Comrie, Bernard, 2, 94, 431, 461
Bin-Nun, Jechiel, 256, 257 Correll, Christoph, 189, 193193
Bisang, Walter, 91, 92, 102 Crass, Joachim, 20, 9192, 99102
Blench, Roger M., 107
Blommaert, Jan, 123 Dakubu, M. E. Kropp, 120
Blust, Robert, 326 Đào, Duy Anh, 358
Bolonyai, Agnes, 16 Darjowidjojo, Soenjono, 339
Boretzky, Norbert, 261, 274 de Casparis, J. G., 326
Index of authors 595
de Rhodes, Alexandre, 344, 346 Gardner-Chloros, Penelope, 330

De Rooij, Vincent A., 22, 131, 132 Gemalmaz, Efrasiyap, 166, 179
Dedrick, John M., 422423 Genetti, Carol, 290, 297
DeFrancis, John, 344 Gilberti, Maturino, 476
DeLancey, Scott, 288 Givón, T., 55
Diffloth, Gérard, 345 Gołąb, Zbigniew, 16
Dimmendaal, Gerrit J., 117 Gómez-Rendón, Jorge, 484, 485, 498,
Dixon, R. M. W., 3, 55, 371 509, 511, 518, 519, 533, 549
Dorleijn, Margreet, 166, 168, 172174 Gonda, J., 311
Dozier, Edward P., 431 Graber, Philip, 375
Dryer, Matthew S., 2 Granda, German de, 484
Duval, R., 213 Grégoire, Claire, 127
Gregores, E., 524, 525, 530, 535, 546
Ebert, Rolf, 254 Grijns, C. D., 326
Echegoyen, Artemisa, 442 Grotzfeld, Heinz, 193
Ecker, Lawrence, 442 Guasch, Antonio, 527
Eggers, Eckhard, 249, 252254, 257 Guerrero, Alonso, 404, 412
Elšík, Viktor, 3, 24, 3334, 42, 45, Gysels, Marjolein, 133
5254, 58, 59, 67, 261, 262, 265,
268, 281 Haboud, Marleen, 481
End′ukovskij, A. G., 235 Haig, Geoffrey, 25, 166168, 174
Epps, Patience, 20, 25, 551552, Halász, Ignácz, 238
555561 Hale, Kenneth L., 431
Escalante, Fernando, 431 Harley, Matthew, 114, 116
Estrada Fernández, Zarina, 420, 423, Harris, Alice, 169
430, 431 Harris, John W., 366, 367
Everett, Daniel L., 317 Haspelmath, Martin, 2, 93, 298, 458,
467
Faber, Alice, 91 Haudricourt, André G., 345
Fabian, Johannes, 123 Haugen, Einar, 3, 16, 32
Fauchois, Anne, 483, 487, 490, 491, Hayward, Richard, 100101
512514, 518, 545 Heath, Jeffrey, 15, 32, 8688, 365, 382
Fenyvesi, Anna, 280, 281 Heine, Berndt, 16, 17, 9293, 98, 107
Ferlus, Michel, 357 Hekking, Ewald, 17, 436442, 451,
Fetter, Bruce, 124 549
Field, Fredric, 3, 33, 43, 431 Hengeveld, Kees, 485
Fischer, Steven Roger, 389, 399 Hernández Cruz, Luis, 439
Fleischman, Suzanne, 94 Hess, H. Harwood, 442
Ford, Kevin C., 120 Hildebrandt, Kristine A., 26, 242,
Friedman, Victor A. 285290, 293298
Hill, Jane, 502
Gage, William W., 345 Hill, Kenneth, 502
Galand, Lionel, 86 Hobermann, R. D.
596 Index of authors
Hopkins, S., 213 La Polla, Randy J., 120

Hoshi, Michiyo, 290, 293, 298 Lastra de Suárez, Yolanda, 439, 442,
Hübschmannová, Milena, 261, 262, 446
269, 280 Leslau, Wolf, 97
Hudson, Joyce, 92, 93, 367 Lim, Sonny, 315
Hyman, Larry, 117 Lindenfeld, Jacqueline, 427, 431, 432
Lockhart, James, 423
Ingham, Bruce, 137 Lötzsch, Ronald, 248, 253257
Isaacs, Miriam, 247 Loukotka, Cestmir, 551
Itkonen, Toivo Immanuel, 233 Lowenstein, Steven, 246
Jackson, Jean, 551 Macalister, R. A. S., 151, 159

Jacobs, Neil, 245, 247 McConvell, Patrick, 367
Jastrow, Otto, 167 Macdonald, R. Ross, 339
Jiménez Moreno, Wigberto, 436 McGregor, William B., 369371
Johanson, Lars, 3, 16, 363 MacKenzie, David, 167, 172, 199202,
Johnson, Jean Bassett, 431 206, 207, 211
Jones, Robert B., 346 McWhorter, John H., 309
Jones, Russell, 326 Maddieson, Ian, 107
Majtinskaja, K. E., 23, 241
Kaarhus, Randi, 486 Makihara, Miki, 388, 393398
Kabamba, Mbikay, 123 Martins, Silvana A., 551, 552
Kahn, Margaret, 167 Masica, Colin P., 286289
Kamba Muzenga, 129 Maspero, Henri, 357
Kane, Thomas Leiper, 99, 101 Matisoff, James A., 350
Kapanga, André Mwamba, 123 Matras, Yaron, 3, 610, 1517, 20, 21,
Karttunen, Frances, 423 2427, 3134, 4146, 49, 5259,
Kashoki, Mubanga E., 125 6769, 151, 152, 163, 166, 227,
Katz, Dovid, 252 264269, 272276, 281, 289, 382,
Kaufman, Terrence, 3, 16, 31, 32, 61, 421, 451, 489, 509, 559
68, 343, 382 Mazaudon, Martine, 292
Keesing, Roger M., 16 Meakins, Felicity, 369
Kenesei, István, 280, 281 Mečkina, Jekatarina Ivanovna, 229
Kert, Georgij Martynovič, 231231 Mei, Tsu-Lin, 354
Khan, G., 24, 213 Meijering, Henk D., 250
Klaus, Väino, 231 Melià, Bartomeu, 548
König, Ekkehard, 374 Menz, Astrid, 179
Korhonen, Mikko, 234 Meyer, Ronny, 9192, 101
Kossmann, Maarten, 82 Möhlig, Wilhelm J. G.
Kulonen, Ulla-Maija, 231 Moravcsik, Edith, 3, 33, 43, 47
Kuruč, Rimma Dmitrijevna, 229, 237, Morínigo, Marcos, 549
239 Mougeon, Raymond, 16
Kuteva, Tania, 16, 17, 92, 93, 98 Munro, Jennifer M., 366368
Index of authors 597
Mushin, Ilana, 425 Salinas Pedraza, J., 439

Mutzafi, H., 213 Salmons, Joe, 330
Muysken, Pieter, 3, 32, 62, 70, 364, Sammallahti, Pekka, 237, 239
437, 499, 511, 518, 519 Sandefur, John, 367
Myers-Scotton, Carol, 16, 519 Sandefur, Joy, 367
Sasse, Hans-Jürgen, 379
Nau, Nicole, 16 Savić, Jelena M., 16
Németh, Gyula, 216 Schachter, Paul, 509
Newman, Paul, 77 Scheller, Elisabeth, 230
Nguyễn, Đình Hoà, 347, 357 Schicho, Walter, 128129, 133
Nguyễn, Phú Phong, 350 Schmidt, Annette, 368369
Nguyễn, Tài Cẩn, 353, 357 Schrammel, Barbara, 269
Nguyễn, Văn Lợi, 345 Schuller, Rudolph, 569
Nicolaï, Robert, 7676, 8888 Schultze-Berndt, Eva, 1719, 371, 376
Noonan, Michael, 283286, 289, 290, Šebková, Hana, 261, 262
298 Shabibi, Maryam, 20, 137, 147
Norman, Jerry, 356 Shopen, Timothy, 369
Nugent, Paul, 107 Sidibé, Alimata, 75
Siegel, Jeff, 16
O’Shannessy, Carmel, 369 Sinclair Crawford, Donaldo, 439
Sinh, Vinh, 354
Palancar, Enrique, 443 Siptár, Péter, 280
Payne, Doris L., 567 Snellgrove, David L., 283285
Polomé, Edgar C., 133 Socin, A., 213
Prentice, D. J., 329 Sorensen, Arthur, 551
Pulleyblank, Edwin G., 358 Spitaler, Anton, 192
Stadnik, Elena, 230, 231
Ramirez, Henri, 556, 558, 564 Stassen, Leon, 2
Reershemius, Gertrud, 17, 254, 256 Stenzel, Kristine, 561
Rhydwen, Mari, 367 Stilo, Don, 168, 181, 213
Richter, Renate, 91 Stolz, Christel, 3, 23, 32, 33, 382
Rijkhoff, Jan, 2 Stolz, Thomas, 3, 23, 32, 33, 54, 55,
Ring, Andrew J., 107, 108 382
Rogers, Clint, 285 Struck, R., 107
Romaine, Suzanne, 130 Suárez, C., 524, 525, 530, 535, 546
Rose, Deborah, 365 Suárez, Jorge A., 439
Ross, Malcolm D., 3, 25, 33 Szabó, László, 232
Ruhlen, Merritt, 435
Ruiz de Montoya, A., 548 Tadmor, Uri, 23, 458
Taylor, Keith W., 357
Sahagún, Bernardino de, 436 Thomason, Sarah G., 3, 10, 16, 3132,
Sakel, Jeanette, 78, 1517, 23, 31, 41, 61, 68, 317, 343, 382, 460
45, 289, 421, 567572, 578579 Thurgood, Graham, 345
598 Index of authors
Timm, Erika, 245 Voegelin, Florence M., 431

Tompa, József, 280 Voigtlander, Katherine, 442
Törkenczy, Miklós, 280
Tosco, Mauro, 102 Wang, Li, 354, 358
Treffers-Daller, Jeanine, 16 Weinreich, Uriel, 16, 65, 69, 247257,
Trinidad Sanabria, Lino, 530 382
Troy, Jakelin, 366 Weir, E. M. Helen, 551, 554, 564
Tryon, Darrell T., 358 Weissberg, Joseph, 257
Tryon, Ray, 366 Westermann, Dietrich, 107
Tsereteli, K. G., 213 Wexler, Paul, 257
Tufan, Şirin, 20, 32, 227 Whorf, Benjamin L., 431
Wichmann, Søren, 47, 69
Urbano, A., 436 Williamson, Kay, 107
Winford, Donald, 32, 36, 38, 43, 47,
Vago, Robert M., 280, 281 70, 278
van der Auwera, Johan, 56 Wirasno, Umar, 309
van Driem, George, 283 Wohlgemuth, Jan, 47, 69
Van den Heuvel, Wilco, 329 Wolff, H. Ekkehard, 76, 88
van Hout, Roeland, 3, 32
Van Minde, Don, 339 Yar-shater, E., 206
Van Valin, Robert D. Jr, 120 Yasugi, Yoshiho, 439
Villavicencio, Frida, 466, 469
Vitale, Anthony J., 130 Zaborski, Andrzej, 91, 102
Voegelin, Carl F. , 431 Zwicky, Arnold, 168

(Empirical Approaches To Language Typology 38) Yaron Matras, Jeanette Sakel-Grammatical Borrowing in Cross-Linguistic Perspective-Mouton de Gruyter (2007)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Empirical Approaches To Language Typology 38) Yaron Matras, Jeanette Sakel-Grammatical Borrowing in Cross-Linguistic Perspective-Mouton de Gruyter (2007)

Uploaded by

Copyright:

Available Formats

Grammatical Borrowing in Cross-Linguistic Perspective

Library of Congress Cataloging-in-Publication Data

Grammatical borrowing in cross-linguistic perspective / edited by Yaron

Bibliographic information published by the Deutsche Nationalbibliothek

Types of loan: Matter and pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

The borrowability of structural categories . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Grammatical borrowing in Tasawaq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Grammatical borrowing in K’abeena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Grammatical borrowing in Likpe (Sɛkpɛlé) . . . . . . . . . . . . . . . . . . . . . . . 107

Grammatical borrowing in Katanga Swahili. . . . . . . . . . . . . . . . . . . . . . . 123

Grammatical borrowing in Khuzistani Arabic . . . . . . . . . . . . . . . . . . . . . 137

Grammatical borrowing in Domari. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Grammatical borrowing in Kurdish (Northern Group) . . . . . . . . . . . . . . . 165

Arabic grammatical borrowing in Western Neo-Aramaic. . . . . . . . . . . . . 185

Grammatical borrowing in North-eastern Neo-Aramaic . . . . . . . . . . . . . 197

Grammatical borrowing in Macedonian Turkish . . . . . . . . . . . . . . . . . . . 215

Grammatical borrowing in Kildin Saami . . . . . . . . . . . . . . . . . . . . . . . . . 229

Grammatical borrowing in Yiddish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Grammatical borrowing in Hungarian Rumungro . . . . . . . . . . . . . . . . . . 261

Grammatical borrowing in Manange . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

Grammatical borrowing in Indonesian . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

Grammatical borrowing in Biak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

Sino-Vietnamese grammatical borrowing: An overview. . . . . . . . . . . . . . 343

Recent grammatical borrowing into an Australian

Grammatical borrowing in Rapanui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

Grammatical borrowing in Nahuatl. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

Grammatical borrowing in Yaqui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419

The case of Otomi: A contribution to grammatical borrowing

Grammatical borrowing in Purepecha. . . . . . . . . . . . . . . . . . . . . . . . . . . . 465

Grammatical borrowing in Imbabura Quichua (Ecuador) . . . . . . . . . . . . 481

Grammatical borrowing in Paraguayan Guaraní. . . . . . . . . . . . . . . . . . . . 523

Grammatical borrowing in Hup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551

Mosetén borrowing from Spanish. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567

Index of subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

Mark Alves Wilco van den Heuvel

1. Borrowing in cross-linguistic perspective1

2. Sampling in contact linguistics

Linguistic typology tries to make generalizations about human languages.

representative of human language. Most researchers have therefore made

Given these diﬃculties of sampling it is not surprising that most attempts

3. The data compilation tool

A major diﬃculty in sampling for the purposes of contact linguistic studies

The aim of the questionnaire was to obtain a representative and comparable

Figure 1. Information page of the Language Convergence database (entry: Domari)

Figure 2. Encoding the sociolinguistic situation (Mosetén)

questions on expressions of time and space). Using the “Layout” function

Figure 3. Coding of contact languages (Manange)

Figure 4. Encoding MAT (matter) and PAT (pattern) replications (Domari)

A sample comparison – piloting just two languages, Kelderash Romani

4. Coverage of phenomena and languages

In assessing the diachronic impact of contact, many of the contributors faced

within a speech community, and structural changes that may be regarded as

Figure 5. Comparing results on the presence/absence of contact phenomena for the

1. We gratefully acknowledge support from the Arts and Humanities Research

Aikhenvald, Alexandra Y., and R. M. W. Dixon (eds.)

3. Integration of MAT and PAT loans

4. MAT and PAT in phonology

(1) MAT-borrowed element is phonologically integrated into the recipient

is introduced, and a change in the stress patterns could be counted as PAT

– Borrowing of individual phonemes that are also used in native elements

Adaptation of stress, syllable structure, prosody or tone systems would clas-

5. The MAT/PAT distinction in grammar: ﬁndings from

5.1. Situations with overall MAT-borrowing