Odisho, E. Y. Pronunciation Is in The Brain, Not in The Mouth: A Cognitive Approach To Teaching It

Pronunciation is in the Brain, not
in the Mouth
Pronunciation is in the Brain, not
in the Mouth
A Cognitive Approach to Teaching it
Edward Y. Odisho
9
34 2014
Gorgias Press LLC, 954 River Road, Piscataway, NJ, 08854, USA
www.gorgiaspress.com
Copyright 2014 by Gorgias Press LLC
All rights reserved under International and Pan-American Copyright

Conventions. No part of this publication may be reproduced, stored in a
retrieval system or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording, scanning or otherwise without the
prior written permission of Gorgias Press LLC.
2014
9
ISBN 978-1-4632-0415-0
Library of Congress Cataloging-in-Publication

Data
Odisho, Edward Y.
Pronunciation is in the brain, not in the
mouth : a cognitive approach to teaching it / By
Edward Odisho.
p. cm.
ISBN 978-1-4632-0415-0
1. English language--Pronunciation. 2.
Cognitive grammar. 3. Psycholinguistics. I.
Title.
PE1137.O423 2014
421.540071--dc23
2014032984
Printed in the United States of America
TABLE OF CONTENTS
Table of Contents ...................................................................... v

Foreword................................................................................ xiii
Acknowledgments .................................................................. xix
Lists of Symbols and Phonetic Labels ...................................... xxi
Chapter 1: My Story with Languages, Pronunciation and
Accent ............................................................................... 1
1.1. Prelude ....................................................................... 1
1.2. The Evolution of my Interest in Linguistics and
Phonetics ................................................................... 2
1.2.1. Natural Language Internalization:
Language Acquisition ............................................ 3
1.2.2. A Major in English Language in a non-
English Environment ............................................. 4
1.2.3. Full Immersion as an Adult in Two
Languages ............................................................. 5
1.2.4. Phonetic and Linguistic Orientation in
Graduate Education ............................................. 10
1.2.5. Educational and Professional Challenges
in the U.S. ........................................................... 11
1.3. The Impact of my Linguistic/Professional
Background on the Evolution of an Approach ........... 12
1.3.1. Impact of my Linguistic Background ................. 12
1.3.2. Impact of my Teaching Career .......................... 14
1.4. Concluding Remarks ................................................. 20
1.4.1. Childhood Trilingualism Triggered
Interest in Languages........................................... 21
1.4.2. Learning Kurdish Triggered Interest in
Linguistics ........................................................... 21
1.4.3. Graduate Study Immersed me in
Phonetics and Linguistics .................................... 22
1.4.4. Professional Challenges in the U.S. .................... 23
v
vi PRONUNCIATION IS IN THE BRAIN
Chapter 2: The Cognitive Base of Language ............................. 25

2.1. Language: A Species-Specific Code of
Communication ........................................................ 25
2.2. Language: A Cognitive-Social System Superimposed
on other Systems ...................................................... 27
2.2.1. Vocal Tract Modification ................................... 28
2.2.2. Vocal Folds (Cords) Modes ................................ 29
2.2.3. Tongue Functions and Maneuverability ............. 30
2.2.4. Lip Configurations............................................. 30
2.2.5. Cavities Resonance............................................ 31
2.3 Brain Speaking via Respiratory and Digestive
Systems .................................................................... 31
2.4. Economy in Language ............................................... 33
2.5. Conscious and Subconscious Brains ........................... 37
Chapter 3: Language in the Brain of a Child ............................ 41
3.1. Learning vs. Acquisition: Conceptual Differences ...... 41
3.2. The Brain of a Child and Language ........................... 42
3.2.1. Child Brain Formation and Maturation .............. 42
3.2.2. Formative Months and Years of Mother
Tongue ................................................................ 44
3.3. Cognitive Transition in Sound Perception and
Production ............................................................... 46
3.3.1. Transition from Phonetics to Phonology ............ 48
3.3.2. The Brain as the Commander-in-Chief of
Language Acquisition: The Cognitive Roots
of Linguistic Accent ............................................. 50
3.4. Fossilization or Psycholinguistic Insensitivity ............ 52
3.5. There is Room in the Human Brain for more than
One Language .......................................................... 54
3.6. Narrowing Down the Broad Definition of Accent....... 55
3.7. Implications for Understanding the Cognitive
Nature of Accent ...................................................... 56
Chapter 4: Linguistic Accent: Definition, Classification and
Demonstration ................................................................. 59
4.1. Introductory Remarks ............................................... 59
4.2. Intralanguage and Interlanguage Accents .................. 60
4.3. Phonetic and Phonological Accents ........................... 62
TABLE OF CONTENTS vii
4.4. Accent: A Normal Linguistic Phenomenon................. 64

4.5. What is Meant by Accent Acquisition, Accent
Reduction and Accent Impersonation ....................... 65
4.5.1. Accent Acquisition ............................................ 66
4.5.2. Accent Reduction (Remediation) ....................... 67
4.5.3. Accent Impersonation or Faking ........................ 69
4.5.4. Intralanguage Accent Reduction and
Impersonation ..................................................... 72
4.6. Cultural Accent ......................................................... 73
4.7. Transition of Accent into Orthography ...................... 74
Chapter 5: A Broad Base for Understanding the Pedagogy of
Teaching Pronunciation ................................................... 79
5.1.1. Speech: A Cognitive Phenomenon ..................... 80
5.1.2. Pronunciation: Multisensory Access .................. 81
5.1.3. Pronunciation: Multicognitive Access ................ 82
5.1.4. Pronunciation: An Integrated and Holistic
Process ................................................................ 83
5.1.5. Pronunciation: Top-Down & Bottom-Up
Dynamics ............................................................ 84
5.1.6. Pronunciation: The Complementary
Nature of Acquisition and Learning ..................... 85
5.1.7. Pronunciation: A Natural Gift for Children ........ 86
5.1.8. Pronunciation Should be Premised on a
Triangular Base of Perception, Recognition
and Production .................................................... 87
5.1.9. Pronunciation & Psycholinguistic
Insensitivity......................................................... 89
5.1.10. Pronunciation: Understanding its
Scientific Premises............................................... 90
5.1.11. Pronunciation: Its Feedback Mechanisms ........ 91
5.1.12. Pronunciation: In Light of Multiple
Intelligences Theory ............................................ 91
5.1.13. Pronunciation: A Generative Skill.................... 92
5.1.14. Pronunciation: Interactive Involvement
of Instructors and Learners .................................. 93
viii PRONUNCIATION IS IN THE BRAIN
Chapter 6: Ten Commandments for Teaching Effective

Pronunciation .................................................................. 95
6.1.1. Thou Shall Teach Pronunciation as a
Cognitive Undertaking ........................................ 97
6.1.2. Thou Shall Teach Children and Adults
Differently ........................................................... 97
6.1.3. Thou Shall be Qualified for Instruction in
Pronunciation...................................................... 98
6.1.4. Thou Shall Familiarize Learners with
Human Speech Production .................................. 99
6.1.5. Thou Shall Orient Learners
Psychologically ................................................... 99
6.1.6. Thou Shall Use all Sensory Modalities to
Prop up Instruction ........................................... 100
6.1.7. Thou Shall Use all Cognitive Modalities to
Prop up Instruction ........................................... 101
6.1.8. Thou Shall Transform Learners from
Listeners into Performers ................................... 101
6.1.9. Thou Shall Refrain from Insistence on a
Learner.............................................................. 102
6.1.10. Thou Shall Make the Classroom a Place
for Learning and Fun ......................................... 102
6.2. Concluding Remarks ............................................... 103
Chapter 7: Examples of Cross-Language Accent-Causing
Consonants .................................................................... 105
7.1. Introductory Remarks ............................................. 105
7.2. Outline of the English Consonant System ................ 105
7.2.1. Interdental Pair /, / ..................................... 106
7.2.2. Approximant /r/ ............................................. 107
7.2.3. Voiceless and Voiced Alveolar Fricatives
/s/ and /z/ ........................................................ 110
7.2.4. English Plosives: /p b, t d, k g/ ....................... 111
7.2.5. Labio-Dental Fricatives /f, v/ .......................... 113
7.2.6. The Affricates / / ....................................... 114
Vowels ........................................................................... 117
8.1. Salient Features in General Vowel Description ........ 117
TABLE OF CONTENTS ix
8.2. The Vowel System of English .................................. 121

8.2.1. Simple Vowels of General American
English .............................................................. 122
8.3. Selections of Cross-Language Accent-Causing
Vowels ................................................................... 124
8.3.1. Hispanic Learners of English Vowels ............... 125
8.3.2. Arab Learners of English Vowels ..................... 128
Suprasegmentals ............................................................ 133
9.1. A Description of the Most Salient Features of
Suprasegmentals..................................................... 133
9.2. Stress and Rhythm .................................................. 136
9.3. Tone and Intonation................................................ 140
9.4. Basic Pitch Patterns................................................. 140
9.5. Consonant Clusters.................................................. 141
Chapter 10: The Role of Articulatory Settings in
Pronunciation and Accent .............................................. 147
10.1. Introductory Remarks ........................................... 147
10.2. Salient Features of Articulatory Settings of
Selected Languages................................................. 150
10.2.1. English Articulatory Settings ......................... 150
10.2.2. Spanish Articulatory Settings ........................ 155
10.2.3. Arabic Articulatory Settings .......................... 162
10.3. Concluding Remarks ............................................. 165
Chapter 11: Principles of a Multicognitive Approach to
Teaching Pronunciation ................................................. 167
11.2. Multicognitive Principles for Teaching
Pronunciation......................................................... 169
11.2.1. Think about L2 Speech Sounds ...................... 170
11.2.2. Transition from Hearing to Listening............. 171
11.2.3. Learn Something about Speech
Production ........................................................ 171
11.2.4. Mechanical Repetition Hardly Works
with Adults L2 Learning .................................... 172
11.2.5. Follow the Perceive, Recognize and
Produce Procedure ........................................... 173
x PRONUNCIATION IS IN THE BRAIN
11.2.6. Instructors Academic and Professional

Qualifications .................................................... 176
11.2.7. Plan Instructional Connection with
Learners ............................................................ 177
11.2.8. Explain, Demonstrate and Demonstrate
Multisensorily ................................................... 177
11.2.9. Deal with Pronunciation in a Holistic
Fashion ............................................................. 178
11.2.10. Consider both Top-Down and Bottom-
Up Perspectives ................................................. 179
11.2.11. Do not Confuse Memorization with
Retention .......................................................... 179
11.2.12. Deal with Pronunciation as a
Generative Skill ................................................. 181
Chapter 12: Principles of Multisensory Approach to Teaching
Pronunciation ................................................................ 183
12.2. Multisensory Principles for Teaching
Pronunciation......................................................... 184
12.2.1. Auditory Modality......................................... 184
12.2.2. Visual Modality ............................................. 186
12.2.3. Tactile, Kinesthetic, Proprioceptive
Modalities ......................................................... 188
12.3. Developing Teaching and Learning Strategies ....... 189
12.3.1. Developing Teaching Strategies ..................... 189
12.3.2. Developing Learning Strategies ..................... 192
Chapter 13: Exemplary Applications of Accent Remediation
Techniques .................................................................... 197
13.2. Techniques for Teaching Selected Consonants ....... 197
13.3. Techniques for Teaching Labial-Dental Sounds...... 198
13.4. Techniques for Teaching Interdental Fricatives /
/ ........................................................................... 202
13.5. Techniques for Teaching Tense (Long) vs. Lax
(Short) Vowels ....................................................... 207
13.6. Techniques for Teaching Vowel Reduction ............ 213
13.7. Techniques for Teaching Accentuation (Stress) ..... 218
TABLE OF CONTENTS xi

Chapter 14: Tips for Accent Reduction and Accent Detection. 227
14.2. Tips for Accent Reduction ..................................... 228
14.2.1. Tackle the most Salient Phonological
Problems ........................................................... 228
14.2.2. Tackle the most Salient Phonetic
Problems ........................................................... 231
14.2.3. Improve other Linguistic Skills ...................... 232
14.3. Accent Detection ................................................... 232
14.3.1. Accent Detection by Ordinary
Individuals ........................................................ 233
14.3.2. Accent Detection by Professionals ................. 233
14.3.3. Telling the Linguistic Background of a
Speaker through Accent .................................... 234
14.3.4 Hiding an Agent through Hiding an
Accent ............................................................... 237
References ............................................................................. 243
FOREWORD
My fascination with human language, in general, and pronuncia-

tion, in particular, seems to be intimately connected to my tri-
lingual upbringing as a child in the city of Kirkuk/Iraq and my
later exposure to three more languages. The immense diversity
in the sound systems of those six languages afforded me exten-
sive coverage of a wide range of sounds, sound patterns and
sound systems. Gradually, throughout the five decades that I
spent in teaching English language, linguistics and pronuncia-
tion in different countries and to speakers of a wide variety of
languages, my passion for the diversity in their pronunciation
patterns deepened. Thus, teaching pronunciation did not just
become my favorite subject, but it also became the focus of my
academic research. Throughout the last decade, I became fasci-
nated with the concept of accent and whether it is a sensory de-
ficiency with some people at a certain age or the side effect of
cognitive and linguistic perfection in the internalization of the
native language (L1). After lengthy observations and investiga-
tions, I came to the conclusion that accent is a cognitive phe-
nomenon and seems to be the outcome of a deficit between
what is known in psycholinguistics as language acquisition and
language learning. The focus in this book is, above all, on accent
and its cognitive roots. Such understanding will determine the
framework of an approach to teaching pronunciation.
In my book: Techniques of teaching pronunciation in ESL, bi-
lingual and foreign language classes (2003),1 I introduced several
innovative principles which outlined my cognitive approach to
teaching pronunciation. One such principle emphatically stated
1
Mnchen: Lincom-Europa.
xiii
xiv PRONUNCIATION IS IN THE BRAIN
that pronunciation is in the brain prior to being in the mouth

which constituted the foundational premise for my cognitive
approach to teaching pronunciation in general. Naturally, there-
fore, several of 2003 principles are reintroduced in certain chap-
ters with further refinement, elaboration and application to ac-
cent in human speech. As I am still pursuing the teaching of
pronunciation in more details in this book with the linguistic
phenomenon of accent being a major constituent, it certainly
further highlights the significance of the cognitive perspective
especially in the case of adult learners rendition of an L2 pro-
nunciation.
Since the cognitive potential of human beings is fed and
nurtured via the five senses, the approach to teaching pronunci-
ation has been identified as multisensory and multicognitive.
The former emphasizes the significance of the joint involvement
of as many sensory modalities as possible especially those that
feed the cognitive processing of human language and speech,
namely, auditory, visual and tactile/kinesthetic, whereas the
latter highlights the mobilization of all cognitive processes that
human beings practice such as thinking, comparing, contrasting,
analyzing, synthesizing, associating and memorizing, among
others.
It is worth underscoring that accent is not a pathological
defect and its remediation is not exclusively the responsibility of
a speech pathologist as claimed by some. In an accent modifica-
tion instruction course announcement, it is stated that a
speech-language pathologist then teaches the client appropriate
speech modification techniques.2 Accent remediation falls with-
in the professional domain of any person qualified in the sci-
ence/art of speech production, foremost of whom is a phoneti-
cian. Nevertheless, each one of those who are professionally
qualified to deal with the teaching of speech and pronunciation
should combine his/her academic expertise with an effective
instructional methodology that incorporates the latest linguistic
and educational findings.
2
http://www.speechenrichmentcenter.com/index.php?option=com_Cont
ent&view=article&id=10
FOREWORD xv
Hence, if accent is not a pathology, then what is it? From

the perspective of this book, it is one of the normal symptomatic
side effects of the cognitive and physical maturation of human
beings and the gradual transformation in delegating the majori-
ty of conscious biological and social survival functions, foremost
of which being language, to the subconscious brain. Stated dif-
ferently, accent is consequential of cognitive transformation of
humans from all-conscious operants to predominantly subcon-
scious ones. The recent literature on human language internali-
zation is replete with substantial evidence that the perfection in
the mastery of the native language, especially its pronunciation,
falls primarily within the years of childhood ranging from
birthin fact, even before birthuntil puberty and adolescence.
Once a person is fully immersed in the native language (L1)
throughout those years, he/she will grow up accentless, but
he/she is highly likely, beyond that period, to manifest a certain
degree of accent in L2 ranging from light to heavy. It is the in-
herent cognitive bias to L1 that interferes with the accurate per-
ception and recognition, hence production, of L2 sound system.
The years of childhood are so significant for language internali-
zation that a child can grow up as balanced bilingual or even
balanced trilingual if there is ample exposure to and immersion
in more than one language. My childhood is a typical example.
This is so because the brain of any normal child is neurolinguis-
tically wired for such a mission and is powerful enough to han-
dle more than one language. Accent, therefore, tends to be a
symptomatic trait of adulthood resulting from the cognitive
maturation with regard to L1 rather than being a physical senso-
ry deficiency in the perception and production of speech sounds
by L2 learners. It is this difference that has convinced linguists
to label the natural internalization of L1 by children as acquisi-
tion, while adults attempt at internalizing L2 as learning. Doubt-
less, the two processes are not treated here as mutually exclu-
sive as some inexperienced persons may make it look like; ra-
ther they are complementary in nature depending on the age of
the person among several other factors to be dealt with in due
course. Nevertheless, the younger the person is the greater the
role of acquisition in comparison to learning and the reversal of
the roles with adults.
xvi PRONUNCIATION IS IN THE BRAIN
The emergence of accent in the speech of adult L2 learners

should never imply the closing of the window on the likelihood
of improvement of pronunciation and the remediation of accent.
The claim that adults suffer from the so-called fossilization in
their learning of L2 pronunciation (Selinker, 1972) is rejected in
this study partly due to the lack of a thorough understanding at
the time of the cognitive nature of L1 acquisition (internaliza-
tion) and its subsequent impact on L2 learning and partly due to
the inefficient approach and methodologies used to teach pro-
nunciation to adults and their accent remediation. A multisenso-
ry multicognitive approach has proven very effective in learning
pronunciation and teaching it to adults of different linguistic
backgrounds and the remediation of their accent to different
degrees depending on several other specific factors such as apti-
tude, enthusiasm, focus, time allocation etc All those factors
have been relevant in my personal case as I, at the age of thirty-
three (33), improved my pronunciation of English and immense-
ly broadened my inventory of different sounds in perception,
recognition and production. Certainly, remediation may not be
perfect in most cases, but it is definitely evident.
As hinted earlier on, the cornerstone of implementing this
approach is primarily the multisensory multicognitive princi-
ples; however, there are several other functional considerations
for implementation foremost of which is the distinction between
phonological accent and phonetic accent. The former represents
sound substitutions that directly result in semantic confusion in
words as well as in sentences, whereas phonetic accent may not
result in semantic confusion directly, but it may generate noise
or uncertainty that interferes with proper conveyance of mean-
ing. Although I had actively implemented this functional distinc-
tion in my classes in mid 1990s, it appeared formally in print in
2003. Field and classroom observations substantiate the para-
mount functional significance of this dichotomy. In teaching
pronunciation, the remediation of phonological accent should
take precedence over phonetic accent. To demonstrate, it is far
more significant to teach a Hispanic learner of English not to
substitute a [j] sound as in <you> = [ju] with [] and change
the word <you> to <Jew> = [u] than to substitute a tap or
trill <r> for the English approximant <r> as the former ex-
FOREWORD xvii
ample represents a phonological accent, whereas the latter rep-

resents a phonetic one.
In light of what preceded, the effective and efficient teach-
ing of pronunciation of an L2 to adults requires highly qualified
professionals implementing an approach that seriously takes
into consideration the cognitive nature of pronunciation and the
implementational approach that is compatible with it. This book
is all about such an approach.
The book is in fourteen (14) chapters with the first chapter
being a reflection on the authors trilingualism as a child and
later addition of three more languages as well as the experiences
that led to his majoring in linguistics with focus on phonetic
science and pronunciation. If the reader is eager to go directly to
the core materials related to human speech and pronunciation,
he/she can skip the first chapter although the chapter is rich
with personal experiences. Chapters two through five are essen-
tial to understanding the cognitive nature of human language,
pronunciation and accent. The reader is expected to encounter
difficulty in understanding certain contents of the next chapters
without thoroughly absorbing the contents of those four chap-
ters. Chapter six represents a sketch of the most significant prin-
ciples to be considered for the application of the multisensory
multicognitive approach to teaching pronunciation. As for chap-
ters seven through nine, they cover the most common segmental
(consonants and vowels), suprasegmental (stress, rhythm, tone,
intonation) that cause pronunciation problems and accent in
cross-language learning. Intimately related to those chapters is
chapter ten which deals with the very important but less known
aspect of pronunciation the so-called articulatory settings.
Chapters eleven and twelve elaborate on the main applications
of the multicognitive modalities and multisensory modalities,
respectively. The book ends with chapters thirteen and fourteen
which focus more narrowly on accent and its acquisition, reduc-
tion and remediation.
Finally, a couples stylistic notes are unavoidable so that the
reader would not misinterpret my intention as the writer. First,
parts of the book are written in the first person simply because
they reflect my personal life and experience; it would, therefore,
be awkward to state them in third person pronoun. Second, my
use of he pronoun for the learner does not exclude the she. I
xviii PRONUNCIATION IS IN THE BRAIN
have tried the combination he/she, but after a while it becomes

stylistically too repetitive. I am a liberal thinker and my
acknowledgements recognize three women that have deeply
impacted my life
Edward Y Odisho
March 3, 2014
ACKNOWLEDGMENTS
I recently celebrated my 75th birthday and with it came my deci-

sion not to write more books after this, my 11th one, but rather
focus on producing more research papers which reflect some of
the thoughts and themes that have been with me for a long
time. Certainly, I have not had the opportunity to develop them
and bring them all to fruition. However, I seize this opportunity
to express my deep gratitude to all institutions and publishers
which have assisted me in transforming my writings into books
and promoting them locally and internationally. My first book
was published by the Iraqi Ministry of Culture in 1971 followed
by the second one published by Al-Mustansiriya Universi-
ty/Baghdad. After my escape from Iraq during Saddams regime
in 1980, my third book was published by Otto Harrassowitz Ver-
lag/Wiebaden, 1988. Between 1988 and 2002, my academic
efforts were devoted to publishing research papers which put me
on fast track for tenure and academic promotion to professor-
ship. In 2003, I decided to produce a series of books to docu-
ment my experience in classroom teaching. Towards the end of
2003 my book Techniques of teaching pronunciation in ESL, bilin-
gual and foreign language classes was published by Lincom Euro-
pa/Mnchen followed by A linguistic and cognitive approach to the
teaching of the English alphabet, Edwin Mellen Press/New York in
2004. Between 2007 and 2011, I published the following books
all by Gorgias Press of New Jersey. Techniques of teaching com-
parative pronunciation: English-Arabic, 2005. Linguistic tips for La-
tino learners and teachers of English, 2007. Linguistic and cultural
studies in Aramaic and Arabic, 2009. Modern Assyrian (Aramaic)
language between speech and writing: linguistic examination, 2011.
Although this book still deals with pronunciation, it has a
somewhat narrower thematic focus as it attempts to deal with
accent in pronunciation in greater academic and empirical
depth. It certainly incorporates the views of many distinguished
xix
xx PRONUNCIATION IS IN THE BRAIN
scholars and experts in the fields that deal with human sound
production intricacies, at large, and pronunciation, in particular.
To those views, I added the personal reflections on my child-
hood as a balanced trilingual alongside my experiences accumu-
lated during five decades of classroom teaching and real-life ob-
servations.
In publishing this book, I must express my gratitude to tens
of thousands of students that I have taught and who, in return,
taught me through their difficulties as well as their successes in
overcoming those difficulties.
Finally, there are a few people to whom I am indebted.
There are three women who have changed my life: my mother,
Shakira, with whom I shared my interest in books, my doctoral
supervisor, Celia Scully, who twisted my arm to be more sci-
ence-oriented and my wife, Wardia Shamiran, for being my
friend and partner in whatever I have accomplished during the
last four decades of our marriage.
Edward Odisho,
Morton Grove, Illinois,
March 3, 2014
LISTS OF SYMBOLS AND PHONETIC LABELS
The conventions and symbols of the International Phonetic Asso-

ciation (IPA) and their acceptable substitutes are listed below
and they have been used throughout this book. The Arabic al-
phabet and its diacritics are also listed because they have been
employed where necessary. The following is a list of symbols
and conventions used:
Vowels Phonetic Description

Close front with spread lips
Close front to close-mid (somewhat central-
ized) with spread lips
Close-mid front with unrounded lips
Open-mid front with unrounded lips
Open-mid central with unrounded lips
Near-open front with unrounded lips
Open front with unrounded lips
Open back with unrounded lips
Open-mid back with rounded lips
Close-mid back with rounded lips
u Close back with rounded lips
Near-close near-back with rounded lips
Open-mid back with unrounded lips
Mid central (neutral) vowel (schwa)
R-colored (rhotacized) mid central (schwar)
R-colored (rhotacized) open-mid central
xxi
xxii PRONUNCIATION IS IN THE BRAIN
Consonants Phonetic Description

b Voiced bilabial plosive
p Voiceless unaspirated bilabial plosive
p h
Voiceless aspirated bilabial plosive
d Voiced alveolar plosive
t Voiceless unaspirated alveolar plosive
t h
Voiceless aspirated alveolar plosive
Voiced palatal plosive
g Voiced velar plosive
k Voiceless unaspirated velar plosive
k h
Voiceless aspirated velar plosive
c Voiceless unaspirated palatal plosive
c h
Voiceless aspirated palatal plosive
q Voiceless (unaspirated) uvular plosive
Glottal stop
Voiced postalveolar affricate
Voiceless postalveolar affricate
v Voiced labialdental fricative
f Voiceless labialdental fricative
Voiced interdental fricative
Voiceless interdental fricative
z Voiced alveolar fricative
s Voiceless alveolar fricative
Voiced postalveolar fricative
Voiceless postalveolar fricative
Voiced uvular fricative
Voiceless uvular fricative
Voiced pharyngeal fricative
Voiceless pharyngeal fricative
h Voiceless glottal fricative
LIST OF SYMBOLS AND PHONETIC LABELS xxiii
Voiced labialdental approximant

Voiced labialpalatal approximant
Voiced alveolar approximant
Voiced retroflex approximant
l Voiced alveolar lateral approximant
j Voiced palatal approximant
w Voiced labialvelar approximant
m Voiced bilabial nasal (approximant)
n Voiced alveolar nasal (approximant)
Voiced velar nasal (approximant)
Voiced dental/alveolar tap
Voiced retroflex tap
r Voiced dental/alveolar trill
Diphthongs in RP English
au as in <how, now>
ai as in <high, tie>
oi as in <boy, noise>
ou as in <go, know>
ei as in <bait, gate>
i as in <here, dear>
e as in <there, bear>
u as in <poor, tour>
Conventions
/ / Phonemic transcription
[] Phonetic transcription
Vowel full length
Vowel half-length
_ Superscript indicating aspiration
_ Superscript indicating strong stress
xxiv PRONUNCIATION IS IN THE BRAIN
Subscript dot under /d t s/ = [ ] indi-

cates /, , , / the emphatic sounds of Ar-
abic
C In syllable structure patterns, C stands for a Con-
sonant
V stands for a Vowel
Arabic Symbols
Consonants IPA Phonetic Description

[] glottal stop
[] voiced bilabial plosive
[h] voiceless aspirated alveolar plosive
[] voiceless interdental fricative
[p] voiceless bilabial plosive (Farsi)
[] voiced postalveolar affricate
[] voiceless postalveolar affricate (Farsi)
[] voiceless pharyngeal fricative
[] voiceless uvular fricative
[d] voiced alveolar plosive
[] voiced interdental fricative
[] voiced alveolar trill
[] voiced alveolar fricative
[] voiced postalveolar fricative (Farsi)
[] voiceless alveolar fricative
[] voiceless postalveolar fricative
[s voiceless emphatic alveolar fricative
[d voiced emphatic alveolar plosive
[t voiceless (unaspirated) emphatic alveolar
plosive
[ voiced emphatic interdental fricative
LIST OF SYMBOLS AND PHONETIC LABELS xxv
[] voiced pharyngeal fricative

[] voiced uvular fricative
[] voiceless labialdental fricative
[v] voiced labialdental fricative (Farsi)
[] voiceless unaspirated uvular plosive
[] voiceless velar plosive
[g] voiced velar plosive (Farsi)
[] voiced alveolar lateral
[] bilabial nasal
[] alveolar nasal
[] voiceless glottal fricative
[] labialvelar approximant
[] palatal approximant
Superscript on consonant indicating geminated
(double) consonant.
Vowels (Letters)
[a] long counterpart of [a]
[i] long counterpart of [i]
[u] long counterpart of [u]
Vowels (Diacritics)
Superscript over consonant indicating short [a]
vowel.
Subscript over consonant indicating short [i] vow-
el.
Superscript on consonant indicating short [u]
vowel.
Superscript on consonant indicating absence of
vowel.
CHAPTER 1: MY STORY WITH LANGUAGES,
PRONUNCIATION AND ACCENT
1.1. PRELUDE
Very simply, this book focuses on the linguistic nature of human
language, in general, and pronunciation and accent, in particu-
lar. Two important principles govern the overall approach,
namely, cognitive and the pedagogical principles. With regard to
the first, language as a structure and system originates in society
but its blueprint is in the brain. It is, therefore, a social product,
but a cognitive entity. Whenever social survival needs the ser-
vices of language, it signals to the brain which, in turn, activates
its neurons and synapses to generate communication. As for na-
tive language (L1) pronunciation, it, ceteris paribus, is a process
of natural acquisition with perfection; however, learning effec-
tive pronunciation of a second language (L2) by adults requires
conscious effort by the learner assisted with the linguistic and
educational knowhow of the instructor. This latter statement
highlights the pedagogical principles of the approach taken in
this book. The instructor should have a high level of profession-
al competence and experience in the sound systems of the lan-
guages involved. He/she should also follow an educational phi-
losophy that premises the success of a teaching approach on the
extent of interactive connection with the learners to ascertain
that there is an effective mode of two-way interaction. It is im-
perative that the instructor diversify his/her cognitive and sen-
sory strategies and techniques of teaching as well as discover the
individual learning styles of the learners and encourage them to
get actively involved in the process.
1
2 PRONUNCIATION IS IN THE BRAIN
1.2. THE EVOLUTION OF MY INTEREST IN LINGUISTICS AND

PHONETICS
There are five major linguistic experiences in my life that have
contributed to the evolution of my interest in linguistics and
phonetics.
First, the natural acquisition of three languages in my
childhood, namely, Assyrian (Modern Aramaic) Turkmeni and
Arabic due to the multilingual environment in which I grew up
in the city of Kirkuk/Iraq. My natural trilingualism was the re-
sult of full immersion in a context-embedded and situation-
embedded linguistic environment. In the framework of this study,
the use of these two terms is somewhat different from their cir-
culation in previous literature, especially by Cummins (Cum-
mins, 1979, 1984). Context-embedded implies the use of lan-
guage in discourse format rather than in isolated words and sen-
tences, while situation-embedded implies the use of a discourse
that matches the situation in which it actually takes place. To
clarify the latter statement, a discussion of the action of air-
planes taking-off and landing at an airport will be more authen-
tic than a discussion about the same theme in a caf or a class-
room situation. This is simply because in the former scenario the
actions are multisensorily and realistically perceived, hence
readily comprehended.
Second, for my first degree, I majored in English language
at Baghdad University which implies partial immersion in the
target language (L2) due to the absence of fully context-
embedded and situation-embedded environments. During my
four-year study for my degree, English was dominant only dur-
ing the class sessions and occasional conversations with faculty
members (the majority of whom were native English speakers).
Outside those two environments, the day-to-day language of
conversation was in Arabic or in any other native languages of
Iraq such as Assyrian, Turkmeni, etc.
Third, in my early adult life, I experienced full immersion in
the Kurdish language. This type of full immersion was repeated
later in my adult life with English in both England and the Unit-
ed States.
Fourth, in my graduate studies of four years in England, I
specialized in linguistics, in general, and phonetic science, in
particular. This duration thoroughly exposed me to a wide varie-
CHAPTER 1 3
ty of sound materials from different languages. Such an experi-

ence afforded me a scientific insight into human language both
as an acquisition process and a learning one.
Fifth, the five-decade long professional experience in teach-
ing I have had at different levels of education, in different coun-
tries and in diversified educational and professional situations
added to the depth of my linguistic experience and honed it.
Below is an elaboration on the above five experiences.
1.2.1. Natural Language Internalization: Language

Acquisition
Any study of human language internalization should consider
the manner in which normal children master their native lan-
guage or any language they are immersed in as opposed to
adults embarking on learning a second language. The two pro-
cesses are known in psycholinguistic literature as acquisition vs.
learning. Acquisition tends to be a subconscious, automatic and
effortless process of internalizing a language, whereas learning
tends to be more conscious, mechanical and effortful. Inasmuch
as pronunciation acquisition is concerned, all that children need
to accomplish it is ample exposure to speech in real-life contexts
and situations. My hometown Kirkuk was at the time1 the most
multilingual city in Iraq with the five languages of Turkmeni,
Arabic, Kurdish, Assyrian and Armenian spoken in its different
neighborhoods. The overwhelming majority of its population
was, at minimum, bilingual with many being trilinguals. Which
of the five languages one mastered, depended partly on the size
of the population that spoke a given language and partly on the
neighborhood in which one resided. In my case, Assyrian was
my home language while Turkmeni and Arabic were the two
languages to which I was most exposed. It should be clarified,
however, that Arabic was then confined to a couple of neigh-
borhoods because of the very small population of Arabs in Kir-
kuk at the time. Fortunately, I lived in one such neighborhood.
What is important to highlight with regard to Arabic is the fact
1
Demographics have significantly changed since the 1940s.
that it was the official language of the country especially in edu-

cation and governmental businesseveryone had to pick it up
or, at least, be familiar with it.
In light of the diversified linguistic environment in Kirkuk,
I grew up fully orally competent in Assyrian, Turkmeni and Ar-
abic. I always felt that functionally all three were my native
tongues, the only difference between them being that Assyrian
was my ethnic native tongue and my home language. The pro-
cess of internalizing Turkmeni or Arabic to the degree of a na-
tive tongue can be attributed to the context-embedded and situ-
ation-embedded environment in which the two languages were
encountered and used. No conscious effort was made to internal-
ize them; they were simply naturally acquired as long as the
exposure to them continued almost every day and throughout
most of the day. In a nutshell, my native competency in those
three languages as well as their cultures was a typical example
of child language and culture acquisition.
Turning to another aspect of this childhood linguistic expe-
rience and its impact on my later adult linguistic experience,
several observations can be made especially with regard to my
competence in the future performance of sounds and sound sys-
tems of other languages. First, this childhood experience afford-
ed me a much broader inventory of sounds and richer compe-
tence in the production of some unfamiliar sounds of other lan-
guages. Second, on the flip side, in certain aspects of the sound
systems, especially in the domain of accentuation (stress place-
ment) my childhood experience was so dominant that it subcon-
sciously permeated the English language that I learned in the
early stages of my exposure to English. In many respects, my
stress placement in English was seriously sullied by my child-
hood languages, especially Assyrian and Arabic. I lived with this
misplacement of stress in English without knowing it until I set-
tled in Britain for four years receiving systematic education in
phonetic sciences and linguistics. There will be extended elabo-
ration on this subject in due course.
1.2.2. A Major in English Language in a non-English

Environment
Completing a major in English language at Baghdad Universi-
tyobviously a non-native environmentimplied partial im-
CHAPTER 1 5
mersion in the target language (L2) due to the absence of a fully

context-embedded and situation-embedded environment. During
my four-year study for my degree, English was dominant only
during class sessions and occasional conversations with faculty
members (the majority of whom were native English speakers).
Outside those two environments, the day-to-day language of
conversation was in Arabic and in any other native languages of
Iraq such as Assyrian, Turkmeni, etc. In other words, my expo-
sure to L2 was very limited in terms of time as well as in the
variety of contexts and situations in which a language is normal-
ly used. Stated differently, we rarely, if ever, had real exposure
to and use of English in, for instance, a market place situation or
a casual family gathering. Throughout four years of education,
my fellow students and I were capable of talking about the three
witches in Shakespeares Macbeth more than conducting a con-
versation with the owner of a shop-keeper in a fruit and vegeta-
ble market.
This limited exposure in terms of time, contexts and situa-
tions in which a language is naturally used is in striking contrast
to the ample time and the multitude of contexts and situations
in which my three childhood languages were acquired. Majoring
in English in Baghdad was, to a large extent, a typical example
of a context-reduced and situation-reduced experience; it was a
classic example of an adult language learning model as opposed to
a child language acquisition model. Consequently, my learning of
English in my early adulthood in context-reduced and situation-
reduced environment had many deficiencies.
1.2.3. Full Immersion as an Adult in Two Languages

The following deals with my full immersion in the Kurdish and
English languages as an adult and the impact of such experienc-
es on my linguistic orientation and the later evolution of a pas-
sion for pronunciation and the pedagogy of teaching it.
1.2.3.1. Full Immersion as an Adult in all-Kurdish Environment

After gaining my degree in 1960, I was appointed as a teacher of
English language in the Kurdish city of Sulaimaniya where the
daily language of communication was predominantly Kurdish.
Thus, I was immersed in another language as an adult. Learning
Kurdish was a necessity in a city where almost all daily interac-

tions were in this language. Additionally, for reasons unknown
to me at the time, except for the fact that I was an open-minded
and progressive young man with an interest in other languages
and cultures, I embarked on learning the Kurdish language and
culture exceptionally seriously, and with a passion.
The manner in which I learnedor acquiredKurdish
turned out later to be innovative. At the time I was not a linguist
to implement the approaches or techniques promoted by applied
linguistics to master a second language as an adult, neither was I
familiar with the modern concept of immersion in an L2 situa-
tion to acquire or learn it. I simply handled the Kurdish lan-
guage in the following manner, which, decades later, I discov-
ered to be full immersion in the target language:
The attempt at learning Kurdish went through four stages.
The first stage was to focus entirely on the oral communica-
tion with minimum recourse to the written form except
when a certain word or phrase was extremely necessary. For
a couple months or so, I did not attempt to speak; it was
simply a period of listening to other peoples communication
and carefully watching their facial and body gestures that
accompanied the conversation. Occasionally in classroom, I
used to ask my students to translate some short statements in
an English language dialogue into Kurdish and I would care-
fully listen to the translated segments.
The second stage began around the third month when I de-
cided to go one step further beyond the listening period. I
ventured to speak by getting involved in very short conver-
sations with some intimate Kurdish friends who appreciated
my intention to learn their language. They would correct me
when necessary and they would also repeat certain state-
ments so that I would be able to internalize them. I was nev-
er intimidated by the mistakes or hesitations that would oc-
cur in my conversation. In order to carry my listening and
speaking skills one step further and associate them with real-
life contexts/situations, I took upon myself almost daily to
go to the marketplace to do my shopping and carefully listen
to live interactions between shop-keepers and customers.
This experience was the most effective and efficient in help-
CHAPTER 1 7
ing me predict the meaning of many words and expressions

that I did not know previously. The actual context/situation
of the conversations aided me in the prediction of meaning.
Perhaps, more significant than just predicting the meaning
was the higher possibility of retaining the meaning also be-
cause of the clues from the context/situation.
At the third stage, which began by the end of the first school
yearusually nine monthsI was able to sustain simple so-
cial conversations. After my marketplace ventures, I made
more Kurdish friends with whom I spent my evenings in
their homes or in cafs and social clubs for teachers and oth-
er civil service employees.
The beginning of the second year was the final stage when
my colleagues at school, my students and my friends in the
community began to address me in Kurdish most of the time.
I avoided using Arabic as much as possible; however, when I
found myself groping for the right word in Kurdish I did not
hesitate to double-dip in both Arabic and Kurdish. In other
words, I resorted to some familiar linguistic devices that bi-
linguals use such as code-switching or code-mixing of two
languages. I should not forget to reveal another strategy I
used to teach myself Kurdish. This was a two-prong strategy
of listening to songs and retaining their lyrics, as much as
possible, as well as the retention of some popular proverbs
and sayings. In both cases, the retention was aided by the
music in the first instance and by the uniqueness of meaning
and other linguistic niceties that this genre of human lan-
guage usually has. By the end of the fifth year, which was
the last year of my service in Sulaimaniya, my communica-
tion with people was predominantly in Kurdish. I was good
in overall fluency, but excellent in pronunciation.
After this brief outline of my experience with Kurdish, it is abso-

lutely essential to point out that the manner in which I learned
the language so successfully is still an experience which I cannot
fully explain. Did I succeed with Kurdish because I was original-
ly a multilingual or was it because I accidentally ran into an oral
approach that turned out to be a context/situation-based linguis-
tic pedagogy of which I knew nothing then except for the fact
that it seemed a more natural way for human language acquisi-

tion or learning? Equally puzzling was the motive that pushed
me towards a future in the study of linguistics, in general, and
phonetics, in particular, as a profession while still living in a
country (Iraq) where linguistics, let alone phonetics, was at the
time much less known.
After successfully finishing my graduate studies, the ques-
tion of how I had become so obsessed with pronunciation and
phonetic sciences was still nagging me and I craved for a satis-
factory explanation. As I was accidentally sifting through the
pages of some of my early college psychology and methodology
textbooks I noticed that I had persistently underlined statements
and paragraphs that were related to language learning, in gen-
eral, and child language acquisition, in particular. In my view,
my childhood multilingualism contributed significantly to my
pursuit of linguistic studies as an adult.
1.2.3.2. Full Immersion as an Adult in an all-English Environment

My four-year stay in England placed me in a linguistic environ-
ment with maximum immersion in English coupled with exten-
sive exposure to diversity in authentic contexts and situations in
which the language was used. Those four years were, in many
respects, virtually the reversal of my experience with English
during the four years in Baghdad majoring in English. The long-
er I stayed in England and became acquainted with the linguistic
and phonetic principles of language learning and teaching, the
more I discovered the weaknesses and flaws in my English at all
levelsgrammar, style, lexicon and, above all, pronunciation
which is the focus of this book.
Fortunately, in the consonantal domain of English, there
were no serious phonological problems simply because of my
broad phonetic and phonological base attributed to my child-
hood trilingualism. The most noticeable consonantal mispronun-
ciation was my rendition of the approximant English < r> as a
tap or a trill one, which, fortunately, constituted a mere phonet-
CHAPTER 1 9
ic accent2 that did not interfere with meaning. Obviously, all

consonant clusters involving an <r> such as /pr, tr, cr/, etc.,
did cause some deviation in pronunciation. Nevertheless, this
was a minor phonetic aberration compared to, for instance,
stress placement in words and longer stretches of speech. For
instance, in almost all patterns of verbs such as: <amplify>,
<criticize>, <facilitate>, among many other word patterns, I,
like the overwhelming majority of adult Assyrians and Arabs,
placed stress on the final syllable which is hardly there in Eng-
lish. Also, in noun compounds such <black bird>, <junk
food>, etc., where the accent is usually on the first word, I
would place it on the second not knowing that it makes a huge
difference in reference. A <black bird> is a certain type of bird
that happens to be black, whereas a <black bird> is any bird
that is black. These mispronunciations were all transfers from
my Assyrian and Arabic languages. For example, in Assyrian the
compound word for an <old man> (literally, a white-bearded
man) is < >with accent on the second part. I did not
know what stress was and how it functioned within a language,
and neither did I know what intonation was. Throughout my
four years of English language education in Baghdad, which was
more or less literature-oriented, I had never heard an instructor
draw attention to such important language dynamics.3 In a nut-
shell, the English we learned in Baghdad was subconsciously
colored, to different extents, with the overall pronunciation of
Arabic and/or the other native languages of the learners.
In light of such facts, and with my schooling in the De-
partment of Phonetics in the techniques of listening to a wide
variety of sounds and producing them, I gradually began to con-
centrate my attention on the pronunciation of the native speak-
ers of English with emphasis on both the segmental and supra-
2
It will be argued later that a phonetic accent is a mispronunciation that
does not alter meaning as opposed to phonological accent which does.
3
Bear in mind, I am referring to the years 19561960. In the years begin-
ning with the 1970s there were several faculty members in different universities
in Iraq who specialized in linguistics and they would emphasize such prosodic
features of both English and Arabic.
segmental aspects of their pronunciation. I remember vividly

how I rectified my pronunciation of the verbs <develop> and
<deposit> which I used to pronounce with stress on the first
syllable instead of the second one. In fact, in the case of <de-
posit> I had a transaction at the bank and I pronounced the
word as <deposit> and the banker said: You mean <depos-
it>. With a very mellow touch of linguistic embarrassment, I
said: Yes. I also soon discovered that a certain category of
words such as: <permit, export, import, contract> could be
rendered verbs or nouns depending on the position of stress. In
this latter case, I do remember that one of our British instructors
in Baghdad did mention in passing such a rule, but did not dwell
enough on it neither did he demonstrate the difference for us in
order to internalize what he was saying. Pedagogically, this ex-
ample is typical of the fact when the instructor fails to establish
a bridge of interaction with his students assuming that once a
rule is mentioned everyone picks it up. The absence of such a
methodological strategy of connection between instructors and
learners has become one of the cornerstones in my approach to
teaching: simply, make sure you and your students are on the
same page and they connect with you.
1.2.4. Phonetic and Linguistic Orientation in Graduate

Education
The first year of my graduate education was a two-pronged in-
tensive orientation in theoretical and applied principles of pho-
nology and phonetics. The theoretical component covered a
wide variety of schools and theories of phonology. The applied
side of it involved thorough exposure to human speech from the
acoustic, physiological and aerodynamic perspectives which
were reinforced with experimental laboratory work as well as
practicals in the perception, recognition and production of a
wide variety of human sound specimens. This type of education-
al orientation continued throughout the four years although it
was most intensive during the first year.
Such specialized intensive education in human speech had
far-reaching impact on my perception, recognition and produc-
tion of a broad array of sounds from different languages. For
instance, prior to this education I was able to recognize and
produce two bilabial sounds [b p]; however, after the orienta-
CHAPTER 1 11
tion the inventory doubled or even tripled into such sounds as

[b b p p ]. This perceptual and productive enhancement in
discriminative and articulatory skills afforded me far-reaching
insight into other languages; perhaps, more importantly, it af-
forded me better insight into my own language repertoire. In the
latter case, for instance, I was able to phonetically identify dif-
ferent bilabial plosives such as [b, p, p, p, p]4 some of which
turned out to be phonologically significant, i.e., they triggered
semantic differences between words. Also, equally important to
broadening my range of sound perception, recognition and pro-
duction, I nurtured an acute kinesthetic and proprioceptive
sense for articulatory maneuvers of sound production. Stated
differently, I began to feel the movements, positions and shapes
of my tongue in the oral cavity, feel the tightness or relaxation
of muscles and proprioceptively detect airflows, frictions and
vibrations. Such kinesthetic and proprioceptive skills are ex-
tremely helpful for learning and teaching sounds especially in
L2 situations. If the instructor does not bring such skills to the
attention of the learners, they will pass unnoticed and learning
fails.
1.2.5. Educational and Professional Challenges in the U.S.

I entered the U.S. as a refugee at the age of forty-three (43) with
a family. I stayed unemployed for a year, desperately looking for
employment to help my family survive and help myself pursue
my academic profession. There was no room for me then to be
choosy for employment. The English sayings Beggars cannot be
choosers and Dont look a gift horse in the mouth applied to
me. Every job that I took in U.S. was only broadly within the
realm of my academic and professional background, but was not
in the core of my expertise so that I would immediately begin to
be creative. As one will see later on, I went through three major
professional and academic assignments each of which was a
daunting challenge that occasionally pushed me to the brink of
frustration. Nevertheless, I was a refugee and I was determined
4
) versions.
The subscript dot indicates emphatic (
to face head on any challenge. I succeeded in all three challeng-

es and ended up gaining massive professional and academic ex-
perience from each one of them. Very humbly, I felt that the
challenges I encountered during the first ten years in the U.S.
granted me the value of several more doctorates. I felt that I was
academically reborn and professionally baptized as an applied
linguist that added depth to my skill as a teacher and enhanced
my research fervor in both theoretical and applied linguistics
and phonetics with pronunciation being at the core.
1.3. THE IMPACT OF MY LINGUISTIC/PROFESSIONAL BACKGROUND ON

THE EVOLUTION OF AN APPROACH
No doubt, my linguistic upbringing coupled with my linguistic
education provided the hidden dynamics that helped cultivate
my passion for teaching pronunciation; in turn, it helped mold
my approach to teaching pronunciation that is cognitive in es-
sence, but requires further well-defined pedagogical principles
and techniques for its implementation.
1.3.1. Impact of my Linguistic Background

The scenario of my linguistic background detailed in the above
sections represents diversity in the types of languages to which I
was exposed both as a child and adult. Briefly, I had different
degrees of linguistic exposure to three major language families,
namely Semitic (Assyrian and Arabic), Indo-European (English,
Kurdish and German5) and Turkic (Turkmeni).
A different perspective for my linguistic exposure is reflect-
ed in the nature of the particular experience I have had with
each language. Was the experience in the form of acquisition
very much akin to what children go through in their native lan-
guage as subconscious internalization and full-time immersion
in context/situation-embedded environment? Was it more in the
form of an adult conscious learning without full-time immersion
5
I had some exposure to German through courses in my first degree cou-
pled with other courses at a German Institute in Baghdad. Most of all, my focus
was on pronunciation.
CHAPTER 1 13
and with only minimum context/situation-embedded environ-

ment? Or was it a partial combination of the above two experi-
ences in the sense that it was a full-time immersion in con-
text/situation-embedded environment, but as an adult?
The exposition of my cumulative linguistic experience in
the previous sections indicates that I have been through all three
types of linguistic environments. My early childhood experience
with trilingualism in Kirkuk was of the first type: full-immersion
as a child in context/situation-embedded environment which led
to typical acquisition of the three languages involved. My expo-
sure to English beginning with 5th grade throughout my high
school on the basis of one hour per day could, at best, be de-
scribed as minimal exposure to a language in almost con-
text/situation-reduced environment using the most traditional
approach to teaching known as grammar-translation. This ap-
proach uses translation into the native language (L1) as a medi-
um to teach L2. It is a crude and utterly mechanical way of
learning L2. My four-year college experience with English in
Baghdad was relatively much better and more effective than the
seven years in elementary and high school. There was more ex-
posure to English, better contact with native speakers and more
coordination between oral skills (listening and speaking) and
literacy skills (reading and writing). However, the whole experi-
ence of school and college was a learning experience compared
to my childhood trilingual acquisition. My experience with Kurd-
ish and the four years of my graduate study in England had the
characteristics of both acquisition and learning. Acquisition was
facilitated by the linguistic environments that afforded me am-
ple context/situation-embedded experiences in both languages. I
also had plenty of opportunity in terms of time to intentionally
learn what I wanted to. I vividly remember that when I was first
introduced to tones and intonation in human language I was
totally lost because I had no idea, whatsoever, about these two
aspects of human language. To familiarize myself with this as-
pect of human language, I imposed on myself a very strict
schedule of attendance at the phonetic lab throughout one week
and listened to a diversity of tones and intonation patterns. I
also focused on stress and stress patterns in English as well as
other languages. For English, I did two things to improve my
ability in stress identification and placement. First, I listened
very carefully to native speakers of English and carefully

watched their body gestures. In one instance, I watched the eco-
nomic correspondent of BBC television in 1970s (Dominick Har-
rod) whose straight black hair moved visibly down and up his
forehead often synchronized with stressed syllables, especially
those with primary stress. I rarely missed his appearances. For
other languages, I used to listen to the English pronunciation of
scores of foreign students attending Leeds University. Second,
once I discovered my misplacement of stress, I focused all my
attention on the correct pronunciation and repeated that force-
fully as many times as needed and at times loudly so that I could
hear my performance. I stopped when I felt that I had auditorily,
kinesthetically and cognitively developed an acoustic image in
my mind of the location of stress and its rendition in the target-
ed word and pattern of words.
1.3.2. Impact of my Teaching Career

Altogether, forty-nine (49) years of my life were spent in educa-
tion. The first eleven (11) of which I spent in different Iraqi high
schools teaching English. Four years after that were devoted to
my graduate studies in England where I also taught Arabic on a
part-time basis. After graduation, I returned to Iraq and taught
linguistics and phonetics at both undergraduate and graduate
levels for five years. I had to flee Iraq for political reasons and
settle in the United States at the beginning of March 1981. In
the United States, I first joined Loyola University as an adjunct
professor teaching two courses in linguistics. In 1984, I was of-
fered a full-time position of lecturer in English as a Second Lan-
guage (ESL) with the City Colleges of Chicago (CCC) with which
I stayed until 1990. During the years 19871990 I assumed the
position of instructional advisor for ESL and Bilingual teachers
within the CCC system. In August 1990, I resigned my position
to join Northeastern Illinois University (NEIU) as associate pro-
fessor in the Department of Teacher Education where I stayed
until my retirement (effective January 2009). In the following
sub-sections I will highlight the types of educational and profes-
sional challenges that I encountered and how each challenge
contributed to building and shaping my cognitive and psycho-
logical approach to teaching pronunciation.
CHAPTER 1 15
1.3.2.1. Teaching Linguistics and Phonetics

The five years of university teaching of linguistics and phonetics
in classroom situations in Baghdad helped put my theoretical
knowledge into practice. While I was teaching my students, I
was gaining real-life experience which helped me transform that
into research papers that were accredited internationally. Never-
theless, I still did not have a vision of an approach to which I
could claim ownership. The experience of twenty (20) years of
teaching linguistics at Loyola University was, more or less, a
continuation of what I did in Iraq except for the fact that one of
the courses I was assigned focused on the ethnic and linguistic
communities of Chicago. This course was quite challenging for
me in nature since I was still a newcomer to Chicago. I devoted
long hours to educating myself about the ethnic and linguistic
composition of Chicago, which, in reality, represented almost all
large urban areas of the United States. It was this course that
introduced me to the ethnic and linguistic nature and composi-
tion of the overall society in my adopted country.
1.3.2.2. Teaching and Training ESL Teachers

The first three (3) of the six (6) years I spent with CCC were
simply classroom teaching of English to L2 learners; however,
the last three years were far more challenging both professional-
ly and pedagogically. I was given the responsibility of training
ESL teachers to assume the task of teaching English to a wide
variety of speakers of other languages. At this juncture, I was
professionally only a linguist who knew something about applied
linguistics of which ESL is a discipline. The reader should be re-
minded that towards the end of 1980s ESL had become a prima-
ry discipline of applied linguistics with its own pedagogy in the
form of theories and methodologies. Sensing that I was not pre-
pared for the challenge, I decided to devote my whole time to
review as much published literature in ESL as possible. I was so
overwhelmed with sifting through the relevant materials, espe-
cially those most relevant to the daily teaching of ESL in class-
room situations with all the needed teaching and learning strat-
egies, I felt as if I was preparing for another doctorate. After
approximately six months, I had built up some confidence in
conducting lectures and running training workshops. I conduct-
ed many of them in all of the seven colleges of CCC; in fact, I

did more than what was expected of me. I began to receive ku-
dos from the participants in those workshops, the deans of the
ESL programs as well as the administrators in the central office
of CCC. I was even offered an administrative position in the cen-
tral office which I declined to have because I preferred to be in
the classroom or field rather than in the office.
After the end of those three years, I sensed that I had trans-
formed myself into an ESL specialist. I began to present work-
shops and papers at local and state conferences as well as pub-
lish articles and papers. Personally, I felt that I had enhanced
my academic reputation not just as a phonetician and theoreti-
cal linguist,6 but also as an applied one. Apparently, because of
my performance and professional reputation I was approached
by the College of Education at NEIU to apply for a position new-
ly opened in the Bilingual/Bicultural Program (BLBC) which was
part of the Department of Curriculum and Instruction later
known as Department of Teacher Education. After the formali-
ties of interviews, I was offered the position at the rank of asso-
ciate professor to teach four coursestwo bilinguals and the
other two language arts (English)in the regular program of
Teacher Education. I believe I was offered the position not really
because of my expertise in teaching bilingual education and lan-
guage arts, but rather because of my academic credentials, my
quality publications and my multilingual background in several
languages. My academic assignment at NEIU turned out to be
far more challenging than the ESL one with CCC simply because
I had limited professional experience in both areas. Neverthe-
less, I welcomed the challenge simply because I still felt I was a
refugee ready to confront any challenge in order to survive pro-
fessionally and socially as a person with a family.
6
In 1988 I published my first book outside Iraq titled: The Sound System of
Modern Assyrian (Neo-Aramaic), Harrassowitz Verlag, Germany.
CHAPTER 1 17
1.3.2.3. Teaching Language Arts

In preparing teachers for English language in elementary and
high schools in the United States, usually the course that covers
its methods of teaching is named as language arts. Once I was
assigned two language arts courses the first semester at NEIU, I
had a quick look at the text and I was shocked at how remote
the contents were from the modern linguistic perspective of
teaching a language in an age when linguistics was permeating
every study of every aspect of human language be it as L1 or L2.
Unlike the ESL and bilingual language instruction, the teaching
of English as a native language (L1), usually under the rubric of
language arts, is so untouched by linguistics that, at times, the
approach to its analysis is utterly surface-structured and orthog-
raphy-based that many inaccurate practices and misconceptions
riddle the approach. For instance, English vowels are taught as
if they are five in number and occasionally six when <y> is
added. Such a statement, which is characteristic of phonics, is
often a letter-based approach as opposed to a sound-based one.
Linguistically such a statement is baseless because in all varie-
ties of English there is a minimum range of 1520 vowel pho-
nemes (units) in the form of both simple vowels and diphthongs.
Moreover, in the phonics-based approach consonant clusters (so
called blends) may still be determined on the basis of letters
rather than sounds. The teaching of spelling is so generically
and obscurely conducted that there is hardly any distinction
made between graphic spelling (based on written letters) and oral
spelling (based on letter-names) two totally different language
processes that require different methodologies for implementing.
I was so frustrated with this assignment that I was almost
about to ask the chair of my department to replace them with
other courses. After further contemplation I thought it would be
unwise on my part to ask for replacement since I was a rookie
professor who had to build up his professional and academic
reputation in order to secure employment, promotion and ten-
ure. In face of such a professional dilemma, I recalled fondly my
successful experience with my ESL professional struggle. Once
again I decided to confront the professional and academic chal-
lenge. I embarked on a massive reading and researching cam-
paign to familiarize myself with all aspects of teaching language
arts including the so-called theories, approaches, techniques and
styles. Everywhere I thought I could infuse knowledge from lin-

guistics into teaching language arts I did not hesitate to do so.
I was relieved when the first semester came to an end and I
managed to survive satisfactorily. With the next semester, I felt
more prepared and at ease. As the years passed, I fell in love
with the teaching of language arts because I succeeded in gear-
ing it in the direction of modern linguistics in combination with
up-to-date cognitive theories in language teaching pedagogy. At
this juncture, I began to feel that I had accumulated enough
knowledge and experience to formulate what could be called
my personal approach. Theoretically, the pedagogy of this ap-
proach was premised on the works of four intellectual pillars of
cognitive pedagogy to human language acquisition and learning.
They were Noam Chomsky with his transformation-generative
(TG) approach to the nature of human language and child lan-
guage acquisition premised on the concept of language acquisi-
tion device (LAD); Howard Gardner with his multiple intelligence
theory (MIT); Lev Vygotsky with his zone of proximal development
(ZPD); and Jean Piaget with his child cognitive development
(CCD). With the knowledge I acquired from these giants, I in-
fused my own academic and professional experience of decades
of teaching and researching to formulate what I had called at an
early stage: Multisensory Multicognitive Approach to teaching
pronunciation (Odisho, 2003; 2007/a). Since then I have bene-
fitted from further theoretical insight into child vs. adult acqui-
sition and/or learning of sounds and sound systems. I have
equally benefitted from the feedback that I received during the
last ten years of the application of the approach in classroom
situations with a large number of adult learners from a broad
range of linguistic and cultural backgrounds.
Gradually, I began to feel that there were so many weak-
nesses in the traditional approach to teaching language arts
that I had to unveil them publicly for other people in the field.
My focus was primarily on areas related to my academic orien-
tation and research interests. I began presenting and publishing
research works with focus on the shaky and linguistically inde-
fensible premises of phonics as a tool to teach letter-sound cor-
respondence and the ensuing vowel and consonant systems as
well as the overall pronunciation and spelling. Phonics, for in-
stance, confuses letters with sounds in both domains of conso-
CHAPTER 1 19
nants and vowels but more so in the latter. The alleged dichot-
omy of long vs. short vowels is the most striking example of
confusion in the quality and quantity7 of the English vowel sys-
tem. The manner in which phonics approaches the study of the
sound system of English is utterly vulnerable from the perspec-
tive of modern linguistics. After several years of teaching lan-
guage arts, I had so many inaccuracies and misconceptions to
reveal that I had to write a book titled: A Linguistic approach to
the application and teaching of the English alphabet (2004).
1.3.2.4. Teaching Bilingual Education

I have to emphasize the fact that I was hired by NEIU primarily
to fill a position in the BLBC program. In other words, teaching
bilingual/bicultural courses was my primary responsibility. Cer-
tainly, my multilingual background and my professional linguis-
tic orientation coupled with my quality research work were of
great help in handling some of the courses in the program. Nev-
ertheless, there were areas in bilingual theories and bilingual
education with which I was not familiar. Since the late 1960s
bilingual/bicultural education and bilingualism, at large, have
taken gigantic strides forward in theory and pedagogy. Very
much like ESL, bilingual education has during the last few dec-
ades constituted a major domain in applied linguistics. In order
to be ready to handle all aspects of bilingual/bicultural educa-
tion I had to seriously acquaint myself with the latest innova-
tions and applications in the field. The works of Jim Cummins,
Stephen Krashen, Colin Baker, among many others, were thor-
oughly studied.
As if I had not done enough self-education in the field of bi-
lingual education, the Dean of the College of Education called
for a special meeting of the BLBC faculty during which he as-
signed me the responsibility of developing two master level pro-
grams in bilingual education. This implied the need to develop
7
Because the words quality and quantity will keep recurring and acting
jointly in vowels, I have opted to blend them together in the form of qualtity to
be used where necessary.
from scratch a minimum of six (6) graduate courses supported

with the needed bibliography. Fortunately I was relieved of 50%
of my teaching load for one semester. Throughout the semester,
besides teaching my other two courses, I had exclusively devot-
ed my time to this new professional assignment. The first two
months of the semester were dedicated to surveying the litera-
ture on bilingualism and biculturalism in combination with per-
tinent insights from applied linguistics especially ESL. The focus
during the remaining two months was on designing the frame-
work of each course, identify its contents and supplement a core
bibliography. The courses were approved by the College of Edu-
cation and certified by the educational authorities in Spring-
field/Illinois. Very humbly stated, this achievement in bilingual
education enhanced my professional status among my col-
leagues; besides, personally I felt it significantly added to my
professional stature. I also began to feel that I stepped beyond
my theoretical linguistic or phonetic skills into the realm of ap-
plied linguistics and the pedagogy of teaching language in a
wide variety of classroom and real-life situations. Concurrently
with my triumphant academic wrestling with ESL, language arts
and bilingual/bicultural teaching I began a massive campaign of
national and international academic presentations and publica-
tions. I published scores of papers in refereed journals and spe-
cial volumes in honor of renowned scholars throughout the
world. In 2002, I decided to put my long experience in teaching
and research in several books prior to my retirement. Thus, I
published: Techniques of teaching pronunciation in ESL bilingual
and foreign language classes in 2003; A Linguistic and cognitive
approach to the teaching of the English alphabet in 2004; Tech-
niques of teaching comparative pronunciation: English-Arabic, in
2005; and Linguistic tips for Latino learners and teachers of English,
in 2007.
1.4. CONCLUDING REMARKS

My early multilingualism seems to have been the motive behind
my interest in languages, while learning Kurdish seems to have
been the turning-point that guided me in the direction of lin-
guistics. As for my early pre-doctorate writings, they narrowed
down my general interest in linguistics focusing it on phonetics
and pronunciation. Some light should be shed on this statement.
CHAPTER 1 21
1.4.1. Childhood Trilingualism Triggered Interest in

Languages
There seems to be ample evidence that my trilingual/tricultural
upbringing in Kirkuk was the flicker that lit the road for me in
the direction of interest in languages and their teaching. Since
the formal teaching of my native Assyrian language was very
limited, I gradually developed interest in the last two years of
high school in both Arabic and English beyond their formal clas-
ses. I finally ended up majoring in English for my first degree
and graduated with honors.
1.4.2. Learning Kurdish Triggered Interest in Linguistics

My first degree in English in Baghdad does not seem to have
triggered my future pursuit of self-education in linguistics in
mid 1960s; instead, I give most credit to the five years of my
stay in Sulaimaniya where I passionately pursued the learning of
Kurdish. The first inkling of an infatuation with what I later
found to be linguistics was born there. During the six years of
teaching after that in Basra and then in Baghdad most of my
readings were in the realm of human language with focus on the
history of the English language, the etymology of words and
some pronunciation issues. My pursuit of language studies in the
direction of linguistics was utterly self-motivated. I published
more than twenty (20) articles all in Arabic8 in different Iraqi
newspapers and magazines followed by my translation of a book
from English into Arabic titled Sounds and signs which was pub-
lished by the Ministry of Culture in 1971. By this time, in Iraq, I
was known as a linguist. Remember this was all prior to my
formal graduate study in phonetics and linguistics in England.
8
Except for one article in English, published in the Baghdad Observer
dealing with Arabic loanwords in English via Spanish which retained their defi-
nite article < [ = >]. 25 years later I revisited the theme extensively and
in depth with a publication in Zeitschrift fr arabische Linguistik, Vol. 33, 1997.
1.4.3. Graduate Study Immersed me in Phonetics and

Linguistics
In 1971, after eleven years of applying for some sort of scholar-
ship, I was granted a study leave, which was financially a third
class sort of scholarship. The reason why it took this long to
have even a third class privilege was simply due to political,
ethnic and religious discrimination. Nevertheless, after gaining
this scholarship I joined Leeds University determined to do my
utmost. I joined Leeds because it was the only university in Brit-
ain that granted degrees in phonetics as a separate subject from
linguistics.
In my wildest imagination, I never ever thought that to ma-
jor in phonetics would be so challenging, demanding and frus-
trating on top of the cultural shock of moving from Iraq and set-
tling in Britain. The shock was so powerful in the first weeks
that it pushed me to the verge of frustration; however, I put all
my thoughts together and remembered all those eleven years of
my struggle to continue my higher education as an ethnic Assyr-
ian. I, then, decided to face the challenge and do everything
possible to succeed.
My work was the second doctoral project since the found-
ing of the Department of Phonetics in 1948, the first being
granted in 1970. I had two supervisors: a linguist and a phoneti-
cian. The latter was a young, very active and academically am-
bitious, faculty member in charge of the phonetics lab and the
courses in acoustics, anatomy and physiology. She wanted me to
be in line with her academic interests and ambitions. She
pushed me almost beyond my scientific tolerance; fortunately, I
managed to respond and began gradually to be more science-
oriented in my research.
The bulk of my research was lab-based. I had to conduct a
wide variety of experiments using different gadgets. The most
challenging, and at times scary, experiments were the ones that
involved inserting catheters or polyethylene tubes through my
nostrils for two purposes. One catheter, with a photo-transducer
at the end of it, was to be positioned just above the vocal folds
to detect the opening and closing of the glottis and whether
there was any vibration in the vocal folds. The other catheter
had two small openings at its end to be inserted through the
nasal passage and positioned between the larynx and pharynx.
CHAPTER 1 23
The purpose of this experiment was to monitor the intraoral

pressure changes in the vocal tract.
Such experiments were initially very intimidating and I did
not have the support of anyone who had conducted them before.
Nevertheless, the dream of a doctorate in phonetics by an Assyr-
ian who suffered from discrimination and struggled for eleven
years to have a go at it, made the trial less daunting. Inserting
the catheters was a truly scary and disgusting ordeal. I suffered
from severe coughing and nausea and my nose was almost drip-
ping with mucous. During one year, I conducted fifteen success-
ful attempts and accumulated more data than what I needed for
my doctoral thesis. Being a fast writer, I finished my work, had
my thesis of 420 pages typed and bound 6 months before the
first day of my eligibility to submit.
My graduate studies radically changed me as an educated
person. It gave me academic depth, enhanced my tolerance for
research and innovation and infused more scientific orientation
in my approach to problem finding and problem solving. I be-
came a passionate researcher with a broader horizontal perspec-
tive and deeper vertical vision.
1.4.4. Professional Challenges in the U.S.

The professional challenges in the U.S. in the form of teaching
ESL, bilingual education and traditional language arts were not
in nature what a phonetician was trained to do. Nevertheless,
for a refugee professor facing those professional challenges was
a fact of survival; indeed, I did not just survive, but I also gained
a treasure trove of professional expertise in those three fields of
teaching. As a researcher and professional, I developed a better
understanding of my strengths and weaknesses, promoted more
intimate connection with my students and cultivated richer
teaching and learning styles. Perhaps more importantly, my
knowledge base in linguistics and phonetics became more ap-
plied especially because of a better understanding of human
language and speech as a multilingual person from childhood to
adulthood. My greatest discovery was that any study of lan-
guage, especially human speech, should begin with the evolu-
tionary potential of the brain and its application through inter-
actions in real-life situations. The knowledge and experience I
accumulated throughout almost five decades of teaching led me
to the pedagogy of teaching pronunciation that is multisensory

and multicognitive in theory and in application.
CHAPTER 2: THE COGNITIVE BASE OF LANGUAGE
2.1. LANGUAGE: A SPECIES-SPECIFIC CODE OF COMMUNICATION

Language as a species-specific entity implies that only human
beings are genetically born with a potential for language acqui-
sition in its generative sense that Chomsky has intensively pro-
moted. In this study, the generative nature of human language
entails the potential to produce and comprehend instinct-free
and stimulus-free infinite chains of meaningful structures using
finite number of rules. In its underlying structure, the generative
characteristic of human language also implies the potential for
producing infinite number of meaningful structures from a very
finite number of meaningless sound units (minimal structures)
that are traditionally known as phonemes. This potential is part
of the genetic makeup of human beings and it is ingrained in
their brain.
The code of language with human beings is radically differ-
ent from that of other beings including apes and birds simply
because with human beings the code is open-ended (infinite in
the generation of meaning) whereas with non-humans it is close-
ended (finite in the generation of meaning). Thus linguistically,
the word language is exclusively used for human beings, while
its use for other creatures is, at best, figurative; theirs is simply a
finite code of communication to serve finite functions during
their life span.
This faculty of language is the result of millions of years of
natural evolution of Homo sapiens away from chimpanzees, es-
pecially with regard to encephalization or amount of brain
mass to body mass.1 In more strictly scientific terms, the brain
1
http://en. wikipedia.org/wiki/Encephalization.
25
of a human being has the highest number of nerve cells (neu-

rons) compared to all other animals; these neurons number in
their hundreds of billions. A certain percentage of those billions
constitutes the blueprint of the most sophisticated code of mean-
ing-generation and meaning-decipherment: language. Although
certain locations or areas in the brain such as Brocas expres-
sive and Wernickes receptive areas (Joseph, 2011) are more
commonly associated with human language, language as it is
understood nowadays is the function of the brain as a whole.
Many scholars whose scientific works are in line with Dar-
wins theory of evolution lump the gradual human brain encepha-
lization with the gradual emergence of the faculty of language.
This reciprocal relationship of evolution between the brain and
language has been identified as a co-evolution phenomenon in
nature (Pinker, 1994; Deacon, 1997; Christensen, 2001). How-
ever, the evolution of modern man goes beyond just the co-
evolution of brain and language. Developing an upright gait and
freeing the hands (so-called bipedality, see Ackerman, 2006)
seems to have preceded this particular co-evolution and, indeed
prepared for it. Bipedality afforded man three extremely signifi-
cant advantages. First, it freed the front limbs and transformed
them into what became hands to be manipulated for more crea-
tive survival strategies. Second, it granted him a more efficient
panoramic vision. Thus, vision wise, he did not have to stand on
his hind legs, as many animals still do, to see beyond the level of
his eyes when resting on four legs. Third, the upright gait gradu-
ally helped prepare the vocal tract for both physical survival and
powerful potential for speech sounds generation. These three so-
called sub-evolutions, jointly with others, enhanced his power to
control his environment and interact with it more intimately
and creatively. It is such combinations of sub-evolutions that
have triggered the reciprocal evolution of human brain in the
form of massive multiplication of its neurons. Figuratively
speaking, the evolution of human brain capacity is similar to the
gigantic increase in the capacity of the hard drive of modern
computers from megabytes to Gigabytes to terabytes, etc., but in
a more creative manner.
Obviously, natural survival forced man to be as creative as
possible. The most important strategy for survival was to be-
come a social animal and form a cooperative community with
CHAPTER 2 27
others. In turn, living in a community necessitated an efficient

means of communication to plan and consolidate cooperation.
Consequently, this need for communication led to the gradual
emergence of a system to facilitate it. It is highly likely that in
the beginning, the system was based on primitive hand and
body gestures combined with different types of murmurs, grunts
and noises which were more suprasegmental (long) in nature.
No doubt, this was a very primitive and crude system. With
time, and in order to grant the system more efficiency, it even-
tually evolved into well-defined segmental sounds to be coa-
lesced together in conjunction with melody and rhythm to gen-
erate longer meaningful structures. In a sense, it was an evolu-
tion and progression from more generic to more specific, similar
to the evolution of writing from longer and more general units
to shorter and more specific (i.e. pictographic, ideographic, syl-
labic and alphabetic).
In order for such meaningful messages to be comprehended
by all members of the community, they had to be governed by
rules: rules for sound combinations and others for word combi-
nation or syntax. It was these minimal sound units that were
joined together with rules into larger meaningful units that
paved the way to what we identify now as language. This grad-
ual evolution of rule-governed systems and structures of mean-
ing generation and comprehension could not have been possible
without the qualitative and quantitative enhancement of the
human brain. In turn, this enhancement through the power of
selective evolution was genetically internalized. Such a genetic
transformation granted newly-born babies the potential for acti-
vating, in appropriate social environments, a code of infinite
production/comprehension of semantic messages that justified
identifying language as species-specific. It is this potential that
convinced Lenneberg to name his book Biological foundations of
human language (1967).
2.2. LANGUAGE: A COGNITIVE-SOCIAL SYSTEM SUPERIMPOSED ON

OTHER SYSTEMS
The gradual evolution of language required the generation of
certain types of neurons and neural connections to assume re-
sponsibility for the internalization and activation of language.
The more language distanced itself from grunts, murmurs and
gestures in the direction of more readily generated and easily

identified sounds (noises and voices), the greater the need be-
came for manipulating many of the basic organs that help the
other systems to assume additional functions. Along this line of
thinking, this newly-evolving sociocognitive system had no specif-
ic organs assigned to it to initiate the needed acoustic and aero-
dynamic conditions for speech generation. Hence, in order for
speech to be generated, it had to manipulate the existing organs
in the human body, especially organs from the digestive and
respiratory systems. From the evolutionary perspective this phe-
nomenon of assigning double functions to certain organs has
been recently named exaptation.2 A standard definition of the
term is an evolutionary process in which a given adaptation is
first naturally selected for, and subsequently used by the organ-
ism for something other than its original, intended purpose
(Croom, 2003). Stated differently, the process stands for assign-
ing an extra function to a system or organ in addition to its orig-
inal function or purpose. The most commonly cited example for
exaptation as an evolutionary process is the case of feathers in
birds. Biologically, it has been stated that the original purpose of
feathers in birds had been for the control of body temperature; it
was only later that the evolutionary process adapted them for
flying. In the case of speech, a typical example of exaptation is
the radical modification in the shape of the vocal tract and the
additional functions assigned to different digestive and respira-
tory organs. Below are some of the typical exaptation examples
that facilitated the proper production of speech.
2.2.1. Vocal Tract Modification

The most significant outcome of exaptation accompanying lan-
guage evolution is the modification of the tract between the lips
and the vocal folds (cords). Presently, in human beings, instead
of the tract having an approximately 45 curve, as it still is with
many mammals as well as with newborn babies, in a mature
2
Etymologically, the word was coined from the prefix <ex> +
<(ad)aptation> with the deletion of (ad).
CHAPTER 2 29
adult that tract is now almost at 90 curve and it is known as the

vocal tract. A newly-born baby has a mammalian larynx that
can rise, enabling concurrent breathing and eating, and not until
the age of three months are its speech organs ready for produc-
ing vowels (Pinker 1994:354). It is because of the early 45
shape of the vocal tract and their ability to separate the breath-
ing and eating tracts that babies can drink liquids while lying on
their back whereas adults have serious difficulties doing that
with a 90 tract. This 90 curve has resulted from the considera-
ble lowering of the larynx, thus pulling the epiglottis away from
the velum, and making contact between the epiglottis and ve-
lum no longer possible. It is because of this separation that swal-
lowing is not possible while breathing for adults and vice versa.
In modern man, the 90 vocal tract has two fairly distinct
dimensions: a horizontal dimension beginning with the lips in
front and the end of the oral cavity in the back and a vertical
dimension beginning with the pharynx down to the vocal folds
enclosed in the larynx. This evolutionary modification of the
vocal tract has remarkably enhanced the articulatory, aerody-
namic and acoustic suitability of the vocal tract for immensely
diversified sound generation (in the larynx) and noise genera-
tion almost along the entire vocal tract.
2.2.2. Vocal Folds (Cords) Modes

The vocal folds (commonly known as cords) are small lip-like
muscular tissues that are jointly and horizontally anchored to
the thyroid cartilage (Adams Apple) in front and separately
connected at the back to the arytenoid cartilages.
Their natural function in the human body is to seal against
the accidental entry of any object into the lungs while swallow-
ing and to open them while breathing. With the evolution of
speech, the vocal folds assumed an extremely important func-
tion, namely, voice generation through their different modes of
vibration. In fact, vocal folds constitute the major source of hu-
man sound inventory enrichment. They are involved in the gen-
eration of both segmental (consonants and vowels) and supra-
segmental (stress, tone and intonation) sounds. It is worth men-
tioning that the presence or absence of vocal folds vibration
constitutes the richest and most economic principle for the gen-
eration of contrastive pairs of sounds. The voiced vs. voiceless
dichotomy, which is a universal feature in human speech

(Aitchison, 1996: 183), is the most economic and convenient
distinctive feature to double the basic sounds in any language.
Aesthetically, when voice was embellished with harmony and
melody, humans crafted the most popular form of entertainment
in human historysinging.
2.2.3. Tongue Functions and Maneuverability

The role of the tongue in all mammals is of prime importance
for survival. It has vital functions in feeding: It plays a major
role in ingestion, as in licking, lapping, and browsing; and it
moves food distally through the oral cavity from the incisors to
the post-canines for chewing, and then to the pharynx for bolus
formation and swallowing (Hiiemae and Palmer, 2000). Never-
theless, the role of the tongue in speech is equally important. It
is so important that in many cultures, such as English, Turkish,
Assyrian, Hebrew, etc., the word tongue is synonymous with
language. In actual articulation, its configurations, distances
from and approximations to the other passive or active articula-
tors inside or outside the vocal tract (such as the lips) give birth
to almost all common vowels in human languages. In the pro-
duction of consonants, its role is no less significant. It is the
primary articulator that determines the classification of the ma-
jority of unmarked (common) speech sounds in terms of place
and manner of articulation. The versatile muscular structure of
the tongue grants it so much plasticity that it can even endure an-
tagonistic movements such as having the tip of the tongue placed
at the incisors and alveolar ridge with a simultaneous gesture of a
drastic push of its back/root posteriorly into the pharynx as is the
case with the emphatics < > in Arabic. In short, evo-
lution has given the tongue, the forefront organ of the digestive
system, equally important functions in human speech production
system, especially through its greater maneuverability within the
vocal tract.
2.2.4. Lip Configurations

The essential biological function of the lips is to help food begin
its journey along the digestive system. In speech production,
they are the organs that generate most of what could be called
CHAPTER 2 31
the visible sounds or sound features for both vowels and conso-
nants (Odisho, 2003). In the case of vowels, for instance, lip-
position (spread, neutral, rounded), which is the only distinctly
visible feature in vowel production, constitutes one of the three
primary parameters in vowel formation and description. As for
consonants, the lips, jointly or severally, are active in the for-
mation of all bilabial e.g., [b, p, , ], labialdental [v, , f] and
labialvelar [w] sounds. The labial/bilabial feature is not only felt
by the speaker, but is also seen by the listener.
2.2.5. Cavities Resonance

The current 90 shaped vocal tract has augmented the sources of
speech resonance, thus contributing considerably to the diversi-
fication of speech sounds and allowing different languages
throughout the world to select their own sound inventories
based on places and manners of articulation. Among such reso-
nance cavities are the laryngeal, buccal (mouth), nasal and
pharyngeal cavities. No doubt, all those cavities were biological-
ly exclusively designed to serve primarily the respiratory and
digestive systems. The laryngeal and buccal cavities have re-
sponded to evolutionary pressure leading to clear-cut exapta-
tion; less so, for instance, with the nasal cavity. The structure
and the anatomy of the nasal passage with the absence of a
movable part (articulator), with many side chambers and heavy
coating of mucous membrane render it a highly efficient air-
conditioning tract for respiration, but a very inefficient cavity
for sound production and resonance. This explains why in hu-
man speech production mechanism, it is the oral passage that is
predominant. Only a few nasals and nasalized sounds are attest-
ed in human languages, in general. Proprioceptively, it is quite
difficult to feel nasal airflow, but one certainly can prove its
presence when a nasal sound, such as [m] or [n], is sustained
and then suddenly the nostrils are shut off with the fingers.
2.3 BRAIN SPEAKING VIA RESPIRATORY AND DIGESTIVE SYSTEMS

It is clear from the above brief descriptions that the biological
functions of certain parts and organs of the digestive and respir-
atory systems and the additional functions assigned to them
throughout millions of years of evolution to generate language
have all led to radical changes in the brain of human beings and
other biological systems. The evolutionary growth in brain ca-
pacity and the concomitant evolutionary modifications in some
organs have granted human beings much better qualifications
for physical, cognitive and social survival and creativeness. In
the forefront of such qualifications is the emergence of language
as a unique sociocognitive privilege that sets human beings apart
from other primates.
When the brain fires its instructions for a message to be de-
livered, it is primarily the respiratory system and the upper end
of the digestive system that facilitate the transformation of the
cognitive message into an audible one through aerodynamics
and acoustics. The lungs pump the air (the dynamic power) and
send it through the necessary channels for vibration generation
(voice) at the vocal folds level, if needed, combined with the
appropriate degree of turbulence noise at different junctions
along the vocal tract. It is this concomitant combination of sys-
tematic and rule-governed acoustic signals that impact the ear of
the listener whose brain then decodes those signals according to
pre-conceived code of a given language. Without such a pre-
conceived code, the on-coming acoustic signals would be mean-
ingless. It is just like listening to a language that one does not
know.
Usually in a face-to-face communication, speech is more
readily transmitted and decoded between speakers and listeners
because it is naturally accompanied by facial, hand and body
gestures. This is yet another aspect of human speech where it
out-performs the communication code of other primates. Ac-
cording to more up-to-date research, it seems that gestures
have a tight and perhaps special coupling with speech in pre-
sent-day communication. In this way, gestures are not merely
add-ons to languagethey may actually be a fundamental part
of it (Kelly, et al, 2009). In other words, the authors conclude:
If you really want to make your point clear and readily under-
stood, let your words and hands do the talking.
CHAPTER 2 33
2.4. ECONOMY IN LANGUAGE

Some of the plain definitions of the term economy read like the
following: Careful, thrifty management of resources; an order-
ly, functional arrangement of parts; an organized system.3 Ac-
cording to such definitions human language turns out to be one
of the most economic systems ever developed in nature. It is a
system that uses finite minimal meaningless units to generate
infinite meaningful multi-length units. It is this genius in human
language that Martinet (1964) calls double articulation. The
principle of double articulation is mathematically generative in
nature. It first creates minimal units without meaning because if
they were with meaning then any single unit or any combina-
tion of them would result in meaningful units. With such a sys-
tem it would be too confusing for the brain to retain so many
units with meaning. In the face of such vulnerability to confu-
sion, the brain imposes rules that generate redundancy which, in
turn, provides the sufficient stimulus needed to acquire the sys-
tem of language. Redundancy, therefore, provides the stimulus
needed to acquire a complex grammar system.4 One such pow-
erful rule of redundancy is to deprive the minimal units of the
bottom-most layer of languagethe so-called phonemesof
meaning and start assigning meaning to units at a higher level
generally known as morphemes. Other sets of rules are added to
regulate the syntactical relationships of longer stretches of
speech. It is this multi-layered reversed-pyramidal structure of
language that becomes infinitely generative. It uses a finite set
of rules to create finite systems which jointly are capable of
generating infinite meaningful structures. These sets of rules
serve two primary purposes: first, enable the speaker/listener to
guess meaningful units from meaningless ones within a given
language; second, enable the brain to internalize the finite rules
subconsciously and interfere consciously when needed.
Such a transformational production/reception of meaning
is made possible by the very nature of the human brain which
3
http://www.thefreedictionary. com/economy.
4
http://en.wikipedia.org/wiki/ Redundancy_linguistics.
has the capacity to function as the perfect encoder (generator of

the code) in the case of the speaker and perfect decoder (deci-
pherer of the code) in the case of the listener. In order for the
speaker-listener flow of interaction to be triggered and contin-
ued, the embedded cognitive code in the brain of the speaker
has to be able to reciprocate with the cognitive code in the brain
of the listener and vice versa. Stated differently and plainly, they
have to be speakers of the same language. Meaningful linguistic
communication will be impeded if the underlying cognitive code
is not the same or identical. Evolution through natural selection
has empowered certain centers in the human brain to function
as the decoding or encoding centers. Although the human brain
is holistically responsible for human language, more specifically
the former is identified as the reception center, commonly
known as Brocas area and the latter as the production center
commonly known as the Wernickes area.
Nature is simultaneously a creation and a creator. It rests
on three pillars, namely biology, physics and mathematics. Biol-
ogy creates, physics balances and mathematics computes. It is
the change in the biology of the brain and other survival sys-
tems (respiratory and digestive) brought about by evolution that
made language possible. Then, the rules of physics kicked in to
balance the ensuing changes through computations that mathe-
matics made available. This is nature at its best. Under the in-
fluence of evolution, creations in nature are susceptible to
change. At times, with changes problems emerge. Nevertheless,
nature is conscientious enough not to disturb its own balance by
only initiating problems; its problems are followed by solutions.
For example, when nature enabled the brain to grow larger and
more powerful and capable of housing the species-specific lan-
guage, it (nature) forced the respiratory and the digestive sys-
tems to adapt and facilitate the physical, aerodynamic and
acoustic prerequisites of language generation. For instance, the
45 mammalian digestive/respiratory tract evolved into a 90
tract to accommodate the speech apparatus. The vocal folds had
to acquire far more maneuverability than merely adducting the
passage to the lungs when swallowing. They increased in elastic-
ity to develop different and more sophisticated patterns of ab-
duction/adduction and tension/relaxation. The relevant muscu-
lar apparatus and innervations had to be able to enhance the
CHAPTER 2 35
raising/lowering capability of the larynx to adjust the aerody-

namics of voice and noise generation not just for speech produc-
tion but also for different forms of singing.
Another of the most characteristic features of the human
brain is its hemispheric functional asymmetry which is complet-
ed through the so-called lateralization process according to
which the left brain controls the right side functions of the body
and vice versa. This lateralization process is usually completed
by the age of puberty. If one seeks an explanation for this hemi-
spheric specialization the answer seems to combine the principle
of functional economy with increased specialization.5
The ability of human beings to develop their cognitive,
physical, aerodynamic and acoustic potentials to accommodate
language is the best gift uniquely endowed by nature upon
them. The evolution of such a potential to express limitless
meaning with minimum physical and mental effort is a superb
example of the principle of economy in action when economy is
defined as the minimum amount of effort to achieve the maxi-
mum result (Vicentini, 2003). It is because of this dominating
principle of economy that human beings are able to produce
about three words per second or one sound every tenth of a sec-
ond on average and make only about one sound error per mil-
lion sounds and one word error per million words (Caplan,
1995).
Within this grand system of economy in language, there are
different sub-systems that contribute to building the unique sys-
tem of speech generation. Without such sub-systems the burden
of language on the brain would simply be too much to endure.
To illustrate, the human speech apparatus can hypothetically
generate infinite number of sounds; however, two questions
immediately arise. First, does a human language need that many
sounds to generate speech? Second, does the brain, which has
millions of other biological functions to handle, like to burden
itself with stocking up thousands of speech sounds? The answer
to both questions is no. In the first instance, the generative de-
5
http://pandora.cii.wu.edu/vajda/ling201/test4materials/language_and
the_brain.html.
sign of language requires only tens of sound units (phonemes) to

be recycled again and again in a recurrent and generative man-
ner. This highly economic design is a universal feature of human
language without which it will lose its limitless creativeness. In
the second instance, there is a salient tendency in language to
manipulate sounds whose articulatory maneuvers are predomi-
nantly easy to produce and easy to perceive.
The above two dynamics minimize ambiguity and enhance
clarity. It is, therefore, not accidental that the majority of speech
sounds are produced in the anterior half of the vocal tract in the
form of labial, bilabial, inter-dental, dental, alveolar, etc.,
sounds; moreover, although in actual speech each sound unit
may have a large number of different phonetic variations (allo-
phones) for the same unit,6 the speaker does not recognize and
store all those phonetic variations; rather, it only cognitively
internalizes one abstraction for all the variations of a given
sound unit to be known as a phoneme. It is only in the context
of live speech that the phonemes mold themselves to the context
in which they occur, thus yielding suitable contextual variants
(allophones).
Let us consider yet another example to demonstrate the
principle of economy in language. When speaking, the brain
fires instructions to the phonemes (abstractions) to construct a
word, but the dynamic nature of speech, especially the mutual
interactions of sounds in the flow of speech, creates different
shades (allophonic variations) for each fired phoneme. It is only
the listener who hears in phonemes because they go directly to
his brain which stores only a finite inventory of phonemes. To
state the same fact in more accurate linguistic terms, the speaker
speaks phonetically while the listener recognizes phonemically
(phonologically).
6
For instance, the /p/ phoneme may have several different realizations in
different contexts; it can occur as aspirated as in <pit> or unaspirated as in
<spit>, with lip-rounding as in <pool> or with lip spreading as in <peel>.
CHAPTER 2 37
2.5. CONSCIOUS AND SUBCONSCIOUS BRAINS

In the preceding section, different pieces of evidence were knit
together to highlight the economic premise of language as a re-
flection of the robust tendency towards economy in brain ener-
gy. It is the massive data storage capacity of the brain that ren-
ders language an open-ended system capable of producing and
recognizing endless meaningful stretches of speech. Neverthe-
less, one of the most salient attributes of the brain that relates to
human language and empowers it to be such a rich, creative and
sublime medium of communication will be discussed in this sec-
tion. This attribute is the twin-nature of human brain as con-
scious and subconscious (or unconscious). Although some non-
human creatures may manifest a few hints of such a division of
labor for the brain, in reality it is only a nominal division com-
pared to that of human beings. It is, therefore, logical and sub-
stantiable to say that the human dichotomy of conscious-
subconscious brains has been one of the main evolutionary de-
velopments that gradually emerged to manage, administer and
execute millions of biological, social and cultural functions that
humans have to successfully execute in order to survive healthi-
ly and rationally. One such fundamental function of the brain is
language; indeed, without a highly sophisticated brain there
would be no language. Furthermore, without the dichotomy of
conscious and subconscious brains, language would be too much
of a mental burden on the conscious brain to be able to handle it
so smoothly and effortlessly.
Here again there is a distinct division of labor between the
two brains for the sake of economy in effort through securing
maximum coordination and harmony in the execution of the
myriad functions of the brain. At best, the conscious brain is
responsible for approximately 10% of total brain functions, the
rest being the responsibility of the subconscious brain. The con-
scious brain is responsible for any action that it decides to initi-
ate. Once it decides on a certain action, most of the require-
ments for completion are automatically delegated to the subcon-
scious brain. For instance, to participate in a marathon running
competition is the responsibility of the conscious brain, but the
physical preparation of the body (neuromuscular coordination
and respiration) and the implementation of the actual running
are executed by the subconscious brain. To cite another exam-
ple, when engaged in an informal conversation with family

members and friends, a sizeable percentage of the conversation
is managed by the subconscious brain simply because one does
not seriously engage in planning the contents of the conversa-
tion such as the selection of the needed vocabulary and monitor-
ing the morphological and syntactical rules because the storage
of the lexicon and the rules of grammar are in the subconscious.
However, unlike this informal conversation, if a person engages
in delivering a formal speech orally, the conscious brain as-
sumes a greater role to cater for targeted contents, careful selec-
tion of the needed lexicon and greater adherence to formal
grammatical rules of morphology and syntax. Because of the
conscious role of the brain, the speaker might have more pauses,
hesitations and repetitions than in a casual social conversation
due to a covert conflict between the two brains.
For all biological functions, the subconscious brain never
sleeps because it has to monitor the heartbeat, respiration, blood
circulation, secretion of the necessary glands, digestion and
scores of other survival tasks. Most important of all, the subcon-
scious brain is the sentinel of our normal continued existence
especially inasmuch as language is concerned; it is the seat of
the long-term memory unlike the conscious brain which usually
handles sensory memory (for just a split second) and at best short-
term memory (for a few seconds). With every impression, experi-
ence and event that a person intends to maintain for the long
term, it is the duty of the conscious brain to serve as a medium
to transfer them to the subconscious brain followed by some
reinforcement strategies.7 It is not unrealistic to say that when in
the morning we prepare ourselves to go to work most of what
we do is to obey our autopilot (subconscious brain) which will
help us dress ourselves, have breakfast, start the car, leave the
garage, close the garage door and drive. If all those habitual ac-
tions were to be handled by our conscious brain, by the time we
were at work we would already be stressed out.
7
The retention in the long-term memory is the result of anatomical or bio-
chemical changes that occur in the brain (Tortora and Grabowski, 1996).
CHAPTER 2 39
In an earlier study of the strategies for teaching pronuncia-

tion, it was pointed out that language acquisition is a process of
mental (cognitive) habit formation (Odisho, 2003). I used the
term cognitive to mean storing the linguistic habits in the brain
and retrieving them instantaneously when needed. It is common
knowledge to say that whenever anything has been repeated a
sufficient number of times to have become habitual, it becomes
second nature, or rather a subconscious action (Larson, 1912).
We all as babies, toddlers or young children have slowly and
gradually learned how to grasp things with our fingers, balance
our bodies, walk, run, ride a bicycle and perform countless
number of functions which through constant and systematic
repetition have been transformed into automatic, effortless and
subconscious survival functions. Such transformations are the
greatest relief that nature has ever bestowed upon human be-
ings. It is this transfer of the mental load from the conscious
brain to the subconscious that helps the former avoid collapse
under the pressure of too many requests and commands for ac-
tion. Without mental habit formations, life would be too stress-
ful and burdensome on the conscious brain. In our day-to-day
life, when we say that life is becoming stressful we simply mean
that we are engaging the conscious brain in responding to sev-
eral problems simultaneously, knowing that the brain prefers to
handle one problem at a time. Thus, normal survival of human
beings without a powerful subconscious brain is virtually impos-
sible. It is estimated that the subconscious brain has 40,000,000
nerve impulses per second, while the self-conscious brain fires
40 nerve impulses per second.8
The ease with which human beings use language as their
most efficient social and cultural tool has only been possible
through continuous mental transferences from the conscious to
the subconscious. All the required basic constituent units of
sounds, the rules that combine them in different formations and
assign meaning to them are gradually transferred from the con-
scious brain to the subconscious especially in childhood. It is
estimated that the window for language is from birth to 10 years
8
http://brainforsuccess.com/howyourbrainwork.html.
old. Note how quickly children learn a new language compared

to adults. Moreover, unless the children learn the new language
at a very early age, they will most likely have an accent in that
language for life (Carpenter, 2004).

Evolution is the magic with which Mother Nature works its mir-
acles. If nature imposes a new function or creates a new prob-
lem for its beings, it must, later on, devise a road map to solve
the problem. In this instance, nature forced man to innovate a
culture of survival. The foremost tool for such a culture of sur-
vival was language. However, language as a sophisticated gen-
erative system needs a large brain to store its building blocks, its
rules and organize them to generate the semantic strings that
speech is made of. So, the human brain gradually grew larger
and larger to be the largest relative to body size of all brains of
all animals. In order for language to be spoken, it had to have
the necessary organs to initiate physical energy, to create the
aerodynamic conditions that, in turn, afford the needed acoustic
prerequisites that are translated into speech in the brain of the
listener. This is how nature instructed the brain to negotiate
with the respiratory and digestive systems to collaborate with-
out compromising their biological functions. Both systems re-
sponded by imposing some modifications on their basic biologi-
cal structures and functions. Thus, evolution gave human beings
a modified physical tract that became capable of (almost) per-
fectly operating three systems (respiration, digestion, speech)
instead of two. The evolution of human language is a typical
example of compromise between nature and nurture (Bates,
1999). If the human brain is the most powerful computer ever to
have evolved (even compared to our modern computers) it is
because the brain is the result of millions of years of evolution
whereas the latter are only the product of the last few decades.
Stephen Hawking, in The Universe in a nutshell, says, Present
computers remain outstripped in computational power by the
brain of a humble earthworm (cited in Carpenter, 2004).
CHAPTER 3: LANGUAGE IN THE BRAIN OF A CHILD
3.1. LEARNING VS. ACQUISITION: CONCEPTUAL DIFFERENCES

In dealing with human mastery of language, the two terms
learning and acquisition are the most frequently used to de-
scribe the process. Obviously, learning is more generic in con-
notation than acquisition. The latter carries more technical de-
notation inasmuch as the phenomenon of human language in-
ternalization is concerned. The technical denotation has
emerged with the rise in the popularity of modern linguistic
studies, especially applied linguistics with specific focus on child
language mastery (i.e. acquisition), bilingualism and second
language learning in adulthood. In its specific sense, the term
acquisition has been used to identify the natural process
through which a normal child masters its native spoken lan-
guage perfectly and almost effortlessly compared to second lan-
guage learning by an adult.
There are very significant theoretical and applied distinc-
tions between the two processes that are unfortunately known
only to people who have been exposed to neurolinguistics, cog-
nitive psychology and linguistics, per se. The majority of people
use the term learning; some use the two terms interchangeably.
The applied side of the distinction will be dealt with in the up-
coming chapters. Theoretically, however, there are three ration-
ales for shedding light on the distinction this early. First, the
distinction between the two processes is related to the age of the
learner and the nature of brain involvement in each case. Sec-
ond, the implicit association of the two processes with the di-
chotomy of nature vs. nurture is significant in the context of this
book. Third, although learning and acquisition are occasionally
used as mutually exclusive, in the application of the pedagogy
promoted here the two processes are handled as complementary
in nature.
41
3.2. THE BRAIN OF A CHILD AND LANGUAGE

Children are the best learners of language on earth simply be-
cause nature has endowed them with this gift. They begin inter-
nalizing their mother tongue not just in the postnatal period, but
also before birth. Language begins burgeoning with the early
formation of the fetus. One of the key findings of an ongoing
research project by Canadian and Chinese researchers who are
studying infant development suggests that while still in the
womb, our brains learn speech patterns, laying the groundwork
for language acquisition.1 Although we think of human infants
as being behaviorally immature at birth, there are many aspects
in which the human brain is unusually mature at birth (Clancy
and Finlay, 2001).
3.2.1. Child Brain Formation and Maturation

In terms of the power and speed for information encoding, gen-
eration, transmission, reception and decoding, there is no
match, to be identified yet, in nature and in modern technology
to the human brain. If a comparison is made between the human
brain and the latest most powerful computer in the world, re-
search shows that each neuron is comparable to a single com-
puter. With one hundred billion neurons in the brain, we have
the neural capacity of approximately one hundred billion net-
worked computers each of which gets modified and updated on
an everyday basis.2 Since the focus of this book is on language,
in general, and pronunciation, in particular, many of the biolog-
ical and physiological aspects of the human brain fall outside
the realm of the book. All attention will be centered on the func-
tion of the neurons (brain cells) and their synapses,3 which con-
1
http://abcnews. go.com/Technology/ story?id=97635&page=1.
2
Wesson, Neuroscience: http: //www.sciencemaster.com/columns/wes
son/wesson_part_03.ph
3
A synapse is a junction between two nerve cells, consisting of a minute
gap across which impulses pass by diffusion of a neurotransmitter. When a nerve
impulse reaches the synapse at the end of a neuron, it cannot pass directly to the
next one; instead, it triggers the neuron to release a chemical neurotransmitter.
CHAPTER 3 43
stitute the neural network of the brain in crafting and operating

human language. The neuron transmits information through its
axons and receives information through its dendrites (Frana,
2006). The intricate manner in which this network enables a
child to internalize the code of a language into which it is born
with so much ease is simply the prime miracle of Mother Na-
ture.
It is startling to realize how much of fundamental brain
morphology and organization is already laid down by the end of
the first trimester even before many mothers realize that they
are pregnant (Clancy and Finlay, 2001). At birth, each neuron in
the cerebral cortex has approximately 2,500 synapses.4 By the
time an infant is two or three years old, the number of synapses
is approximately 15,000 synapses per neuron (Gopnick, et al.,
1999). At its peak, the cerebral cortex of a healthy toddler may
create 2 million synapses per second.5 In general, young chil-
dren may have more synapses than they will ever need. The de-
velopment of synapses occurs at an astounding rate during chil-
drens early years in response to childhood experiences. In fact,
the sudden increase in synapses amounts to more than what an
adult brain usually has.
The significant question at this juncture is: why is there
such a gigantic surge in synapse formation or synaptogenesis?
The obvious answer lies in the immense environmental pressure
on the brain brought about by the physical and mental growth
of a child. In fact, the abundance in synapse formation is a safe-
ty and precautionary measure on the part of nature to success-
fully carry a child through the most significant phase in human
life. Once the formative years carry the child along the path to
The neurotransmitter drifts across the gap between the two neurons. On reach-
ing the other side, it fits into a tailor-made receptor on the surface of the target
neuron, like a key in a lock. This docking process converts the chemical signal
back into an electrical nerve impulse. (http://www.sciencemuseum.org.uk/Who
AmI/FindOutMore/Yourbrain/Howdodrugsaffectyourbrain/Whatsasynapse.aspx).
4
http://faculty.washington.edu/chudler/plast. html.
5
https://www.childwelfare. Gov /pubs /issue_briefs/brain_development
/how.cfm, 2009
maturity and enable it to successfully and creatively engage in

normal physical and cognitive survival, a visible sudden drop in
the number of synapses, called synapses pruning, takes place.
Human and animal studies show that the mammalian brain un-
dergoes massive synaptic pruning during childhood, removing
about half of the synapses before puberty (Chechik, et al 1999).
It is, therefore, fair to say that the infant arrives in the world
with a nervous system whose working components are in place
and organized. All cells are generated, major incoming sensory
pathways are in place and all have already gone through a peri-
od of refinement of their total number of cells, connections, and
topographic organization (Clancy and Finlay, 2001). There is
ample scientific evidence that the human brain is primed right
from its fetal phase throughout adulthood to be the unique and
powerful organ to construct, operate and manage the perfect
communication system ever known in historyhuman lan-
guage. The relationship between the human brain and human
language typifies a perfect balance between nature and nurture:
the former creates and demands while the latter reacts and
heals.
As for building cognitive habits in the brain, it is worth
mentioning that the more frequently the neuron connections are
used, the more they retain information and the stronger they
become.6 It is in this manner that children execute the process of
natural acquisition of language.
3.2.2. Formative Months and Years of Mother Tongue

Research indicates that infants are able to respond to sound 10
weeks before birth through bone conduction (Shiver, 2001).
Nevertheless, the actual acquisition7 of language begins with
birth when the infant is not only exposed to its mothers speech
6
http://www.brighthubeducation.com/infant-development-learning/
35203-sensory-stimulation-for-infant-brain-development/
7
Acquisition is used here in its technical denotation as a natural, effort-
less and subconscious process of language internalization. It is collectively, but
not absolutely, opposed to the process of learning which tends to be effortful
and conscious.
CHAPTER 3 45
but also to the speech of other members of the family and to all
natural sounds in its environment. The period from birth
throughout childhood and early adolescence is the prime time
for language acquisition. The brain, through billions of neurons
and trillions of synapses, is ready to assimilate any structures of
language beginning with its minimal sounds through larger
combinations of them in the form of words, phrases, clauses and
sentences leading to discourse. This burgeoning of language ac-
quisition neatly coincides with the mushrooming of the synapses
in the brain that become the pathways to the neurons for a two-
way communication of speech production and perception.
Clancy and Finlay (2001) neatly summarize child language
evolution during the first thirty (30) months as follows:
First, the period between 8-10 months is a behavioral wa-
tershed, characterized by marked changes and reorganiza-
tions in many different domains including speech perception
and production, memory and categorization, imitation, joint
reference and intentional communication, and of course
word comprehension. It seemed plausible that this set of
changes (which are correlated within individual children)
might be related to patterns of connectivity and brain me-
tabolism. Second, the period between 16 and 30 months en-
cases a series of sharp non-linear increases in expressive lan-
guage, including exponential increases in both vocabulary
and grammar. A link seemed possible between this series of
behavioral bursts and a marked increase in synaptic density
and brain metabolism that was estimated to take place
around the same time.
Once the base of language is consolidated in the brain of a child,

the massive synaptic activity slows down in volume and speed.
It is at this stage the synaptic pruning phase kicks in. A huge
number of neurons perish and even greater numbers of synapses
are simply eliminated; it is truly a use-it-or-lose-it situation
(Shiver, 2001). Other synapses are redirected to handle addi-
tional cognitive functions. For instance, not all millions or bil-
lions of synapses that were engaged in internalizing the sound
system (phonology) of the native language are needed because
the phonology and the articulatory maneuvers required for its
actual production/perception are already perfectly internalized
to the extent of being fully cognitively habitual or subconscious.

It is also quite likely that a sizeable percentage of the neurons
and synapses are assigned different linguistic functions such as
focusing on enriching the lexicon (vocabulary) and upgrading
the morphological and syntactical rules. For example, it is quite
common for very young native speakers of English to demon-
strate excellent mastery of their sound system, but fail to cor-
rectly conjugate past tense and past participle of irregular verbs
such as <go>. For instance, instead of conjugating <go> as
<went> and <gone>, they apply the dominant rule of <-
ed> ending resulting in <goed> in both cases. In light of such
grammatical flaws, it is likely that some of the phonology-
oriented synapses in the brain of those young learners will be
redirected to engage in the mastery and refinement of other
grammatical rules. This neural and synaptic scenario in terms of
peak activity, slow down and redundancy is very reminiscent of
the working conditions in a factory. When there is high demand
for goods the laborers work overtime, when the demand de-
creases a slowdown follows forcing the management to lay off
some laborers and/or reassign others to different jobs. In this
game of brain and language, the brain is the manager, the neu-
rons and synapses are the laborers and the requirement of lan-
guage acquisition is the demand.
3.3. COGNITIVE TRANSITION IN SOUND PERCEPTION AND

PRODUCTION
Once the child is born into the world, it has to transition from
the world of hearing to the world of listening as the former is a
sense while the latter is a skill indeed, listening can be
thought of as applying meaning to sound, allowing the brain to
organize listening is where hearing meets brain listening to
language is uniquely human (Beck, 2011). In the process of
language acquisition, the child has to hone its listening skills to
be able to discriminate one sound from the other. Discriminative
listening is a cognitive process through which sounds are inter-
nalized into the long term memory to be subconsciously and
effortlessly identified, selected and produced when needed.
Neurolinguistic literature is fraught with evidence substan-
tiating the fact that very young infants discriminate not only the
segmental contrasts of their native language, but many
CHAPTER 3 47
nonnative contrasts as well (Best, 1991). This implies that the

phonetic inventory of infants and young children in their native
language is richer than when they become adults. Bests (2002)
findings further suggest that infants begin life with language-
universal abilities for discriminating phonetic sounds that do not
exist in their language environment. However, these abilities
gradually decline in cross-language speech perception between
infancy and adulthood; in fact, adult speech perceptual ability is
more limited, reflecting discrimination of only those contrasts
which are phonemic in the listeners native language (Werker
and Tees, 2002).
It might be that the difficulties adults have will reflect a
permanent, absolute loss of sensory-neural sensitivity to the
acoustic properties of those contrasts discriminating between the
pair members of many nonnative contrasts, or their linguistic
properties (Eimas, 1978). True, the loss in sensory-neural sensi-
tivity is a serious one, but is neither absolute nor permanent;
there is always room for neuroplasticity in early adulthood.
Nevertheless, the gradual loss in sensory-neural sensitivity is a
natural phenomenon and seems to correspond neatly with the
synaptic pruning that the infant brain undergoes in the early
years. The synaptic pruning is an attempt on the part of the
brain to internally reorganize its neural functions and avoid any
redundancies in the whole process of phonology acquisition.
Thus, when the baby hears people speak a certain language, the
brain strengthens connections for the sounds of that language;
consequently, the connections for the sounds of other languages
become weaker and may eventually wither.8
When the brain of the infant senses that it has an almost
complete mastery over the sound system (phonology) and com-
munication is sustained almost flawlessly with parents and other
siblings, it reinforces the system and locks it in the subconscious
brain and indirectly excludes other sounds that are not experi-
enced and do not recur in his native language. Linguistically,
this transition from discriminating a large number of sounds in
childhood to recognizing only a limited core in adulthood is
8
http://www.fcs.uga.edu/outreach.
known as the transition from phonetics to phonology or a transi-

tion from physical entities to cognitive ones. Stated differently,
it is a transition from the world of concrete speech sounds in the
environment to abstract sounds in the brain. Once this concrete-
to-abstract transformation takes place the child gradually be-
comes psycholinguistically insensitive or, at times, deaf9 to
sounds that are absent in his language or occur only allophoni-
cally (as contextual variations of a phoneme). This transition is
the foremost reason why adults learning a second language (L2)
usually fail to accurately pronounce sounds that are alien to the
native phonology of their language or dominant dialect; it is this
failure that we identify as accent in the pronunciation of a given
language or even a given dialect within a language.
3.3.1. Transition from Phonetics to Phonology

In linguistic studies, phonetics is the study of human capabilities
for speech sounds production: their places and manners of pro-
duction, their voicing and voicelessness, among others. Human
speech organs are capable of generating thousands of sounds,
some of which differ from each other in only minuscule and in-
significant features (Catford, 1977). However, this profusion in
physical sound generation is incompatible with the dominating
tendency in the brain towards economy in both physical and
mental effort. Stated differently, the rule of the brain is to store
finite sound units in memory, but generate infinite stretches of
meaning. In order to attain this goal, the brain applies a sound
compression technique in the form of the process of abstraction.
In other words, the brain, right from infancy, begins the process
of compressing many physically and/or perceptually related
sounds under one cognitive unit stored in the subconscious
brain which is traditionally known as a phoneme or a sound unit.
9
When in England as a graduate student, I had an Arab friend from Syria
who used to pronounce <judge> = [] as []. When I brought that to his
attention, he said: I said [ I didnt say [ . He absolutely did not realize
that he was still pronouncing [] as []. Indeed, he was deaf to this sound.
It took me a long time to convince him that he had to pronounce <judge> with
[] not [].
CHAPTER 3 49
As pointed out earlier, each phoneme is the abstraction of many

physical sounds identified as allophones. To illustrate, the sounds
of in the following words: <pit>, <spit>, <pool>, and
<peel> are different from each other. Generally speaking, they
are simply phonetically described as follows: aspirated; unaspi-
rated or deaspirated; aspirated with lip-rounding; and aspirated
with lip-spreading. Each of those four represents an allophone of
the phoneme /p/.
In reality, the human brain is not simply satisfied with this
process of abstraction; abstraction is coupled with a process of
selection and elimination. The brain of a typical native speaker
of a given language selects a number of sounds10usually in
tensfrom a huge inventory of possible natural sounds to form
the inventory of his L1 sound system. It is this limited inventory
of units that is known as phonology.
Once this transition is completed in the brain of a child,
two major linguistic consequences are distinctly noticed. First,
the natural phonetic gift of a child in being able to perceptually
discriminate a wide variety of sounds including many that are
not part of the inventory of his native language (L1) gradually
begins to subside with age. Accordingly, his skill in perceiving,
discriminating, let alone producing non-native sounds erodes
progressively. Naturally, the erosion continues with older age. It
is worthwhile emphasizing that the erosion is not the result of
loss in the plasticity of the vocal organ as much as it is a conse-
quence of the loss of sensitivity (neural plasticity) to sounds and
sound phenomena outside the periphery of the native language
phonology. Second, barring any speech deficiency, the mastery
over the phonology of L1 becomes perfect, but the perfection
will be at the expense of non-L1 sounds perception/production.
Such biases facilitate rapid perception of native speech, but seri-
ously impede perception of speech sounds in a language other
than the native one (Port, 2007). It is in the above transitions in
the skill of sound perception and production that the root cause
of the linguistic phenomenon of accent is buried. In sum, it is the
10
The majority of languages throughout the world have an inventory
range between 2050 units.
phonology of the native language (L1) that imposes phonetic

limitations on perceiving and producing L2 sounds.
3.3.2. The Brain as the Commander-in-Chief of Language

Acquisition: The Cognitive Roots of Linguistic Accent
Detailed discussion of the applied implications of accent in lan-
guage will be dealt with in a specific chapter. At this juncture,
since the focus of this book is on the cognitive pedagogy of
teaching pronunciation, only the cognitive roots of accent will
be tackled in this section. In a nutshell, accent is a psycholin-
guistic problem; it is a cognitive problem whose solution should
be primarily cognitive as well. It emerges as a result of the brain
trying to immaculately internalize the phonology of the native
language and transfer it from the conscious brain to the subcon-
scious and render the process of pronunciation as effortless as
possible with the purpose of alleviating the stress on the brain if
all language interactions are conscious. It seems that the brain
of an infant assigns a massive contingent of neurons and synap-
ses to perfect the process of internalizing the phonology. As if
during the formative years, the brain keeps communicating
mutely with the infant and negotiating the conditions for ac-
complishing the mission. The mute dialogue between the infant
and his brain might go on in the following manner:
Brain: Do you love your mother tongue?
Infant: Yes, I do; indeed, I adore it.
Brain: Do you want to master its pronunciation perfectly?
Infant: Yes, I love to.
Brain: I will help you with it.
Infant: Oh! Thank you! Thank you!
Brain: Are you planning to learn a second language as an

adult?
Infant: May be.
Brain: It may be difficult for me to help you master its pho-

nology as perfectly as your L1.
Infant: What do you mean?

CHAPTER 3 51
Brain: You may have an accent.
Infant: Why?
Brain: You have either to start much earlier or you have to

have a gift as well as a creative instructor to help you reduce
your accent.
Infant: O.K. I have no better choice.
The gist of the above assumed and tacit dialogue between the
brain and the self implies that once a person approves the condi-
tions of the brain for securing perfection in the phonology of L1
at the expense of the phonology of L2, he/she implicitly ap-
proves the likelihood of emergence of accent in L2.
The argument that accent is a cognitive problem and its so-
lution should also be cognitive-oriented with the assistance of as
many sensory modalities (channels) as possible is premised on
three cognitive strategies on the part of the Commander-in-
Chief, the brain. First, it is the economic principle of employing
almost negligible cognitive effort to generate the maximum
product by internalizing a finite inventory of sound units and
storing them in the subconscious for spontaneous and effortless
retrieval. Second, store the sound system in a safe zone as if the
brain intends to grant it immunity against interferences from
non-native sound systems and other outside interferences. Third,
redirect the redundant neurons and synapses after successful
internalization of the sound system to reinforce other linguistic
skills including the lexical, morphological, syntactical and stylis-
tic.
It is the above argument that has inspired the title of this
book: Pronunciation is in your Brain, not in your Mouth. If one
intends to eliminate, or avoid an acute accent the only option
to achieve that is to negotiate with the brain for entry visas to
accommodate additional alien sounds. Children are always
granted such visas whenever they have ample linguistic expo-
sure to an alien language. Adults, unfortunately, have to try
hard and even wait for longer time to secure a visa. Oftentimes,
adults are granted the visa, but with constraints attached to
ityou have to have a certain degree of accent. It is almost a
universal tradition that a child born and raised in a certain
country secures the citizenship of that country. By the same to-
ken a child born and raised in a certain language becomes the

citizen of that language. It is a natural privilege of children.
3.4. FOSSILIZATION OR PSYCHOLINGUISTIC INSENSITIVITY

There is no doubt, whatsoever, that children show more adept-
ness in language mastery than adults do, especially in the area
of pronunciation. As has been discussed earlier, this distinction
is attributed to the natural age-bound neuronal and synaptic
proliferation and activity with regard to native language inter-
nalization in early childhood followed by a gradual slowdown.
With age, adults begin to progressively lose their aptitude for
the automatic and subconscious internalization of pronuncia-
tion. Consequently, the process of mastering the pronunciation
of a second language becomes increasingly more conscious, me-
chanical and effortful.
In the available literature, failure of adults to further im-
prove their mastery of L2 pronunciation beyond a certain limit
has been identified by some researchers as fossilization. (Selink-
er, 1972).11 According to the cognitive pedagogy adopted here,
the use of the term fossilization to identify the slowness and
imperfection of adults mastery of L2 pronunciation is rejected
for several reasons. First, the choice of the term fossilization is
unfortunate since it implies excessive rigidity to be associated
with the enormous potential plasticity of the human brain. Sec-
ond, the neuronal and synaptic activities which are behind the
internalization of L1 sound system do not seem to be exhausted
because many adults, young and middle-aged, can still secure a
highly satisfactory proficiency in pronunciation. Some investiga-
tors (Johnson and Newport, 1989) conclude that there is no sin-
gle moment when the window of linguistic opportunity slams
shut.12 Third, the inefficient methods of teaching pronunciation
11
The definition of the word fossilization is the process of being turned
to stone.
12
The author significantly improved his English pronunciation at the age
of thirty-three (33) by eliminating many phonetic and phonological residues left
over in his English from his Aramaic and Arabic languages. He also added a
wide variety of new sounds to his phonetic inventory. It was a very demanding
CHAPTER 3 53
that rely exclusively on the auditory modality (i.e. repeat-after-

me technique) instead of an approach that is multisensory and
multicognitive in essence and secures significant results (Odisho,
2003). Fourth, the extent of exposure to L2 and the context of
exposure13 are of considerable importance in reducing the
acuteness of L2 accent. Finally, if self-motivation on the part of
the learner is added to the above four factors the accent can be
considerably reduced and, occasionally, minimized.
The rejection is justified based on the fact that systematic
multisensory and multicognitive orientation helps all learners,
regardless of age and aptitude for pronunciation, to improve
their skills, though to different degrees, in the acquisi-
tion/learning of L2 pronunciation. Using a combination of di-
versified multisensory and multicognitive techniques and exer-
cises, the learning process can continue, albeit slowly, but it will
hardly cease completely to the extent claimed by fossilization.
In an earlier work (Odisho, 2003), this slowness or intran-
sigence of adults in the acquisition/learning of L2 pronunciation
was known as psycholinguistic deafness which, unlike fossilization,
does not imply total cessation of learning; rather, it keeps the
doors of acquisition/learning of L2 pronunciation open depend-
ing on the rationales mentioned above. Maybe the choice of
deafness was too loaded a term; nevertheless, what I implicitly
meant was a transitional decline in efficiency or sensitivity to
speech sound perception/production, but not a total cessation.
Stated differently, with adulthood, speakers of any language
gradually develop a certain degree of psycholinguistic insensitiv-
ity to the perception, hence production, of speech sounds out-
side the phonological realm (inventory) of their native language
(L1). It is, however, noteworthy that the distinction between the
adeptness of children and adults to language acquisition is pri-
and time-consuming task, but it was doable for two reasons: first, it was full
immersion in an all-English speaking environment; second, it was an environ-
ment of academic specialization that imposed on the brain to react to the specif-
ic situation.
13
Is it classroom exposure only or is it combined with sufficient social in-
termingling in the community of the targeted language?
marily confined to the linguistic domain of pronunciation and

not necessarily to other domains, such as morphology, syntax
and lexicon in which adults may be equally adept or even more
adept than children in many cases. It is, therefore, logical to
conclude that for a child there is no easy or difficult language or
languages to acquire, especially their pronunciation; the difficul-
ty in mastering the pronunciation of second language tends to
be a trait of adults.
3.5. THERE IS ROOM IN THE HUMAN BRAIN FOR MORE THAN ONE
LANGUAGE
Bilingualism is a normal natural phenomenon in human civiliza-
tion. There are as many bilinguals around the world as there are
monolinguals; indeed, there is hardly any country that does not
exhibit a certain degree of bilingual communication. In the long
history of human civilization, bilingualism has never been a
marginal or accidental linguistic phenomenon that emerges spo-
radically and intermittently here and there. Conversely, bilin-
gualism is a constant component of the overall structure of hu-
man civilization; it automatically emerges when two or more
language communities or speakers come into contact. Bilingual-
ism is an easily justifiable normal and natural sociolinguistic
and psycholinguistic phenomenon. The sociolinguistic natural-
ness of bilingualism is substantiated by its pervasiveness
throughout all linguistic communities; likewise, psycholinguisti-
cally, bilingualism is a normal and natural phenomenon because
human beings, especially the young, internalize it readily, im-
plying that the human brain is endowed with enough cognitive
potential to absorb more than one language (Odisho, 2002).
The brain of a child with billions of virgin neurons and
synapses is a massive generator of cognition, imagination, crea-
tion and innovation. It is, therefore, quite natural for a child to
automatically acquire two, or even three languages if it has am-
ple access and exposure to them. If the exposure is balanced
then naturally the competency in the two languages will be bal-
anced as well. In fact, each language may be handled as a sys-
tem in its own right and the competency will greatly resemble
that of monolingual child in either language (de Houwer, 1990).
Stated differently, if the child is exposed to two languages from
a very early age, he will essentially grow as if there were two
CHAPTER 3 55
monolinguals housed in one brain.14 The linguistic competency

with many bilingual children can be so high that enables them
to transition instantaneously and subconsciously from one lan-
guage to the other. Obviously, some unintended switching and
mixing between the two languages is to be expected. The human
brain is powerful enough for more than one language, but in the
absence of a bilingual or multilingual environment, the child
will naturally grow up as an immaculate monolingual.
3.6. NARROWING DOWN THE BROAD DEFINITION OF ACCENT

The broadest definition of linguistic accent is that it is a devia-
tion from a given norm of pronunciation acceptable to a group
of speakers. With this broad definition, it is irrational and im-
practical to design one approach to remedy all types of devia-
tions from the given standard or acceptable model of pronuncia-
tion. Often, the term accent carries a negative connotation
since it is usually associated with comments such as: He speaks
with an accent; He has a strange accent; I cant understand
him because of his very heavy accent. With todays linguistic
refinement in the description and assessment of human lan-
guages, the term is still often used generically without much
specificity. The only distinction that has relatively been made
somewhat clearer is between a dialect and an accent. The
former is usually used to refer to a combination of grammatical,
lexical and pronunciation differences, whereas the latter is es-
sentially confined to pronunciation differences.
In light of the above information, two points should be tak-
en into consideration. First, highlight all identifications and de-
scriptions of the term accent and select the one that will be the
focus of this study. Second, identify the approach that will be
used to teach pronunciation, in general, and accent remediation
in L2, in particular. The response to the first point is covered by
the following two dichotomies: intralanguage accent vs. interlan-
guage accent and phonetic accent vs. phonological accent which
will be the focus of Chapter 4. The approach will be touched
14
Petitto, quoted in sue.knapp@dartmouth.edu.
upon lightly in this chapter, but details will be dwelt upon thor-
oughly in some of the remaining chapters.
3.7. IMPLICATIONS FOR UNDERSTANDING THE COGNITIVE NATURE OF

ACCENT
There are both theoretical implications and practical applica-
tions for the cognitive identification of the underlying causes of
accent especially with adult L2 learners in contrast to children
internalizing their L1 or even L2.15 The practical applications
will be elaborated on elsewhere in this book; thus, the focus
here is on the theoretical implications that should outline the
pedagogical roadmap for application. Foremost among those
theoretical implications are the following.
1) Develop a cognitive approach that reflects the latest un-
derstanding of distinctions between the nature of human
language acquisition and language learning.
2) Envisage the methodology or methodologies and identify

the teaching and learning techniques that take the above dis-
tinctions into consideration and implement them.
3) Be aware of the phonetic vs. phonological differences in

pronunciation and how the latter should receive the priority
as they cause semantic (meaning) confusion.
4) Determine the proficiency level of pronunciation that is

targeted; is it native, near-native or just acceptable?
5) Determine the main functional goals and objectives of

teaching pronunciation.
The striking disparity between children and adults is that chil-

dren internalize language through a natural process identified in
applied linguistics as acquisition. Cognitively, they are primed
to master their L1 or even beyond that. Their brain is ready to
masterfully absorb any linguistic materials to which they are
15
In fact, being fully exposed to and immersed in three different languages
as a child, I grew up trilingual with native oral mastery of all three of them (As-
syrian, Arabic and Turkmeni).
CHAPTER 3 57
amply exposed in authentic social contexts and situations. Col-

lectively, all those three conditions of cognitive readiness, ample
exposure and authentic context lead to the process of acquisition
which is a teacherless process in the formal meaning of teaching.
All that a normal child needs is the habitat of a family and the
community. Children acquire the core of the native language,
especially pronunciation, from mere exposure to the community
around them which is also known as incidental learning. Inci-
dental learning through overhearing occurs when children listen
to speech not directly addressed to them, yet they learn from it.
Amazingly, very young children learn approximately 90% of the
information they acquire incidentally (Beck, 2011).

In chapter 2, the focus was on the interaction between brain and
nature and the power of the latter in the evolution of the for-
mer. This chapter had a triangular emphasis on nature, brain
and social environment in that language is a product of all three
and children are a most fortunate beneficiary of the combination
of the three forces. Language acquisition for a child is identical
with the acquisition of walking; they both evolve, ceteris paribus,
naturally and are seamless in their function. This is why chil-
dren do not have an accent when thoroughly exposed to a lan-
guage besides their own. Accent is a linguistic phenomenon that
is associated with adults. Naturally, no adult wants to speak an
L2 with an accent; nevertheless, there will often be a sliver of
accent. How thick or thin the sliver would be depends on
many conditions. In a social-linguistic context, a thinner sliver is
better than a thick one. It is for this reason that the next chapter
emphasizes the distinction between a phonological accent and a
phonetic one, the former being the thick (implying a thick ac-
cent), and the latter being the thin. Functionally and profession-
ally, priority should be given to thinning (i.e., reducing) the
phonological accent first then handling the phonetic one.
CHAPTER 4: LINGUISTIC ACCENT: DEFINITION,
CLASSIFICATION AND DEMONSTRATION
4.1. INTRODUCTORY REMARKS

As mentioned earlier on with regard to the difference between
dialect and accent, the former is usually used to refer to a com-
bination of grammatical, lexical and pronunciation differences,
whereas the latter is essentially confined to pronunciation dif-
ferences. In fact, in ESL classroom situations or real-life interac-
tions, one is able to identify the accent of an L2 speaker not
necessarily based exclusively on pronunciation inaccuracies and
deviations from acceptable targeted standard pronunciation. In
many instances, there are grammatical, morphological and lexi-
cal hints that trigger an accent in pronunciation. Take, for in-
stance, the case of some English past tense and past participle
formations of verbs which end with suffixes that are in the form
of voiced or voiceless consonant clusters. Often such clusters are
broken up or reduced in one way or another by Hispanic learn-
ers of English because they are alien to their native phonology.
For example, the past tense of <keep, rock, suck, sum, bog> is
<kept, rocked, sucked, summed, bogged> and are pronounced
as ]. Traditionally, Hispanics try to
overcome the problem in one of two ways both of which are,
unfortunately, wrong: 1) Reduce the cluster by dropping one of
its elements, usually the latter. This means that instead of pro-
nouncing [] and [] correctly, they would simply reduce
them to [] and []. 2) Insert a vowel between the two el-
ements of the cluster, break up the cluster and cause a reshuffle
in the syllabic structure of the word. To demonstrate, instead of
pronouncing the word <kept> or <summed> as [] and
[], they are rendered roughly as [] and []. Un-
doubtedly, this is a grammatical problem but its roots are in
pronunciation. In the totality of the performance of English,
59
such instances could be treated as both morphological and

grammatical errors besides being pronunciation errors. Errors
such as those are different from replacing [v] with [b] or [z]
with [s] as in the words <have> and <has>, respectively,
which are sheer pronunciation inaccuracies. By the same token,
an Arab student may make a statement as follows: The house
beautiful instead of The house is beautiful. Surely, this is a
grammatical error attributed to the absence in Arabic of what is
equivalent to the verb to be in English; nevertheless, it is more
readily captured by the listener through a gap in the overall
pronunciation. Certainly, errors in the overall pronunciation of a
statement in L2, such as the ones above, tend to point in the
direction that accent as part of speech production and percep-
tion process on the part of the speaker/listener can receive inter-
ference from other linguistic systems besides pronunciation. In
light of such broadening of the concept of accent, one can justify
considering it in terms of surface layer vs. deep layer with the
former ascribed exclusively to mispronunciations, whereas the
latter ascribed to other linguistic factors including grammatical
and lexical.
4.2. INTRALANGUAGE AND INTERLANGUAGE ACCENTS

The intralanguage accent refers to sound differences that exist
among different dialects or varieties of a given language such as
the difference in the pronunciation of // as an alveolar trill or a
uvular trill or fricative across different German dialects. Similar-
ly, the pronunciation differences between New York and Chica-
go dialects or even the differences between Received Pronuncia-
tion (RP) of England and the General American English (GAE) of
the Midwest constitute intralanguage accent. In the latter case,
there are, for instance, some major pronunciation differences
between the two varieties. The differences are more distinct in
the vowel system than the consonant system. Within the vowel
system, the difference is more in diphthongs than in pure (sim-
ple) vowels. In consonants, except for the phonetic nature of /r/
and the pronunciation of the intervocalic /t/ as in the word
<pretty>, there are hardly any significant differences. In Assyr-
ian (Modern Aramaic) dialects, the grapheme < >has differ-
ent renditions such as [k], [k], [c and []. In Arabic, the
standard realization of the graphemes < >and <>, for ex-
CHAPTER 4 61
ample, are different throughout the Arabic speaking countries.

The former has at least four different realizations, namely [ , a
voiced postalveolar affricate as in Iraq; [g], a voiced velar plo-
sive as in Egypt; [ , a voiced postalveolar fricative as in Syria
and Lebanon and [ , a voiced palatal plosive as in Sudan. The
< >has at least three different realizations. It is a voiceless
unaspirated uvular plosive [q] in Standard Arabic and in several
other dialects; in Iraq, it may be realized as voiced velar plosive
[g]; and in Egypt as a glottal stop []. In sum, intralanguage ac-
cent represents pronunciation differences within L1.
Conversely, the interlanguage accent stands for pronuncia-
tion differences that emerge when one moves from the native
language (L1) to the target language (L2). For instance, when a
native speaker of English embarks on learning French or Span-
ish, a major pronunciation difference one encounters is the radi-
cal shift in rhythm. The same is true for Frenchmen and Span-
iards learning English. Some typical vowel differences that result
in acute accent for native speakers of English learning German
and French are the front rounded vowels such as [ ] and
also the French nasal vowels such as [ ]. A foremost diffi-
culty in learning the pronunciation of the Semitic languages,
especially Arabic, is the dominance of the so-called guttural
sounds [, , , , ] and the emphatic (also known as pharyn-
gealized) sounds [ ]. For the specific teaching of the lat-
ter category see Odisho, 1981.
In teaching pronunciation in cross-linguistic situations, it is
important to be aware of the difficulties that arise when one
moves from one language to another or even when one moves
within the different varieties of one given language. There are
other specific situations in which two individuals of the same L1
learning an L2 may have different pronunciation problems due
to dialectal differences. To illustrate, if two German speakers,
one with an alveolar trill-r dialect and the other with a uvular
trill-r dialect embark on learning Spanish or Italian, the former
is less vulnerable to r-pronunciation problem than the latter.
There are scores of such cross-language pronunciation problems.
With native speakers of Arabic learning English, the // as in
<church> is not a problem for those speakers of Iraqi Arabic as
opposed to Egyptian, Lebanese or Syrians speakers because Ira-
qis have the sound in their local dialect. The focus in this study
is essentially on interlanguage phonological and phonetic ac-

cent.
4.3. PHONETIC AND PHONOLOGICAL ACCENTS

From the functional perspective, the phonetic vs. phonological
distinction in the nature of accent is extremely significant in
teaching pronunciation. It was a major distinction that I devel-
oped in the mid-1990s and began implementing in my classes.
The distinction was in print in 2003. Pedagogically, it made a
substantial difference in helping learners focus on more im-
portant problems of pronunciation facing them rather than
scratching on the surface of phonetic accent.
One needs to understand the difference prior to any elabo-
ration on the applied side of the dichotomy. Phonetic accent re-
fers to a mispronunciation that does not result in a semantic
(meaning) change, though it may negatively interfere with the
proper comprehension of meaning due to partial detraction from
the acceptable standard rendition of a given pronunciation. In
other words, it is a mispronunciation that does not directly cause
a miscomprehension, but it may hamper it or delay it and become
a distraction. Let us now elaborate on the key words in the last
statement. An example for the non-semantic nature of this ac-
cent is the massive replacement of the English approximant /r/
(English [] and American []) by a tap/flap [], or trill [r] or
retroflex tap/flap [] by millions of learners of English. This re-
placement does not cause a change in meaning in English; it
simply phonetically deviates from the normal standard and ac-
ceptable rendition of it. It is, therefore, a form of phonetic ac-
cent. Similarly, if a native speaker of English learning Spanish
aspirates the unaspirated Spanish [p, t, k] or a native speaker of
Spanish learning English deaspirates the normally aspirated Eng-
lish [p, t, k] no miscomprehension will result, only a phonetic
deviation.
Although a phonetic accent may not directly hamper com-
prehension by replacing one sound for another, it may indirectly
interfere with or impede comprehension because a phonetic
mispronunciation or a combination of phonetic mispronuncia-
tions may serve as an element of noise that can confuse the pho-
nological filter of the listener and, hence, cause miscomprehen-
sion or, at least, a delay of comprehension. The latter two symp-
CHAPTER 4 63
toms portray themselves verbally through repeated questions or

statements such What? What did you say?, I beg your par-
don! or I did not understand, etc. Nonverbally, the listener
may give you a facial impression of lack of understanding.
As for the phonological accent, it is a mispronunciation that
directly causes semantic confusion and impedes comprehension.
Obviously, a phonological accent always implies a phonetic ac-
cent. If, for instance, the inaccurate phonetic rendition of aspira-
tion and non-aspiration between the speakers of English and
Spanish was a good example of phonetic accent, a similar inac-
curacy in phonetic rendition between the speakers of English
and Thai languages will certainly amount to a phonological ac-
cent on the part of the English learner of Thai since in the latter
language aspiration vs. nonaspiration amounts to a phonological
distinction. To illustrate, if a speaker of English fails to aspirate
the name <Thai> [tai] (a person from Thailand), he will be
pronouncing it as if it were the word [tai], which means (kid-
ney). A real conversation between an Indian student from Kera-
la/India and a Thai student went on like this.
Indian student: Are you Tai?
She replied: No.
He went on: Arent you from Thailand?
She replied: Yes.
He continued: Why do you say you are not Tai?
She replied: Tai, means kidney. Im not kidney. I am

Thai.
There was laughter.
Similarly, the absence of [] sound, as in <judge> [], in

German is easily felt in the pronunciation of German learners of
English. Germans usually replace the [] with its voiceless
counterpart [] as in the English words <junk> and <joke>
which are rendered <chunk> [k and <choke> [ouk],
respectively, thus causing radical semantic change, a typical
outcome of phonological accent. Interestingly, when Arnold
Schwarzenegger occasionally appeared on Jay Lenos show one
could hear him say Jay, are you joking? which sounded:
Chay, are you choking? Therefore, the English pronunciation

of the word <German> and its derivations constitute a prob-
lem for Germans because it sounds as <chairman>. However,
luckily, they do not have this problem in the native language
because they either pronounce those derivations with a [g]
sound as in <germanisch> [g] similar to <g> in the
English word <give> or use the Germanic counterpart of the
word, <deutsch>.
For Spanish learners of English, one of the main sources of
phonological accent is the vowel system both for quantity
(short/long or lax/tense) and quality (overall vowel impression).
The absence of short/long or lax/tense contrasts in Spanish and
their presence in English creates a very serious problem for His-
panic learners of English because virtually thousands of words
are semantically confusedsome of them resulting in embar-
rassing situations. Luckily for English learners of Spanish, hardly
any phonological accent results from their handling of the five
vowels of Spanish; however, English learners of Spanish may
have noticeable phonetic accent, especially because of the impo-
sition of their lax vowels, such as schwa [], and other vowel
reductions and schwaizations1 that accompany unstressed sylla-
bles.
In addition to the semantic difference that ensues from the
distinction between phonetic and phonological accent, percep-
tually the phonological accent is much more readily identified
by the native listener than the phonetic one because it either
triggers confusion in meaning or propagates meaninglessness.
4.4. ACCENT: A NORMAL LINGUISTIC PHENOMENON

The overall tone of the preceding discussions of the phenome-
non of accent should give the impression that accent is a normal
linguistic fact. It emerges as a result of the cognitive attitude of
the brain of a child acquiring L1 versus that of an adult learning
L2. Every adult human being has to demonstrate a certain de-
1
Schwaization, after the neutral vowel schwa []; a vocalic change in the
direction of this vowel.
CHAPTER 4 65
gree of accent whether trying to learn a dialect within his lan-

guage or a second language besides his. Thus, to speak a dialect
or a language with an accent is not a stigma or a pathology and
should not be treated as such; however, when the accent seri-
ously interferes with meaning it then begins to obstruct the
normal conveyance of the message between two individuals.
Earlier on, this was identified as phonological accent versus
phonetic accent, the latter of which does not seriously obstruct
meaning directly except when it involves a certain degree of
deviation from the targeted rendition of several segmental (con-
sonants and vowels) and suprasegmental components (stress,
tone, intonation) of the targeted language. Briefly, the general
goal of teaching pronunciation should be an attempt at eliminat-
ing phonological accent and reducing the phonetic one.
Some learners or speakers of an L2 claim that they do not
care whether they manifest an accent in their speech or not.
Such a claim is made under the pretext that the speaker wants
his accent to reflect his ethnic and linguistic identity. In my
view, this is a baseless claim and it is no more than a cover-up
for the linguistic failure. Logically and aesthetically, any speaker
of L2 should portray his ethnic and linguistic identity through
perfect demonstration of his/her L1 rather than through the
loading of an L2 with mispronunciations infused through L1.
4.5. WHAT IS MEANT BY ACCENT ACQUISITION, ACCENT REDUCTION

AND ACCENT IMPERSONATION
To have or not to have an accent is a particularly controversial

question. How much accent one has is even more controversial
simply because it can be a very subjective judgment especially
by non-linguists. In real-life situations, there are different routes
to accent minimization, namely through: accent acquisition, ac-
cent reduction (remediation) and accent impersonation or faking.2
In all three cases, the goal of the person is to try to diminish the
differences in pronunciation between himself and a typical na-
2
Accent impersonation and faking will be used interchangeably according
to the context in which they occur.
tive speaker of a target language (L2). In what follows, an at-

tempt will be made to afford a descriptive account of each term
coupled with some highlights.
4.5.1. Accent Acquisition

Accent acquisition is primarily the gift that nature bestows upon
a child who grows up immersed in the authentic environment of
a languagebe that the language of the country of birth or the
country of the resettlement where he grows up abiding by the
rules and conventions of the pronunciation of that given lan-
guage. The gift could be broadened in scope to include two
more conditions that qualify a child or a person for accent ac-
quisition. The former is the total and equal immersion of a child
in two languages while the latter is a linguistically talented ado-
lescent or even a young adult who is amply exposed to an L2
and has a passion for the pursuit of mastering that language. No
one should exclude others from qualifying for accent acquisi-
tion, but the instances are very rare for adults. Nevertheless,
very many adults can excel and even outsmart native speakers
in their immaculate and creative competence in L2 lexicon,
morphology syntax, stylistics and overall fluency. Unfortunately,
they may not master the pronunciation simply because the latter
seems to be bound cognitively with the limitations of agethe
younger the more perfect, the older the less. The renowned nov-
elist Joseph Conrad, a Pole in origin, is a typical example of a
person who achieved the highest level at all aspects of linguistic
performance in English except for his Polish accent in pronunci-
ation which is a case commonly identified as Joseph Conrad
syndrome. Another example is Henry Kissinger, the former Sec-
retary of State whose fluency, syntax, lexicon are immaculate,
but his accent is striking to the ear.
The contrast in matters of pronunciation between the skill
of a child and an adult reminds one of the difference between
the terms acquisition vs. naturalization in becoming a citizen
of the United States. Briefly, obtaining citizenship by acquisition
is primarily a right by birth, whereas citizenship by naturaliza-
tion, especially for adults, is a privilege to be earned by a set of
procedures and requirements. Translating this difference in
terms of accent-free speech and accented-speech does not
amount to an exact analogy, but it hints at the fact that being
CHAPTER 4 67
born and raised in the same language is different from being

born and raised in a language, but relocated to a second one
later in life as an adult. Simply, accentless speech is a natural
gift for a child born and/or raised in what is supposed to be
practically an L1 environment.
4.5.2. Accent Reduction (Remediation)

Nevertheless, if accent acquisition is a difficult goal to achieve
because it requires perfection or near-perfection in the execution
of a language skill, it should not deter anyone from emulating
the targeted pronunciation through accent-reduction as much as
possible. In accent reduction, the objective is to suppress the most
salient features of ones L1 and attempt to replace them with
those of L2 when speaking it in order to camouflage any readily
detectable indications of an accent. Accent reduction is targeted
for different purposes foremost of which are the following.
4.5.2.1. Linguistic and Aesthetic

Attain the highest proficiency in pronunciation as part of the
overall competency in L2 to avoid accent as much as possible.
This could simply be for linguistic and aesthetic purposes. Need-
less to say, in any L2 language classes, learners aim at a better
level of proficiency in the target language, especially in its pro-
nunciation through which they give the native listener a very
positive impression especially because of higher intelligibility.
4.5.2.2. Spontaneous Interpretation

Spontaneous oral interpretation that is a daily practice in the
United Nations General Assembly, Security Council or in any
high level negotiation where room for misinterpretation and
miscomprehension attributed to any linguistic component (syn-
tactical, morphological, lexical and phonological) should be
eliminated or reduced to the minimum.
4.5.2.3. Acting and Broadcasting

An actor or actress would certainly be far more impressive and
convincing if he/she were to impersonate a foreign-speaking
character as accurately as possible. For instance, Ben Kingsleys
performance in Gandhis film was extremely impressive in act-
ing and demeanor, although his language would have needed a

touch of retroflexion, especially with his r sound to impact the
native Hindi audience more authentically. Equally important is
the role of newscasters in delivering their news bulletins loaded
with names of foreign personalities. For instance, now that the
name of the Syrian President < > is in the news, I
have yet to listen to an American newscaster or anchor pro-
nouncing it correctly; in reality, its pronunciation is seriously
distorted. Ironically, the name does not contain any sound that
does not exist in English; rather, the distortion in its rendition
lies in the misplacement of stress, reduction of Arabic vowels to
schwas [] and the overall shift in syllabification. The Arabic
pronunciation of the name sounds like this: [baar alasad]3
as opposed to an approximately typical mispronunciation by a
native English-speaking newscaster: [br lsd]. The follow-
ing are some of the major differences between the two rendi-
tions.
a) Almost all Arabic vowels are Anglicized, especially in the
direction of vowel neutralization, i.e. infusion of [] vowels
in place of [a]; also the replacement of vowel [a] with []
b) The germination (doubling) of [] (= <sh>) is reduced

to a single one.
c) The elimination of the germination results in reshuffling

the syllabic structure of the first name; thus, instead of [ba]
+ [ar] it becomes [b] + [r].
d) Finally, the Arabic rolled <r> = [r] is replaced with an

approximant retroflex one [] or [] which is not a major
change.
Cumulatively, the change in the overall rendition of the name

goes beyond being an accent; rather, it is an overall distortion in
pronunciation both phonologically and phonetically.
A very similar example of mispronunciation in broadcasting
that is retained in my memory from some decades ago is the
3
The bold syllables indicate the stressed syllable in each case.
CHAPTER 4 69
pronunciation of the name of the former French President Val-

ry Giscard dEstaing. The proper pronunciation of the name is
typically [valei iska d ]; however, the Arabic-speaking
Baghdadi television anchor pronounced the name with such a
heavy Arabic accent that it was stripped off all of its French lan-
guage characteristics. Some such typical phonetic/phonological
features of pronunciation in French language include: strong
tendency for word stress to fall on the final syllable; nasal vow-
els; voiced uvular fricative [] instead of an alveolar <r>; and
a typical voiced postalveolar fricative [ in place of a voiced
postalveolar affricate [ . None of those features were found in
the rendition of the name by the Iraqi news anchor.
4.5.2.4. Spying and Espionage

Spying or espionage is a profession in which the person may use
language, especially pronunciation to camouflage his/her per-
sonality without raising suspicion. In other words, the person
hides his identity behind an adopted pronunciation. Because
this profession is an extremely risky assignment, it requires a
high level of accent reduction and/or accent impersonation. This
dimension of accent reduction will be revisited from a different
perspective.
4.5.3. Accent Impersonation or Faking

In this section the term accent faking is used with a specific
connotation; hence, it undoubtedly needs some clarification. In
a broad sense, accent reduction and accent faking overlap to
some extent because they involve impersonation of a targeted
speaker of a language except for the fact that the purpose can be
different. Accent faking can project itself in different forms or
strands.
4.5.3.1. Comedians Faking an Accent

A common example of someone impersonating or faking an ac-
cent is what comedians do especially when mimicking speakers
of languages other than their own to generate laughter. It is in-
teresting to note that not all comedians are equally skillful in
impersonationsome are better than others. Let us just consider
the case of Jay Leno impersonating Arnold Schwarzenegger. Le-
no is a great comedian, but is not a good impersonator; never-

theless, he is very impressive when tackling Schwarzenegger. It
is noteworthy that phonetically, his success is not attributed to
his imitation of Schwarzeneggers vowels and consonants, but
rather to his overall rhythm, tempo and intonation. What come-
dians usually do is highlight the most salient features of the tar-
geted speech and mimic them in a caricaturized way to capture
attention. This means that a comedian may not necessarily be
skillful in meticulous impersonation, but may be good in high-
lighting some salient pronunciation features similar to what car-
icaturists do in drawingto highlight the most striking facial
and bodily features and exaggerate them for comedic effect.
4.5.3.2. Building Intimacy

Accent faking may also take the form of socially impersonating
the language or dialect of an interlocutor to sound friendly, in-
timate and trustful; however, in some instances, all those three
attributes may be used either honestly or with a twist of dishon-
estly. In the first instance, the impersonator aims sincerely at
bonding with the interlocutor for no ill intention. In the latter
instance, he may be aiming at enticing the interlocutor to unveil
personal information or even secrets. This may sound like spy-
ing, but in reality it tends to be more the action of a curious and
nosy individual; however, there is always the likelihood that it
can cross over to indirect interrogation or information gather-
ing.
As an instructor of phonetics and pronunciation, I have
used accent faking of languages that I did not speak, but I knew
the sound systems of, such as Russian, Greek and Hindi, among
others. I did the faking for two purposes. First, to demonstrate
the power of the knowledge of phonetics, especially articulatory
phonetics, as a science. Second, to assess my own skill in faking
as a strategy for teaching. In the former instance, faking an ac-
cent was used as a tool to attract the attention of the learners to
my presentation and build up confidence in their ability to hone
their skills through attention and practice. In the latter instance,
I did it to test my own skill in faking a given accent. Let me cite
some examples of my attempts at accent impersonation.
In one instance, I had the following encounter with an
adult American lady of Russian ethnic background:
CHAPTER 4 71
One day on the campus of my university, I stopped by a col-

league of the Department of Chemistry to arrange for a uni-
versity event. While we were chatting, the lady in charge of
the chemistry labs stopped by to ask my colleague a couple
questions. Just as a courtesy gesture, my friend introduced
her to me as Ludmila. Looking at my non-fair complexion,
she apparently became more curious as to my ethnic back-
ground. I also became equally curious because she portrayed
a distinct Russian accent in her English. Once I detected the
Russian accent I wanted to play my so-called accent imper-
sonation game as a means to collect data for this book. The
following conversation went on between the two of us:
She asked me: Where are you from?
I said, infusing some sort of Russian phonetic features in my

English: I am from ex-Soviet Union.
She looked at me strangely as if in disbelief: Are you Rus-

sian?
I said: No, my parents were originally from Azerbaijan. I

chose Azerbaijan to justify my facial complexion as non-fair.
She went on: Do you speak Russian.
I replied: Just a few words.
At this stage, I felt that she was extremely confused because

she was suspicious of my story. I also felt guilty of the confu-
sion I caused her. I immediately, apologized to her and told
her I was simply trying to impersonate a Russian accent. One
could readily notice that she was relaxed after my apology.
She further looked at me and said: But you dont know how
good you make it. Linguistically, I was very happy after this
encounter because I felt that my impersonation of the Rus-
sian accent seemed to have been good enough to make her
believe in my fake story.
With regard to the Greek language, I know only a few words

and phrases including: <Good morning>, <Good evening>,
<Good night>, <Thank you>, <Thank you very much>;
however, when I use them in a Greek shop or with a Greek per-
son, the immediate question is: Are you Greek? When my re-
sponse is No, the next utterance is always: I dont believe

you. All that I do is apply my articulatory knowledge of the
Greek sound system and use the right accentuation. This makes
the difference between a phonetician and a lay speaker of a
language. Let me explain this difference with the help of the
following real anecdote:
One day, one of my friends asked me whether I knew
Greek.
I said: No.
He went on to say: Do you know the greetings?
I said: Yes.
He then said: What is Good Morning?
I said: [kali mea .
When I asked him why of all Greek language he wanted this

phrase, his response was that his mother shared a room in a
nursing home with an old Greek lady and he simply wanted to
greet her in her native language as a courtesy gesture. A few
days later, I saw my friend and asked him about his visit to his
mother and her Greek roommate.
He said: Yes, I greeted her, but she did not show any re-
sponse at all.
I said: Are you sure you used the right phrase?
He responded: Yes, I said [kl m].
If one notices carefully there are at least seven (7) phonetic dif-
ferences between the authentic Greek pronunciation and his
Americanized rendition of it. I then thought to myself that she
did not respond to his greeting because she did not realize it was
in Greek due to the very heavy English accent.
4.5.4. Intralanguage Accent Reduction and Impersonation

Accent reduction and impersonation can be at both dialect level
(intralanguage) as well as language level (interlanguage). Alt-
hough the emphasis is on the latter, some comments must be
made about the former. All languages have many dialects that
evolve for geographic, socio-economic or ethnic reasons. In
CHAPTER 4 73
many instances, a certain stigma may be attached to a given

dialect causing many of its speakers to avoid using it either
completely or in certain situations. On the contrary, one of the
dialects, usually the standard,4 tends to be the most prestigious,
hence a large number of speakers of other less prestigious or
local dialects tend to adopt it through formal education or
through special orientation such as radio and television anchors
and announcers. Many such educated and professional people
tend to reverse back to their own social, regional and local dia-
lects in casual communication; hence, they are typically bidi-
alectals. Usually, a good percentage of educated people tend to
be bidialectal moving back and forth between the standard and
their own dialects as the situation dictates.
4.6. CULTURAL ACCENT

As equally important as the linguistic accent, there are solid
grounds for justifying the existence of a cultural accent. Thus,
following the common pattern of L1 for native language and L2
for a target language, C1, henceforth, will stand for native culture
and C2 for target culture. However, the question still remains as
to those grounds that justify the recognition of a cultural accent.
There are certainly some non-verbal gestures such as hand, eye
and body movements as well as some interjectional filler words
that differ from culture to culture.5 Strictly speaking, the filler
words are not necessarily supposed to be words in the linguis-
tic sense. Some of them can be simply interjectional utterances
with no well-defined lexical denotations such as <uh>= [:]
or [], and <um> = [m]; [m] or [m]. The reason why such
interjections are phonetically transcribed somewhat differently
is because they do not have an exact pronunciation across indi-
viduals, dialects as well as languages. For instance, in English
<uh> is more popular than <um>; besides, even if the latter
is shared, for example, between native English and Russian
4
Linguistically, the so-called standard language is also a dialect which
happens to be associated with the schooling system and formal education per se.
5
http://en.wikipedia.org/wiki/Filler_(linguistics).
speakers, its pronunciation is somewhat different. Interestingly,

the use of such language-specific filler words is so deeply in-
grained in ones native language due to acquisition that they
tend to be some of the last remaining accent traces of L1 and/or
C1 in L2 and/or C2. If one carefully listens to some fairly com-
petent Russian speakers of English, their speech tends to be
punctuated with [m]s as fillers. An equally interesting observa-
tion about these filler words is that the vowel element in them
tends to be consistent with the phonology of the given language.
It is not unexpected to have the schwa [] as a dominant vowel
in the coinage of filler words in English since this vowel has the
highest frequency of occurrence in English. In the absence of a
schwa in Arabic and the high frequency of [a] or [] (), it is
this vowel that coins the filler word <ah> = [a] or [] in Ara-
bic.
With regard to hand and body gestures, as part of the cul-
tural accent, there are many examples of them differing with
different peoples and cultures. For example, one of the most typ-
ical examples of cultural accent for the Japanese while meeting
and greeting natives of other languages and cultures is to bow
with or while exchanging hand shaking. Equally noticeable is
the tradition among Arabs, especially of rural areas or of rural
background, who immediately and subconsciously touch their
chest with the right hand after shaking hands.
4.7. TRANSITION OF ACCENT INTO ORTHOGRAPHY

Unquestionably, phonological accent is far more detectable than
phonetic accent not just in speech, but also occasionally in or-
thography (written form). In fact, one of the indications of the
seriousness of a phonological accent is when it is carried on by
the L2 speaker into the orthographic renditions of the mispro-
nunciations.
Let us consider some misspellings of adult Hispanic stu-
dents learning English. It is quite likely for a Hispanic to spell
<this> as <these> or the vice versa. This is certainly attribut-
ed to mispronunciation. It is an established fact that the Spanish
vowel system has a limited inventory of vowels based chiefly on
quality differences with no quantitative (length) distinctions.
This is a system that was first identified technically as a centrifu-
gal one as opposed to a centripetal system of English in which
CHAPTER 4 75
vowels allow both qualitative and quantitative distinctions

(Odisho, 1992). In Spanish, the vowel in <sin> (without)
has the qualtity that is almost half way between the English
vowels in <sin> and <seen>. Consequently, a Hispanic stu-
dent fails to distinguish English words such as <this> vs.
<these>, <bid> vs. <bead> or <pill> vs. <peel>. It is
exactly because of the absence of such qualtity differences in
their language, pairs of words such as the above are misspelled
by adult Hispanic learners of English. In light of such examples,
it is not uncommon for a Hispanic person to write a sentence
such as: This cars are expensive.
For Arabic, the sound is phonologically irrelevant,
though phonetically the sound may occur in certain contexts
such as when followed by an aspirated sound as in <>
[] (beginning); nevertheless, without training and
practice, /p/ is the most difficult, and at times embarrassing,
sound for Arabs because in the absence of such a sound, hun-
dreds, perhaps even thousands, of words are confused. Words
such <park, pray/prey, pitch, prick and push> can become
<bark, bray, bitch, brick and bush>. The impact of this mis-
pronunciation can occasionally overflow into orthography in the
form of replacing words of spelling with . In fact, at
times, the phenomenon known as over-compensation may devel-
op, according to which the fear of mispronouncing a given
sound leads the speaker to reverse the rendition of the relevant
two sounds. Once this over-compensation kicks in, the situation
worsens because the Arab learner of English will not only re-
verse the pronunciation of /b/ and /p/, but will also reverse
their orthographic renditions. I have come across many Arab
students who pronounce or write <but> as <put> and
<put> as <but>. In one instance, in the Iraqi city of Basra, a
traffic officer had ordered the sign <Keep Right> to be en-
graved on a concrete slab; unfortunately, it ended up being
spelled as <Keeb Right>.
In Kurdish, which is an Indo-European language, the Eng-
lish interdental fricative pair [, ] is absent. It is consistently
replaced in pronunciation with the alveolar fricative pair [s, z].
This sound substitution is so powerful that it occasionally
stealthily sneaks into their orthography. The anecdote below is
relevant to this phenomenon.
During the years 1960-1965, I worked as a teacher of Eng-

lish in a high school in the Kurdish city of Sulaimaniya/Iraq.
Once, I gave my students a written test. During the test, I
moved around the classroom monitoring the performance.
On the desk of one of the students, I noticed a small piece of
paper with Arabic scribbling on it. At first glance, I did not
pay attention to it because the written test was in English;
however, when I moved away from students desk, I had a
different vision of the Arabic scribbling. There were some
strange orthographic features in the scribbling that made it
look different from the overall visual impression of standard
Arabic orthography. The writing had more than usual recur-
rence of the letters <s >and <z >. I went back to the
students desk and picked up the piece of paper and began
reading it. To my utter surprise, it was a text in English
transliterated in Arabic and had relevance to the exam. The
reasons why it had an extraordinary number of <s >and
<z >was simply because all English <th> soundsthere
are many of themwere transcribed as <s> and <z> be-
cause they reflected his Kurdish rendition of <th> sounds.
The student was assigned an F in the test.
All the above examples from Spanish, Arabic and Kurdish lan-
guages serve as evidence that when cross-language pronuncia-
tion causes a heavy accent, the accent may occasionally be
transferred into the orthographic system of the targeted lan-
guage.

It is about time we stopped using the term accent in a generic
manner, especially when accent has professional implications
such as in teaching, acting, broadcasting and information gath-
ering at large. Any instruction in pronunciation, especially when
related to accent reduction, should be designed for the targeted
purpose. If the purpose is L2 learning, especially for adults, it is
of prime importance to distinguish between phonetic accent and
phonological accent and place the emphasis on the latter. Be-
sides, any orientation in accent reduction should bear in mind
that accent is not confined to segmental features (consonants
and vowels); rather, the suprasegmental features are of equal
CHAPTER 4 77
significance in shaping an accent or reducing it. Furthermore, it

is important to bear in mind that the acquisition of accentless
pronunciation is an ideal achievement for all children and some
adolescents immersed in L2 environment; also, perhaps, for a
few adults with distinct linguistic aptitude for pronunciation.
Serious and purposeful instruction in accent reduction should be
handled exclusively by professionals with general linguistic
knowledge and specific phonetic/phonological expertise who
implement a multisensory and multicognitive approach.
CHAPTER 5: A BROAD BASE FOR UNDERSTANDING
THE PEDAGOGY OF TEACHING PRONUNCIATION

The pedagogy of teaching pronunciation according to the ap-
proach promoted here has been premised on cognitive principles
implying that pronunciation, at large, and accent, in particular,
is the reflection of some underlying cognitive settings. Conse-
quently, any instruction targeting the improvement of pronunci-
ation, especially in L2 situations for adults, should be designed
in light of those cognitive settings.
Teaching pronunciation to adults through memorization in
the form of mechanical repetition becomes a highly ineffective
practice because sound features and segments are simply mean-
ingless, mono-dimensional acoustic signals that impact the ear
and have no other semantic mnemonic to assist with their reten-
tion. They are very much unlike morphemic unitsespecially
wordsand syntactical stretches where structure, meaning and
organization kick in to render them bi-dimensional or even mul-
ti-dimensional which collectively assist with comprehension and
retention. This is exactly why many adult L2 learners may excel
in morphology, syntax and lexicon, but manifest a distinct pho-
netic and phonological accentJoseph Conrad being a typical
example. Consequently, teaching sounds through mechanical
auditory repetition by the instructor and rote memorization by
the learner makes their cognitive retention quite difficult. More
channels of learning are needed for better and more permanent
retention. This is why the suggested approach calls for the joint
involvement of as many sensory and cognitive channels of input
as possible.
To understand the difference between memorization and re-
tention, it is enlightening to consider the following analogy. A
balloon that is fastened with a single string may easily be lost
79
when the string snaps, whereas a balloon that is fastened with

several strings remains firmly in place even if one or more of
them snap. Based on this analogy, teaching a sound feature au-
ditorily (by listening to the sound), visually (by observing the
accompanying facial and bodily features) and kinesthetically (by
feeling the concomitant sensations) will secure much better re-
tention, while mere rote memorization will not.
Below are some of the most relevant principles that help
with the understanding of the pedagogy.
5.1.1. Speech: A Cognitive Phenomenon

Human speech is a cognitive faculty, a potential in the brain
before being in the mouth. Teachers will often see adults experi-
encing serious difficulty in producing a new sound to which
they have never been exposed. This is a good example of the
cognitive requirement for sound production, meaning that the
brain may need enough exposure time to the new sound to per-
ceive and recognize it before being able to produce it appropri-
ately. Therefore, any instruction in pronunciation should target
both the cognitive potential for perception and recognition prior
to the necessary physical maneuvers of production. If, for in-
stance, an adult native speaker of English is asked to produce a
Spanish trilled [r], and he, after continuous modeling by the in-
structor fails repeatedly to produce it and instead persists in
producing a frictionless continuant (approximant), like the English
r, then the whole situation indicates that the learner is psycho-
linguistically unable to recognize it and, hence, produce it. This
is a typical condition that is identified in this study as psycholin-
guistic deafness or insensitivity to sound, a condition that is char-
acteristic of adults learning L2. Psycholinguistic deafness or in-
sensitivity in the teaching of pronunciation cannot be remedied
without an approach and sets of techniques that enable the
brain to cognitively perceive and recognize the new sound and
then fire the commands to the vocal organs to embark on a pe-
riod of trial and error in executing the articulatory maneuvers
needed for the accurate production of the targeted sound.
CHAPTER 5 81
5.1.2. Pronunciation: Multisensory Access

In real-life situations, sensing is rarely mono-dimensional and is
often multi-dimensional. Usually two or more senses function
jointly in a situation which reflects the nature of our physical
and cognitive existence. Since speech is a cognitive faculty
which is fed by a broad base of sensory modalities, especially
the auditory, visual and tactile/kinesthetic ones, any pedagogy
for teaching speech and pronunciation should be multisensory in
nature. In fact, learning occurs more rapidly when more than
one sense is involved. This also implies that all the strategies
and techniques used for implementation should emanate from
all those sensory modalities. For instance, the visual modality
should take into consideration all facial and body gestures that
are intertwined with the overall dynamics of speech production.
All the non-verbal gestures that accompany speech perception,
recognition and production are extremely helpful in teaching
pronunciation. Learners have to be prepared not just to hear and
produce the sounds, but also, and equally importantly, to see
and feel the sounds in conjunction with the concomitant sensa-
tions and physical gestures in the context of authentic speech.
This is why the approach to implement this pedagogy is multi-
sensory in essence.
In light of this approach a certain group of sounds could le-
gitimately be labeled visible sounds because the listener can see
the facial features that produce them. For example, consonantal
sounds such as the bilabial [b p], labialdental [v f], interdentals
[ ] and dentals/alveolars [d t] will squarely fall into this cate-
gory. Features of sound production such as lip configurations
(i.e. lip spreading and rounding), lip protrusion, jaw depression
and elevation are also fairly visible features. It is also possible to
visually detect some facial and bodily gestures indicating some
characteristics of tense and lax sounds, especially vowels. Such
gestures are readily detectable in the pronunciation of some
English vowels, e.g. the high front ones / i / = [i], as in
<seat> vs. [] as in <sit>. This is why speakers of languages
such as Spanish, Greek, Russian and French have difficulty dis-
tinguishing those two English vowels because they have only
one variety which is transcribed here as []. To visually detect
the difference between the two English vowels, learners have to
notice the extra stretching and spreading of the lips and the ad-
jacent cheek musculature with less jaw separation.
Even with suprasegmental features, such as stress and
rhythm, there are several facial and bodily gestures that are part
and parcel of stress execution and they hardly go unnoticed. In
fact, a proper and natural placement of stress in actual speech
cannot be executed without some facial gestures and occasional-
ly body gestures especially when the stressing is emphatic. A
teacher has to bring those features to the attention of learners
who, in turn, should watch for those gestures; they are usually
synchronized with the syllables or words to which stress is as-
signed.
5.1.3. Pronunciation: Multicognitive Access

Indeed pronunciation is a cognitive process and in order to in-
ternalize it one needs a multicognitive approach. Learners have
to be encouraged to try to attentively listen to sounds, remem-
ber them, compare and contrast them with sounds already part
of their psycholinguistic (cognitive) inventory or with versions
of sounds produced by other learners. Simply, learners have to
practice thinking consciously of the sounds and the process of
their production through association, analysis, synthesis, com-
parison, contrast, memorization, etc. There is also room for met-
acognition, which is a state of the mind when people are aware
of their own cognitive processes (Bourne, et al 1986); in other
words, they are highly conscious of their thinking process.
Learners should be instructed to feel their tongue movements in
their mouth, watch the shape of their lips and sense the contact
of the tongue with the teeth and lips as well as other facial mus-
cular gestures. Although these cognitive activities may sound
too abstract for some learners or even instructors to know about
them, in reality, they do exist and their presence can be felt in
different ways. Often when an instructor models a certain sound
and then allows for a break before the reproduction session,
many of the learners are already thinking of the reproduction.
One can readily infer the thinking process through the facial and
bodily gestures they unconsciously manifest. For instance, you
can easily see a learner moving his tongue inside the oral cavity
to feel the place of articulation or to try to create a rounded con-
CHAPTER 5 83
figuration for the lips, or even to depress or elevate the jaw to

secure the targeted degree of oral opening.
Every speaker of every language unconsciously manifests
some head, hand, foot movements or facial gestures synchro-
nized with the muscular effort needed for the execution of stress
placement. These movements and gestures are all reflections of
inner and mute endeavors on the part of the speaker and should
be brought to the attention of the learner to master the dynam-
ics of stress placement. There are times when the learner is quite
conscious of his inner effort at the processing of the sounds and
the outer gestures accompanying them. Quite often, when learn-
ers are asked about those cognitive processes and their physical
reflexes, they admit to them. It is because of all those mental
processes and the physical gestures that are associated with
them, that the approach in this book is identified as multicogni-
tive. It should be highlighted that the so-called classical tech-
nique of ear training cannot solely accomplish the effective and
successful mission of teaching L2 pronunciation; eye training,
neuro-muscular training and, above all, brain training should sup-
plement ear training. In fact, teaching efficient pronunciation
requires guiding the learners of L2 through the processes of cog-
nition and metacognition.
5.1.4. Pronunciation: An Integrated and Holistic Process

Pronunciation is an integral part of overall human communica-
tion (Morley, 1991). Human speech is portrayed in a wide varie-
ty of integrated combinations of segmental and suprasegmental
(prosodic) elements. Both categories should be handled insepa-
rably from the overall articulatory, visual, auditory and tac-
tile/kinesthetic features accompanying speech production. The
latter sets of features form the basis of what is differently la-
beled as articulatory settings (Honikman, 1964) or phonetic
settings (Laver, 1980), among others. No cross-language teach-
ing of pronunciation will be authentic and dynamic in nature
and a reflection of the native-speakers proficiency without a
serious consideration of the articulatory settings of the targeted
language.
For instance, a language like Arabic with a limited vowel
system and a heavy dependence on guttural sounds and emphat-
ics has such specific articulatory settings that without the incor-
poration of the settings into the overall approach to learning

Arabic by non-native speakers and learning of foreign languages
by Arabs the results will be highly unsatisfactory. Similarly, it is
important to know that there are languages whose vowel sys-
tems are identified in this book as centripetal as opposed to cen-
trifugal systems (Odisho, 1992). The former systems tend to have
a schwa vowel, [], with the rest of the vowels in the system
showing a tendency to be reduced to schwas or schwa-like vow-
els in unstressed positions. The centrifugal systems tend to avoid
contrasts in vowel quantity (length) as well as any noticeable
schwa-inclination of other vowels regardless of their stressed or
unstressed positions. English is a typical representative of a cen-
tripetal system as opposed to Spanish, which is a typical repre-
sentative of a centrifugal system. Any cross-language teaching of
pronunciation between these two types of languages can hardly
be effective if the approach handles the vowels individually and
as decontextualized segments. An efficient mastery of each oth-
ers vowel system cannot be attained without a holistic approach
to the teaching of the systems through their most characteristic
features. Those learners with a native centripetal vowel system
should be trained on both the avoidance of schwas and vowel
quality and quantity reduction when learning languages with
centrifugal systems. Conversely, learners with centrifugal sys-
tems should be trained on the production of schwas and vowel
quality and quantity reduction in appropriate contexts and con-
ditions of L2 learning.
5.1.5. Pronunciation: Top-Down & Bottom-Up Dynamics

Pronunciation is a dynamic cognitive and physical process. It
should be taught in a dynamic way with both bottom-up and
top-down approaches. A bottom-up approach implies teaching it
from smaller to larger units (i.e., segments to suprasegmentals),
while a top-down approach entails the reversal of the order of
units. In other words, teaching pronunciation is like two-way
traffic in which both directions of movement are needed in or-
der to complete the cycle of communication. Traditionally, pro-
nunciation has been taught through a bottom-up approach with
emphasis on vowels and consonants often lacking proper contex-
tualization and embedding in longer meaningful stretches of
speech. Recently, there has been a twofold focus of attention in
CHAPTER 5 85
teaching pronunciation; firstly, an intra-segmental emphasis with

attention on distinctive features, i.e. a more microscopic per-
spective; secondly, an inter-segmental emphasis with attention
on prosodic features, i.e. a more macroscopic perspective (Pen-
nington and Richards, 1986).
5.1.6. Pronunciation: The Complementary Nature of

Acquisition and Learning
Teaching pronunciation should distinguish between the process-
es of acquisition and learning. Acquisition tends to be a subcon-
scious, automatic and effortless process of internalizing a sound
system, whereas learning tends to be more conscious, mechani-
cal and effortful. The former is primarily characteristic of nor-
mal childrens mastery of the pronunciation of their native lan-
guage or any given language, whereas the latter is primarily
associated with the manner in which adults master pronuncia-
tion. Despite the difference between the two processes, acquisi-
tion and learning are not mutually exclusive in nature and func-
tion. Conversely, they tend to be complementary depending on
many factors such as the age of the learners, extent of exposure
and the conditions of exposure to the linguistic materials, level
of motivation and the approach to teaching. Generally speaking,
research as well as life experience adduce ample evidence in the
direction of more acquisition than learning by children as op-
posed to more learning than acquisition by adults. Hence, in the
description of language internalization by children, the appro-
priate compound action verb would be acquire-learn and the
reversed order, learn-acquire, would be more appropriate for
adults. However, the above two orientations in language/speech
internalization should not, in any way, imply that some adults
are unable to attain a near-native proficiency. No doubt, those
adults who have some degree of linguistic aptitude and a gift for
language internalization will tend to handle languages with an
acquire-learn strategy similar to that of children. Nevertheless,
even some adults who do not entertain a linguistic aptitude may
improve their chances of better learning regardless of age if the
conditions and techniques of learning/teaching are conducive
enough to motivate them and activate all their sensory and cog-
nitive processes of knowledge and skills acquisition.
The immediate instructional implication of the above

statement is that acquisition and learning are not two processes
of language internalization that are mutually exclusive in the
absolute sense. The statement also entails that teaching pronun-
ciation to normal children requires a methodology that is drasti-
cally different from that of adults. All that children need to ac-
quire-learn the pronunciation of a given language is cognitive
and developmental readiness coupled with ample authentic ex-
posure to the reception and production of linguistic materials in
the target language (L2). This does not mean that children learn
only inductively with no benefit attached to deductive learning.
There is always some of bothmore of the former in this case
and less of the latter. Consequently, mere exposure to language
materials as well as repetitive drilling such as the traditional
repeat-after-me technique may be far more functional and effec-
tive with children than with adults. With the latter, mere expo-
sure is not sufficient and, oftentimes, the above technique turns
out to be useless because adults tend to repeat after themselves,
i.e. repeat in terms of their own phonology. In other words,
adults may reproduce L2 sounds in terms of their L1.
5.1.7. Pronunciation: A Natural Gift for Children

The acquisition of pronunciation by children, which is so natu-
ral, efficient and perfect, should not exclusively be attributed to
the pre-puberty potential and readiness for language acquisition.
There are several other factors which are no less efficient in ac-
tivating natural potential and nurturing it. The following are
some of the foremost factors relevant in this regard:
tens of thousands of hours of exposure to authentic
native language materials in the first few years,
exposure often to context-embedded and situation-

embedded materials,
holistic exposure to language materials and skills
natural feel for language as a tool for normal social

survival
affectionate risk-free environment afforded by the

parents.
CHAPTER 5 87
The absence of some or all of the above conditions or their ex-

cessive deficiency renders L2 learning/acquisition by adults a far
more challenging social and cognitive task.
5.1.8. Pronunciation Should be Premised on a Triangular

Base of Perception, Recognition and Production
Any teaching of pronunciation should thoroughly follow the
three-stage procedure of sound acquisition: perception, recogni-
tion and production in the sequence indicated. The above trian-
gular procedure is highly consistent with the three-stage proce-
dure of registration, retention and retrieval in learning and with
the three types of sensory, short-term and long-term memories in
information storing. In each case, the earlier stage serves as the
gateway to the next and final stage. The transition to the final
stage cannot be completed without continued rehearsal.
A brief clarification of the terminology is invaluable. Per-
ception is used to denote the condition of feeling and sensing the
presence of a given sound; recognition includes the condition of
perception as well as the condition of being able to distinguish
the given sound from others and, perhaps, identify the differ-
ence(s) in comparative/contrastive situations. According to Par-
asuraman and Beatty and in terms of cognitive processing, the
distinction between perception and recognition appears to be
the matching of the external sensory pattern with some internal
sensory engram1 and the bringing of this to awareness (cited in
Kissin, 1986). As a further enhancement of the above quotation,
Kissin states, The definition of recognition as the process of
matching external perceptions against existing internal corre-
lates, implies a second level of activity (Ibid). As for production,
it satisfies the above two conditions of perception and recogni-
tion in addition to the ability to retrieve the sound and repro-
duce it at will with an acceptable degree of proficiency and ac-
curacy.
The sequential triplet of learning is: registration, retention
and retrieval. In standard literature on learning, registration refers
1
What was supposed to be the trace of a sound in the brain.
to the perception, encoding and neural representation of stimuli

at the time of an original experience; retention is the neurologi-
cal representation of an experience to be stored for later use;
and retrieval is the permit to access previously registered and
retained information (Arnold, 1984; Levitt, 1981). As for infor-
mation storing in the brain, there are three different kinds of
storing systems. The sensory memory is the initial level of infor-
mation storing in the form of an impression; information stored
here is extremely limited and is retained for only a few seconds.
Sensory memory is a sort of photographic memory (Loftus:
1980) that is represented in two forms: auditory sensory memory
known as echoic and visual sensory memory known as iconic. It is
interesting to note that auditory memory appears to be more
durable than visual because a sequence of spoken digits is better
remembered than a sequence of digits presented visually (Bad-
deley, 1993). Short-term memory is not as limited as sensory
memory; it can store about seven items plus or minus two items
and for no more than half a minute or so. Although short-term
memory may be transient and limited in capacity, it may be
very useful in ear-training sessions where the temporary reten-
tion may allow the learner to better perceive the sound; it may
also play a crucial role in conscious thinking. In plain wording,
the half-minute or so allows the learner to think about the sound
and its production. Long-term memory is the storing system
where information is retained for longer periods of time and
even permanently. In terms of cognitive knowledge, the process
of learning is essentially one of transferring information from
the environment into the long-term memory. Long-term memory
is a more-or-less permanent repository of general knowledge
about the world and past memories (Bourne et al, 1986).
The explanation above suffices to portray the functional
and operational parallelism across the processes of sound acquisi-
tion, general learning and memory and the sequential stages
through which they usually go. For instance, in order to per-
ceive a sound one has to be exposed to it at least in passing
through the sensory memory; to have it registered, temporarily,
it should be stored in the short memory; however, in order to
retrieve and produce a sound at will, it has to be retained and
consolidated in the long-term memory through rehearsal. Se-
quencing of stages is significant and bypassing a stage may neg-
CHAPTER 5 89
atively impact the outcomes. For instance, with insufficient and

improper exposure to unfamiliar sounds, one is highly unlikely
to succeed in producing them. On the contrary, it is highly likely
that the learner will subconsciously relapse into his native in-
ventory and produce a sound that is not the targeted L2 one. A
serious flaw in the traditional approach to the teaching of pro-
nunciation is attributed to either insufficient dwelling on the
perception and recognition stages or their total negligence.
Those two conditions lead to an immediate jump to the produc-
tion stagea condition that is typically embodied in the repeat-
after-me technique of teaching pronunciation, which is usually
so incompatible with the learning styles of adults.
5.1.9. Pronunciation & Psycholinguistic Insensitivity

When very young children are able to perceive and even recog-
nize a sound, but fail to produce it, the reason may be attributed
to lack of neuromuscular maturation. To express it differently,
such cases, which are very common with children, indicate cog-
nitive perception and recognition of sounds, but lack of physical
maturation and practice to coordinate, synchronize and set in
motion the relevant articulators to assume the targeted articula-
tory configurations and postures. To substantiate such instances
of lack of maturation, I had to conduct the following test with
my son when he was four (4) years old:
He had an alveolar lisp because he was unable to pro-
nounce the alveolar fricatives [s] and [z]. So I selected the
minimal pair of <thin> vs. <sin> and repeated them sev-
eral times and asked him to repeat them after me. He pro-
duced <thin> for both of them. I wrote the words on a
piece of paper in big letters and read them very clearly and
emphatically and asked him to repeat them after me. Once
again he repeated <thin> for both. When I tried a third
time, he yelled at me and said: This is [] pointing to
<thin> and this is [] pointing to <sin>. I knew at that
juncture that he was able to perceive and recognize the
sounds, but was unable to articulate the difference. Six
months later, when he went to pre-school, he came to me
one day and said: Do you want me to pronounce <thin>
and <sin>. He pronounced them perfectly. It was simply a

maturational problem.
It is not a serious problem teaching pronunciation to children.

Ample exposure and proper practice will automatically enable
the child to overcome the difficulty. If, however, adults feel it is
difficult to produce certain sounds in L2 or if they completely
fail to produce them in spite of repeated modeling by the in-
structor, it is highly likely that this situation has arisen because
of a failure to perceive and recognize the targeted sounds. Such
a condition has been earlier on labeled as psycholinguistic insensi-
tivity or deafness. This is a condition which develops as a result
of exclusive exposure to ones native language (L1) in which a
certain sound does not exist. It is assumed that the exclusive
exposure to the speech of L1 creates an unintentional and inad-
vertent phonetic and phonological bias to the sounds and the
sound system of L1 at the expense of those of any L2. It is a sit-
uation like this where the repeat-after-me procedure is highly
ineffective. Nevertheless, a multisensory and multicognitive ap-
proach, which provides a broad selection of teaching styles,
tends to be very effective in managing the percep-
tion/production of unfamiliar L2 sounds. Obviously, different
learners will respond differently to various sensory and cogni-
tive techniques or a combination of them.
5.1.10. Pronunciation: Understanding its Scientific Premises

Some knowledge of the articulatory movements is indispensable
for teachers and can be very beneficial for learners, especially
adults. Familiarity with the function of the vocal folds, the dif-
ferent parts of the tongue, the hard and soft palates and the lips
is invaluable in understanding the nature of speech as a dynam-
ic process as well as its teaching. If, for instance, a Hispanic
learner of English tends to replace a voiced labialdental fricative
[v] with a voiced bilabial plosive/stop [b], it should not be a
difficult problem for the instructor, as well as the student, to
overcome. The teacher should focus on visual demonstration of
the labial articulatory differences between the two sounds: a
posture of the two lips contacting each other for [b] as opposed
to the lower lip contacting the upper teeth for [v]. Because these
CHAPTER 5 91
two articulations are clearly visible they become readily imita-

ble and learnable.
5.1.11. Pronunciation: Its Feedback Mechanisms

The production of speech requires the simultaneous and coordi-
nated use of respiratory, phonatory and articulatory mecha-
nisms. Speech is such a complex activity that some method of
feedback control seems likely (Borden and Harris, 1980). The
physical, aerodynamic and acoustic dynamics, movements and
perturbations that result from the action of the mechanisms of-
ten yield multifarious sets of internal sensations of touch, pres-
sure, movement, position etc., which constitute the feedback
control systems.
Two important instructional facts emerge as a result of the
above emphasis on diversified speech production feedback sys-
tems. Firstly, it is the auditory feedback system, which still, al-
most exclusively, dominates our approach to teaching speech
and pronunciation in our schools. Unfortunately, in addition to
the exclusiveness of the auditory feedback, it is oftentimes ap-
plied in the mechanical sense of mere listening or listening and
repeating after instructor. Secondly, all types of feedback mech-
anisms, especially tactile/kinesthetic/proprioceptive, which are
extremely important in sound acquisition, should be brought
into play jointly in the form of different pronunciation teaching
and learning techniques and activities.
5.1.12. Pronunciation: In Light of Multiple Intelligences

Theory
Gardners Multiple Intelligences Theory (MIT) is one of the sig-
nificant pillars of recent cognitive philosophy and orientation in
knowledge acquisition. MIT is broad enough to encompass any
aspect of our life and any field of knowledge dissemination and
acquisition. Due to the relatively recent emergence of MIT in
terms of classroom application, it is still at the abstract and
philosophical level of an approach to instruction. Any attempt to
bring it down to the classroom floor in the form of strategies for
daily application requires its transformation from an approach
to sets of techniques and strategies.
With its nine intelligences of linguistic, logi-

cal/mathematical/scientific, visual/spatial, musical, bodi-
ly/physical/kinesthetic, interpersonal, intrapersonal, naturalist
and existential, several of them are directly or indirectly related
to the teaching/learning of language, in general, and more than
one of them is directly involved in the teaching/learning of pro-
nunciation. It is the belief here that in the teaching/learning of
pronunciation, the linguistic, visual, musical and kinesthetic
intelligences are involved to different extents in the develop-
ment of a cognitive pedagogy to teaching pronunciation pro-
moted in this book. It is, therefore, very important for the in-
structor to approach the learner via more than one sense and the
learner should be prepared and encouraged to learn likewise. It
was also made clear that the so-called technique of ear training
cannot solely do the job. Eye training, neuro-muscular training
and brain training should supplement ear training. It is through
a joint set of procedures involving the above modes of training
that sound production and its dynamics will be assimilated and
accommodated by the brain in the form of traces in long-term
memory for immediate retrieval.
The view that pronunciation is a complicated cognitive
process that taps into more than one intelligence implies the
need to activate those intelligences and involve them in the
learning process. In order to secure maximum involvement of
the learners, it is incumbent on the instructor to diversify the
teaching techniques/styles to encompass all relevant sensory
and cognitive modalities. It is only through this diversity that
more adult learners will be invited to participate and multiple
intelligences will be stimulated. The extent of the success of any
class depends on the degree to which the teaching styles match
the learning styles. The diversity of the teaching/learning styles
will serve the significant purpose of discovering the intelligences
of the learners and design future instruction accordingly.
5.1.13. Pronunciation: A Generative Skill

Obviously, the term generative is associated with Chomskys
theory of linguistics. The term is reused here with a slightly dif-
ferent denotation though still somewhat related to the Chom-
skyan connotation. The pertinence of the generative nature of
the pedagogy promoted here implies that mastering the percep-
CHAPTER 5 93
tion, recognition and production of one sound should facilitate

the mastery of more than that one sound. In other words, devel-
oping a skill in one aspect/domain of pronunciation should
serve as a key to enhance or generate a skill to master other as-
pects/domains of pronunciation. For instance, in English, mas-
tering the production of the sound of schwa [] vowel does not
only help with the mastery of the complicated vowel system, but
it will also considerably facilitate the process of stress placement
and the overall rhythmic performance. In the domain of conso-
nantal features, mastering the features of aspiration vs. non-
aspiration in one pair of consonants (e.g. /p/ vs. /p/) should
enable the person to apply that skill to other pairs of consonants
(e.g. /t/ vs. /t/, /k/ vs. /k/, /c/ vs. /c/). Also, learning how
to kinesthetically and proprioceptively sense a tongue tip con-
tact at the alveolar ridge should develop the skill of sensing oth-
er contacts of the tongue in the oral cavity. Even in the dynam-
ics of sound production, mastering accentuation in a given word
should pervade other words and the overall rhythm mastery in
the targeted languageor any other language for that matter. In
sum, the pedagogy espoused here for teaching pronunciation is
a holistic one because human speech is more than a combina-
tion of sounds; human speech is gestalt in nature.
5.1.14. Pronunciation: Interactive Involvement of

Instructors and Learners
The instructor of pronunciation according to the multisensory
multicognitive pedagogy suggested here should be knowledgea-
ble about the processes involved in speech production and per-
ception as well as having some awareness of the recent cognitive
orientation in the theories of language and education. The theo-
retical knowledge base must also be reinforced by some practi-
cal skills of application. Such theoretical and applied know-how
is indispensable because teaching pronunciation is no longer a
strict, mechanical and exclusive imposition of mechanical repe-
tition. Pronunciation is no longer a stand-alone physical phe-
nomenon; it is rather contextually ingrained in the brain like the
rest of the cognitive processes and activities. More importantly,
the instructor should ascertain that there exists an interactive
connection between him and the learners. To establish this con-
nection, the instructor should also be knowledgeable in the fol-
lowing respects. Firstly, he should make sure that the learners

know what the theme/activity under demonstration is all about.
For instance, if the activity is about accentuation (stress place-
ment), he should make sure that learners know what stress and
stress placement as phonetic phenomena are. He should not pass
instructions that only he comprehends. Secondly, he should take
the age of the learners into consideration and plan accordingly.
Thirdly, to secure the above two considerations, the instructor
should conduct a few testing and assessment activities to dis-
cover their knowledge and familiarity with speech acquisition.
Finally, he should also conduct further direct and indirect as-
sessment and probing activities to identify the type of pronunci-
ation difficulties they may have, especially those emerging as a
result of L1 and L2 interference.

Language acquisition for a child follows the pattern of Caesars
phrase of Veni, vidi, vici (I came, I saw, I conquered) restated in
the following format: I listened, I recognized, I acquired. Unfor-
tunately, it is not that straightforward for adults embarking on
an L2. There is no doubt that different adults have different po-
tentials for an L2 learning; however, the majority of them do not
have the disposition that a child has in mastering an L1. Adults
have already exhausted their natural potential while acquiring
their L1. It is the exhaustion of their innate L1 bias that stands
in the way of duplicating the childhood experience as adults. It
is because of this L1 bias that the methodology of teaching
adults an L2 requires a broad array of sensory and cognitive
modalities and strategies to render their mastery of L2 as suc-
cessful as possible.
CHAPTER 6: TEN COMMANDMENTS FOR TEACHING
EFFECTIVE PRONUNCIATION

This section will serve as an introduction to the main implemen-
tational principles of the pedagogy promoted in this book. It will
also summarize the factors that throughout my long career in
teaching qualified me to manage the teaching of pronunciation
in an effective and efficient way. The layout of the chapter is in
the format of ten commandments stated in a positive tone.
They are collectively a reaction to three educational experiences
in my life both as a student and instructor. First, occasionally
learners do not understand the explanations and/or instructions
of the instructor pertaining to certain aspects of language, but
they fail to ask for clarification because of one of the three rea-
sons: they are shy; they think they understand the point; and
they are simply negligent. So, they remain in limbo. Second,
some instructorsincluding myself at the early stage of my
teaching careerfail to double check on the learners connec-
tion with what they are teaching or explaining. Such a scenario
disrupts the interaction between the two, a situation that defeats
the purpose of teaching. If such a disconnection prevails, the
instructor will often be the only learner together, perhaps, with
a minority of students. Third, some instructorsincluding myself
at the early stage of my teaching careermay not be profes-
sionally qualified to teach pronunciation. This latter fact makes
them vulnerable to irrelevant or inaccurate explanation of lin-
guistic facts about speech, in general, and pronunciation, in par-
ticular. Later in my career, those three experiences were a major
source of feedback for developing my own approach to teaching
pronunciation. I benefited tremendously from my graduate edu-
cation and my extensive classroom experience. I tried to over-
95
come my own weaknesses and those of my teachers throughout

my early education.
I gradually made the adjustments needed. Fortunately, my
graduate education in speech science extended the horizons of
my knowledge and experience. I became more experienced in
discovering my pronunciation problems and those of my stu-
dents. Solutions to those problems had to be part of my lesson
plan. Since my students were usually of different language
backgrounds, I had to prepare strategies for each linguistic
group. I always pressed my students to be straightforward in
asking questions and making comments regarding my presenta-
tions, explanations of materials and implementational tech-
niques used. I also used different strategies to discover whether
they understood the technical jargon I was using in my instruc-
tion. A wide variety of techniques, demonstrations, facial, bodily
and hand gestures were used to prop up demonstrations and
explanations of facts and procedures. I carefully watched their
faces for any indications of confusion and uncertainty. In asking
a certain student whether he understood what I was saying, I
never took a yes response for a genuine yes. I usually probed
the students to verify their positive response. The following is a
real example pertinent to the latter point. While I was once en-
gaged in teaching my students about stress placement or accen-
tuation in English and how it can render the same category of
words as verbs or nouns depending on the position of stress such
as in: <permit, contract, refuse, desert, import>, I noticed that
the facial gestures of one of the students indicated uncertainty
and confusion. I asked the student whether she grasped the dif-
ference. She said: Yes, but with a tone of indecision. I had to
demonstrate the examples once again highlighting the difference
in accentuation. When I asked her to repeat the demonstration,
she managed some of them but stumbled with others. In light of
this situation, I had to use additional strategies until the student
grasped the fundamental difference. The strategies were a com-
bination of visual and auditory sensory modalities. I selected the
word <contract> for demonstration and marked the strong
syllable with a large dot and the weak one with a small dot. For
a noun <contract > = [], I tapped on her desk strongly for
the first syllable and lightly for the second one. For the verb
<contract> = [], I reversed the strength of the taps: light
CHAPTER 6 97
for the first syllable and strong for the last. I also asked her to
watch the movement of my hand while tapping. After two days,
we had another session in the beginning of which I once again
highlighted the difference and asked the same student to give a
demonstration. This time she was excellent and with no hesita-
tion, whatsoever. The mission was accomplished successfully.
Let us now move to the ten commandments.
6.1.1. Thou Shall Teach Pronunciation as a Cognitive

Undertaking
Like thinking, language is a cognitive skill that occupies consid-
erable space in the brain which needs a social environment to
nurture and cultivate the skill. The roots of human language
grow with the birth of the brain and its gradual maturation. It is
one of the first gifts the fetus receives; they grow and mature
together. Since the brain is responsible for the physical, cogni-
tive and social growth of a child, it has to function with superb
economy of effort in all three types of growth. The brain gives
the child a golden opportunity to perfectly internalize the lan-
guage of the community into which it is born. This perfection is
a onetime privilege for a child; once the child gradually steps
out of the realm of its childhood the perfect skill in language
acquisition begins to taper off. This explains why adults usually
manifest a certain degree of imperfection in pronunciation
known as accent. Hence the deficiency in the pronunciation of
L2 and the emergence of accent are primarily attributed to the
perfect efficiency in the pronunciation of L1. Accent in the ren-
dition of L2 by adults is a normal natural cognitive phenomenon
and the instruction to reduce it should be premised on cognitive
basis and implemented accordingly.
6.1.2. Thou Shall Teach Children and Adults Differently

Since language internalization by children is a process of natural
acquisition of L1, while adult embarking on an L2 is a contrived
process of learning, the approach and the teaching/learning
strategies have to be different in each case. In actual fact, chil-
dren are not formally taught the sounds of their language and
their collective pronunciation; rather, their brain is simply ready
to assimilate the sound system through social exposure and in-
teraction within the family and the community. The school

simply reinforces the early linguistic education that home and
community had already initiated. Adults are taught their L2 in
schools or gradually pick it up in the L2 community in which
they reside. The task of learning an L2 by adults is a conscious
and effortful one. Oftentimes, they manifest a certain degree of
accent.
6.1.3. Thou Shall be Qualified for Instruction in

Pronunciation
Teaching pronunciation at a professional level requires general
linguistic knowhow with qualification and experience in phonet-
ics. Unlike other linguistic skills, such as morphology, syntax
and vocabulary, pronunciation is the earliest skill a child is ex-
posed to and it is, thus, naturally acquired in L1. This is why if
the skills of sound production and overall pronunciation of L2
are attempted in adulthood the process can be very demanding
because the brain shows reluctance to reshuffle the L1 system
through relearning additional constituents. Any instructor of
adults learning L2 pronunciation should be thoroughly aware of
the sound system of both L1 and L2 to highlight the differences
and similarities and the areas where problems would be ex-
pected. Knowledge of sound description and identification in
terms of voicing/voicelessness, place of articulation and manner
of articulation are seriously beneficial. Perhaps, most important
of all is the possession of a diversified set of teaching strategies
that take the learners beyond the most common and the least
effective, strategy of parroting or repeat-after-me. Because of
adults imminent psycholinguistic insensitivity to L2 phonology,
they often repeat after themselves under the influence of their
L1 phonology. The instructor has to use as many sensory and
cognitive strategies as needed and diversify them in order to be
able to penetrate the protective shield of L1 phonology. At
times, individual learners require individualized attention and
instruction. It is also the responsibility of the instructor to intro-
duce adult learners to new sensory and cognitive learning strat-
egies beyond the mechanical repetitions after the instructor.
CHAPTER 6 99
6.1.4. Thou Shall Familiarize Learners with Human Speech

Production
Since pronunciation is the reflection of cognitive activities
transmitted via the speech organs it is quite helpful for learners
of other languages to familiarize themselves with some of the
articulatory activities involved in speech production. As sounds
in any language are based on three distinctive features, namely,
voicing/voicelessness, place of articulation and manner of artic-
ulation for consonants or the shape and location of tongue for
vowels, it is advantageous to introduce learners to the basics of
those features. For instance, if one is teaching a group of His-
panic learners of English who are having difficulty in distin-
guishing a /b/ from a /v/, it is absolutely necessary to visually
demonstrate the two sounds. The demonstration will readily
show them the difference between the two sounds because the
/b/ is bilabial (the two lips come together), whereas the /v/ is
labialdental (the upper incisors touch the lower lip). Once they
realize the physical (articulatory) difference, it is very easy to
master the pronunciation of both sounds.
6.1.5. Thou Shall Orient Learners Psychologically

The instructor has to prepare adult learners and guide them step
by step during the process of mastering L2 pronunciation. First
and foremost, he has to instill in them a positive attitude for
learning coupled with confidence. Next, comes the need to
transform the physical ability to hear into a cognitive skill of
listening to hone the potential for distinguishing one sound from
the other. The instructor has to be careful not to let learners feel
that they are under constant watch by him; rather, the interac-
tion between him and the learners should be very casual yet
focused on problem identification and problem solving. Learners
should feel absolutely free to comment, criticize and seek fur-
ther clarification. During all the interaction that goes on be-
tween the instructor and the learners, the instructor has to bear
in mind that some learners tend to be more outgoing than oth-
ers. It is his responsibility to make every learner feel utterly
comfortable to ask, participate and demand further clarification
of points under discussion. The instructor has to understand that
in response to his question Did you understand what I said?
learners nodding their heads does not necessarily mean that

everyone has really understood what was said. There are always
some learners who are shy, a couple who are slow and do not
want to openly admit they did not understand, while some oth-
ers may think they understood, but in reality they did not. It is
utterly the responsibility of the instructor to check all these pos-
sibilities and respond accordingly.
6.1.6. Thou Shall Use all Sensory Modalities to Prop up

Instruction
Pronunciation is not the exclusive responsibility of the auditory
sense; unfortunately, the overwhelming majority of people think
that is the case. This is why statements such Listen to me, I
will say it again or repeat after me are commonly used by or-
dinary people as well as by teachers who are professionally not
qualified in the methodology of teaching pronunciation. This
exclusive focus on the auditory sense is one of the primary rea-
sons for the failure of teaching effective and efficient pronuncia-
tion to adults. In addition to being heard and listened to, sounds
are oftentimes visually detected through the facial and bodily
gestures of the speaker. Besides, the speaker has the potential
for kinesthetic and proprioceptive sensing of movements and/or
contacts of vocal organs. In other words, the speaker has the
potential to sense the position, location and orientation of
speech organs inside his body beginning with the air movement
in the lungs and throughout the vocal tract as well as the
movements of the tongue in the pharyngeal and oral cavities. As
for the opening of the oral cavity, the different lip configura-
tions and lips/tongue contact, they are both seen and sensed.
Accordingly, one of the foremost instructions to give to
learners is to watch the facial and bodily gestures of the instruc-
tor during the modeling of new and unfamiliar sounds. Also one
has to try to sense the movements or contacts of the tongue or
other parts of the vocal organs while pronouncing or practicing
sounds. This is the best technique to monitor your own articula-
tion of sounds.
CHAPTER 6 101
6.1.7. Thou Shall Use all Cognitive Modalities to Prop up

Instruction
Foremost among the cognitive modalities in teaching and learn-
ing pronunciation to be considered by the speaker or the learner
is to think of his production of sounds or to associate a sound
that one is learning with a sound that one has already mastered.
The best way to demonstrate the cognitive modalities is through
examples. Let us go back to the Hispanic learners of English and
their difficulty with the [v] sound. All that the learners need to
do is to watch the instructors face and create a contact between
the lower lip and upper incisors. Next, ask them to repeat the
contact several times so that the brain registers the kinesthetic
and proprioceptive impression between the teeth and the lip. If
the brain registers the contact through repetition it will serve as
a reminder for the learner to perform the contact when a [v] is
the targeted articulation. Proper repetition of articulation will
gradually transform the maneuver into a cognitive process.
The same Hispanic learners will face difficulty with the
production of a [z] sound which is habitually replaced with an
[s] sound. Since the difference between the two sounds is the
vibration of the vocal folds with [z] and its absence with [s], the
best advice an instructor can give his students is to perform the
exercise of detecting vibration. This is executed by pressing the
palms of ones hands on the ears while performing a sustained
[sssssss] sound followed by a sustained [zzzzzz] sound. With the
latter sound, the person should feel some sort of humming-like
echo in the ears. If such echo is missing it means the person is
not able to set the vocal folds into vibration. The instructor
should demonstrate the two sounds and ask the learner to do so
until he gets the hang of it. Once the learner feels the difference
in the form of the vocal folds vibration, it gradually becomes a
cognitive process and is registered in the brain.
6.1.8. Thou Shall Transform Learners from Listeners into

Performers
Although explanation of some aspects of speech production is of
help in the process pronunciation, too much of it becomes dis-
tracting. Emphasis should better be on practical performance
and demonstrations. The instructor should monitor the perfor-
mance of learners and identify those who achieve an early suc-

cess. Those early achievers should be encouraged by him to
guide the other learners who need further demonstration and
practice. At times, peer teaching and learning can be more effec-
tive than that of the instructor. The aim of the instructor should
be the transformation of the learners into performers. This im-
plies that each session of theoretical explanations should be fol-
lowed by a session of actual performance of the targeted sounds.
6.1.9. Thou Shall Refrain from Insistence on a Learner

If a learner fails to master the performance of a certain sound
after repeated demonstrations by the instructor or other success-
ful performers, the instructor should not persist in pressing the
learner for continued action. A situation like this confuses the
learner even more and makes him conscious of his inability to
perform. Once the instructor feels that such a situation has aris-
en, he should immediately stop either by moving to another
learner or simply change the exercise or stop it. The instructor
should also avoid focusing on one or two learners simply be-
cause they are good performers. Such a scenario will make other
learners feel inferior or see themselves as under-performers.
6.1.10. Thou Shall Make the Classroom a Place for Learning

and Fun
Mastering of certain segmental speech sounds (consonants and
vowels) or suprasegmental ones such as stress, tone and intona-
tion can result in funny situations that elicit laughter. For exam-
ple, if one wants to prove that an <m> is a nasal sound (the air
runs through the nose not through the mouth), one can begin
humming and then suddenly clip the nostrils. Of course, the
humming (or the mmmmmm) will suddenly cease because the
nasal air is blocked from flowing. The cessation of airflow sud-
denly builds up high pressure in the vocal tract and the ears.
Usually, this unexpected swift buildup of pressure results in a
burst of laughter which is perfectly acceptable and part of the
hands-on learning about the dynamics of human speech. Also
when one teaches a retroflex <r>, typical of the sub-
continental Indian languages, the exercise requires teaching
learners the tilting of the tongue-tip backwards. This is a very
CHAPTER 6 103
unfamiliar articulatory maneuver and few people can perform it

at first attempt. Working on it can raise a lot of laughter which
is a normal and integral part of mastering it. Perhaps, teaching
pitch movement or the tones is the funniest of all, simply be-
cause many of the learners may turn out to be tone-deaf. You
may ask a learner to do a rising tone and he gives you a falling
one. Indeed, there are few other sounds or sound phenomena
that are not acquired without fun and laughter such as the dif-
ferent types of clicks in some African languages. In light of such
funny situations all that the instructor has to do is to control the
fun and laughter within acceptable limits. Having fun while
practicing unfamiliar sounds is a normal, natural, learning situa-
tionall that the instructor has to do is not let some of the
learners be carried away with fun and interfere with class man-
agement.

Effective teaching in classroom situations is a collaborative ef-
fort between instructors and learners. To make this collabora-
tion successful, effective teaching and learning strategies should
be premised on combinations of sensory and cognitive consider-
ations. This diversity in considerations is indispensable in teach-
ing L2 pronunciation to adults.
CHAPTER 7: EXAMPLES OF CROSS-LANGUAGE
ACCENT-CAUSING CONSONANTS

Since English is the most widely used international language,
the focus will be now be on English to help restrict the number
of comparisons and contrasts. We will carefully identify some of
the most noticeable pronunciation problems that natives of oth-
er languages encounter and, in turn, identify problems that na-
tive speakers of English encounter when learning other lan-
guages. Another way to contain the unneeded breadth of com-
parisons is by limiting the number of selected sounds and sound
phenomena that will be tackled. This chapter will be devoted to
consonant features with the next two chapters covering vowels
and suprasegmental features.
7.2. OUTLINE OF THE ENGLISH CONSONANT SYSTEM

In terms of unmarked (common) and marked (uncommon)
sounds, the consonant system of English leans in the direction of
the former more than the latter. In actual fact, there are no con-
sonants in English that can be identified as marked (such the
pharyngeals and emphatics consonants of Arabic, the pervasive
retroflexion of Hindi and the three-way plosive distinction of
Korean, etc.). However, based on the difficulties of learners of
English as L2, the most problematic pair of consonants is the []
and [] as in <thigh> and <thy>, respectively. Also, English
sounds such as <r>, <l>, <v>, <s>, <z>, can be
the source of phonetic and phonological accent for speakers of
some languages. Below are further comments on some of the
cross-language phonological inconsistencies.
105
7.2.1. Interdental Pair /, /

In terms of markedness and unmarkedness of human language
sounds, this pair [, ] can be considered marked because it is
rarely attested in the majority of known languages. Due to its
rarity, many learners of English encounter serious pronunciation
problems resulting in a distinct phonological accent. More wide-
ly, the pair [, ] is often replaced with the alveolar fricative
pair [s, z] or the alveolar plosive pair [t, d] depending on the
phonology of the specific language. The less common replace-
ment of [, ] is with the labialdental fricatives [f, v]. This last
substitution is also known in literature as th-fronting which is
common in some dialects of British English as well as in African-
American English.1 It is also observed in New Zealand English.2
To demonstrate examples of the above-mentioned substitu-
tions, a German or Kurd learner of English, for example, renders
the words <thank> and <then> as <sank> and <zen>,
while a Pole, Filipino or Assyrian learner of English renders the
same two words as <tank> and <den>, respectively. In each
case, the outcome is a very serious phonological accent as well
as a phonetic one. It results in a phonological accent because it
changes the meaning of thousands of words causing crucial se-
mantic confusion.
Even without the semantic change resulting from the fail-
ure to properly pronounce the pair [, ], the mere phonetic
change can cause serious phonetic accent. If one listens to ex-
Pope Benedict XVI3 delivering speeches in English, there is a
sibilant impression running throughout his English pronuncia-
tion. The impression of sibilance is generated by the fact that
English is naturally a language rich in sibilant sounds, typically
[s, z, , . Now, if a German, Frenchman or Kurd learner of
English were to convert all the [, ] sounds into [s, z] that
1
http://en.wikipedia.org/wiki/Th-fronting.
2
http://www.victoria.ac.nz/lals/resources/publications/nzej-
backissues/2003-elizabeth-wood.pdf.
3
Listen to his reading a text in English (http://www.youtube.
com/watch?v=w3fi93umuc4)
CHAPTER 7 107
would seriously reinforce the dominance of sibilance. It should

be pointed out, however, that the high frequency occurrence of
[, ] sounds, especially [], is not because they occur in the
structure of many English words; rather, because the words in
which they occur are of high frequency of occurrence such
<the, this, that, those, thing, think, thin, thick, etc.>. Thus,
when all the sibilance imposed by an L2 learner of English is
added to the existing sibilance that is already naturally in Eng-
lish, it generates a pervasive semantic confusion and disturbing
sibilance which reverberates as noise. Because of this dominat-
ing pervasiveness of sibilance in the pronunciation of a German
or Kurd learner of English, the influence at times may be reflect-
ed even in orthography (as was pointed out in 4.6, above).
The other form of conversion of [, ] is into [t, d] which
causes equally serious phonological accent as it semantically
confuses scores of words such as <they, then, those, three, thin,
thank> which are rendered <day, den, dose, tree, tin, tank>,
respectively. Phonetically, it creates an overflow of [t, d] sounds
which, in turn, infuses considerable noise into the flow of oral
discourse in English.
7.2.2. Approximant /r/

The most unmarked (common) types of <r> sounds in the ma-
jority of languages throughout the world are the tap <r> = []
and the rolled <r> = []. Although such <r> sounds exist in
several dialects of English, the one dominant in Standard Eng-
lish varieties such as the English RP and GAE tends to be an ap-
proximant or the so-called frictionless continuant despite some
phonetic difference between the two. The English RP <r> is a
voiced postalveolar approximant [], whereas the American one
is a voiced postalveolar retroflex approximant []. Perhaps, the
most interesting and atypical fact about the phonetic behavior
of this sound in English is its suprasegmental impact on the con-
text of the words in which it occurs.
The contextual rules of r pronunciation in the two varie-
ties are different. In the general classification of the English dia-
lects and varieties, GAE is categorized as an r-dialect (or rhotic
dialect) meaning that the r is pronounced in all linguistic con-
texts, whereas RP is, perhaps, the most typical of all r-less-
dialects (non-rhotic dialect) meaning that the r is not pro-
nounced except in certain linguistic contexts. The rule for the

positional pronunciation of r in RP is very simpler is pro-
nounced when it is followed by a vowel sound within a word
and across the word boundary. For instance, in the following
pre-consonant, pre-silent vowel and word-final positions, the r
is not pronounced:
Pre-consonant: <bird>, <card>, <verb>,
<dirt> and <curve>,
Pre-silent vowel: <here>, <there> and <care>,
Word-final: <car>, <sir>, <doctor> and

<bear>,
whereas in the following pre-vowel within a word or across a

word boundary it is pronounced:
Pre-vowel: <ring>, <reading>, <writing> and
<arrange>,
Cross-word pre-vowel: <for instance>, <here

is>.
It should be reiterated once again that the difference in RP or

GAE <r> pronunciation by learners of English as L2 is essen-
tially of a phonetic nature implying that it rarely amounts to
phonological accent;4 nevertheless, the phonetic difference is
extremely noticeable. The retroflexion in GAE colors the adja-
cent vowels, whereas the positional constraints on RP r results
in two phonetic variations in the overall pronunciation. First,
dropping the <r> creates a consonantal vacuum for an L2
learner. Second, dropping it in word final positions or pre-silent
vowels results in more diphthongal renditions of the vocalic el-
ements as in <dear> [] and <there> [] versus their
renditions in GAE as less diphthongized vowels [] and [].
Additionally, the substitution of the approximant English <r>
with a tap, rolled or retroflex flap /r/ as with speakers of Span-
4
Except, of course, when the <r> is confused with another sound, such
as in Japanese when it is replaced with <l>.
CHAPTER 7 109
ish, Italian, Arabic, Russian, Turkish, Indian, among many other

languages, tends to result in so much phonetic accent that it
may snowball into some sort of acoustic noise that, in turn, in-
terferes with comprehension on the part of the native English
listener.
A special note is in order with regard to the typical retro-
flex flap /r/ of the sub-continental Indian languages. In such
languages, retroflexion is pervasive and phonetically covers al-
most all consonants and vowels. Actually, from the perspective
of Firthian linguistics,5 phonetic retroflexion in such languages
functions as a prosody that runs throughout complete words and
even discourse. In languages such as Hindi and Urdu, retroflex-
ion seems to run throughout their entire oral discourse. This
explains why many Hindi or Urdu learners of English, or any
other language, color their L2 with a heavy trace of retroflexion.
Most strikingly, retroflexion constitutes the most salient feature
in the articulatory settings of Indian learners of other languages,
in general.
It is not a thematic diversion in this section to bring in the
/l/ and /r/ confusion by Korean, Japanese and Chinese learners
of English. Let us consider the case of Japanese learners. Phonet-
ically, in Japanese language the <r>, which is, generally
speaking, identified as a tap or flap, while the <l>, which is
identified as a lateral, are two phonetic variations (allophones)
of the same phoneme. In terms of the neurolinguistic storage of
sounds in the brain, the two sounds of Japanese are stored in
one slot. Unlike Japanese, in English the <l> and <r> are two
autonomous phonemes independently stored in two separate
slots. Besides the primary difference between the phonological
function of the <l>s and <r>s, their phonetic realizations are
also different. It is, therefore, quite natural for Korean, Japanese
and Chinese students to encounter serious phonological difficul-
ty in mastering the independent contextual production of the
two phonemes of English. In fact, these two liquid phonemes in
English remain some of the last sounds of English that those ori-
ental learners of English manage to successfully pronounce.
5
After the late Professor J. R. Firth of London University.
7.2.3. Voiceless and Voiced Alveolar Fricatives /s/ and /z/

This pair in English represents the unmarked variety of frica-
tives. Their mispronunciation is not as pervasive as the interden-
tal pair /, /. However, in cross-language studies, differences
arise that lead to mispronunciations in the form of accent. For
instance, the voiced alveolar fricative /z/ phoneme is missing in
Spanish which results in its replacement with /s/ in Latin Amer-
ican Spanish and a // in Continental Spanish leading to both
phonological and phonetic accent. In Latin American Spanish,
words such as <zoo>, <zip>, <zeal> and <zinc> are con-
fused with <sue>, <sip>, <seal> and <sink> resulting in
semantic confusion. Furthermore, what is linguistically signifi-
cant about [z] is that even when it does not cause phonological
confusion, it certainly causes a very noticeable phonetic accent
because of the high frequency of occurrence of /z/ sounds in
English. The failure to pronounce the [z] is extremely noticeable
even among otherwise fluent Hispanic speakers of English. Just
recently, CNN television broadcast a program on the techniques
of interrogation of foreign terrorists in which a CIA high-ranking
officer of Hispanic background pronounced most of his [z]
sounds as [s] in spite of the fact that his English was fluent and
native-like. I vividly recall that the word <pleasant> =
[p] was rendered [ps]. English words such as: <is>,
<result>, <preserve>, etc. in which the <s> is pronounced
[z], are traditionally realized as [s] by Hispanics.
It is also noticeable that for Greek learners of English, the
alveolar sounds [s, z] are often replaced by the alveolo-palatal
sounds [ ] which are auditorily and impressionistically con-
fused with the [, pair for the native listener. The confusion is
not unexpected as they actually sound like them because [, ]
are located exactly half-way between [s, z] and [, in terms of
their place of articulation. It is worth noting that because Greek
lacks the postalveolar pair [, sibilants their <> sounds
like Sean Connerys <s>.6 Traditionally, classical Greek histo-
rians have confused the alveolar pair [s, z] or the postalveolar
6
http://greek.kanlis. com/ phonology.html.
CHAPTER 7 111
one [, ] of other languages with their alveolo-palatal sounds

[, ]. A piece of historical evidence that supports this claim is
embedded in the chronicles of ancient Greek historians docu-
menting Alexander the Greats invasion of the Middle East. For
example, notice that the English word Assyria is the anglicized
rendition of the Greek name A in which the geminated
sigma <> originally represents the germinated <sh > = []
in the ancient Assyrian name <ashshur> = [aur]. This indi-
cates that the Greek historians identified the ancient Assyrian []
as sigma <> which English translators, in turn, rendered as
<s> thus creating the word <Assyria>.
7.2.4. English Plosives: /p b, t d, k g/

These three pairs of English plosives are quite unmarked (com-
mon), as in many languages, but learners of English as L2 with
different linguistic backgrounds may encounter some problems
in pronouncing them. Before citing some examples of pronuncia-
tion problems, it is necessary to point out that the voiceless plo-
sives /p, t, k/ of English are aspirated i.e., [p, t, k] which
means they are followed by a puff of air when released. English
does have voiceless unaspirated variants of the aspirated ones
only in post [s] initial clusters such as [sp__, st__, sk__]. Hence,
the aspirated plosives in <pin> = [pn], <till> = [tl] and
<kin> = [kn] become unaspirated (or rather deaspirated) as
in <spin> = [spn], <still> = [stl] and <skin> = [skn].
Unfortunately, sounds that have only a phonetic occurrence in a
language are not recognized phonologically by the brain of the
native speaker.
With some learners of English the presence or absence of
aspiration may lead to phonological and/or phonetic accent. For
instance, the Thai language has phonologically three-way plo-
sive contrasts, namely voiced, voiceless unaspirated and voiceless
aspirated as in the following examples in table 7.1, below.7
7
Handbook of the international phonetic association, Cambridge University
Press, 1999.
Voiced Plosives Example Meaning

[b] [b:n to bloom
[d] [d:n] calloused

Voiceless Unaspirated Plosives
[p] [p:n birthmark
[t] [t:n sugar palm
[k] [k:n act
Voiceless Aspirated Plosives
[p] [p:n belligerent
[t] [t:n alms
[k] [k:n shaft
Table 7.1. Three-way plosive distinction in Thai language
Thai learners of English are not expected to encounter pronun-

ciation difficulties with the English aspirated plosives [p, t, k]
except maybe in final position when Thai speakers pronounce
them with no audible release whereas in English there is an op-
tion to release them (Kanokpermpoon, 2007). They may also
have a perceptual problem in distinguishing the voiced plosive
from the voiceless unaspirated plosive. On the reverse, English
learners of Thai will encounter serious problems in mastering
the production of the voiceless unaspirated plosives and distin-
guishing them from the voiced ones.
Hispanic learners of English do not have a phonological
problem with English plosives; nevertheless, they might demon-
strate a certain degree of phonetic accent in rendering the aspi-
rated plosivesespecially in initial positionas unaspirated
since their own voiceless plosives are unaspirated in nature. Be-
cause of the high frequency occurrence of initial plosives in Eng-
lish, a rendition of English by someone who fails to produce the
aspiration gives the overall pronunciation a distinct phonetic
accent. This has been typical of some of my Hispanic students.
CHAPTER 7 113
Arabic has all the plosives of English8 except the voiceless

bilabial plosive /p/. Its absence is by far the most significant
phonological problem for Arab learners of English inasmuch as
segmental phonology is concerned. The failure to produce a /p/
and replace it with a /b/, almost exclusively, riddles the Arabic
rendition of English with /b/s. It creates an unmistakable pho-
nological and phonetic accent.
Also noticeable with Turkish, Farsi and Greek learners
rendition of English is the replacement of the velar plosives /k/
and / g/ with palatal plosives // or // for the voiceless and
// for the voiced. Although such a shift does not add up to a
phonological accent, it does, however, color the rendition of
English discourse with an distinct palatalized impression which,
subsequently, manifests a readily distinguishable phonetic ac-
cent that runs throughout their discourse in English. Such pala-
tal plosives are most distinct among Iranian speakers of English
because of the palatal plosives in Farsi. Just to demonstrate the
Farsi substitution of English velar plosives with palatal plosives,
an acquaintance of mine used to pronounce the title of a cultur-
al festival in the village of Skokie/Illinois Coming Together in
Skokie as [cmn tdr n scoci] for [km tgr n
skoki]. Although there is no phonological accent, the phonetic
one is more than a mouthful.
7.2.5. Labio-Dental Fricatives /f, v/

This pair of sounds can be a source of difficulty for learners of
English of different linguistic backgrounds. For instance, in the
Tagalog language of the Philippines, this pair is missing, there-
fore, the pair is replaced with /p/ and /b/.9 The substitution of
such a pair in combination with the substitution of the English
pair /, / with /t, d/ results in a very serious phonological ac-
cent in the rendition of English by Tagalog speakers. Typically,
8
It does not have /g/, but it is widely found in local vernacular Arabic
and many dialects.
9
For an interesting impersonation of English by a native speaker of Taga-
log watch the video at: (http://www.highpoint-ieltsblog.com/2011/03/filipino-
pronunciation.html).
when the people of the Philippines are asked about their identi-
ty as a people they reply Pilipino not Filipino.
In Farsi and Turkish, the English /v/ is realized differently;
most likely as the labialvelar approximant [w] or the labialden-
tal approximant []. When replaced with the former it may lead
to a phonological accent in pronouncing <vine> as <wine>
or <veal> as <wheel>; otherwise, it merely leads to a pho-
netic accent. In Assyrian (Modern Aramaic), /v/ has a wide
range of phonetic realizations (phonetic variants) depending on
different regional and tribal dialects as demonstrated in table
7.2, below.
Realizations of /v/ Phonetic

Description
[] Labial-velar approximant10
[] Labial-palatal approximant
[] Labial-dental approximant
Table 7.2. Different phonetic realizations of the English pho-

neme /v/ in Assyrian.
7.2.6. The Affricates / /

Although these two sounds in English do not have specific let-
ters to signal them, they are in fact quite common sounds and
can be a cause of mispronunciation for L2 learners of English. It
has already been pointed out earlier that Germans have difficul-
ty with //, which they replace with its voiceless counterpart
//. The French do not have this pair, therefore, they substitute
the sounds with their fricative counterparts /, /.
For Arab learners of English, the difficulty of such sounds
may be relative depending on which Arabic dialect is in the
background. For instance, speakers of eastern Arabic dialects
such as Iraq, Saudi Arabia and the Gulf have much less difficulty
10
The feature labial with [w and with [ should really be bilabial
because both lips are involved in conjunction with the tongue configuration in
the velar region for the former and the palatal region for the latter.
CHAPTER 7 115
in pronouncing / / and // and / as opposed to the speakers

of Western dialects, especially Lebanon, Syria and Egypt. In the
first two countries, they are usually replaced with // and //,
respectively, whereas in Egypt the // tends to become a voiced
velar plosive /g/ and the / / is non-existent, and if attempted it
will become a //.
Interestingly, the pair / / does not occur in Greek; there-
fore, Greek learners of English tend to replace the pair with the
alveolar affricates / /. For instance, a Greek learner of Eng-
lish is expected to pronounce <teach> as [t and <judge>
as [ . It is a substitution that generates a distinct phonetic
accent.

Because English is the most widely used language throughout
the world, it has been used as the basis for reflecting the broad
variety of phonological and phonetic accents that learners of
English as L2 demonstrate in their rendition of its consonants.
This, however, should not conceal the fact that English learners
of other languages encounter a broad array of phonological and
phonetic accents. Another aspect of the English consonantal sys-
tem that should be highlighted is that most consonants of Eng-
lish are phonetically unmarked (common). This fact suggests
that English learners may encounter their most salient difficul-
ties when tackling the marked (uncommon) sounds of other lan-
guages such as the gutturals and emphatic of Arabic, retroflexes
of Hindi, palatal plosives of Farsi and above all, the clicks in
some African languages.
ACCENT-CAUSING VOWELS
8.1. SALIENT FEATURES IN GENERAL VOWEL DESCRIPTION

Foremost among the phonetic features used for vowel descrip-
tion are the terms quality and quantity. Quality is defined in
terms of tongue-height, tongue-position and lip-shape, etc. and
their combined acoustic impact on the ears of the listener. As for
quantity, it is defined in terms of shortness vs. length and/or
laxness vs. tenseness. A more comprehensive option for vowel
systems identification has been introduced in the form of cen-
tripetal vs. centrifugal dichotomy which affords a more general
model (Odisho, 1992). Because of some differences between the
GAE and RP vowel systems and the manner in which those differ-
ences are transcribed, the RP system will be used to demonstrate
the nature of the centripetal system. Generally speaking, this sys-
tem tends to have a schwa vowel, [], with the rest of the vowels
displaying different degrees of quantitative and/or qualitative val-
ues in unstressed positions.
In figure 8.1, below, the vowel symbols indicate only dif-
ferences in the quality of RP vowels, while in 8.2 the vowels are
impressionistically marked phonetically for full-length and the
short are left unmarked with schwa being the shortest and most
reduced in quality. In the description and transcription of RP
vowels Gimsons (1967) model is followed because his transcrip-
tion of individual vowels indicates both quality and quantity
features unlike Daniel Jones model (1956) where the emphasis
is primarily on quantity (i.e., either short or long). Gimsons, in
my opinion, is the most innovative and practical system espe-
cially in teaching advanced and efficient cross-language com-
parative pronunciation with emphasis on accent reduction. In
this study, the only divergence from Gimson is in the identifica-
tion of the English vowel []. Almost all those who have dealt
117
with this vowel have identified it as short. However, as a phone-

tician and native speaker of Arabic, I beg to differ with this
identification because the Arabic vowel <>, which is the
long counterpart of <>, sounds phonetically, especially in
non-emphatic contexts,1 almost identical with English [] as in
the English and Arabic words in table 8.1, below.
Arabic words Meaning Arabic words Meaning English words

with Fatha with Alif with []
<> worry <> important <ham>
<> decision <> stayed <bat>
<> poison <> poisonous <Sam>
<> weaken <> elapsed <fat>
<> dam <> prevailed <sad>
Table 8.1. Words matching Arabic < >with English

vowel [].
Impressionistically, the Arabic words in column #3 and the Eng-

lish ones in #5 are pronounced almost identically. Thus, in con-
texts other than with Arabic emphatics, the gutturals and <r>,
< >has a phonetic variant that is identical in vowel qual-
ity with English [] especially when the pronunciation adheres
to that of Modern Standard Arabic and Classical Arabic. It is
true though that < >is not absolutely exactly of the same
vowel quality of <>, but the latter is identical with Eng-
lish [] in both quality and quantity. This is the rationale for
assigning the length mark to the English vowel []. Marking
such quantity and quality differences is extremely important
especially for learners of English whose native languages tend to
have centrifugal systems.
Essentially, the centrifugal system has the propensity of
avoiding two inclinations. First, it avoids any quantity (length)
contrasts. Second, it leans in the direction of avoiding vowel re-
duction especially in the form of a schwa. Doubtless, there will
1
Not adjacent to < >.
CHAPTER 8 119
be some difference in the quantity of the unstressed vowels ver-

sus the stressed ones, but this is hardly a striking difference. Ac-
cording to such templates for vowel identification, English will
be a typical representative of a centripetal system as opposed to
Spanish, which will be a typical representative of a centrifugal
system (figure 8.3) with Arabic (figure 8.4) falling half-way be-
tween the two. Notice that the vowels for Arabic indicate major
quantitative contrasts coupled with a limited degree of quality
difference which is not readily perceptible by people with no
professional phonetic experience.
Figure 8.1. The English RP simple vowel systema typical

centripetal one.
Figure 8.2. The English RP simple vowel systema typical

centripetal one with relative length marks.
Figure 8.3. The Spanish vowel systema typical centrifugal

one.
CHAPTER 8 121
Figure 8.4. The Arabic vowel systemhalfway between cen-

tripetal and centrifugal.
Any cross-language teaching of pronunciation to avoid phono-

logical and phonetic accent can hardly be effective if the ap-
proach handles the vowels individually and as decontextualized
segments. An efficient mastery of a different vowel system can-
not be attained without a holistic approach to the teaching of
the systems through their most characteristic features. Even mi-
nor phonetic differences in a given language may have phono-
logical weight in the target language. Thus, in cross-language
teaching of pronunciation any phonetic differences matterif
not to avoid being caught in the phonological trap then at least
for dodging a phonetic accent. To demonstrate, those learners
with a native centripetal vowel system should be trained on
both the avoidance of schwas and vowel quality and quantity
reduction when learning languages with centrifugal systems.
Conversely, learners with centrifugal systems should be trained
on the production of schwas and vowel quality and quantity
reduction in appropriate contexts and conditions.
8.2. THE VOWEL SYSTEM OF ENGLISH

Obviously, English being the native language of several coun-
tries and also the most widely used international language, one
expects some differences, or perhaps even some major differ-

ences, within its different standard varieties. To avoid drifting
sidewise into minutia, the focus will be on GAE simply because
it is gradually becoming more commonly used internationally.
Occasional mention of RP will be made when and where neces-
sary.
8.2.1. Simple Vowels of General American English

There are some noteworthy differences between the vowels sys-
tems of GAE and RP. Generally speaking, the two systems are
more different in the domain of diphthongs than in simple vow-
els. With regard to simple vowels, there are some differences in
the qualitative and qualitative values assigned to the same pho-
netic symbols. For instance, in GAE the vowel phoneme // as in
<bait> and // as in <boat> are used to indicate simple
vowels, whereas for RP the quality of vowels in the preceding
two words is diphthongal and is marked by the symbols /ei/ and
/ou/, respectively. However, a more striking difference is in the
absence of the vowel quality [o] = [] as in the RP rendition of
the words <hot, dot, com, pot, shot>, etc.
Another diversion in GAE away from RP system, is the
emergence of what are known as r-colored vowels such as [
and [ also named as unstressed schwar and stressed schwar,
respectively.2 Table 8.2 below, represents GAE simple vowels
with examples. The relative phonetic and/or phonological quan-
titative differences of vowels are marked with length mark []
and its absence indicates shortness. Highlighting such phonet-
ic/phonological differences are absolutely essential in cross-
language teaching of pronunciation without which learners can
hardly avoid manifesting accent.
2
MacKay, 1978.
CHAPTER 8 123
Vowel Quality Vowel Quantity Example

Symbol Indicator
beat
bit
bait
bet
bat
father; shot
boot
book
but
boat
bought
about
writer
bird
Table 8.2. Simple vowels of General American
Several points are worthy of consideration. First, the English

vowel system, whether GAE or RP, is rich in quality compared
to many vowel systems the most widespread of which being the
five-vowel system.3 Second, to mark the quality and quantity
differences sheds better light on the nature of the vowel systems
a fact that is extremely significant to highlight in comparative
phonetic and phonological studies especially when accent reduc-
tion and remediation are targeted. To demonstrate, if one has to
phonetically transcribe the Spanish vowel in the word <sin>
(without) compared to <sin> and <seen> of English, one has
no choice but to mark the vowel in Spanish <sin> with half-
length mark [i] because it is slightly different in both quality
and length compared to English <sin>. English <seen> has to
be transcribed as [] to set it apart from both English [],
with a very short vowel, and Spanish [] with a half-long
3
Ladefoged and Maddieson, 1996.
vowel. Third, in teaching comparative vowel systems such as

English and Spanish, for instance, it is very difficult to enable
the learners on both sides to master each others vowels without
highlighting the tiny phonetic differences of quality and quanti-
ty across the vowel systems. Typically, Russian, Italian, French,
Japanese and Spanish learners of English seriously confuse the
English vowels such as in <dim> vs. <deem> and <pull>
vs. <pool>. They reduce each pair of words to one in the form
of half-long vowels as [dim] and [pul].
As surveyed above, the differences in the vocalic systems of
the two main varieties of English are many, but fortunately,
they often result in phonetic accent between its native speakers;
however, at times, they may result in phonological accent as in
the examples in table 8.3, below:
GAE Pronunciation RP Pronunciation

<cod> [kd] <card> [kd]
<hot> [ht] <heart> [ht]
<pot> [pt] <part> [pt]
<shop> [p] <sharp> [p]
Table 8.3. Phonetic differences in vowels between GAE and

RP can be phonological.
Thus, a combination of phonetic and, in rare cases, phonological

differences makes the study of vowels in RP and GAE an area
worthy of attention in teaching pronunciation, especially when
making a choice between the teaching of one or the other varie-
ty.
8.3. SELECTIONS OF CROSS-LANGUAGE ACCENT-CAUSING VOWELS

In this section, attempts will be made to identify some of the
most salient and readily perceptible features of both phonologi-
cal and phonetic accent of learners of English as L2 with differ-
ent linguistic backgrounds. Evidently, there will be limitations
on what this section will include partly because of my limited
linguistic knowledge and experience with languages and partly
to avoid repetition of similar information. Additionally, at the
end of each sub-section, the experience of learners will be re-
versed, i.e. native English speakers learning other languages.
CHAPTER 8 125
8.3.1. Hispanic Learners of English Vowels

It has already been demonstrated that English and Spanish vow-
el systems are almost maximally contrastive, the former being a
centripetal system and the latter a centrifugal one. Due to this
radical difference, mastering the vowel system of English be-
comes the foremost phonological and phonetic difficulty for
Hispanic learners of English. The difficulty is not simply at-
tributed to differences in quality and quantity of vowels; rather,
the dynamics that govern the qualitative and quantitative fea-
tures of the vowels involved further complicate the problem. It
is the combinations of both factors that jointly lead to serious
phonological and phonetic accent for Hispanic learners of Eng-
lish.
A simple analogy in this regard may be beneficial in help-
ing the learner envisage the difference between the two vowel
systems. If each vowel in English and Spanish is likened to a car
and the slot of the vowel to a garage, then Spanish will have
five garages with five cars in them, while English will have
twelve garages with 12 cars in them. Thus, when a Hispanic in-
tends to learn English, he, subconsciously, has no choice but to
allow the parking of more than one car in each of his five garag-
es as it is schematically represented in figure 8.5 below, where
the arrows show which vowels in English may mistakenly be
identified as the same vowel in Spanish. In linguistic terms,
Hispanic learners will take two or three vowels in English as one
vowel. And this is the most prominent pronunciation problem
for Hispanics embarked on learning English. To demonstrate,
each pair of the English vowel contrasts such as in the pairs
<bid> = [] vs. <bead> = [] or <full> = [] vs. <fool>
= [] will go into one slot of [] for the former pair and [] for
the latter one. Most important of all in terms of both quality and
quantity, neither [] and [] are [] nor [] and [] are [].
Phonetically, as well as phonologically in this case, these are
certain vowels that should be precisely and carefully taught in
an English/Spanish cross-language teaching of pronunciation.
These examples serve to highlight one of the most challenging
if not the most challengingproblems for L2 learners of English
whose vowel systems are very limited in quality and quantity.
At times, the confusion can be extremely embarrassing when
some obscene or taboo words are involved such as <beach> vs.
<bitch>, <sheet> vs. <shit>, <cheek> vs. <chick> and

<wiener>4 vs. <winner>.
Figure 8.5. Misidentification of English vowels by Hispanic

learners.
The above schematic diagram indicates that the English vowel

system is a complex one, whereas the Spanish system is a rela-
tively simple one as demonstrated with the words in table 8.4,
below.
4
In its vulgar sense. These are some of the embarrassing pairs that one
may hear in classroom situation.
CHAPTER 8 127
Vowel Grapheme Vowel Phoneme Example
<a> /a/ <paso>

<e> /e/ <peso>
 or <y> /i/ <piso>
<o> /o/ <poso>
 /u/ <puso>
Table 8.4. Simple Spanish five-vowel system.
In fact, some describe it as the essence of simplicity and ele-

gance (Stockwell and Bowen, 1965). What drives the two sys-
tems further apart is the substantial difference in the dynamics
of vowel reduction in English, especially with regard to word
syllable structure and the location of primary stress within a
word or stretch of words. Spanish has hardly any noticeable var-
iation of vowel quality and quantity in different contexts and
across its different dialects (Stockwell and Bowen, 1965). As for
its diphthongs, it is quite natural for a simple vowel system to
have a simple and basic combination of diphthongs. The most
common diphthongs in Spanish are the following:
Diphthong Example
/ei/ ley, reina
/ai/ hay, taita
/oi/ soy, coy
/au/ auto, chao
Table 8.5. Simple Spanish diphthong system.
The limited variety of diphthongs in GAE compared to RP, espe-

cially with diphthongs in words ending with <r> where the
<r> is deleted and a schwa [] is inserted as in <here> =
[hi] and <there> = [], makes the transition of Hispanics
into GAE somewhat easier. However, the above observation
about the easier transition into GAE compared to RP should be
considered with caution because one can unequivocally state
that phonetically every single simple vowel or diphthong in the
two languages is virtually different from what phonologically
may be considered a counterpart. For instance, phonologically

one tends to say that both languages have the /au/ diphthong;
nevertheless, phonetically, this is not exactly accurate because
the two constituents that make up the diphthong in each lan-
guage are phonetically different in the first place. At least, one
can state that the English diphthong /au/ is originally composed
of an [] vowel gliding into an [] vowel, whereas the Spanish
diphthong is the coalescence of the [a] and [u] vowels. This is
a strictly phonetic assessment that many teachers of pronuncia-
tion may not be professionally qualified to be aware of. From
this perspective, the intention should be to secure as near a na-
tive-like pronunciation as possible that will dispel any confusion
in meaning on the part of both the listeners and speakers.
8.3.1.1. English Learners of Spanish Vowels

Generally speaking, English learners of Spanish are expected to
reverse what Hispanic learners of English do. They have to learn
to shrink the domain of vowel quality diversity and eliminate
the influence of schwa completely. Fortunately for such learners,
due to the much richer vowel inventory of English than its Span-
ish counterpart, the problems will be more of phonetic nature
than phonological. Nevertheless, it is worth mentioning that the
dominating tendency of vowel reduction in English is expected
to be the culprit for many unwanted qualitative and quantitative
changes in the proper production of Spanish vowels. This ten-
dency, if left unchecked, can become the cause of a serious dis-
tortion of the overall rendition of the pronunciation of Spanish
primarily in the form of imposing a stress-timed rhythm on a
syllable-timed one for which Spanish is well-known. A native
speaker of English should never allow himself to be misled in
pronouncing the Spanish word <color> = [kolor], which is
orthographically identical with the English one, as [kl or
[kl]. Unlike English, which has remarkable inconsistency of
sound and orthography, Spanish has one of the highest consist-
encies among known languages.
8.3.2. Arab Learners of English Vowels

Arabic and English are two languages that are drastically differ-
ent in language family, sound systems and orthographic systems.
CHAPTER 8 129
English belongs to the Indo-European family, while Arabic is a

typical Semitic language. The sound systems of the two lan-
guages differ extensively in consonants, vowels, stress placement
and the dynamics.
Arabic, like English, is the native language of a large popu-
lation inhabiting a very large area. Consequently, it has a wide
range of different regional, social and ethnic dialects. Some fa-
miliarity with a few most salient linguistic characteristics of the
dialects is important for any learner of Arabic because even the
so-called Modern Standard Arabic (MSA) is regionally influ-
enced by those dialects. In fact, one can easily distinguish
among different standard varieties of Arabic such as Iraqi Stand-
ard Arabic, Egyptian Standard Arabic and Lebanese Standard
Arabic etc. These standard variants are not only different in
segmental (consonants and vowels) pronunciation, but also in
lexicon and, at times, even in the overall rhythm and melody,
especially if North African Arabic varieties are considered.
One typical deviation of the dialects away from MSA is the
enhancement of the basic three vowel-quality system into a five
vowel-quality one by adding the mid vowels of [] and [].
The three simple vowels of Arabic usually combine in produc-
ing the diphthongs [ai] and [au] as in table 8.6, below.
Thus, the above enhancement of vowel quality, through the
influence of the Arabic dialects, does, somewhat, help Arab
learners of English in handling more English vowels. Neverthe-
less, the system remains restricted in quality compared to Eng-
lish. Essentially, it is a simple triangular system of maximally
differentiated vowels of /, , /. As a corollary to the restricted
vowel quality in Arabic, its diphthongs are also limited in num-
ber. This is why some linguists are reluctant to accept the exist-
ence of diphthongs in Arabic. They opt to identify them as vow-
el plus a semi-vowel of <[ = >j] or <[ = >w].
Word Meaning SA in IPA DA in IPA

house [bajt] [bet]
between [bajn] [ben]
night [lajl] [lel]
how [kajf] [kef]
sword [sajf] [sef]
universe [kawn] [kon]
day [jawm] [jom]
urine [bawl] [bol]
color [lawn] [lon]
Table 8.6. Vowel contraction in Arabic and creation of the

mid vowels [] and [].
The nature of the problems facing Arab learners of the English

vowel system is, overall, typical of a transition of speakers of
non-centripetal vowel system to a centripetal one. Such learners
of English are usually pressed for enhancement and diversifica-
tion of their vowel quality range alongside the mastery of vowel
reduction and schwa production. The most typical feature of the
centripetal system is the existence of a schwa vowel []. The
predominance of a schwa [] in English and its absence in Ara-
bic is the main culprit behind the tendency of the so-called
word-deflation in English vs. word-inflation in Arabic
(Odisho, 2009) which is one of the most primary causes of ac-
cent by Arab speakers of English (Odisho, 2013).
To demonstrate, the reader is referred to the manner in
which the full name of the ex-President of U.S.A. William Jeffer-
son Clinton is transliterated in Arabic. Its expected traditional
Arabic transliteration would be <> 5 instead
of a more accurate rendition of < > . It is clear

that in the English pronunciation of the Presidents name there
are no long vowels, whereas in its Arabic rendition there emerge
five long vowels which, in turn, bring about a major shift in the
5
http://ar.wikipedia. org/wiki.
CHAPTER 8 131
rhythmic structure of the name and its pronunciation as visually

demonstrated by the following stress pattern in English rendi-
tion vs. the stress pattern in its Arabic rendition
with larger dots standing for the stressed syllable
in each case. Such a shift in stress pattern is the most powerful
source of accent generation by Arab speakers of English includ-
ing those who are highly educated (Odisho, 2013).
8.3.2.1. English Learners of Arabic

English learners of Arabic bring with them a centripetal vowel
system with a broad range of vowel qualities dominated by a
schwa. Consequently, the first thing they have to do is to learn
how to restrain their strong inclination toward vowel quality
diversity. For instance, they have to eliminate the use of a schwa
as well as any tendency in the direction of schwaization and
vowel reduction. Equally importantly, they have to try carefully
to maintain the vowel quality of Arabic vowels. They should not
attempt to render the Arabic past tense forms, which are pre-
dominantly formed with triliteral consonantal roots and vocal-
ized with three < >vowels such as < = > //, < = >
// and < > = //, as // and //,
respectively. Additionally, even the stressed vowel [a] may be
replaced with English []. Such a substitution of vowels does
not only result in a shift in vowel quality and relative quantity,
but most importantly in the overall rhythm. In the case of Ara-
bic learners with an RP English background, there is yet one
more shift in vowel quality in the form of replacing some long
vowels of Arabic by English diphthongs as in words with a long
vowel in pre-r position of a syllable such as in < > = /safir /,
< > = / samir / and < > kabir which are pronounced as
< > / sfi /, <
> = / smi/ and <
> kbi respective-
ly.

Vowels play a very significant role in generating accent in cross-
language situations. There are four main reasons for this role.
First, vowel systems across languages can be drastically differ-
ent. Second, vowels do not have a well-defined contact area dur-
ing their articulation; they are the result of a configuration ra-
ther than actual contact. Usually, it is more difficult to form a

configuration than to execute definite contact. Third, except for
the configuration of the lips, the other two constituents of vowel
production, namely location and size of the narrowing, do not
yield themselves easily to precise visual, kinesthetic and propri-
oceptive assessment. Fourth, vowels carry the weight of stress
within words and determine the nature of the overall rhythm.
Unfortunately, in the traditional teaching of pronunciation, in
general, and accent, in particular, there has been more emphasis
on consonants than on vowels, and this is partially the reason
behind the ineffective approach of many instructors, especially
those who depend on the so-called phonics approach to teach-
ing pronunciation.
ACCENT-CAUSING SUPRASEGMENTALS
9.1. A DESCRIPTION OF THE MOST SALIENT FEATURES OF

SUPRASEGMENTALS
For many people, especially those without any linguistic orienta-
tion, the most natural categorization of sounds is into conso-
nants and vowels. However, the design of human speech is too
structurally and systematically complex, intricate and diversified
to be straitjacketed in the dichotomy of consonants and vowels,
which usually represent short segments of speech. Obviously,
the use of the attribute short for some segments (individual
segments) implies the presence of units, which represent long
segments (relevant to more than one segment or a stretch of
segments), which are often known as suprasegmentals. Without
suprasegmentals, segmental features alone would not suffice to
carry the complex and open-ended communicative message of
human speech. A combination of segmental features and multi-
length suprasegmentals generates tremendous structural and
systemic diversity of sound units, which, in turn, account for the
multi-layered and multidimensional construct of human speech.
A relevant aspect of the study of suprasegmentals is to decide
what a long segment is. The response to such a question de-
pends on the linguistic school and perspective one follows and
the targeted linguistic refinement. In the early works of struc-
tural linguistics, the primary emphasis was on stress and intona-
tion; duration (length) and juncture were also occasionally
treated. In the tradition of prosodic analysis, of the London
school of linguistics, the longer segments called prosodies are not
necessarily confined to the traditional stress, intonation, dura-
tion and juncture. A prosody, according to prosodic analysis, may
represent any feature that extends over more than one segment
or pervades throughout a stretch of segments. For instance, the
133
ized-suffixed sound phenomena such as labialized, (lip round-

ing) nasalized, velarized, palatalized, pharyngealized may all be
handled as prosodies. For a simple and straightforward illustra-
tion of the nature of prosody, let us examine the differences be-
tween the sounds of the following minimal pair: <seep> =
[] vs. <soup> = []. A standard phonemic approach will
identify three phonemes in each word with the difference being
confined to the vowel element. Unlike the phonemic approach,
prosodic analysis will identify three segmental units in each
called phonematic units and one prosody represented by the lip-
spreading in the former and lip-rounding in the latter each of
which pervades throughout the whole word; hence, both lip
spreading and lip-rounding are prosodies or suprasegmentals.
With this broadening of the domain of suprasegmentals, the
phonetic and/or phonological relevance of suprasegmentals will
not only be associated with syllables, as the shortest units in
speech, and the sentence as the longest unit, but also will cer-
tainly include any stretch of speech or discourse. This trend is
highly consistent with the recent emphasis on the study and
teaching of language at discourse level. Consequently, the teach-
ing of pronunciation with the inclusion of suprasegmentals, in
general, and the inclusion of articulatory settings (Honikman,
1964) or phonetic settings (Laver, 1980; 1994), in particular,
creates a primary shift in emphasis and direction. The signifi-
cance becomes even greater at higher levels of proficiency ac-
quisition. In special cases when accent acquisition and accent re-
duction are targeted, orientation in suprasegmentals and articu-
latory settings are extremely vital; in fact, they become inevita-
ble for refined pronunciation.
Generally speaking, the formal study of suprasegmentals
receives attention only after having exhausted the study of the
segmental features. In the beginning, the segmental elements
(consonants and vowels) receive more attention because they
are more tangible by virtue of their easily identifiable nature.
They have become even more tangible and identifiable in lan-
guages that have been reduced to writing. In writing, especially
an alphabetic system, the major target has been the assignment
of symbols to segmental sounds only. Few languages have cared
to incorporate prosodic features into their orthographies. Today
no linguistic description and study of a given language is com-
CHAPTER 9 135
plete and coherent if its prosodic aspects are left untouched.

This advancement in the description of languages has led to the
application of the findings in different L1 and L2 language in-
struction situations. It is interesting to note about language edu-
cation in the United States that second and foreign language
education and instruction are far more linguistically geared than
native language instruction (i.e., English language arts). Perhaps
two reasons may account for all or part of the discrepancy. First-
ly, native language acquisition is completed in a subconscious,
effortless and automatic manner due to ample exposure to au-
thentic context-embedded and situation-embedded language
materials. Secondly, under normal native language acquisition
by normal individuals, there does not seem to be much need for
formal linguistic intervention and support. Unlike native lan-
guage acquisition situation, second and foreign language learn-
ing situations, especially with adult learners, may require more
formal and linguistics-based intervention strategies to better
systematize and enhance the learning process. This latter fact
may account for the use of more linguistics-oriented teaching
materials and textbooks in ESL and bilingual education instruc-
tion than in English language arts instruction (at least in the
U.S.). For instance, English vowels are taught as if they are five
in number and occasionally six when <y> is added. Consonant
clusters (the so-called blends) may still be determined on the
bases of letters rather than sounds. Letter (grapheme), sound
(phoneme) and letter-name (nomeneme)1 identities are easily
mistaken for each other. Phonics is a mere letter-based tech-
nique that fails to handle proper teaching of pronunciation.
Many of those misconceptions are less frequently encoun-
tered in ESL and bilingual language materials perhaps because
those two disciplines have developed in close connection with
applied linguistics.
1
This term was coined after the patterns of phoneme and grapheme based
on the Latin root nomen (name) to designate letter-name (Odisho, 2004)
9.2. STRESS AND RHYTHM

Stress may have different interpretations from the speakers or
the listeners standpoints. When the speakers activity in produc-
ing stressed syllables is in focus, stress may be defined in terms
of greater effort that is exerted in the production of a stressed
syllable as opposed to an unstressed one (Lehiste, 1970). When
stress is defined from the listeners standpoint, the claim is often
made that stressed syllables are louder than the unstressed syl-
lables (ibid). This is why Ladefoged tends to think that stress can
always be defined in terms of something a speaker does while it
is difficult to define it from the listeners point of view (1982).
To avoid those complications, it suffices to deal with stress in
terms of greater or lesser physiological effort by the speaker and
greater or lesser prominence by the listener who assesses promi-
nence as the overall index of greater loudness, length and higher
pitch.
Stress is, hence, primarily the result of greater physiological
effort exerted by the speaker at a certain point within a polysyl-
labic word and at repeated points within the flow of speech. A
greater respiratory effort makes a given syllable more prominent
and with the decrease in this effort, syllable prominence dimin-
ishes. A realistic division of the prominence continuum is to
identify three degrees of prominence to be associated with the
trichotomy of weakly stressed, medium stressed and strongly
stressed syllables. However, the dichotomy of unstressed and
stressed syllables has customarily been more dominant. The
term unstressed is only figuratively employed to subsume the
first two degrees of stress because the term is literally meaning-
lessno portion of speech is produced without physiological
effort; consequently, every portion should have some promi-
nence. In other words, the unstressed syllables stand for the por-
tions with minimum stress.
It is possible to distinguish between stress assignment with-
in a word and within a sentence because within the latter it is
likely for words to undergo a shift in the location of stress or to
emerge with partial stress only. Within a word a certain syllable
sounds more prominent in relation to others, while in a sentence
certain word or words sound more prominent in relation to the
rest. The former is called word stress and the latter sentence
stress.
CHAPTER 9 137
Languages differ in the manner they use word stress and

sentence stress to signal linguistic/nonlinguistic variations.
Some languages show a strong tendency to retain the stress on a
certain syllable within the word regardless of the syllabic struc-
ture and the number of syllables. Obviously, in such cases stress
becomes highly predictable. Czech words tend to have stress
predominantly on the first syllable irrespective of the number of
syllables (Ladefoged, 1982), whereas Turkush tends to place
stress on the last syllable. In other languages, stress changes its
place according to several factors: foremost of all are the num-
ber of syllables, their internal structure and arrangement within
a word, the grammatical category of words and their status as
native or loan words.
To facilitate the rules of stress assignment, linguists use the
classificatory terms of ultimate, penultimate, and antepenulti-
mate to identify the structural location of stress. If no rules can
be formulated or if the rules can only capture certain instances
leaving the rest of the instances unaccounted for without some
ad hoc rules, then the predictability of stress becomes less likely
and its role as a distinctive feature between the lexical items
more striking. If stress is highly predictable, its function is pri-
marily that of determining the rhythm and the overall pronunci-
ation though it still can have a demarcative function, i.e. it helps
to signal the word boundary (Hyman, 1975). In languages
whose stress placement resists straightforward predictability, the
function of stress is no longer confined to pronunciation and
demarcation; it can assume a wide and diversified range of lexi-
cal and grammatical functions.
The distribution of stressed and unstressed syllables within
a language determines its rhythm. The traditional view is that
rhythm in languages follows the dichotomy of stress-timed or
syllable-timed (Adams 1979; Dauer, 1983). However, there are
some linguists who tend to think that the concept of a dichoto-
my is too rigid a characterization to realistically portray the na-
ture of rhythm in human language. Ladefoged (1982) states:
Perhaps a better typology of rhythmic differences among lan-
guages would be to divide languages into those that have varia-
ble word-stress (such as English and German), those that have
fixed word-stress (such as Czech, Polish and Swedish) and those
that have fixed phrase-stress (such as French). But since there is
more than one factor that determines the nature of rhythm in a

given language there is no compulsion to have one or the other
of the stress-timed or the syllable-timed rhythmical bases
(O'Connor, 1973). However, one needs to understand what is
meant by a stress-timed or syllable-timed rhythm.
A stress-timed rhythm is the one in which stressed syllables
tend to recur at regular intervals of time and the syllables vary
considerably in length depending on whether they are stressed
or unstressed. On the other hand, a syllable-timed rhythm is one
in which each syllable tends to retain more or less the same du-
ration regardless of stress (Adams, 1979; Ladefoged, 1982;
Roach, 1983). In the stress-timed rhythm, only syllables receiv-
ing the primary stress stand out prominently, while the un-
stressed syllables are reduced and compressed in time to become
far less prominent. Unlike such uneven distribution of promi-
nence, in the syllable-timed rhythm, all syllables, stressed or
unstressed, receive a relatively even prominence; syllables take
approximately the same time, and the overall length of an utter-
ance depends on the number of syllables involved. In other
words, in this latter type of rhythm there is hardly any noticea-
ble reduction in the prominence of the unstressed syllables. It is
in light of the above-mentioned characteristics that English is
said to have a typically stress-timed rhythm, whereas Spanish is
said to have a typically syllable-timed rhythm.
At this stage, it is interesting to consider the possibility of
an underlying connection between stress and rhythm type and
the type of the vowel system in a given language. In English,
vowel quality and quantity fall heavily under the influence of
stress and this interaction is part of the dynamics of the vowel
system. The location of stress and its strength within the word
or sentence greatly influence the vowels both qualitatively and
quantitatively. In syllables with a primary stress, vowel quantity
(length) reaches its maximum and its quality is very distinct. In
syllables with a secondary stress or a weak stress, both quality
and quantity of vowels are reduced drastically. In unstressed
syllables, almost all the English vowels can be reduced in both
quality and quantity to the shortest vowels namely, [] or []
(Dalbor, 1969; Ladefoged, 1982; Dale and Poms, 1985).
Such a qualitative and quantitative process of vowel reduc-
tion is a typically characteristic feature of English, but very un-
CHAPTER 9 139
characteristic of a language such as Spanish. The above exposi-

tion seems to point in the direction of the plausibility of a con-
nection between the centripetal vowel system and the stress-timed
rhythm type, on the one hand, and a centrifugal vowel system and
a syllable-timed rhythm type, on the other hand. In case of Span-
ish, vowels tend to retain their relative quality and quantity,
regardless of stress, and if they never undergo any schwaization
or even vowel reduction, how does one expect the syllables to
be manifestly different in length and prominence? A univalent
system of vowels should undoubtedly yield a temporally uni-
form and univalent type of syllables, which is typical of a sylla-
ble-timed rhythm as in Spanish. By contrast, with a multivalent
system of vowels combined with a very pervasive tendency to-
ward schwaization, one should expect multivalent types of syl-
lables as is the case typically of English stress-timed rhythm. To
paraphrase it differently, if the vowel system of a given lan-
guage does not maintain long/short or tense/lax phonological
contrasts, if it does not have a schwa as part of its phonological
system, and if it does not tolerate schwaization or a tangible
degree of vowel reduction, it implies the presence of a synchron-
ic constraint on the extent to which stress can alter the quality
and/or quantity of its vowels. It is true that stress in Spanish can
somewhat change vowel quality and quantity, but the change
will still be confined to the phonetic domain. Thus, in Spanish,
vowels may phonetically be slightly longer/shorter or tens-
er/laxer, but the absence of phonological contrasts based on
those features will deny the language the potential for creating
syllables that are significantly different in length and promi-
nence.
This argument in favor of binding the rhythm type to the
vowel system does not mean that the vowel system is the only
factor that determines the rhythm type in languages; undoubted-
ly, other factors such as syllable structure (Dauer, 1983),
fixed/variable word stress and word/phrase stress (Ladefoged,
1982) are relevant in this regard. The most significant conclu-
sion drawn from the preceding discussion is that the nature of
the vowel system should be a factor to be seriously reckoned
with in the typological classification of speech rhythm, its study
and teaching. Those modifications to the traditional syllable-
timed and stress-timed rhythm types amount to a major change
that should be seriously considered in the development of the

approach and the techniques of teaching pronunciation and the
remediation of accent.
9.3. TONE AND INTONATION

In speech, there is always a continuous change in the fundamen-
tal frequency, which is auditorily realized as pitch also known as
the melody of speech. Languages use pitch in two essentially
different ways. If it signals semantic differences between words,
the languages are called tone languages. In Chinese, for instance,
the basic unit < ma > may have more than one meaning de-
pending on the rising, falling, falling-rising or level pitch it car-
ries. It is the pitch difference (or toneme to be consistent with
other eme-suffixed linguistic labels such as phoneme and graph-
eme) that triggers the semantic difference. Many of the African
and Asian languages fall into this category. Thus, a given lan-
guage, whose pitch pattern has no specific role in the semantic
shaping of words, but is rather used to signal a combination of
syntactic, semantic and attitudinal features of the utterance, is
called an intonation language.
9.4. BASIC PITCH PATTERNS

Pitch patterns are very vividly explained and schematically rep-
resented in terms of pitch height and pitch direction (Laver,
1994). The labels used to refer to different pitch levels are high,
mid, low, mid-high, mid-low etc. Pitch contour refers to the shape
and direction of pitch yielding different shapes such as rise, fall,
level, rise-fall, fall-rise etc. A combination of attributes from both
pitch-height and pitch direction produce the basic pitch patterns
whose recognition and production should be an essential goal in
any program for training in phonetics, in general, and pronunci-
ation, in particular.
No teaching of tone and intonation will be effective with-
out the mastery of the basic pitch patterns. An interesting aspect
of the basic pitch patterns is the cross-linguistic commonality in
their nuances. The falling pitch patterns, both low and high,
have the general purpose of expressing an utterance with a
sense of completeness so that the attention of the listener is no
longer required inasmuch as that particular utterance is con-
CHAPTER 9 141
cerned. A high fall usually indicates a more vigorous and deter-

minate notion of completeness and finality than the low fall
does. The rising pitch patterns, unlike the falling patterns, imply
a sense of the incompleteness of the utterance as if further in-
formation is expected from the speaker or a response is neces-
sary on the part of the listener.
The basic pitch patterns are:2
Figure 9.1. Basic pitch patterns in human speech in general.
9.5. CONSONANT CLUSTERS

Consonant clusters or so-called consonant blends should be dis-
tinguished from consonants that occur juxtaposed to each other.
The former is a combination that should structurally belong to
one syllable and is pronounced as one intact piece. The latter is
a combination that is spread over two syllables. Take the word
<catfish> which has a <tf>combination of two consonants,
2
For a demonstration of tone patterns in Chinese go to section 14.2.1/f.
but it is not a cluster because the <t> belongs to the first sylla-
ble and <f> belongs to the second. Compare the <tf> of
<catfish> with the <tr> of <contract> in which the <tr>
is one intact combination and belongs to one syllable. The <tf>
of <catfish> is linguistically termed abutting consonants as
opposed to <tr> of <contract> which is a consonant cluster
proper. This phonetic differentiation is quite important in train-
ing students in areas pertinent to pronunciation because the dif-
ference will stress the point that clusters, not abutting conso-
nants, are the real source of trouble (Odisho 1979a; 2003).
Based on two linguistic facts, there is a strong rationale to
include consonant clusters among the suprasegmentals. First,
they are at a minimum longer than one segmental sound. Sec-
ond, they can be a major source of mispronunciation and dis-
tinctive accent especially for those learners whose native lan-
guages do not contain clusters and they are planning to embark
on a language loaded with them. Japanese, for example, is a
typical language that is almost consonant cluster-free. Arabic is
also a language that has relatively few clusters compared to Eng-
lish. Consequently, Japanese and Arab learners of English do
exhibit serious problems with consonant clusters and impose
phonetic changes that represent their L1 phonotactic rules. To
avoid complex phonetic transcription, the changes in pronuncia-
tion will be kept as simple as possible. Also, interesting is the
fact that speakers of different languages handle consonant clus-
ters or abutting consonants differently. The prevailing rule is the
breaking up of the cluster usually by inserting a vowel to re-
arrange the syllabic structure of the word containing the cluster
or abutting consonants. The following are some of the most
common ways that learners employ to avoid a cluster produc-
tion. First, if the cluster is initial, some languages add a vocalic
element to the beginning of the cluster called a prosthetic or
anaptyctic vowel. This is attested in various languages including
Arabic, Hindi, Sinhalese (Odisho, 1978; Fleischhacker, 2000;
Jabbari, et al, 2012) among others, as in table 9.1, below.
CHAPTER 9 143
Language English Rendition

Word
Arabic <sport> <isbort>
Sinhalese <school> <iskool>
Hindi <spelling> <ispelling>
Table 9.1. Examples of initial consonant cluster breaking

with a prosthetic vowel.
Second, some languages break up the cluster or abutting for-

mation by inserting a vowel element called an epenthetic vowel
as in table 9.2 below for Korean and Farsi, among others.
Language English Rendition

Word
Farsi <fruit> <furut>
Korean <steam> <siteam>
Table 9.2. Examples of initial consonant cluster breaking

with an epenthetic vowel.
The epenthetic vowel can also occur in syllable-final positions as

well as across word boundary. It is quite common with Arabic,
Assyrian and Spanish speakers to avoid final clusters in two
ways: splitting the cluster with a vowel and creating an addi-
tional syllable as in pronouncing <barked> as barkid instead
of the normal [brkt] as speakers of Arabic, Spanish and Assyri-
an usually do. Some Hispanics completely drop the <-ed> suf-
fix. In fact, in the latter case, the tendency is so strong that it is
reflected in their orthographic spelling of the past and part par-
ticiple of <-ed> suffixed verbs (Odisho, 2007).
Speakers of cluster-deficient languages do usually break up
abutting consonants forming across the word boundary. For ex-
ample, speakers of Egyptian Arabic pronounce a phrase such
<white house> as whiti house.3 Korean ESL students tend to

pronounce <English Language> as Englishi languagi because
Korean does not even allow ending words with most consonants.
A similar tendency applies to Japanese. In an ESL instructional
film, a Japanese student pronounced <south gate> and <north
gate> as sausi geti and norsi geti, consecutively.4
Three questions are still relevant in the context of this dis-
cussion of the role of consonant clusters in accent generation.
First, how serious a source of accent can this linguistic aspect
be? Second, what vowel quality is inserted to break up the clus-
ter? Third, where to insert the vowel to eliminate the cluster?
There is a straightforward answer to the first question. It all de-
pends on the linguistic gap between two languages in terms of
cluster-deficiency and cluster-richnessthe wider the gap the
greater the difficulty. It is in terms of this gap that some lan-
guages do not tolerate even combinations of abutting conso-
nants. As for the second question, the answer lies in the phonol-
ogy of each language and the phonotactical rules that govern.
For example, if a language, such as Spanish, has a centrifugal
vowels system without a schwa vowel [] or any short lax vowel
such as [], [] or [] the inserted vowel tends to be [i] a tense
one. Finally, in response to the third question, it also depends on
the phonotactical rules of the two languages involved, especially
those rules that govern the syllable structure formation of
words.
Any combination of consonants, whether in the form of
clusters or abutting consonants, can be a source of serious ac-
cent for learners whose native languages are consonant-cluster
deficient. A humorous but authentic anecdote pertinent to the
breaking up of word-initial clusters and the subsequently seman-
3
The generic vowel is used for simplicity. The exact phonetic quality of an
epenthetic vowel in a particular language may vary depending on segmental and
prosodic factors, such as the quality of the surrounding consonants, the quality
of other vowels in the word, and the position of the epenthetic vowel within the
word. (Repetti, 2012).
4
Obviously, Japanese natives replace the <th> = [] and [] sound with
[s] and [z], consecutively.
CHAPTER 9 145
tic confusion is associated with the deposed dictator Saddam

Hussein after he invaded Kuwait in 1990. Kuwait in Iraqi Arabic
is either pronounced [] or []. After the invasion, a
foreign journalist interviewed Saddam Hussein with the pres-
ence of his interpreter because his English was well known to be
of very low proficiency due to his poor education. During the
interview, the journalist made a statement which I do not recall
exactly, but it was lexically and grammatically as follows: We
should not equate this situation with that of the West Bank.
Saddam jumped ahead of his interpreter and said: Tell him (the
journalist), I did not mention Kuwait. Obviously, there was no
mention of Kuwait in the statement of the journalist, but Sad-
dam mistook the word <equate> = [] for his own pro-
nunciation of the name of Kuwait = []. The interpreter
had no choice but to translate his masters extraneous interjec-
tion because he did not want to lose his life after the interview.
The journalist was bewildered at the translation and Saddam did
not understand what had happened.

No study of pronunciation or the teaching of it is complete and
comprehensive without a thorough covering of the supraseg-
mental features in the speech of any language. Unfortunately,
the traditional and the non-linguistic approaches to teach pro-
nunciation focus almost exclusively on the segmental features
(consonants and vowels) with hardly any attention paid to the
long features that run through those segmental constituents and
bind them together into more semantically expressive stretches
of speech. Without the study of stress placement one fails to
recognize which syllable is prominent; without recognizing the
rhythm, the organization of beats within the stretch of speech is
lost; and without being aware of intonation one fails to appreci-
ate the melodic difference across languages. In sum, if the su-
prasegmentals are not a primary part of a study of pronuncia-
tion, the resulting accent in L2 is expected to be distinct and
telling.
CHAPTER 10: THE ROLE OF ARTICULATORY
SETTINGS IN PRONUNCIATION AND ACCENT

After the pioneering work of Honikman on articulatory settings,
(Honikman, 19641), many researchers elaborated on this con-
cept, its characteristics and linguistic and non-linguistic rele-
vance (Laver, 1980, 1994; Esling & Wong, 1983; Lowie and Bul-
tena, 2007). Unfortunately, different names such as phonetic
settings, voice quality, voice quality settings, paralinguistic fea-
tures (Pennington and Richards, 1986) were assigned to it.
These features, which Catford calls the initiatory, articulatory and
phonatory prosodies (1994) pervade throughout speech in a con-
tinuous and/or recurrent manner and characterize the speech of
a group of people with a distinctive overall impression. In light
of this, Lavers coinage of phonetic settings seems to be more
comprehensive in denoting the phenomenon to include initiato-
ry, articulatory and phonatory features. Laver defines a setting
as a featural property of a stretch of speech which can be as
long as a whole utterance; but it can also be shorter, characteriz-
ing only part of an utterance, down to a minimum stretch of
anything greater than a single segment (1994: 115). This defi-
nition of phonetic settings with the use of the words stretch
and utterance brings it under the rubric of suprasegmentals the
only difference emerging being the fact that some features of
phonetic settings may be as long as the speech act (or discourse)
is maintained. In other words, some features of phonetic settings
may readily qualify as phonetic features of discourse; typically
1
Articulatory Settings is treated as singular. Honikman did not publish
much, but this paper is one of the most brilliant pieces of phonetic literature.
147
in this regard is the retroflexion in some sub-continental Indian

languages.
As discussed above, articulatory settings represents a co-
herent combination of some of the most salient features in a
language that may persist throughout stretches of speech of dif-
ferent lengths including discourse. In the study and application
of the concept of articulatory settings, the focus will not be on
the idiosyncratic manifestations of the characteristics of the set-
tings. Conversely, the focus will be on the collective manifesta-
tion of a habitual phonetic orientation by all or most of the
speakers of a given language within the native language envi-
ronment or its extension into the target language environment.
To illustrate, if an individual nasalizes his speech because of a
physical deficiency or improper articulatory habit, the articula-
tory settings is regarded an idiosyncratic one. Unlike the isolat-
ed nasalization instance, if the speakers of a given language
manifest nasalization as a very consistent feature in their vowel
system, such as in French, and transfer the feature with them
and in their learning of other languages then nasalization be-
comes a primary feature of French articulatory settings and it,
thus, should be the focus of attention in the teaching of pronun-
ciation to French learners of L2 to avoid imposing unneeded
nasalization. Conversely, learners of French language should be
instructed to acquire its nasal vowels. By the same token, vowel
harmony in Turkish and Hungarian will certainly become a
component of the articulatory settings of those two languages.
The articulatory settings jointly represents the most charac-
teristic consonantal, vocalic and prosodic features that are in-
grained in the overall speech production in a given language.
They help generate the speech at its most authentic form, color
it throughout with those features and give it its most genuine
native impression. In the overall approach to teaching pronunci-
ation, there are different levels of proficiencyranging from the
most elementary to the near-native or native levelthat are tar-
geted depending on the overall objectives of a program or an
individual learner. In targeting the highest level of proficiency,
which is the native proficiency, the learner should not only at-
tempt to perfect the phonological distinctions, but also to master
any type of phonetic distinctions and characteristics of the tar-
geted language. Any learner whose objective is the native lan-
CHAPTER 10 149
guage proficiency in L2 should gradually progress from a con-

scious and belabored impersonation of L2 pronunciation to a
subconscious and automatic production of it first at the phono-
logical level then at the phonetic level with both segmental and
prosodic features. Native language proficiency is not confined to
the mastery of phonological contrasts alone. All or most of the
refined phonetic features should be perfected to the extent that
the learner is indistinguishable or at least hardly distinguishable
from a native speaker. To reach this level of proficiency both the
instructor and the learner must be qualified to play their roles
each in his own way. The instructor must be highly knowledge-
able both theoretically and practically; must be aware of the
most distinctive and exclusive features of L1 and L2; and must
possess a set of strategies to implement his approach. As for the
learner, he must be highly motivated; must work hard; and must
have ample exposure hours through classroom practice and real-
life practice in the authentic environments of L2.
It is noteworthy that not every learner may succeed in at-
taining native language pronunciation proficiency in L2. Cer-
tainly, every learner has the potential to improve proficiency in
L2 pronunciation, but most of those who excel tend to have
some sort of linguistic aptitude and high motivation. However,
regardless of the level of achievement in L2 pronunciation, the
learner must have enough opportunity to practice L2 in both
perception and production. It is the ample exposure time to au-
thentic L1 speech that makes a native speaker be a native
speaker. Consequently, if a learner, especially an adult, aims at
nearing or matching the proficiency of a native L2 speaker, he
should go through the same or similar linguistic experience the
native went through. In the following subsections a survey is
made of the most salient features of the articulatory settings of
some languages which are drastically different from each other.
If a learner aims at acquiring high level proficiency in the pro-
nunciation of anyone of those languages he should seriously
consider the mastery of the following most salient features of
the articulatory settings of each one of them.
10.2. SALIENT FEATURES OF ARTICULATORY SETTINGS OF SELECTED

LANGUAGES
The selected languages have been strictly limited for obvious
reasons, foremost of which are the limited languages with which
I am familiar as well as the limited space allocated for each
chapter. For each language, there will be a selected combination
of the most salient segmental and suprasegmental features that
actively mold the articulatory settings of the given language.
Stated differently, the articulatory settings functions as the dis-
tinctive mark that all speakers of a given language share and
manifest when they speak it. With regard to the latter statement,
it is imperative that any teaching of L2 to those adults should
take into consideration not only the articulatory settings of their
L1, but, equally importantly, the articulatory settings of L2. In
the first instance, the instructor should try to block the features
of the L1 settings from seeping through into the L2, while in the
second instance, the instructor should enable learners to absorb
and assimilate the features of L2 settings.
10.2.1. English Articulatory Settings

In identifying the most salient features of the English articulato-
ry settings, the focus will be on three general components: con-
sonants, vowels and the impact of vowel dynamics on rhythm.
All three domains mold the articulatory settings, but to highlight
the characteristic of each domain helps clarify the overall con-
figuration of the settings.
10.2.1.1. Salient Consonantal Features

In highlighting the salient consonantal features of English, it is
advisable to consider them in terms of natural classes as much
as possible. For example, English /p t k/ are voiceless aspirated,
therefore speakers of languages in which these plosives tend to
be predominantly unaspirated (Spanish, French, Italian, Greek)
should consider the difference and learn how to aspirate the
voiceless plosives. /b d g/ plosives tend to be predominantly
voiced, fully or partially, in all positions. Consequently, learners
of English, such as Germans. should be instructed not to devoice
them in final position as they do in their native language. Eng-
lish /k g/ are velar plosives; it is, therefore incumbent on speak-
CHAPTER 10 151
ers of languages such as Persian, Hungarian, Greek, Turkish,

Modern Assyrian (Aramaic), etc., whose plosives tend to be pal-
atal /c /, not to replace the English velars with their own pala-
tals. Unfortunately, they often do so and, hence, they enhance
their phonetic accent. English affricates / / may be the cause
of phonological accent since many languages have only one of
the two affricates. For instance, Germans have // only, while
speakers of Western Arabic dialects (Egyptian, Syrian, Lebanese)
have neither of them; they tend to replace them with the frica-
tives / /; in the case of Egyptian Arabic, the // is replaced
by /g/. As for fricatives, English has many of them most of
which are not specifically problematic for a wide variety of for-
eign learners of English except for the pair /, /. However, na-
tives of some languages may have problems with specific Eng-
lish fricatives in which case the problems should receive selec-
tive attention. Typically, Hispanic learners of English should be
seriously instructed not to replace a voiced labiodental fricative
/v/ with a /b/ or even a //. The substitution results in a seri-
ous phonological problem since the /v/ is a relatively common
sound in English. An equally serious phonological problem for
Hispanics is the failure to recognize and produce the English /z/
as opposed to /s/. Filipino learners of English should be cau-
tioned against replacing an English /f/ with an unaspirated /p/.
Equally, Hindi, Urdu, Farsi and Assyrian speakers, among oth-
ers, should guard against replacing English /w/ with a voiced
labialdental fricative /v/, a labialdental approximant // or a
labialpalatal approximant //. Typically, most Greek learners of
English may have a conspicuous phonetic or even phonological
accent when they replace the English pair of alveolar fricatives
/s/ and /z/ with their alveolo-palatal fricatives [ and [ , re-
spectively, since Greek does not have the more common pairs of
fricatives [s, z] and [, ]. This type of replacement amounts to
one of the most characteristic features of Greek pronunciation of
other languages with the sibilant sounds /s, z, /. For Russians,
the voiced glottal fricative /h/ can be a source of phonetic ac-
cent as they tend to replace it with a voiceless velar fricative //
or voiceless uvular fricative //. In learning some languages,
such as Arabic, the failure of Russians to pronounce /h/ may
amount to a phonological accent as Arabic has a distinct and
quite popular contrast between /h/ and //.
As for the approximants, /l/ and /r/ they can be seriously

problematic, especially for Far Eastern Asians such as Japanese,
Chinese and Koreans. The confusion between these two approx-
imants amounts to a major phonological and phonetic problem.
The confusion is too ear-catching to be ignored; it is, indeed,
one of the most classic sources of their interlanguage accent.
Speaking of the English /r/ in particular, in both RP British and
GAE, its approximant nature makes it a very rare type of r
sound compared to taps, flaps and trills. Although the failure to
master the typical approximant English /r/ and replace it with a
tap, flap or trill <r> rarely results in a phonological problem, it
does, indeed, amount to a significant source of noticeable pho-
netic accent.
Another important aspect of the consonantal system of Eng-
lish is its complex array of consonant clusters in all three struc-
tural positions in a word: initial, medial and final. If, for in-
stance, the Japanese learners of English fail to master the pro-
duction of English clusters, they will be speaking it with a heavy
phonological and phonetic accent. They will not only be dis-
torting the pronunciation of individual words, but will also be
heavily impacting its stress-timed rhythm.
10.2.1.2. Salient Vowel Quality/Quantity Features

In this study, it is believed that the vocalic system constitutes
the most salient feature of English articulatory settings. This
saliency is attributed to two major factors: a) the wide range of
vowel qualtity (combined quality and quantity) contrasts in Eng-
lish; b) the prominent role of the rules of vowel dynamics, espe-
cially in vowel qualtity reduction and heavy schwaization.
Any learner of English coming from a language background
in which vowel qualtity range is narrow (such as the three
through five-vowel systems), will face a major phonet-
ic/phonological problem because of the resulting vowel qualtity
discrepancy. With a five-vowel qualtity system as in Spanish,
some of the most frequently used words of English are semanti-
cally confused due to mispronunciation. Examples of such con-
fusions have been repeatedly cited throughout this book. The
English vowel //, is so characteristic of English that any mis-
pronunciation of it may not only result in a phonological accent,
but also in an easily distinguishable phonetic accent; for in-
CHAPTER 10 153
stance, Israeli or Polish speakers usually tend to replace the //

vowel with an // vowel-like making the pronunciation of Eng-
lish so unEnglish.
What makes the English vowel system even further difficult
and challenging is the dynamics of vowel qualtity change
through vowel reduction, especially in the form of a schwa [].
Consequently, an L2 learner of English does not only face the
problem of mastering the wide range of vowel qualtity, but also
the strong trend towards vowel reduction. Perhaps the group of
English words that most characteristically demonstrates the dy-
namics of vowel qualtity change in English is a set of approxi-
mately fifty words (table 10.1 below for examples) that appear
in two or more forms known as strong form and weak form. For
instance, the strong form of <and> is //, but it has at least
three other weak forms such //, // and //.2 The strong
form is habitually of minimum circulation since it has to occur
in an emphatic form. It is the weak forms of <and> that are of
more frequent recurrence. The often schwa-based weak forms of
this group of words and the weak syllables of other words are
the linguistic units that collectively govern the overall rendition
of English vowel system and its general rhythm type.
The weak/strong forms of this group of words amounts to
a major area which should receive serious attention in the teach-
ing of English pronunciation. No adult learner of English can
bring his pronunciation near that of a native speaker without
the proper mastery of the weak forms of those words. The mas-
tery of the weak forms of such words seriously helps with the
correct rendition of the overall rhythm of English.
2
The so-called syllabic n.
Word Stressed (Strong) Form Unstressed (Weak) Form

A
An n; n
Been bin ; bn bn
Can kn kn ; kn
For f ; f f ; f
Had hd hd ; d ; d
Shall l l; l
Some sm sm ; sm
The (V.) ; (C.)
Would wd wd ; d ; d
Table 10.1. Weak and strong forms of some of the common

words in English.
10.2.1.3. Salient Vowel Dynamics and Rhythm

English is a language that is governed radically by excessive
vowel reduction in the absence of primary stress. Most of the
unstressed syllables are dominated by the short vowels // of
which the schwa [] is the most frequent. It is these short vow-
els that reduce syllables to minimum length and distinction, thus
allowing them to be glossed over rapidly and with minimum
prominence. The speed with which the unstressed syllables are
uttered coupled with their minimal prominence help the
stressed syllables stand out. It is the enhanced prominence of the
stressed syllables that justifies labeling English rhythm as a typi-
cal stress-timed rhythm. The enhanced prominence of stressed
syllables is not just attributed to receiving the primary stress,
but also because the other syllable or syllables within a word
tend to become short and reduced in their vowels. To demon-
strate, the first syllable in the word <comfortable> =
[ftbl] is prominent not exclusively because it is stressed,
but equally because the other syllables are significantly reduced.
There is no way for any learner of English as L2 to master
its rhythm without possessing a skill for vowel reduction and
schwaization. All instructors of English pronunciation to natives
of other languages should make the teaching of vowel qualtity,
vowel reduction at syllable and sentence levels a foremost con-
CHAPTER 10 155
sideration. Without an intensive joint effort in this regard, it is

almost impossible to arrive at a satisfactory native-like imper-
sonation of the articulatory settings of English.
10.2.2. Spanish Articulatory Settings

In numerous Spanish-English bilingual situations, especially in
the United States, features of Spanish such as the strict qualtity
limitations on its vowel system, the absence of tense-lax vowel
contrasts, the non-schwa nature of its centrifugal vowel system;
the absence of certain consonantal sounds such as [v, z, ] cou-
pled with the phonological contrast between its tap <r> = []
and its trilled <rr> = []; and its syllable-based rhythm jointly
constitute the most salient and commanding characteristics of
the articulatory settings of Spanish. They will not only color the
whole pronunciation of Spanish, but will also influence their
pronunciation of other languages.
To begin with, the Hispanic learner of English, or any
learner of English whose language uses the Latin alphabet,
should not be fooled by the familiarity of the alphabet because
in actual application the same symbols may be used to signal
drastically different sounds. Let us consider the comparative
example below of the word <color> in English and Spanish as
a means of displaying a synopsis of the striking differences be-
tween the two languages that are expected to be the source of
serious phonological and phonetic accent at both segmental and
suprasegmental levels.
a) The <c> grapheme in English is pronounced as a voice-
less aspirated velar plosive [k], whereas in Spanish it is a
voiceless unaspirated velar plosive [k];
b) The two <o> vowels in Spanish retain the traditional

vowel quality of [o , whereas in English the traditional
vowel quality completely drifts away in both instances into
[] and [] vowels, respectively;
c) The <l> in English under the influence of the [] tends

to be somewhat velarized or what is traditionally identified
as dark L, whereas in Spanish it has no velarization (i.e., it
remains a clear-L); and,
d) The <r> in English is a retroflex approximant that coa-

lesces with the preceding schwa vowel [ to produce an r-
colored vowel [].
At the suprasegmental level, the retention of the quality of the

[o] vowels in Spanish grants the whole word a lip-rounding fea-
ture or a prosody of lip-rounding, according to Firthian linguis-
tics, which phonetically contrasts strikingly with the lip-
neutralization prosody of the English pronunciation. Besides, the
stress in English falls on the initial syllable, whereas in Spanish
it falls on the final syllable. The cumulative phonetic and phono-
logical differences (at both segmental and suprasegmental lev-
els) afford a very good example of the considerable differences
between the two languages which are even further complicated
by the orthographic traditions.
Figure 10.1. General tendency of stress position in English

and Spanish words.
Thus, if in a simple comparison of two words, which are identi-

cal in orthography, yield considerable phonetic and phonologi-
cal differences both segmentally and suprasegmentally, then any
learner of each others language should expect some serious dif-
ficulties in the domain of pronunciation. Let us consider some
such difficulties.
10.2.2.1. Vowel System

In a previous chapter the vowel system in English was identified
as centripetal which turned out to be in sharp contrast with the
centrifugal system of Spanish in both quality and quantity. The
difference in quality is a straightforward significant problem
which is summarized in the presence of twelve (12) vowel quali-
ties in English vs. five (5) in Spanish. It is a serious source of
CHAPTER 10 157
both phonological and phonetic accent. Any instructor can read-

ily identify the general qualitative differences; however, what is
more challenging to identify and teach when vowel quality is
intertwined with vowel quantity (length). Vowel length is a
term that is somewhat controversial in that some phoneticians
prefer to portray those differences in terms of laxness and tense-
ness. To maintain a level of simplicity in handling the feature
quantity, the dichotomy of short vs. long is preferred; neverthe-
less, this preference should not exclude the use of lax vs. tense
whenever the need arises.
In light of these simplifying propositions, English has cer-
tain pairs of vowels which are distinctive (i.e., are contrastive)
on the basis of length and they cause a major problem for His-
panics since length is phonologically irrelevant (not distinctive)
in their language. Another dimension that further aggravates
this problem is the high frequency of occurrence of those pairs
of vowels in English. Accordingly, there are virtually thousands
of pairs of words which in English are distinguished by vowel
length, but when pronounced by Hispanics, the length differen-
tial is eliminated thus reducing the pair to a single word pro-
nounced according to the specifications of the single Spanish
vowel. Let us consider some of these pairs in table 10.2, below
and demonstrate the Hispanic rendition of those pairs.
Contrastive Pair Phonetic Transcription Spanish Rendition

<sit> vs. <seat> [] vs. [] []
<pull> vs. <pool> [] vs. [] []
<bet> vs. <bait> [] vs. [] []
<boat> vs. <bought> [] vs. [] []
Table 10.2. Comparison of English and Spanish vowels in

qualtity.
These examples serve to highlight one of the most challenging

if not the most challengingproblems for Hispanic learners of
English. At times, the confusion can be extremely embarrassing
when some obscene or taboo words are involved such as those
mentioned in section 8.4.1. The confusion is not only confined
to phonology; it rather extends to serious phonetic mispronunci-
ation. One can easily identify Hispanics by their accent in the
rendition of words such as <kidding> and <building> as

[kding] and [bilding] instead of [kd] and [bld], respec-
tively.
For a thorough phonetic study of sounds and sound sys-
tems, a person has to go beyond phonology because the latter
functions at the level of abstractions which conceal many pho-
netic details. From the pedagogical and educational perspective,
to use a phonetic yardstick in assessing the acceptable targeted
proficiency in pronunciation may be too strict, and perhaps too
ideal, to be used. Instructionally, however, the intention in
teaching any L2 pronunciation should be to minimize semantic
confusion as much as possible. Practically, in L2 teaching the
intention should be to attain as near-native pronunciation as
possible that dispels any confusion in meaning and reduces to
minimum the demand for semantic clarification on the part of
the listener.
10.2.2.2. Rhythm Types

When one specifically compares the rhythm types of Spanish
and English, they typically represent a syllable-timed type vs. a
stress-timed one, each of which requires some explanation.
A stress-timed rhythm is the one in which stressed syllables
tend to recur at regular intervals of time relative to each other.
The syllables vary considerably in prominence relative to each
other depending on whether they are stressed or unstressed or
whether their vowels are reduced or enhanced. In the stress-
timed rhythm, only the syllables receiving the primary stress
stand out prominently, while the unstressed syllables are re-
duced and compressed in time to become far less prominent.
Therefore, the length of an utterance in terms of time, in stress-
timed rhythm depends on the number of stressed syllables not
the overall number of syllables within the utterance. The man-
ner in which time is distributed over the syllables is uneven in
that the speaker dwells longer on the stressed syllables, whereas
he glosses over the unstressed syllables with minimum time. The
unstressed syllables serve as time savers for the speaker through
the facility of vowel reduction and even, at times, consonant
reduction or dropping. Vowel reduction and/or consonant drop-
ping are most vividly displayed in the so-called function words
in English typically represented by articles, prepositions, con-
CHAPTER 10 159
junctions, etc. For instance, the strong form of <and> is [],

which may be reduced to only a syllabic [] as in <bake and
take> usually pronounced bake n take.
On the other hand, a syllable-timed rhythm is one in which
each syllable tends to retain more or less the same duration re-
gardless of stress (Ladefoged, 1982; Roach, 1983). Unlike the
uneven distribution of prominence in stress-timed rhythm, in the
syllable-timed rhythm, all syllables, stressed or unstressed, re-
ceive a relatively even prominence; syllables take approximately
the same time, and the overall length of an utterance depends
on the number of syllables involved. In other words, in Spanish
rhythm type, there is hardly any noticeable reduction in the
prominence of the unstressed syllables nor is there significantly
noticeable enhancement of the stressed syllables. It is in light of
the above-mentioned characteristics that English is said to have
a typically stress-timed rhythm, whereas Spanish is said to have
a typically syllable-timed rhythm. It is, indeed, the restriction on
vowel reduction in unstressed syllables coupled with a re-
striction on vowel enhancement in stressed syllables mentioned
earlier on for Spanish that were identified as the main rationale
for determining the nature of rhythm in Spanish. All those basic
differences at the level of words as well as sentences between
the two rhythm types may be demonstrated schematically in
figure 10.2, below.
Figure 10.2. Comparative differences in syllable prominence

in syllable-timed (Spanish) vs. stress-timed (English) rhythm
types.
In comparing the pronunciation of the word <phenomenon> in

English and <fenmeno> in Spanish, each oval shape in the
above diagram indicates a syllable and the size of the oval shape
indicates the prominence that the syllable receives in pronuncia-
tion. Based on these clues, three syllables of the Spanish pro-
nunciation are of the same prominence with only slight increase
in prominence in the stressed syllable. With the English pronun-

ciation, two of the syllables are of minimum prominence be-
cause they contain the reduced vowel schwa []. The last sylla-
ble has slightly greater prominence because its vowel is not re-
duced3 and the syllable has a CVC structure. Obviously, the
greatest prominence is associated with the syllable that receives
the primary stress. This type of pictorial representation of
rhythm types helps many vision-oriented learners to better grasp
the differences in rhythm types. The pictorial representation
may also assist learners in noticing that in pronouncing the
Spanish <fenmeno>, the person takes four equidistant steps,
whereas in the pronunciation of its English cognate, he takes
one small step followed by a large step followed by another
small one and end with a modest (medium) step. For a demon-
stration of the syllabic and rhythmic differences in sentences,
notice the schematic representations in figure 10.3 attached to
the following two sentences from Hadlich et al (1968).
Figure 10.3. Patterns of syllable arrangement in syllable-

timed rhythm (Spanish) vs. stress-timed rhythm (English).
It is quite evident from the schematic representations in figure

10.3, above that there is only minimal difference in the promi-
nence of all the seven syllables in the Spanish sentence except
for the second and the sixth syllables which are slightly more
prominent because of the placement of stress. Conversely, in the
English sentence, there are four reduced syllables with minimum
prominence conjoined with two syllables with distinctly en-
3
At least in GAE as opposed to [] in RP.
CHAPTER 10 161
hanced prominence. It is the even arrangement of syllables in

Spanish that determines its syllable-timed rhythm type which is
popularly known as the machine gun rhythm or the so-called
staccato rhythm. This impressionistic feeling of a machine gun
rhythm of Spanish is also, at times, interpreted as a rhythm with
faster tempo or speed. In more popular terms, people think that
Spanish is spoken faster than English or other languages. Once
again this is not exactly true; it is yet another impressionistic
feeling resulting from the nature of the evenly-structured sylla-
bles with each taking the same or similar time thus making the
transition time across syllables equally even. It is this unvaried
mode of cross-syllable and cross-word transitions that gives the
impression of faster tempo as opposed to the varied mode of
cross-syllable and cross-word transitions in English.
In sum, if English speakers speak in words, Hispanics speak
in syllables and when Hispanics speak in words, Englishmen
speak in phrases and clauses. There is yet another additional
false impression which is common among all beginner learners
of L2 in that they think the targeted speech is much faster than
their L1 speech. This is a false impression; L2 speech sounds
faster to beginners because the decoding skill is slower.
The above discussion leads to a major conclusion in that
rhythm types, especially inasmuch as the dichotomy of syllable-
timed vs. stressed timed is concerned, depend largely on the
nature of the vowel system in each language. A language, whose
vowel system imposes restriction on the qualitative and quanti-
tative diversity of its vowels, is not expected to breed qualitative
and quantitative diversity in its syllable structures. These condi-
tions are valid for Spanish because each of its five vowels has
defined quality and quantity that only slightly change under
most linguistic contexts. It is exactly the opposite of what vowel
quality and quantity undergo in different linguistic contexts in
English. One then asks: how could a vowel system that does not
allow quality and quantity reduction or enhancement in its
vowels to have syllables that are reduced or enhanced in quality
and quantity? The natural and axiomatic conclusion is that a
vowel system whose units are uniform in quality and quantity
should only yield syllable structures that are, more or less, uni-
form. This explains the syllable-timed rhythm type of Spanish
vs. the stress-timed of English.
10.2.3. Arabic Articulatory Settings

Naturally, the differences between the articulatory settings of
English and Arabic are expected to be quite striking since the
two languages belong to two language families that are drasti-
cally apart. Except for the general claim that both languages
have their rhythmic systems generally categorized as stress-
timed, their vowel and consonantal systems are hugely different.
10.2.3.1. Consonantal System

In the area of consonants, it is the English learners of Arabic
that will have far more pronunciation problems than the Arab
learners of English. The consonantal system of Arabic is particu-
larly complicated especially because of nine back consonants:
the uvulars [ ], the pharyngeals [ ] and the emphatic [
] which are some of the toughest sounds to handle in any
type of phonetic orientation for L2 learners of Arabic. The uvu-
lars [, ] are attested in other languages, but the rest are al-
most exclusively typical of the Semitic languages. Besides, the
four emphatics [ ] have direct phonological contrasts
with their plain counterparts [ ]; therefore, any failure to
produce the emphatics will leave the foreign learner with no
option other than replacing them with their plain counterparts
and subsequently cause serious phonological accent coupled
with semantic confusion. The situation may be worse when any
of the plain counterparts does not exist in learners L1 inventory,
as is the case in Farsi and Urdu where the Arabic [ ]and [ ]
are rendered [s]. Unlike the emphatics, the uvulars and pharyn-
geals do not have plain counterparts, but the failure to articulate
them properly will fill their slots with other sounds with which
those uvulars and pharyngeals will be in contrast. For instance,
the word for <eye> in Arabic is < > = [] and if the voiced
pharyngeal fricative [] is mispronounced it is usually replaced
with a glottal stop [] which changes the word into <[ = >]
meaning <where>. Similarly, the voiceless unaspirated uvular
plosive [q] is predominantly rendered a /k/ which will also
cause phonological accent and transform words such as <= >
[] meaning heart into <[ = >] meaning dog.
CHAPTER 10 163
10.2.3.2. Vowel System

It is with the vowel system of English that Arab learners face
some serious difficulties not so much because of quantity
(length) difference, but rather because of a wide range of quality
difference. MSA, has the simplest vowel system of three vowel
qualities /i a u/ that are doubled by lengthening. Obviously,
there is some quality difference between the short and the long
version, but this is not a major issue. However, unlike Spanish,
the existence of short vs. long contrasts in Arabic is helpful in
distinguishing English minimal pairs of <did> vs. <deed>
and <pull> vs. <pool>. For example, in Arabic, the short
vowels of /a, i, u/ as in < >, < > and < >have their
long counterparts as in <>, < >and <>, respec-
tively. Conversely, the restricted vowel quality in Arabic is det-
rimental in distinguishing an [] from an [ei] as in <bet> [bt]
and <bait> [beit], for instance. What potentially enhances the
vowel system of MSA is its enrichment under the influence of
the local dialects most of which have developed the mid vowels
of [e or ] and [o]. This addition somewhat minimizes the dras-
tic qualitative difference.
Vowels in Arabic do not have much qualitative diversity;
therefore, the system does not pose itself as a difficult one.
However, foreign learners of Arabic coming from a language
background with a qualitatively rich vowel system, such as Eng-
lish with a strong tendency toward vowel reduction, should
avoid imposing their rich vowel system on the pronunciation of
Arabic. They should shrink the broad domain of their vowel di-
versity to essentially three basic qualities / / with long and
short versions and refrain from forcing unwanted vowel reduc-
tions. Even though the pronunciation of the short versions of
Arabic vowels may be, somewhat, qualitatively different from
their long counterparts [ a], the difference, [ ], is too
marginal to be detected by phonetically unsophisticated people.
What the learners should do is to maintain the quality of Arabic
vowels whether short or long and avoid any insertions of
schwas. One of the most characteristic sources of accent for Eng-
lish learners of Arabic is the heavy imposition of vowel reduc-
tion, especially the schwaization of short Arabic [] or the so-
called . This vowel has a high frequency of occurrence in
Arabic; in fact, it dominates the overwhelming majority of past
tense forms of verbs. It is highly characteristic of the English

learners of Arabic pronouncing verbs such as < [ > darasa]
(studied), < [ > kasaba] (gained), <
[ > kataba]
(wrote), etc. as [rs] [kasb], [katb] which causes a
major distortion of the overall pronunciation of Arabic and the
rendition of its rhythm. In a nutshell, it almost imposes the Eng-
lish rhythm on Arabic.
10.2.3.3. Suprasegmentals
Because Arabic does not have a vowel system with extensive
vowel qualities or allow considerable vowel reduction (as is the
case in English), it is quite natural to expect noticeable differ-
ences in rhythm even though both English and Arabic are asso-
ciated with stress-timed rhythm type. In Arabic the rule of as-
signing stress to the long syllable is a very powerful prosodic
feature that causes pervasive interference with the correct
placement of stress in English, hence, generating perceptible
accent. For instance, word patterns with <-ate, -ify, -ize> end-
ing words such as <negotiate, notify, recognize> that have
more than two syllables never receive stress on the final sylla-
ble. Regardless of where the position of stress in words with
those suffixes is, Arab learners have a very strong inclination to
shift the stress to the last syllable. Notice the following examples
in table 10.3:
English Stress Arabic Stress

Rendition Pattern Rendition Pattern
<simplify> <simplify>
<terrify> <terrify>
<alternate> <alternate>
<educate> <educate>
<emphasize> <emphasize>
<dramatize> <dramatize>
Table 10.3. Systematic misplacement of stress by Arab

speakers of English.
There is yet another cause for misplacement of stress in English

by Arabs that seems to be a self-inflicted phonetic sin. The
CHAPTER 10 165
short vowels of Arabic are marked with diacritics over or under

the consonants in the following manner: <>, < >and
< > known as <>, < >and < >which roughly
designate the quality of the short vowels [a], [u] and [i], respec-
tively. In most Arabic orthographic texts, these vowels are not
marked. When foreign languages are transliterated in Arabic by
native Arabs, there is a very robust tendency to replace the
<>, < >and < >with the long vowel counter-
parts <>, < >and <>. Once this vocalic transformation
takes place in its visual form through publications and/or its
perceptible form on radios and televisions, the Arab reader of
those transliterated words pronounces them with long vowels
the result of which is not just a significant change in the rendi-
tion of vowels, but also a major shift in the assignment of the
primary stress resulting in distinct accent generation identified
elsewhere as word inflation vs. word deflation as demonstrated in
section 8.3.2, above. This type of asymmetrical vowel length
change between the two languages in their rendition of the same
unit, and the subsequent rhythm change, is one of the most pri-
mary sources of mispronunciation and accent by Arab speakers
of English (Odisho, 2013).

The concept of the articulatory settings is extremely helpful in
teaching pronunciation. Any learner of an L2 who is able to
identify and master the most salient phonetic and phonological
characteristics of the targeted language is undoubtedly on his
way to secure quality pronunciation. The success of such a lin-
guistic mission may often require expertise and guidance on the
part of the instructor. Instructors who do not have the profes-
sional knowledge and expertise in the science of pronunciation
may not be aware of the concept of articulatory settings, and
hence avoid highlighting its pedagogical significance.
CHAPTER 11: PRINCIPLES OF A MULTICOGNITIVE
APPROACH TO TEACHING PRONUNCIATION

Right at the outset, the reader has to bear in mind that the cog-
nitive and sensory principles imply multicognitive and multisen-
sory guidelines and considerations. The cognitive perspective
essentially implies that speech, at large, and pronunciation, in
particular, are the function of the brain whether in childhood or
in adulthood prior to being the responsibility of the speech or-
gans. At birth, the human brain is already biologically endowed
with incredibly efficient systems that have immense power to
store information in the utmost economic manner. These sys-
tems, of which language is only one, are responsible for the
smooth biological and social survival of human beings.
In light of the above explanation, any normal child has a
golden opportunity to naturally, automatically and effortlessly
internalize the system of the language of the community in
which he grows up, but not necessarily in which he is born. The
latter statement implies that a child might be born in one lan-
guage community, but was then raised in a different language
community; it is the latter language that counts as his L1 if the
exposure to the former community ceases. This natural process
is linguistically known as acquisition. Under normal conditions
once the process of acquisition of a language is completed, one
cannot undo it and can hardly be repeated as an adult with a
second language (L2). Only children can naturally achieve dou-
ble-acquisition (grow up as balanced bilinguals with perfect
pronunciation) if the following conditions are valid: a) ample
and balanced exposure in childhood and early adulthood to lan-
guage materials in authentic socio-cultural and discourse con-
texts; (i.e., materials that are context-embedded and situation
embedded); b) ample multisensory and multicognitive immer-
167
sion and rehearsal of language materials in perception, recogni-

tion and production; and c) an integrative approach in which all
language skills and subskills are practiced and internalized joint-
ly. The more one grows older, the more remote the opportunity
of double-acquisition becomes, specifically in pronunciation. For
a better mastery of L2 pronunciation, more intentional and con-
scious effort is required. This process is here known as learning
which is broadly in contrast with acquisition. Consequently,
adults are expected to manifest different degrees of accent in L2
pronunciation including the three main domains of consonants,
vowels and other suprasegmental features (stress, tone, rhythm,
intonation, etc.).
Since the focus of this study is on pronunciation, any lin-
guistic statements made here relate to pronunciation unless stat-
ed otherwise. Pronunciation of L1 is here identified as sets of
physical phenomena that have to be transformed into cognitive
codes stored in the long term memory or the so-called subcon-
scious brain and retransformed into acoustic signals. If the phys-
ical phenomena are precisely internalized and encoded in the
brain, their reproduction in the form of speech later will mani-
fest no traces of accent. In other words, if the person stores the
right sound input in early years, he should produce the right
sound output entailing no accent. This is usually what children
do in their L1 or their L1s.1 With age, adults begin to slowly lose
their adeptness in the perception of sounds that are not part of
their L1 inventory (phonology). What usually happens with
adults learning L2 is that they become vulnerable to internaliz-
ing a sound from L2 that is not precisely the sound they hear.
Even if they attempt to transition from hearing to listening, their
listening might be biased by their own L1 inventory of sounds.
This means, adults will produce what they thought they heard
or listened to not what the actual sound input was. Consequent-
ly, their production of an L2 sound will be inaccurate because
the perception and recognition of that sound were inaccurate
due to the bias of their L1. Stated differently, they will have an
accent.
1
If they are amply exposed in childhood to more than one language.
CHAPTER 11 169
It should be made clear that the distinction between chil-

dren and adults in their adeptness to language acquisition is in
this book confined to the skill of pronunciation and not neces-
sarily to other skills, such as lexicon, morphology and syntax. It
is the belief here that the evidence in favor of childrens skill in
acquiring and mastering L2 pronunciation as opposed to adults
is overwhelming in real-life situations as well as in published
literature. It is, therefore, reasonable to consider children as
gifted in the acquisition of language, in general, and pronuncia-
tion, in particular. The rapidity and apparent ease with which
children learn language is a phenomenon of childhood and can
hardly be repeated with such ease by most adults (Borden &
Harris, 1980); however, such a statement should not imply that
they lose their ability to learn in the absolute sense of the word.
The claim that adults enter a stage of linguistic fossilization
(Selinker, 1972) is rejected in this study. The rejection is prem-
ised on cognitive principles in conjunction with classroom expe-
rience. Systematic multisensory and multicognitive orientation,
regardless of age and aptitude for pronunciation, helps all learn-
ers to improve their skills to different degrees in the proper
learning of L2 pronunciation. This last statement constitutes the
basic conceptual premise on which the approach stands.
11.2. MULTICOGNITIVE PRINCIPLES FOR TEACHING PRONUNCIATION

Since human language has been defined previously as a code of
communication that is a genetically determined cognitive poten-
tial before being a set of physical maneuvers, it is imperative for
any instructor or learner to try to comprehend such a definition
in its applied sense. The instructor is likely to see that adults
may experience serious difficulty in producing a new sound2 to
which they have never been exposed. This is a good example of
the cognitive requirement for sound production meaning that
the brain may need enough exposure time to the new sound to
2
For the sake of stylistic brevity the term sound will, henceforth, stand
for sound segment, sound feature and sound dynamics collectively, unless specified
otherwise.
perceive and recognize it before being able to produce it appro-

priately. Therefore, any instruction in pronunciation should tar-
get both the cognitive potential for perception and recognition
prior to the necessary physical maneuvers of production. If, for
instance, an adult native speaker of Spanish is asked to pro-
nounce the English pair <pill> = [phl] vs. <peel> = [phil]
and fails after repeated modeling to distinguish the two vowels
and instead replaces them with [i] vowel and substitutes the
aspirated [] with the unaspirated [] one then the whole situ-
ation indicates that the learner is psycholinguistically (cognitive-
ly) unable to perceive and recognize the difference between the
two English words. This is a typical condition that is identified
in this study as psycholinguistic insensitivity; a condition that is
characteristic of adults learning L2. The sub-sections below are
meant to shed more light on the multicognitive approach to
teaching pronunciation.
11.2.1. Think about L2 Speech Sounds

Upon planning to learn the pronunciation of an L2, an adult
should make sure not to fall victim to his own L1 phonology
bias, i.e. hear L2 sounds through his L1 filter which is one of the
most pervasive threats to ideal L2 pronunciation internalization.
If the learners L1 phonology filter determines the hearing, or
even the listening process without proper focus and guidance,
all unfamiliar sounds coming from an L2 phonology source are
likely to be filtered out and replaced with the nearest sounds to
them in L1. If, however, the learner discovers an unfamiliar
sound or is alerted to it by the instructor, the first step is to try
to perceive the sound as accurately as possible. The learner
should try to think about it in the following terms. Is it similar
to one of his native sounds? If it is similar, but not identical,
what is the sound feature that renders the L2 sound different
from the L1 sound? Personally, I remember when I began my so-
called ear-training course in my graduate studies, I encoun-
tered serious difficulty in the perception and production of the
less familiar alveolo-palatal pair [, ] as opposed to the more
familiar postalveolar pair [, ]. It was persistent thinking about
and focusing on the perceptual and the articulatory/kinesthetic
differences that helped me set them apart.
CHAPTER 11 171
11.2.2. Transition from Hearing to Listening

As was pointed out earlier on in Chapter 2, hearing is a sense
while listening is a skill. Simply, learners have to be encouraged
to try to attentively listen to sounds, and remember their acous-
tic images to facilitate their reproduction correctly. The learner
can conduct internal comparing and contrasting of the target
sound or sounds with what is/are already part of his psycholin-
guistic inventory. With time and practice, the learner can hone
his listening skill further to make it more discriminative in the
accurate perception of the targeted L2 sounds. The best example
to cite for this type of advice is my long struggle with the identi-
fication of sound [], a voiced labialpalatal approximant in the
phonology of Neo-Aramaic (Modern Assyrian). When I was pre-
paring my doctoral thesis, I felt that the phoneme /w/ had a
variant, but I did not realize that it was a [] because I had no
conscious exposure to it. Simply hearing the sound and listening
to it were not enough; it was only because of further discrimina-
tive listening and investigation that enabled me to pin down
that variant as [].
11.2.3. Learn Something about Speech Production

An instructor or learner need not become a speech specialist or
phonetician; however, it is tremendously helpful to acquaint
oneself with the basics of sound production. For example, it is
helpful to know that the overwhelming majority of speech
sounds are produced either with vocal folds vibration or without
it. Also speech sounds are identified with their places of articu-
lation such as: bilabial sounds (with two lips); labio-dentals (up-
per lip and incisors); inter-dentals (tongue tip at biting edge of
incisors), etc. The example of [] above is also relevant in this
regard. The sound was related to [w], a labialvelar approximant,
but [] ended up being a labialpalatal approximant, i.e. a labial
sound but somewhat in front of the velar sounds. Additionally,
some acquaintance with the way in which sounds are generated
along the vocal tract is of significant importance. Once again I
cite my experience with the phonetic phenomenon of retroflex-
ion. As a beginner in phonetic studies not only I had not heard
the term, but I also did not know how to consciously do a retro-
flex <r>. Ironically enough, I later discovered that a retroflex
exists in a few Neo-Aramaic (Assyrian) dialects such Tiari and

Alqosh. Once I mastered the production of a retroflex <r>, it
was an easy transition to the mastery of other retroflex sounds.
This last statement is of extreme significance in phonetic train-
ing and education because excellence in phonetic education is
not built up sound by sound, but rather by bundles of sounds or
more accurately termed by natural phonetic categories. Once I
learned how to tilt the tip of my tongue for the retroflex /r/, I
gradually moved into the production of the retroflex sounds /l,
n, t, d/, etc.
11.2.4. Mechanical Repetition Hardly Works with Adults L2

Learning
The classical and old-fashioned style of teaching pronunciation
in the tradition of the Audiolingual Method3 known as repeat-
after-me may be beneficial with children and adolescents, but it
is often practically ineffective, in fact pedagogically damaging,
with adults because adult learners of L2 usually and subcon-
sciously repeat after themselves. Stated differently, due to L1
phonology bias, they repeat the sound that they think exists in
the L1 inventory not what they actually hear demonstrated by
the instructor. Such a practice is far more damaging when the
class as a whole is asked to repeat after the instructor (i.e., the
so-called chorus practice). In a chorus practice, usually the bad
practitioners are hidden among the good practitioners because
there is usually more noise in the background to mask the tar-
geted sound.
According to the repeat-after-me procedure, the immediate
demand by the instructor to produce after him, the learner is
bypassing two important stages, namely, perception and recog-
nition of the unfamiliar L2 sound. Thus, it is not the fault of the
learners to mispronounce because the avoidance of those two
stages implies that the learner will not be well primed for the
targeted performance.
3
Popular in the 1950s and 1960s.
CHAPTER 11 173
11.2.5. Follow the Perceive, Recognize and Produce

Procedure
In the previous sub-section, the instructor was cautioned against
the repeat-after-me procedure; instead, it is suggested that he
should abide by the three-stage procedure of perceive, recog-
nize and produce to prepare the learner for better chances of
securing the targeted performance. In reality, any teaching of
pronunciation should thoroughly follow the natural three-stage
procedure of sound acquisition: perception, recognition and pro-
duction in the sequence indicated. The above triangular proce-
dure is highly consistent with the three-stage procedure of regis-
tration, retention and retrieval in learning and with the three
types of memories of sensory, short-term and long-term in infor-
mation storing. In each case, the earlier stage serves as the
gateway to the next and final stage. The transition to the final
stage cannot be completed without continued rehearsal.
Because the above triplets will be repeatedly used, a brief
clarification of the terminology is invaluable. Perception is used
to denote the condition of feeling and sensing the presence of a
given sound; recognition includes the condition of perception as
well as the condition of being able to distinguish the given
sound from others and, perhaps, identify the difference(s) in
comparative/contrastive situations. According to Parasuraman
and Beatty and in terms of cognitive processing, the distinction
between perception and recognition appears to be the matching
of the external sensory pattern with some internal sensory en-
gram4 and the bringing of this to awareness (cited in Kissin,
1986). As a further enhancement of the above quotation, Kissin
states, The definition of recognition as the process of matching
external perceptions against existing internal correlates, implies
a second level of activity (Ibid). As for production, it satisfies the
above two conditions of perception and recognition in addition
to the ability to retrieve the sound and reproduce it at will with
an acceptable degree of proficiency and accuracy.
4
An assumed unit of sound imprinting in the memory.
The sequential triplet of learning is: registration, retention

and retrieval. In standard literature on learning, registration refers
to the perception, encoding and neural representation of stimuli
at the time of an original experience; retention is the neurologi-
cal representation of an experience to be stored for later use;
and retrieval is the permit to access previously registered and
retained information (Arnold, 1984; Levitt, 1981). As for infor-
mation storing in the brain, there are three different kinds of
storing systems. The sensory memory is the initial level of infor-
mation storing; information stored here is extremely limited in
volume and is retained for only a few seconds. Sensory memory
is a sort of photographic memory (Loftus: 1980). Short-term
memory is not as limited as sensory memory; it can store about
seven items plus or minus two items and for no more than half a
minute or so. Although short-term memory may be transient and
limited in capacity, it may be very useful in ear-training (audito-
ry orientation) sessions where the temporary retention may al-
low the learner to better perceive the sound; it may also play a
crucial role in conscious thought. In plain wording, the half-
minute or so allows the learner to think about the sound and its
production. Long-term memory is the storing system where in-
formation is retained for longer time and even permanently. In
terms of cognitive knowledge, the process of learning is essen-
tially one of transferring information from the environment into
the long-term memory. Long-term memory is, more or less, a
permanent repository of general knowledge about the world and
past memories (Bourne et al, 1986).
The explanation above suffices to portray that in order to
perceive a sound, one has to be exposed to it at least in passing
through the sensory memory; to have it registered, at least tem-
porarily, it should be stored in the short memory; however, in
order to retrieve and produce a sound accurately at will, it has
to be retained and consolidated in the long-term memory
through rehearsal. Sequencing of stages is significant and by-
passing a stage may negatively impact the outcomes. For in-
stance, with insufficient and improper exposure to unfamiliar
sounds, it is highly unlikely to succeed in producing them. A
serious flaw in the traditional approach to the teaching of pro-
nunciation is attributed to either insufficient dwelling on the
perception and recognition stages or their total negligence.
CHAPTER 11 175
Those two conditions lead to an immediate jump to the produc-

tion stage a condition that is typically embodied in the repeat-
after-me technique of teaching pronunciation, which may be so
incompatible with the learning styles of adults.
The suggested triangular procedure is cognitive in nature. It
assumes that the learner may not cognitively perceive the alien
sound because it is not part of his sound inventory, hence fail to
recognize it in preparation for the correct rendition. With the
failure to perceive the incoming sound, he is vulnerable to sub-
stitute it with one of his L1 sounds or at best produce a sound
that is not the intended one; in both cases, he misses the target-
ed sound.
Only children and a few gifted people or highly trained
phoneticians can perceive, recognize and produce an alien
sound at the first attempt or after a few attempts. I remember in
1979 I had prepared for the presentation of a paper at the 9 th
International Congress of Phonetic Sciences, in Copenhagen ti-
tled: A Voiceless Unaspirated Emphatic Alveolar Affricate.5
This is a very rare sound if not unique to one of the dialects of
Neo-Aramaic (Modern Assyrian). A few months prior to the
presentation, I met the late Peter MacCarthy an accomplished
phonetician and Chair of the Department of Phonetics, Leeds
University who asked me whether I was presenting at the Co-
penhagen Congress. I said: Yes.. He said: What is the topic? I
said: I guess a new sound. He asked for the description of the
sound. I gave him a description of it in terms of phonetic labels
as: A voiceless unaspirated emphatic alveolar affricate. With a
few attempts, he was able to produce a quite satisfactory sample
of the sound. This is what accomplished phoneticians are able to
do; however, ordinary adult learners of L2 often encounter diffi-
culties with unfamiliar sounds.
According to this three-step teaching strategy, the first two
steps help the learner to become cognitively familiar with the
targeted alien sound. In other words, the brain recognizes the
identity of the sound as new and different from what it has in its
5
The paper was published in Journal of the International Phonetic Associa-
tion, Vol. 9, 1979.
L1 sound inventory (phonology). Once the brain assigns the al-

ien sound a different identity it means the brain is ready to store
it as a cognitive entity. Stated differently, the brain neuronizes
the sound or registers it with the neurons. If the learner express-
es further interest in dealing with the sound, the next step for
the brain is to begin sending the appropriate articulatory in-
structions to execute the sound. A few trial and error attempts
may be needed prior to successfully master the production of
the sound. It is quite likely for the learner to forget the proper
production of the sound the next hour or the next day simply
because the specifications are still in the short-term memory en
route to full registration in the long-term memory. Further prac-
ticing should, however, create an autonomous slot for the new
sound.
11.2.6. Instructors Academic and Professional

Qualifications
Are you, as instructor, academically and professionally qualified
to teach pronunciation especially when it involves a second lan-
guage and adult learners? Have you had enough linguistic and
educational orientation to tackle a technical subject that re-
quires ample professional preparation? Do not be surprised at
these questions that I am raising. Once upon a time, prior to my
professional orientation in phonetic science at graduate level, I
thought I was well prepared for teaching pronunciation when in
reality I discovered later I was not. I have also known instruc-
tors who taught topics in pronunciation that were described and
demonstrated wrongly. The best example I recall was in Bagh-
dad when one of the instructors holding an M.A. in methodology
from Britain was targeting the teaching of the standard Arabic
voiced emphatic alveolar plosive / /, while pronouncing
the voiced emphatic interdental fricative / / instead. I knew
that he was confused because in the Iraqi variety of Standard
Arabic, the two sounds are pronounced identical as / /. Nev-
ertheless, an experienced phonetician should know the differ-
ence between the two sounds. The worst large scale example of
the wrong approach to teaching pronunciation, especially of
English, is demonstrated by thousands of teachers of English in
the United States who still believe in the so-called phonics ap-
proach to teaching pronunciation. In many instances, phonics
CHAPTER 11 177
fails all the basic principles of modern linguistics. Foremost of

all, phonics fails to distinguish between letters and sounds or
between what linguistics identifies as graphemes and phonemes.
One of the worst examples of matching graphemes with pho-
nemes in phonics is in the phoneme identity of the letter
(grapheme) in the words <up> vs. <use> (Fox and
Hull, 2002) where the former is cited as the short counterpart of
the latter. The two s have hardly any qualitative base for
comparison because the former grapheme is the vowel
proper [] whereas in <use> it stands for the approximant []
+ []. Besides, phonics cites the vowels in words such as
<fat> as a short vowel vs. its long counterpart in <fate>.
Phonetically, the matching lacks a basis in both quality and
quantity. The vowel in <fat> is [] a simple vowel, while
<a> in <fate> is a diphthong [ei].
11.2.7. Plan Instructional Connection with Learners

As an educator, the instructor has to ascertain that he has estab-
lished connection with the learners and, in turn, that the learn-
ers are equally in connection with him. This is one of the most
important pedagogical factors to consider. Learning is at its best
when learners and their instructors are on the same page. The
instructor should be careful not to teach to himself, so to say,
and learners should pay attention when the instructor empha-
sizes a certain point and highlights a certain explanation or
demonstration. Every now and then the instructor should double
check that learners are connected with him. Oftentimes, instruc-
tors ask their students the classical question: Do you under-
stand? Some of the students confidently reply: Yes, when sev-
eral others remain silent. It is usually among the silent students
that those who did not understand hide some of whom are shy
to ask for further explanation.
11.2.8. Explain, Demonstrate and Demonstrate

Multisensorily
The instructor has to use the simplest possible language that is
not loaded with technical jargon. In many instances, the jargon
distracts the attention of the learner and doubles the complexity
of a certain problem. If, for example, you want to give details
about the phonetic nature of the sounds /p, t, k/ in English and

state that they are aspirated your statement involves a technical
term; instead, it is better to replace aspirated with a puff of air
that follows the articulation of /p, t, k/.
To make the explanation even easier to grasp, the instruc-
tor has to use as many demonstrations as possible including vis-
ual, auditory, tactile-kinesthetic, etc. These will be further elab-
orated on in the next chapter. Such multiple demonstrations
render the technical jargon and the complicated language more
comprehensible.
11.2.9. Deal with Pronunciation in a Holistic Fashion

Pronunciation is an integral part of overall human communica-
tion (Morley, 1991). There should be more consideration for
integration than isolation among the components of speech and
the dynamics of pronunciation. In other words, pronunciation is
seen not only as part of the system for expressing referential
meaning, but also as an important part of interactional dynam-
ics of the communication process (Pennington and Richards,
1986). This implies that the teaching of pronunciation should be
considered in the context of a much broader base of human
communication.
At a lower level of integration, human speech consists of a
combination of segmental and suprasegmental (prosodic) fea-
tures. Both of them should be handled inseparably from the
overall articulatory, visual, auditory and tactile/kinesthetic fea-
tures accompanying speech production. The latter sets of fea-
tures form the basis of what is differently labeled as articulatory
settings (Honikman, 1964) or phonetic settings (Laver, 1980;
1994), among others. No cross-language teaching of pronuncia-
tion will be authentic and dynamic in nature and a reflection of
the native-speakers proficiency without serious consideration of
the articulatory settings of the targeted language. For instance, a
language like Arabic with a limited vowel system and a heavy
dependence on guttural sounds and emphatics has such specific
articulatory settings that without its incorporation in the overall
approach to learning Arabic by non-native speakers and learning
of foreign languages by Arabs the results will be highly unsatis-
factory.
CHAPTER 11 179
11.2.10. Consider both Top-Down and Bottom-Up

Perspectives
Pronunciation is a dynamic cognitive and physical process. It
should be taught in a dynamic way with a bottom-up approach
(i.e. from smaller to larger units: distinctive features to segments
to the prosodics of the syllable, word, sentence and discourse) in
conjunction with a top-down approach that reverses the order of
processing from discourse down to distinctive features. Stated
differently, teaching pronunciation is like two-way traffic in
which both directions of movement are needed in order to com-
plete the cycle of communication. Traditionally, pronunciation
has been taught in a bottom-up approach with emphasis on
vowels and consonants often lacking proper contextualization
and embedding in longer meaningful stretches of speech. Re-
cently, there has been a twofold emphasis in teaching pronunci-
ation; firstly, an intra-segmental emphasis with attention on dis-
tinctive features (i.e. more microscopic perspective); secondly,
an inter-segmental emphasis with attention on prosodic features
and the overall features of the articulatory setting (i.e. more
macroscopic perspective) (Pennington and Richards, 1986).
11.2.11. Do not Confuse Memorization with Retention

In teaching pronunciation, distinction should be drawn between
memorization and retention. Although meaningful memorization
is more effective than rote memorization, memorization per se is
only one of the many cognitive processes that help with
knowledge acquisition, retention and retrieval. Retention of
knowledge and perfection of skills can be achieved by means
other than sheer memorization. Association, categorization,
analysis, synthesis, etc., can be highly effective means of
knowledge retention. To cite an example of retention through
association, let us take the subskill of spelling. On several occa-
sions, I have taken my college-bound students by surprise and
asked them to spell (graphic/written spelling not oral) a word,
such as accommodation. Frequently, the misspelling has been
over 70% and occasionally as high as 90% not because <ac-
commodation> is a very difficult word to spell, but because our
approach to teaching spelling has often been and still is, based
on rote memorization with minimum cognitive processing.
There is certainly ample room for a cognitive approach to the

teaching of all language skills and subskills. In this instance, the
teaching of the spelling of <accommodation> should be ap-
proached from a different perspective. In the first place, the
word is not difficult to spell in its entirety. The misspelling is
almost always associated with one or two segments only not the
word as a whole. With <accommodation>, the focus should be
on the doubled letters, i.e. cc and mm. Thus, if the nature of
the association is clarified and cognitively highlighted there will
be no misspelling. My students have always had instant perfect
success with the spelling of this word when they were instructed
to think of the need for two doubles or to associate the spelling
with the phrase accommodation for two.
Teaching pronunciation to adults through memorization in
the form of mechanical repetition becomes a highly challenging
task because sound features and segments are meaningless in
themselves. They are simply acoustic signals that impact the ear.
Consequently, their retention is difficult if memorization is the
sole channel of input. Definitely, more channels of input are
needed for better and more permanent retention. This is why
this proposed approach calls for the joint involvement of as
many sensory and cognitive channels of input as possible. Let us
cite the following teaching scenario. Suppose the teacher intends
to teach Hispanic learners of English the difference between
English and Spanish plosive of /p, t, k/. They are aspirated in
English [p t k], but unaspirated in Spanish [p t k].
If you are the instructor, after a very brief explanation, take
a flimsy piece of paper and place it just under your nose press-
ing on it with your left index finger. Pronounce a typical aspi-
rated sound e.g., [] and do the same with an unaspirated
sound []. Direct the attention of the learners to the fluttering
movement of the paper with the aspirated articulation and the
absence of the movement with the unaspirated articulation. This
fluttering movement is the direct result of a tangible puff-of-air
that accompanies the production of aspiration and its absence
with nonaspiration. The puff-of-air is the ideal demonstration to
reinforce of the cognitive retention of the difference between the
two types of plosives. Obviously, the auditory difference be-
tween the two sounds is additional reinforcement of the phonet-
ic difference.
CHAPTER 11 181
11.2.12. Deal with Pronunciation as a Generative Skill

The generative nature of this approach implies that mastering
the perception, recognition and production of one sound should
facilitate the mastery of more than that one sound. In other
words, developing a skill in one aspect/domain of pronunciation
should serve as a key to enhance or generate the skill to master
other aspects/domains of pronunciation. For instance, in Eng-
lish, mastering the production of a schwa [] vowel does not
only help with the mastery of the complicated vowel system, but
it will also considerably facilitate the process of vowel reduction
and the overall rhythmic performance. Also, learning how to
kinesthetically and proprioceptively sense the upper incisors
contacting the lower lip for a [v] or [f] or a tongue tip contact
at the alveolar ridge for the articulation of a [d] or [t] should
develop the skill of sensing any other contact of the tongue in
the oral cavity. Even in the dynamics of sound production, mas-
tering accentuation in a given word should pervade to other
words and to the overall rhythm mastery in the targeted lan-
guage or any other language for that matter.

The above principles will be elaborated on through further ex-
planation or application in the course of developing the imple-
mentational techniques; however, it is worth noting that the
most important theoretical aspect of this approach is that the
brain is the seat of all language skills prior to becoming physical
manifestations in the form of speech. The reason why this ap-
proach is described as multicognitive and multisensory is because
its implementational techniques are directly designed on the
basis of a wide variety of cognitive and sensory processes. Based
on the approach presented here, teaching pronunciation be-
comes more of a multi-faceted educational process than a mere
repeat-after-me mechanical reproduction of speech sounds. Such
an approach requires more effort on the part of the instructor
and learner and a stronger collaboration between them through
the diversification of teaching and learning styles, respectively.
It is certainly a time-consuming effort, but the time spent is
worth the effort. This approach is no longer a single technique
or drill that tackles one sound at a time; instead, it is a joint se-
lection of cognitive and sensory techniques that are applied con-

currently to facilitate the L2 mastery in a creative and genera-
tive manner similar to the process of child language acquisition.
CHAPTER 12: PRINCIPLES OF MULTISENSORY
APPROACH TO TEACHING PRONUNCIATION

The previous chapter emphasized the significance of the cogni-
tive foundation of pronunciation and the multicognitive princi-
ples to teaching and learning it. In this chapter, the emphasis
shifts to the sensory support of the cognitive foundation of pro-
nunciation via the multisensory channels. In other words, it is
the sensory modalities that jointly reinforce the foundation for
the cognitive modalities to kick in. Once internalized, speech is
used as a tool for social and physical existence and survival.
The relationship of the brain and the five senses is one of
mutual dependency and co-existence; one cannot function and
survive without the other. The power and creativity of the brain
is nullified without the feeding of information for processing
through the five senses: visual, auditory, olfactory, tactile, and
gustatory. To state it more dramatically, the senses are the five
windows of the brain to the outside world if they are closed the
brain will wither in darkness. In turn, the senses will be redun-
dant organs if the brain does not process the information they
feed and transmit it back for execution.
Inasmuch as the acquisition of L1 by a child is concerned, it
is a natural process for which the brain of the child is tuned. The
acquisition progresses very smoothly, effortlessly and subcon-
sciously as long as the child is immersed in the language. The
brilliant success of the process of language acquisition is prem-
ised on two main pillars: first, total immersion in the language
and exposure to a variety of multisensory stimuli; second,
transmission of such stimuli to the brain for processing and de-
cision making and then dispatching the decisions back for im-
plementation.
183
12.2. MULTISENSORY PRINCIPLES FOR TEACHING PRONUNCIATION

When the human brain is in the process of decision-making, it
does not just depend on one sensory source; rather, it manipu-
lates all the senses to gather as much information as possible.
The analogy of all the roads lead to Rome applies here because
all the senses meet in the brain; besides, receiving data from
different sensory channels provides the brain with a far more
accurate, diversified and comprehensive assessment of the prob-
lem to be solved. In solving a linguistic problem, the brain takes
in the needed information, sifts through it and then processes it
to arrive at a gestalt solution. According to the approach sug-
gested here, of all the five senses only three are indispensable,
namely, the auditory, visual and tactile senses. They are the fo-
cus of the next sections.
12.2.1. Auditory Modality

Traditionally, the auditory modality has received the priority in
teaching pronunciation. No doubt, this is comprehensible be-
cause it is the primary sense of acoustic intake. Nevertheless, in
real-life situations, especially in child language acquisition,
speech is not the exclusive function of the auditory sense; ra-
ther, it is the collective function of the input from the auditory,
visual and tactile senses. It should, however, be pointed out
briefly that the tactile sense covers all the kinesthetic and pro-
prioceptive sensations that are transmitted to the brain via mus-
cular innervations.
In adult L2 teaching classes, the instructor should not take
for granted that the auditory modality will do the job of teach-
ing sounds that are alien to L1. This was discussed earlier on in
that the phonology of L1 will mask many of the L2 sounds hence
filter them out. In order to ascertain that the L2 targeted sound
is really perceived and recognized by the learners, the instructor
should lead the learners step by step towards the targeted
sounds using different exercises. At times, the auditory modality
alone fails to do the job; therefore, it should be supplemented by
other modalities. Let us take the case of /b/ vs. /v/ for Hispanic
learners of English. The substitution of a /b/ for /v/ is one of
the most salient accent indicators among Hispanics. It has been
the experience that when handling this phonological confusion
CHAPTER 12 185
with dependence on only auditory modality, the results have

been dismal. For much better results, the auditory modality has
to be supplemented with other sensory modalities, especially the
visual one. The following step-by-step procedure, which abides
by the perceive, recognize, produce strategy, is recommended
to overcome the difficulty.
1) Model the two sounds as many times as you deem neces-
sary asking learners to pay attention. Without attention to
the targeted sound, as they say in some cultures, it enters
this ear and goes out of the other one.
2) Ask if any learners are willing to demonstrate them for

the class. This will serve two purposes. First, discover the
learners who are more gifted for sound perception, recogni-
tion and production and use them as models. Second, peer
demonstrations may encourage other learners to pitch in.
3) Prepare a couple of exercises to encourage learners to dis-

tinguish and recognize the two sounds in random demonstra-
tions. Make sure that the number of learners who managed
to perceive the two sounds and distinguish them is increas-
ing gradually. If the instructor feels that the number of those
who failed the initial test of perception and recognition is
sizeable, he should repeat the experience of perception and
recognition with the help of the visual and tactile modalities.
4) Model the pair a few times asking learners to visually

watch your facial gestures during the demonstration of /b/
and then /v/. Typically, emphasize the two lips closing to-
gether tightly for /b/, while for /v/ the upper incisors touch-
ing the lower lip. Ask the class to perform the articulatory
gesture for both /b/ and /v/ separately and repeat the ges-
ture several times for each one.
5) Ask for volunteers to come before the class and demon-

strate the articulation of the bilabial (two lips) for /b/ and
the labial-dental (lower lip and upper incisors) for /v/. Then
ask all learners to perform the articulatory posture while
watching them.
6) Place the learners in pairs facing each other and taking

turns in performing a /b/ articulation (bring the two lips to-
gether) followed by a /v/ articulation (bring the upper inci-

sors in touch with lower lip). Move around the class to ob-
serve the performance.
All the above demonstrations and exercises will collectively send

auditory, visual, tactile-kinesthetic messages to the brain for
consideration and registration in the memory. At the end of the
above multisensory input the brain will be more prepared to
cognitively recognize the two sounds and produce them success-
fully. Obviously, the first stage of cognitive retention will be in
the short-term memory; hence, it is not uncommon for some
learners to lose the cognitive impression of the two sounds. This
means that some of the exercises have to be repeated in the next
sessions until the brain transforms the /b/ and /v/ articulatory
impressions from the short memory to the long term memory en
route to the subconscious. Once the /v/ sound is perceived, rec-
ognized and produced, the brain begins to make all the prepara-
tions to register the sound in a slot that is cognitively separate
from /b/. This is how the /v/ phoneme becomes part of their
enriched phonology.
12.2.2. Visual Modality

The common impression among people is that speech sounds are
heard. They rarely envisage seeing the sounds; indeed, there
are several speech sounds, including consonants and vowels as
well as other sound phenomena, such as stress and pitch, whose
articulatory organs or maneuvers that generate them yield
themselves to vision. As highlighted earlier on, pronunciation is
not exclusively an audio-lingual activity because it depends on a
much broader base of sensory and physical activities. Hence,
serious consideration should be given to non-verbal gestures
including facial and body gestures that are intertwined with the
overall dynamics of speech production. In other words, in teach-
ing pronunciation it is not enough to hear the sounds, but also,
and equally importantly, to see and feel them.
In light of such a broad definition of the sensory features
involved in human speech production, a certain category of
sounds has been identified as visible including consonantal
sounds such as the bilabial, labialdental, interdentals. Sounds
produced inside the oral cavity beginning with the alveolar
CHAPTER 12 187
ridge will decrease in visibility with further movement back-

wards in the direction of the larynx. Features of sound produc-
tion such as lip configurations (lip spreading, protrusion and
rounding), jaw depression and elevation are also fairly visible
features. It is also possible to visually detect some facial and
bodily gestures indicating some features of tense and lax sounds
especially with vowels. It is worthwhile reiterating that the vis-
ual modality does not work separately; in many instances, the
visual sensations are accompanied by tactile/kines-
thetic/proprioceptive sensations.
Let us cite some examples to demonstrate the assistance
that the visual modality affords in teaching certain vowel and
consonant sounds. One way to distinguish the German vowel [u]
as in <sputen> (to hurry) as opposed to [y] as in <hten> (to
guard) is by the degree of lip-rounding and lip-protrusion; they
are more visible with the latter. Another good example for
demonstrating the effectiveness of the visual modality to distin-
guish between sounds and master their pronunciation is through
the teaching of aspirated vs. unaspirated sounds. Take a flimsy
paper and place it just under your nose pressing on it with your
index finger so that it covers your lips. Pronounce a typical aspi-
rated sound e.g., [] and do the same with an unaspirated
sound []. Direct the attention of the learners to the fluttering
movement of the paper with the aspirated articulation and the
absence of the movement with the unaspirated articulation. This
fluttering movement is the direct result of a tangible puff-of-air
that accompanies the production of aspiration and its absence
with nonaspiration. One can also see the difference using some
talcum powder and placing it on the palm of the hand in front
of the mouth while pronouncing the aspirated and the unaspi-
rated sounds consecutively. There will be a spread of powder
with the aspirated consonant and its absence with the unaspi-
rated. One can equally distinctly see the flame of a burning can-
dle flutter with aspiration and remain steady without it. Even
kinesthetically, one can feel the aspiration if the hand or fingers
are placed in front of the lips while producing aspirated sounds
such [p , t , k].
12.2.3. Tactile, Kinesthetic, Proprioceptive Modalities

Although the word touch is a common word and the sense of
touch is an equally common label, technically, however, tactile
sense is a somewhat more technical term. For the purpose of
teaching a refined level of pronunciation especially for adults
embarking on an L2, the more technical terms that arise from
different types of touching are kinesthetic and proprioceptive.
Obviously, the purpose is not to teach those adults such tech-
nical jargon; rather, to enable them to feel the sensations that
the two terms invoke. Kinesthesia is the sense that detects bodi-
ly position, weight, or movement of the muscles, tendons, and
joints whereas proprioception is the ability to sense the position,
location, orientation and movement of the body and its parts.
Let us consider the detection of the difference in kinesthetic sen-
sation between the articulations of a voiceless aspirated velar
plosive [k] vs. a voiceless unaspirated uvular plosive [q]. Place
your index and middle finger jointly on the thyroid cartilage
the projection in front of your neck known as Adams apple
and do the [k] sound a couple times followed by [q]. Certainly,
you will feel (as well as see) greater upward movement of the
thyroid cartilage with [q] than with [k] simply because [q] is a
uvular sound that requires the whole laryngeal structure to rise
and execute a complete contact with the uvula. One can also
visually detect the relatively greater movement with [q].
The movements of different parts of the speech production
mechanism and vibrations that may accompany them can all be
picked up by proprioception. The vibrations created by the vocal
folds travel along the bones, cartilages, and muscles of the neck,
head, and upper chest, causing them to vibrate. (McKinney,
2005; Daniloff, 1973). Thirteen centuries ago, a brilliant histori-
cal example of telling the voiceless consonants from the voiced
ones, even prior to being aware of the existence and role of the
vocal folds in this regard, is Sbawayhis1 choice of the attributes
of voiceless (mahmsa) and voiced (majhra) simply based on
1
Renowned Grammarian of Arabic Language during 8th century A.D.
CHAPTER 12 189
his impressionistic sensations detected by proprioceptive feed-

back channels (Odisho, 1988; 2010).
12.3. DEVELOPING TEACHING AND LEARNING STRATEGIES

After lengthy discussion of the modalities of teaching and learn-
ing, the natural progression is into the domain of teaching and
learning strategies to implement the modalities. Two points are
important to highlight in this regard. First, as long as the general
approach to teaching pronunciation is premised on multiple
cognitive and sensory modalities it is quite natural for the teach-
ing and learning strategies to follow the same pattern: multiple
cognitive teaching strategies and multiple learning strategies.
Second, there should be some degree of matching or reciprocity
between instructors teaching strategies and students learning
strategies. Stated differently, diverse teaching strategies should
simultaneously promote diverse learning strategies that serve
the same goals.
The multiple intelligence theory with its, thus far, nine (9)
intelligences affords both instructors and learners a different
perspective for teaching and learning. For the instructors to
know their students by their names is a very commendable skill
for class management, but knowing their strengths and weak-
nesses and their learning styles is even more commendable for
their academic success. The knowledge of this personal aspect of
learners strengths and weaknesses allows the instructor to indi-
vidualize his instruction when necessary which will be benefi-
cial to both sides. At least the instructor will know what learn-
ing style appeals to a particular intelligence that a given learner
demonstrates.
12.3.1. Developing Teaching Strategies

A successful instructor, regardless of the subject he is teaching,
is the one who prepares the entire class to listen and think. Lis-
ten when there is a need for listening and think critically when
attempting to solve a problem. Some learners by nature and
home and/or family culture are good listeners and thinkers,
whereas some others require some guidance and orientation.
Part of the educational responsibility of the instructor is to im-
prove the thinking and listening habits of all. The thinking hab-
its should focus on the assessment of the complexity of the prob-

lem, identification of the most relevant sensory/cognitive mo-
dalities and the assignment of roles in implementation between
himself and the learners. As for teaching discriminative listen-
ing, the instructor should thoroughly demonstrate the effective-
ness of moving the learners from hearing to listening and finally
to discriminative listening.
12.3.1.1. Assess the Degree of the Complexity of the Posed Problem

It is true that any non-native sound or sound phenomenon may
pose a certain degree of difficulty for the learners; it is, howev-
er, equally true that some sounds or sound phenomena are rela-
tively more difficult to master than others. In general, teaching
consonants may be more readily manageable than vowels simp-
ly because most consonants tend to have a more well-defined
articulatory posture than that of vowels. However, even among
the consonants the teaching of some of them seems to be more
straightforward than others. In fact, the so-called visible conso-
nants (e.g., bilabials, labialdental and interdentals, etc.) are the
easiest to teach of all sounds.
12.3.1.2. Identify the Most Convenient Sensory/Cognitive Modality

After deciding on the pronunciation problem to be tackled, one
has to identify the most convenient sensory/cognitive modalities
to be used. To demonstrate this phonetic fact, let us remind our-
selves of the teaching of [v] for Hispanics which constitutes one
of the most characteristic examples of their pronunciation diffi-
culties. In spite of this fact, teaching [v] vs. [b], in my case as an
instructor, has been the easiest task because of their visible ar-
ticulatory postures and their readily detectable kinesthetic and
proprioceptive sensations. Cognitively, in teaching the visible
sounds and this pair, in particular, the instructions are more
viable to be comprehended and implemented because of the
multiple sensory feedbacks that the brain receives.
12.3.1.3. Assign Roles for Instructor and Learner

Once the problem is posed and the sensory and cognitive modal-
ities and strategies are selected, the instructor has to strategize
his role and that of the learners. Foremost of the things to be
CHAPTER 12 191
decided is that the instructions given should be as easy to com-

prehend and implement as possible. For example, if the instruc-
tor aims at teaching Hispanic learners of English that a [z] has
vocal folds vibration whereas [s] has not, he should give learn-
ers the instructions to detect the difference as demonstrated in
section 6.1.7, above.
12.3.1.4. Highlight Discriminative Listening

The progression in the direction of teaching discriminative lis-
tening should be premised on teaching listening which is a skill
as opposed to simply hearing, which is a sense. Let us consider
the following example for teaching discriminative listening. In
English, the plosives, /p, t, k/, are usually aspirated particularly
in initial position. Thus, the aspirated production of such plo-
sives for the native speaker of English is cognitively the model.
The last statement implies that the unaspirated production of
such plosives is the cognitively unrecognized, hence difficult to
perceive, recognize and produce. In teaching unaspirated plo-
sives to native speakers of English learning Spanish, for exam-
ple, it is recommended not to directly give them Spanish words
such as <pero> (but) or <perro> (dog); rather, stay with na-
tive English words in which /p, t, k/ follow an /s/ sound such as
<pit> vs. <spit>. The phonetic rule of pronunciation in Eng-
lish is that when /p, t, k/ follow the /s/ sound in the form of a
consonant cluster, the three plosives are deaspirated (they lose
their aspirated feature and become typically unaspirated
sounds). Therefore, the instructor can use the pair <pit> and
<spit> to familiarize learners with the two sounds in
perception and recognition following the steps below:
a) Demonstrate the pair <pit> vs. <spit> several times
and highlight the difference to be picked up auditorily. It is
quite acceptable to exaggerate the difference somewhat for
the sake of clarity.
b) Ask for volunteers from among the learners to do what

you did.
c) If you notice that learners are having difficulty or even

doubt in telling the difference, move to the next step.
d) Move to the visual sensory modality. Place a flimsy piece

of paper in front of your mouth and repeat the demonstra-
tion. The paper should flutter visibly with <pit>, but will
hardly flutter in the case of <spit>.
e) Ask for volunteers to do what you did.
f) Arrange the class in pairs the members of which face each

other while conducting the flimsy paper experiment
After such joint demonstrations, some of the learners should be

able to perceive and recognize the difference because of the
combined auditory and visual modalities of instruction. It is af-
ter this phase of orientation, that the instructor can begin the
production phase of such phonetic materials from the targeted
language such as [p:r] (odd numbers) vs. [p:r] (lambs) in As-
syrian language. One can take the exercises one step further by
involving the other two plosives of [t] vs. [t] and [k] vs. [k].
In summary, the above orientation began with listening to
the targeted phonetic phenomenon. The auditory modality was
supported by visual modality to help in two ways. First, elevate
natural listening to discriminative listening. Second, use both
listening and discriminative listening to activate the cognitive
involvement of the brain in the process of internalizing the pho-
netic difference.
12.3.2. Developing Learning Strategies

Learning strategies are equally important to teaching strategies.
When the instructor dominates the classroom situation and re-
duces the learners to mere listeners with minimum interaction,
this is an old fashioned teaching strategy. There is always a wide
variety of learning styles that should be considered and encour-
aged some of which will be considered below.
12.3.2.1. Discover Learning Strategies

When it comes to promoting learning strategies, the instructor
has to discover the learning styles of as many learners as possi-
ble. To achieve this, he has to be very alert and observant. The
discovery procedure usually takes place when the instructor pre-
sents one targeted problem with diverse auditory, visual and
tactile-kinesthetic modalities and styles and monitors the reac-
CHAPTER 12 193
tion of students to each modality of the presentation. Once the

instructor senses that a given learner responds to a given senso-
ry modality, he should reinforce that in future attempts.
12.3.2.2. Consider Cultural Diversity of Learners

Foremost of the requirements for teaching pronunciation is the
audible oral exercise that usually requires the presence of one or
more persons. This face-to-face scenario can become intimidat-
ing, especially with some learners whose background culture is
conservative and sensitive to errors and failures in public and in
the presence of other learners, especially those who are
strangers to them. In fact, this is even more relevant to female
learners of those societies. The instructor should carefully con-
sider the psychological and cultural differences between learners
and encourage all of them to participate in class activities. This
participation cannot be achieved without the promotion of mu-
tual trust and respect for each others culture traditions as well
as the tolerance of errors in early performance trials.
12.3.2.3. Encourage a Relaxed Attitude among Learners

Some aspects of teaching pronunciation can be intimidating as
well as humorous. This is especially true with some very unfa-
miliar sounds. To have fun while experimenting with sounds
should be part of the classroom culture of teaching pronuncia-
tion provided it is controlled fun lest it should overflow and in-
terfere with class management. Learners should be instructed
that failures and mistakes are part of the learning process. All
learners should be encouraged to build up mutual trust between
themselves. The more the interaction grows between the learn-
ers, the greater the discovery of each others learning styles.
12.3.2.4. Move between Individual and Group Learning Styles

It was pointed out previously that all learners have their own
personal strategies of learning. In a classroom environment,
however, personal strategies should be complemented with
group strategies. It is natural to have group strategies because
teaching and learning take place collectively in a classroom set-
ting. One way of promoting alternative strategies to ones indi-
vidual style is through group and cooperative learning. This re-
quires forming groups or teams led by facilitators from amongst

the learners. This all should be done under the guidance and
supervision of the instructor.

In any successful classroom environment, there should be a bal-
ance between teaching and learning strategies because they are
complementary in the mission of educating learners. Further-
more, a successful classroom environment should afford a three-
way learning process: a) Learner from instructor; b) Learner
from learner; and c) Instructor from learner. The first two learn-
ing styles are self-explanatory, whereas the third needs some
further elaboration. With regard to the latter style, I am reflect-
ing on my personal classroom experience which has been the
richest source of gaining hands-on experience in instruction.
There have been repeated instances when I corrected myself or
improved my instruction because of direct or indirect feedback
from learners. Let me cite the following examplenot related to
pronunciation but with a genuine linguistic connection.
In one English language class, I was teaching spelling
through a multisensory and multicognitive approach. I selected
the word grammar as an example of a commonly misspelled
word often as grammer or gramar, whereas its correct spelling
is <grammar>. I brought the following facts to the attention of
the learners. I suggested the following mnemonics:
Remember 7 letters = <grammar>;
Remember 1 single letter plus 3 pairs = <g> +

2r + 2a + 2m = grammar;
Remember mirror image: ram + mar;
Remember ram
At this stage I exhausted my sensory and cognitive mnemonic

hints that help with the retention of the correct spelling of the
word <grammar>. All of a sudden, one of the learners, an art-
ist, asked to go to the board to demonstrate an addition. He
quickly drew two sketches of two rams facing each other as if
preparing for a headbutt. His quick sketch was to reflect the
mirror image of ram and mar in <grammar>. It was an ex-
CHAPTER 12 195
cellent and creative idea coming from an artist. I learned that

and added the head-butting rams to my sketches.
CHAPTER 13: EXEMPLARY APPLICATIONS OF
ACCENT REMEDIATION TECHNIQUES

In this chapter, we will try to apply the approach detailed in the
previous chapters with as many cognitive, sensory and educa-
tional teaching and learning strategies as relevant to the sound
or sound phenomenon selected for elaboration. To be sure, eve-
ry sound or sound phenomenon can be a source of difficulty for
some learners of a given L2. Consequently, there has to be a lim-
ited selection of cross-language problems that encounter large
numbers of L2 learners. Due to the international popularity of
English as an L2, the application of the strategies tends to lean
strongly in that direction with a reversal of roles, i.e., with na-
tive speakers of English tackling other languages.
In many instances, teaching a given sound whether a vowel
or consonant implies teaching a natural class of sounds. For ex-
ample, teaching an aspirated/unaspirated consonant pair im-
plies teaching an aspirated vs. unaspirated class of consonants
usually /p t k/ vs. /p t k/; nevertheless, the two classes can
involve other plosives such as palatal as well as affricates, etc.,
as is the case in Modern Assyrian (Neo-Aramaic). It is relevant
to point out that some selected sounds, such / / may be diffi-
cult for many L2 learners with a wide variety of linguistic back-
grounds, whereas other sounds may be difficult for learners with
more specific linguistic backgrounds such as the distinction of
/b/ vs. /v/ for Hispanic and Filipino learners of English.
13.2. TECHNIQUES FOR TEACHING SELECTED CONSONANTS

The consonants or consonantal phenomena selected for demon-
stration include the labiodentals, especially /v, /; the interden-
tals pair / /; retroflex sounds for learners of sub-continental
197
Indians languages and the reversal of retroflexion for most sub-

continental Indians learning L2s.
It is important to bring to the attention of the instructor
and learner that this section will contain sets of strategies that
are procedurally, though not specifically, typical of teaching
most sounds according to a multicognitive and multisensory ap-
proach. This entails that not all sounds yield themselves to the
same sensory modalities and teaching techniques; however, all
difficult sounds require some sort of cognitive orientation. Con-
sequently, most of the details expounded in this section will not
be reproduced for all sounds taught.
13.3. TECHNIQUES FOR TEACHING LABIAL-DENTAL SOUNDS

The voiced labialdental fricative [v] and the voiced labialdental
approximant [] are more marked (less common) sounds than
the voiced bilabial plosive [b] and the labialvelar approximant
[w]. However, because of the generic labial nature of all those
sounds as well as the absence or presence of some of them in
given languages, any instructor of English as L2 will come across
scores of learners who will demonstrate serious difficulties in
pronunciation amounting, at times, to phonological accent while
others will indicate at least a certain degree of phonetic accent.
For instance, Hispanics typically replace [v] with [b], whereas
Persians, Turks and Assyrians, among others, replace [v] with
either a [w] or a []. I have an Assyrian friend from Iran who
pronounces <vote> and <Harvard> as Wote and Harward.
In some other Assyrian dialects the [w] may be replaced with a
labialpalatal approximant []. If one works with Hispanic stu-
dents, he will readily notice that the mispronunciation of [v] is
pervasive even among some individuals whose proficiency in
English is otherwise excellent. Obviously, the primary reason is
the absence of a /v/ sound in the phonology of Spanish. Peda-
gogically, the persistence of the problem even with well-
educated Hispanics with high competency in English may be
attributed to two factors. a) The mispronunciation has not re-
ceived much attention from instructors at an early stage of
learning; b) The instructors did not follow some effective tech-
niques in teaching it. To put it more bluntly, the instructors did
not have the know-how of effective remediation of learners
mispronunciations. Most probably, if ever, they might have fol-
CHAPTER 13 199
lowed the mechanical procedure of repeat-after-me which is

often less effective with adults due to psycholinguistic insensitiv-
ity or deafness explained in earlier chapters.
In what follows, some strategies are put forth to develop an
effective set of procedures to overcome the problem of /b/ vs.
/v/. The strategies typically reflect different cognitive and sen-
sory modalities.
a) Cognitive Orientation: Prepare the learners mentally
(cognitively) to recognize the existence of the problem and its
seriousness because it leads to serious phonetic and phonologi-
cal accent. The preparation requires the following steps:
1) Instruct them to be ready to accept the problem and be
willing to pay utmost attention.
2) Tell them they will certainly manage the pronunciation.
3) Tell them to watch your facial gestures, especially those

of the mouth and recognize the difference in the pronuncia-
tion of [b] vs. [v]. In fact, to dramatize the postural differ-
ence in the articulation of the two sounds, you may call the
[v posture a dogface because when one assumes the pos-
ture, one looks like an angry dog ready to bite. In contrast,
you may call the [b posture a tight-lip face one since the
lips have to come together tightly for the sound. The drama-
tization of the articulatory facial postures for the sounds of-
tentimes functions as a humorous, albeit robust and concrete
mnemonic to remind the learners of the required articulatory
differences.
4) Demonstrate the pronunciation of the sounds in selected

minimal pairs of words for which the difference in meaning
is easily noticeable and, perhaps, even funny or embarrass-
ing, such as <vowel> vs. <bowel>; <vote> vs. <boat>
or <valet> vs. <ballet> (Figure 13.1).
Figure 13.1. Notice the difference when /v/ is replaced with

/b/.
5) Use colors and pictures or any other audio-visuals to high-

light the difference that results from substituting one sound
for the other.
6) Ask learners to watch carefully your facial gestures, espe-

cially your mouth and lips, while you slowly and distinctly
demonstrate the production of the two sounds. To put it dif-
ferently, ask them to watch the dogface posture for [v] and
the tight-lip-face posture for [b].
7) While you do all the above, carefully watch the facial ges-
tures of the learners. If you notice that learners faces seem
attentive and serious then you have to be sure that the
learners are in a mode of thinking. In other words, they are
trying to cognitively grasp the difference between the two
sounds.
b) Auditory Orientation: Go back to the minimal pairs,

number each member of the pair as #1 and #2 then produce
each member of the pair and ask learners to identify the word as
#1 or #2. Do this demonstration with your mouth covered with
a piece of carton to prevent lip reading and easy guessing. An-
other major difference between the two sounds is that [v] being
a fricative sound is sustainable (can be prolonged), while [b]
being a stop is unsustainable (cannot be prolonged). If some
learners still experience some difficulty in perceiving and recog-
nizing the difference between the sounds, then go to the next
step.
c) Visual Orientation: Remove the carton and pronounce
the two sounds quite consciously while exaggerating the bilabial
(upper & lower lips or the so-called tight-lip-face) posture for [b]
and the labial-dental (lower lip and the upper teeth or the so-
CHAPTER 13 201
called dogface) posture for [v]. Put the learners in pairs facing
each other and ask each member of the pair to perform the ar-
ticulatory postures for the two sounds, while the other learner is
observing. Allow them to turn-take on this performance.
d) Kinesthetic/Proprioceptive Orientation: Ask the
learners to carefully watch your demonstration of the two
sounds with distinct performance of their articulatory postures.
Stick with one of the sounds and repeat its articulatory posture
while saying the name of the sound (i.e., its letter-name) then
do the same with the other one. In other words, pronounce the
sound [v] or [b] several times followed by <Vee, Vee, Vee, Vee,
etc> or <Bee, Bee, Bee, Bee, etc.>. Ask them to impersonate
what you have been doing with emphasis on the need to devel-
op a kinesthetic (tactile) and proprioceptive (inner) sensing of
the articulatory contacts made for [v] and [b]. To rephrase the
latter statement, learners must be asked to sense the contact of
the two lips for [b] and the contact of the upper teeth and the
lower lip for [v].
e) Cognitive Reinforcement and Internalization: The ini-
tial cognitive orientation is considerably reinforced by the fol-
lowing three sensory modalities of auditory, visual and kines-
thetic/proprioceptive. The activity and performance conducted
via each sensory modality plays a certain role in the joint rein-
forcement of the articulatory difference between the two sounds.
Once the brain receives the input through each sensory modality
it begins to process it and develop the impressions required for
the neuronization (i.e., imprinting in neurons) of the two sounds
as two different entities.
f) Follow-up Procedures:
1. Obviously, the manner in which human memory functions
should be taken into consideration. To put it differently,
human internalization of an impression may just be for a
short time or for a long time. Those different times were
technically known earlier on as the sensory memory, short-
term memory and long-term memory.
2) It is quite natural and normal for the learner to be able to

correctly articulate the targeted sound, but then forget it in a
split second. This situation may be highly indicative of the
behavior of a sensory memory mode (i.e., perceive, produce

and forget).
3) The above situation certainly tells you that more rehearsal

is definitely needed which is often the case.
4) Return to more auditory, visual and kinesthet-

ic/proprioceptive rehearsal to at least transform the impres-
sion of the sound into the short-term memory during which
it undergoes a remember-forget-remember process. This is an
important stage in the learning/acquisition of the targeted
sound because it, most likely, indicates the initial stages in
the cognitive internalization of the sound. In other words,
learner is actively engaged in a conscious and cognitive pro-
cessing of the sound.
5) If after an active day of remember-forget-remember, the

learner comes back the next day and has forgotten the tar-
geted pronunciation do not panic. All that the learner needs
is a refreshment of memory.
6) Once you notice that the learner produces the sound in-
correctly, but then he instantaneously realizes the mispro-
nunciation and rectifies it immediately, you should relax be-
cause the learner is most likely at the final stage of the cor-
rect internalization of the sound in the long-term memory.
7) After this stage, what the learner needs is more occasional

rehearsal and practice to finally subconsciously internalize
the sound for immediate and automatic retrieval.
13.4. TECHNIQUES FOR TEACHING INTERDENTAL FRICATIVES / /

These two sounds are extremely rare in many languages
throughout the world; consequently they are typically classified
as marked (unfamiliar; uncommon) sounds. Their rarity may be
attributed to their association with the clinical speech problem
known as alveolar lisp according to which individuals, especially
young children, manifest the symptoms of such pronunciation
problem by replacing the [s, z] with [, ], respectively. What
matters here is the fact that in spite of the rarity of this pair, it
is, nonetheless, of high frequency of occurrence in English and
this renders the pair a major pronunciation problem for a large
CHAPTER 13 203
number of L2 learners of English from a wide variety of linguis-

tic backgrounds. It has already been pointed out, in 7.2.1 above,
that one of the most interesting aspects of the mispronunciation
of this pair is that it is realized differently depending on the na-
tive language of the learner and its phonological system. Its re-
placement is usually with far more unmarked (common) pairs of
sounds such as /t, d/ or /s, z/. Surprisingly, when teaching this
pair of sounds, one can come across individual learners who are
capable of pronouncing the pair, but are psychologically reluc-
tant to do that in conversation because they feel they are lisp-
ing. This is a psychological observation that should be taken
into consideration. Since the orientation for those who replace
/ / with the plosive /t, d/ will be different from the orienta-
tion of those who replace them with the fricatives /s, z/ there
will be two sets of procedures of orientation. Below are, first,
the orientation procedures and techniques for /t, d/ substitu-
tion.
a) Cognitive Orientation: The instructor has to ask learn-
ers to carefully watch the articulatory postures and facial ges-
tures concomitant with the production of [] and []. Carefully
and dramatically emphasize how the tip of the tongue slightly
sticks out at the biting edge of the upper incisors in the produc-
tion of both of them. It is helpful to remind learners that in
many cultures, children tend to stick their tongue out as a ges-
ture of mocking. The articulatory postures for [] and [] are
similar to the mocking gesture except in a very moderate and
gentle manner. Just to exaggerate the visual, auditory and kines-
thetic differences between [, ] and [t, d], carefully and slowly
demonstrate the production of the latter pair to highlight the
several sensory differences. Cognitively, the purpose of this
comparison is to encourage learners to consciously think about
the physical and articulatory production of the two targeted
sounds and their unwanted substitutions.
b) Auditory Orientation: You need to prepare suitable
lists of minimal pairs for both[] vs. [t] and [] vs. [d] to ori-
ent the learners on the perception, recognition and production
of the two pairs of sounds. The initial group of minimal pairs
should be carefully selected not only to highlight the pronuncia-
tion differences, but also to highlight the semantic differences
that result from the mispronunciation. One way to better high-
light the pronunciation differences is to select monosyllabic

words. Table 13.1, below represents some such minimal pairs:
Word Mispronunciation
[] [t]
<three> <tree>
<math> <mat>
<thank> <tank>
<bath> <bat>
<cloth> <clot>
[] [d]
<they> <day>
<then> <den>
<than> <Dan>
<there> <dare>
Table 13.1. The [t, d] rendition of the English interdentals

[, ] by different learners of English.
To further highlight the semantic difference ensuing from

the substitution, select a couple of minimal pairs for pictorial
(visual, figure 13.2) difference such as <three> vs. <tree>.
Bring to the attention of learners, that unfortunately the produc-
tion of [] and [] may sound like a speech defect (alveolar
lisp), but that is how the two sounds are in English.
Figure 13.2.Visual and semantic difference when substituting

/t /for //.
c) Visual Orientation: In order to warm up the learners for

the visual orientation, go back to the actual production of the
CHAPTER 13 205
pair /, / vs. /t, d/. This time use a paper to block the visual
channel of guessing of the sounds through lip-reading. Next,
remove the paper and pronounce [] and [] with a clear facial
posture showing the tip of the tongue at the biting edge of the
incisors. Then do the articulatory postures for [t] and [d] while
drawing the attention of the learners to the disappearance of the
tip of the tongue.
d) Tactile-Kinesthetic Orientation: To help learners with
this type of sensory orientation all that the instructor has to do
is to direct learners to put the tip of the tongue at the biting
edge of the incisors and repeat the contact for [, ] several
times until the brain makes an impression of the contact. The
brain may forget the sensation minutes, hours or days later, but
it is easy to recall the impression with a couple of maneuvers.
Word Mispronunciation
[] [s]
<thank> <sank>
<thin> <sin>
<think> <sink>
<thumb> <some>
<thick> <sick>
<theme> <seem or seam>
[] [z]
<the> <zee> (letter)
<then> <Zen>
<breathe> <breeze>
<clothe> <close> (v.)
Table 13.2. The [s, z] rendition of the English interdentals

[, ] by different learners of English.
After completing all the cognitive and sensory orientations for

/, / vs. /t, d/, move to contrasting /, / with /s, z/. This con-
trast may be more challenging than that of /t, d/ simply because
in this case there is only one distinctive feature of place of artic-
ulation that separates them (i.e., interdental vs. alveolar) instead
of two in the case of /t, d/ (i.e., place of articulation: interdental
vs. alveolar coupled with manner of articulation: fricative vs.

plosive). The instructor should use the general procedures in a,
b, c and d above with emphasis on the place of articulation us-
ing the minimal pairs in table 13.2, above.
In addition to the above guidelines and examples, instruc-
tor should attempt to demonstrate the considerable difference in
meaning when the /, / are replaced with /s, z/ similar to the
example of <three> vs. <tree>. A good example would be the
contrast between <think>1 and <sink>.
Figure 13.3.Visual and semantic difference when substituting

/s/ for //.
After the different cognitive and sensory orientations, the in-

structor has to put his work to the test through designing and
implementing some perception and cognition exercises leading
finally to production in the following manner:
a) Perception: Demonstrate the /, / pair several times
for perception by learners with /t, d/ substitution;
b) Recognition: Move to the recognition assessment by
numbering the sounds /t, d, , / as 1, 2, 3 and 4; produce a list
of ten sets of the four sounds arranged in a random order; pro-
nounce the ten sets slowly and methodically while asking the
learners to identify the sounds according to their assigned num-
bers on the worksheet; for example, if the sounds were pro-
nounced with this order: /t, , d, / then the set of numbers
should be: 1, 4, 2, 3; collect the worksheets to assess the general
accuracy. If the matching is considerably satisfactory, proceed to
the production phase, if not repeat the exercise asking learners
to print their names on the worksheets. The latter addition of
1
Point to the head while pronouncing it.
CHAPTER 13 207
names will make the learners more alert and attentive; besides,
it will also identify the learners who need more orientation.
c) Production: Ask for several volunteers to demonstrate
the pair /, / while in their seats; if all successful, ask the same
volunteers to appear before the class and demonstrate the
sounds while the rest of the learners listen to the demonstrations
and watch the facial gestures of the performers; ask the volun-
teers to produce minimal pairs that you provide such as
<then> vs. <den> and <thin> vs. <tin> and so on with
contrasts of /, / vs. /s, z/.
13.5. TECHNIQUES FOR TEACHING TENSE (LONG) VS. LAX (SHORT)

VOWELS
Obviously, the features of quantity and quality are in many in-
stances too intertwined to be separated and autonomously eval-
uated and described. The case of the English vowels in < sit >
vs. < seat > is typical in this regard. Even though many au-
thors and in many instances, handle the relationship of those
two vowels as short vs. long, the relationship is too complex to
be glossed over as short vs. long; it involves a feature of lax vs.
tense accompanied by a difference in quality. This complex fea-
ture combination becomes an instructional reality when adult L2
learners of English in whose languages there are no short vs.
long or lax vs. tense vowel distinctions. Typically in this regard
are speakers of Spanish, Italian, Russian, Greek, Filipino, among
many other languages. The first time I came across such a major
pronunciation difficulty for Hispanic learners of English was
some three decades ago when I was teaching an ESL class at
Loyola University Chicago. In order to demonstrate to my stu-
dents the so-called short vs. long vowel contrast in English, I
wrote the words <ship> and <sheep> on the board and asked
the best student in class to pronounce them. To my utter sur-
prise, I heard him produce the same pronunciation for both in
the form of [p]. I asked him to repeat the pronunciation and
the result was no more than [p] for both. As a linguist, I real-
ized for the first time that apparently there was only one version
of an vowel in Spanish and that the language does not
have a [] consonant sound. As an instructor, I paid utmost at-
tention to such problems. In this case, the incident became a
focus of part of my future research.
Let us consider more examples of teaching Hispanics the

vowel system of English.
a) Cognitive Orientation: In an experiment conducted
with adult Hispanic learners of English at a beginning proficien-
cy level, the learners were asked to pronounce the following
minimal pairs:
<did> <deed>
<sit> <seat>
<hit> <heat>
<pill> <peel>
<bid> <bead>
The overwhelming characteristic of most of the tokens obtained

was the failure to distinguish the vowel difference within each
minimal pair. The vowel that dominated those words sounded
shorter than the English long one and longer than the short one
although it shared a degree of tenseness with the English long
one. The failure to distinguish minimal pairs based on vowel
quantity is certainly the most major pronunciation problem for
Hispanic learners of English for two reasons. First, they have no
cognitive perception and recognition of the quality and quantity
differences between the two sets of English vowels because such
differences do not exist in Spanish. Second, without the realiza-
tion of such a vocalic feature, thousands of words in English
may be semantically confused some of which may be socially
very embarrassing because of obscene or vulgar connotations.
This is why some Spanish, French or Russian learners of English
avoid words such as < beach, sheet, winner, keys> etc.
Right at the outset, the instructor should bring this major
vocalic difference to the attention of the learners highlighting
the considerable phonetic and semantic differences. He should
also prime them for the cognitive realization of the difference
through the following steps:
1) Cite some monosyllabic minimal pairs and highlight the
difference in both pronunciation and meaning such as:
<sin> vs. <seen>
<pill> vs. <peel>

CHAPTER 13 209
<pick> vs. <peak>
2) Demonstrate the above minimal pairs again and highlight

the difference especially in the shape of the lips which tend
to be more stretched sidewise in the latter group of words
than the former.
3) Try to insert the Spanish word <sin> (without) in be-

tween the English minimal pair <sin> and <seen> to fur-
ther underscore the phonetic difference.
4) Transcribe the difference between the three words pho-

netically in the following format:
English: <sin> = [sn]; Spanish: <sin> = [sin]; English

<seen> = [sn].
The Spanish vowel should be transcribed as [] with [i] indicat-

ing the quality of tenseness and the single dot indicating the
medium or half-length somewhere between the two English
vowels [] and [i]. The transcription is meant to signal both
quality and quantity (length) differences. Schematically, the in-
structor can express the quantity (length) and quality (tense vs.
lax) by the thickness of the dark band as demonstrated below:
English Vowels Schematic Representation
<sin> = [p]
<seen> = [p]
The expected Spanish rendition of <sin> (without) will look

like the following
which is shorter than the long English vowel, but slightly less
tense. On the flip side, it is longer than the short English one
though somewhat more tense. The following strategies are sug-
gested to handle such vocalic multiple-feature differences.
b) Perception:
1) Produce a simulated Spanish vowel in the context
of, <s-n> for example, and transcribe it as [n]. Model this
simulated pronunciation several times while learners are lis-
tening.
2) Embed the third item (i.e., the simulated Spanish exam-

ples) in between the English minimal pair of <sin> = /sn/
and <seen> = /sn/. It is very convenient to select mono-
syllabic words to avoid any perceptual interference with bi-
syllabic or multisyllabic words. Number the three items as 1,
2 and 3 as in table 13.3.
English Transcription Spanish English Transcription

Word #1 Rendition #2 Word #3
<it> [t] [t] <eat> [t]
<bit> [bt] [bt] <beat> [bt]
<sit> [st] [st] <seat> [st]
<pill> [pl] [pl] <peel> [pl]
<rich> [r [r <reach> [r
Table 13.3. Simulated Spanish rendition of English minimal

pairs with [] and [] vowels.
Model each of the above triplets very carefully and ask stu-
dents to listen, sense and reflect on the process.
3) Cite more English minimal pairs such as:
[i] [i]
<bid> <bead>
<tin> <teen>
<lip> <leap>
and simulate the expected mispronunciations by Hispanic

learners i.e., [bd], [tn] and [lp]
4) Encourage learners to practice what Catford calls silent

introspection with emphasis on sensing the vowels rather
than hearing them. According to Catford, whenever one
makes sounds aloud, the auditory impression tends to mask
or override the sensations of muscle movements and other
proprioceptive sensations. Certainly, student awareness of
these proprioceptive sensations can be a useful adjunct to
making the articulatory adjustments necessary for learning
the articulation of new sounds (Catford, 1994).
CHAPTER 13 211
c) Recognition:
1) Select the triplet [bt], [bt], [bt] from table 13.3 in Per-
ception and number the items 1, 2, and 3. Record them ran-
domly each of which repeated twice in, at least, ten to fif-
teen attempts. Play the recordings back one attempt at a
time with a few seconds of pause between each attempt and
ask the learners to mark the items as 1, 2 or 3 on a specially
prepared worksheet.
2) Give the learners the key to the correct answers, ask them to
identify the errors and notice the tokens which were with the
highest percentage of inaccuracy. The results may be very sig-
nificant for further design of exercises and drills. If learners
misidentify the first token, such a result is expected because
the first token may be uttered more emphatically by the reader
prior to settling to a normal mode of pronunciation. Besides,
with the first token, learners are, so to say, taken by surprise
when they have not yet developed a mental or psycholinguistic
yardstick for the estimated evaluation of the samples.
3) Ask learners to return all the worksheets of the first trial

then ask them to prepare for a repetition of the exercise. Usual-
ly, a second and a third trial are much better than the first one;
more exposure creates more familiarity and both lead to more
confidence and better focus.
4) Select as many minimal pairs as in perception and mark the

columns as #1 and # 2 then continue to pronounce the items
arbitrarily and ask learners to identify them as #1 or #2.
5) Create carrier sentences and ask learners to place the appro-

priate member of a minimal pair as in perception/3 in the
blank then read the sentences once or twice:
Example: select either mill or meal and place it in the sentences

below:
A is the place we turn wheat into flour.
A breakfast is a in the morning.

CHAPTER 13 213
d) Production:
1) Ask volunteers to repeat the modeling of items in percep-
tion/2 above after as many times as necessary. The repeti-
tion should be instantaneous.
2) When some learners excel in the impersonation or pro-

duction of the targeted sounds, allow them to replace you, as
the instructor, in modeling. All learners should to be given
the opportunity to participate. Preferably, learners should
model while seated in their places among the students; this
setting creates a more learner-friendly situation.
13.6. TECHNIQUES FOR TEACHING VOWEL REDUCTION

What is vowel reduction, anyway? A reduced vowel becomes
shorter in length (quantity) and/or moves in the direction of a
neutral vowel, typically a schwa []. Vowel reduction is used
here to refer to the overall diminishing in the length (longer to
shorter) and quality (from markedly distinct features to less dis-
tinct ones) of the vowel. For instance, the change of [] in
strong form of < than > = [] into [], its weak form, is a
change from a longish and more distinct vowel into a very short
and qualitatively less distinct one. Earlier on, the English vowel
system was identified as a typical centripetal system as opposed
to the Spanish centrifugal one. Since vowel reduction in English
leads predominantly to schwa-type vowels, English is best de-
scribed as a schwa vowel system, whereas Spanish, without any
vowel reduction, is best described as a schwaless vowel system. At
this juncture, the question of what the characteristics of a schwa
are becomes unavoidable. Auditorily, the schwa is the most ob-
scure vowel. Visually, the lips are in the most neutral position:
not spread, not rounded with medium opening. Kinesthetically,
the whole vocal musculature is relaxed with no sensation of an
extreme forward, backward or upward movement of the body of
the tongue. Features such as those make the schwa a very elu-
sive sound and a difficult one to teach. With the elusive articula-
tory maneuvers involved in the production of a schwa, the pri-
mary dependence will be on the auditory channel in the form of:
(perceive) listen, recognize and produce. Nevertheless, in order
to maximize the effectiveness of the instruction, all the other
sensory channels have to be brought into play. This should be
reinforced by considerable non-linguistic and non-verbal activi-

ties and by contextualizing the schwa in words and longer piec-
es of speech. The typical action word to describe the instructors
productive action will be demonstrate; the learners action word
will be impersonate. The word imitate will be avoided because
it usually connotes mechanical action whereas impersonate im-
plies watching, contemplating and purposeful production.
Indeed, vowel systems that tend to be tense (centrifugal)
are usually without a schwa unit; therefore, its speakers face
great difficulty in neutralizing the vowels. This simply implies
that teaching the production of a schwa requires concentrated
sensory efforts until the sound is internalized cognitively. Below
are some perceptual orientations learners have to go through.
a) Perception:
1) Demonstrate to learners a posture of relaxing the muscles
as opposed to tensioning them. Muscle tension is most visi-
ble in the form of stretched neck muscles, which if detected
it means the whole musculature is inappropriate for lax
sound production of which a schwa is most typical.
2) Once the relaxed position is maintained, proceed to

demonstrate short intermittent moaning sounds as if feeling
pain which tend to be typical of schwa vowel postures.
3) If you, as an instructor, are a native or near-native speak-

er of English who has mastered the articulatory process of
vowel reduction or schwaization in English, try to demon-
strate the so-called hesitation vocal gesture (Delattre, 1965)
which happens to be very much a schwa-like sound.
4) Conduct a demonstration of CV type of nonsensical sylla-

bles with the following vowel qualities:
[] as in the English <hard>: [l l l]
[a] as in Arabic <( = >No): [la la la]
Then pronounce the three syllables with each vowel in two

formats: firstly, change the stress from the first syllable
through the third maintaining the vowel quality and with no
temptation, whatsoever, of reducing it to a schwa; secondly,
CHAPTER 13 215
do the same but this time reducing the unstressed vowels to

schwas. Repeat as demonstrated:
lalala lalala lalala
lall llal llla
5) Shift from nonsensical syllables to real words that are typ-

ical of schwa articulation such as banana and America.
6) Model the pronunciation banana emphasizing the reduc-

tion of the vowels of the first and third syllables and accen-
tuating, and somewhat exaggerating, the length and quality
of the vowel in the second syllable.
7) Model the pronunciation of banana in English, e.g.,

[bnn]2 and compare it to a simulated pronunciation of it
in Spanish, e.g., [banana]. It is perfectly acceptable to
somewhat over-accentuate (exaggerate) the differences be-
tween the two pronunciations to attract the attention to
them. This over-accentuation is similar to caricaturish draw-
ing in which the distinctive features are exaggerated to at-
tract attention. However, once the objective has been at-
tained, instructor should stop the exaggerated versions and
end his demonstration with normal and natural modeling of
pronunciation
8) To reinforce the auditory channel with the visual one,

transcribe the English pronunciation of banana phonetically
and portray the syllabic structure of the word with various
schematic diagrams that visualize the difference such as be-
low,
2
Or [bnn] as in RP.
The schematic representation can be in any form or shape as

long as it visually demonstrates the reduced shapes versus
the augmented one, the latter of which often standing for the
accentuated syllables. Model the syllabic pronunciation re-
peatedly.
9) For further visualization, demonstrate vowel reduction and

accentuation through nonverbal gestures. To demonstrate the
English pronunciation of [b n n] nonverbally, take one
short step, followed by a long step and then another short one;
the size of the steps indicates the strength of the stress and de-
gree of prominence of the vowel. For the demonstration of the
Spanish pronunciation, take three steps, which are, more or
less, of the same medium size although the stressed syllable
will slightly affect the quantity (length) and prominence of the
vowel. Visually, the Spanish pronunciation of <banana> will
schematically appear as follows.
CHAPTER 13 217
b) Recognition:
1) Number [b n n] and [ba na na] as #1 and #2, model
them several times and ask learners to identify them as 1 or
2.
2) Repeat Recognition#1 above then ask learners to identify

the word in its reduced (English) form or unreduced (Span-
ish) version.
c) Production: Return to activities in the perception stage

and redo them. Then ask learners to reproduce everything seen,
heard or perceived in the following manner:
1) Initially, any attempt at production by learners should be
instantaneously after the instructors modeling with no
background noise or speech separating the modeling from
production. This procedure secures the recency effect of the
input in the learners short-term memory circuitry for instant
retrieval and reproduction.
2) If the instructor feels the learners still experience difficul-

ty in recognizing and producing a schwa or a schwa-like
vowel viz., [ , which is acceptable for some native speakers
of English (Whitley, 1986:58), instructor should sustain the
modeling. If the failure continues, the instructor should not
persist. A break should be taken to let the learners relax. The
physical and psycholinguistic distancing of self from the
practice and drilling helps both the instructor and the learn-
er to start afreshthe instructor with a more energetic and
enthusiastic attitude and the learner with a more deter-
mined, thoughtful and hopeful attitude.
3) The instructor should encourage learners to think of the

targeted sounds and the overall practice for their mastery.
This reflective thinking can go on anytime during the prac-
tice and drilling or the distancing-of-self breaks. Techniques
1 (immediate repetition) and 2 (a break), above, should ap-
ply in the orientation of any aspect of pronunciation.
4) If the instructor feels learners need a refreshment of

memory, he should afford them with more opportunity to in-
ternalize the production of a schwa through comparison
with other non-reduced vowels in syllables similar to those

in Perception/4:
lalala lalala lalala
lall llal llla
5) Place the word banana in short appropriate sentences,

portray the English rendition in broad phonetic transcrip-
tion, model the pronunciation and ask learners to produce
after.
Example: Barbara eats a banana a day.
[brbr ts bnn de]
6) Select other appropriate English sentences, such as the

one below, model them and ask learners to reproduce them.
Example: I can see a banana in the tree.
[ai kn si bnn n tri]
13.7. TECHNIQUES FOR TEACHING ACCENTUATION (STRESS)

Before understanding the nature of rhythm and its teaching, one
has to be aware of the premise on which rhythm is based
stress. The traditional approach to teaching stress has been
through the auditory modality and its predominant technique
has long been, and unfortunately still is, repeat-after-me. This
exclusive auditory sensory modality in teaching stress may not
be effective with all learners. Besides, in some very traditional
methods of teaching, stress and rhythm are hardly ever tackled.
To help more learners master the process of stress perception,
recognition and production effectively and efficiently, the in-
structor has to resort to a cognitive approach that is implement-
ed through a wide variety of activities and exercises based on
diverse sensory modalitiesauditory, visual and kinesthetic/
proprioceptive. Occasionally, multisensory modalities are jointly
invoked.
Once again, one has to be reminded of the triangular base
of teaching pronunciation in the form of perception, recognition
CHAPTER 13 219
and production. These three sub-processes are interrelated and

one leads to the other.
In the following sections, the techniques of teaching stress
perception and recognition will be clustered together, while its
production will be handled somewhat separately since the tech-
niques applicable to teaching the first two processes may not be
applicable to the teaching of the third and vice versa.
a) Perception:
1) Take a two-syllable word such as <produce> and
demonstrate it with accent on the first syllable and then with
accent on the second syllable as exhibited below. Also to
highlight the grammatical and/or semantic role of stress in-
form the learners that the first rendition of <produce> in-
dicates a noun, whereas the latter indicates a verb.

pro duce pro duce
2) Ask whether any of the learners wants to demonstrate the

difference. If so, let them demonstrate the difference. Such
exercises are the initial exposure to the phenomenon of
stress. You should expect to have learners in your class who
do not know what is going on and what the difference is be-
tween the above two demonstrations of stress. Personally, I
am confident that such learners will always be there. Once
upon a time, I had been one such learner and I always en-
countered such learners in my classes as an instructor.
3) Go one step further to impress the difference on the

learners and try again by tapping on a desk or tableone
strong tap followed by a weak tap for <produce> and re-
verse the order for <produce>.
4) To dramatize the difference and capture the attention of

the learners, grab an empty carton box or can and beat the
rhythm of the two words on it. The visualization that goes
on with the latter demonstration is of great help to reinforce
the retention of the beats and the overall stress location.
b) Recognition:
1) Assign colors to the stressed and unstressed syllables.
Produce a prepared poster with the targeted two words. Tell
learners that the red-colored syllable is the stressed one in
each case and demonstrate the difference.
2) To do it differently but still visually, you may write/print

the stressed syllable in larger size as in,
pro duce produce
then take a big step followed by a small step forpro duce,

followed by a small step and a big one for produce. The ar-
rows mark the size of steps
3) One important set of features for the learners to observe

and impersonate are the facial and body gestures that ac-
company the stressing of a syllable. Stressing a syllable im-
mediately implies the exertion of greater physiological ef-
fort. This additional effort reveals itself physically in the
overall body gestures of the speaker. Such features may be in
the form of a downward head movement, rising of an eye-
brow or both, or simply a sudden movement with the hand
or arm.
4) For an auditory checking on stress recognition, take an-

other noun-verb contrast of English Such as: <contrast>
vs. <contrast>, number them #1 and #2, consecutively
and read them randomly asking the learners to identify them
as #1 or #2.
5) The learners will be taken one step further in the recogni-

tion challenge if they are given three nonsense syllables to
be demonstrated three times each with stress on a different
syllable as below,

CHAPTER 13 221
Mark the triplets as #1, #2 and #3 and then read them ran-
domly asking the learners to identify them as #1, #2 or #3.
One can also tap the three stress triplets on a desk, empty
can or on anything that resounds and ask the learners to
identify the location of the strongest stress within each tri-
plet. If you have a small wooden hammer or a gavel, tap or
beat the same stress triplets then ask learners to recognize
the strongest beat in each case. In both of the above demon-
strations (reading or tapping) there are certainly visual signs
to indicate the stressed syllable. Bring the latter fact to the
attention of the learners.
6) Repeat the beating (tapping) of the stress triplets men-

tioned above putting more emphasis on the movement of the
hand or the gavel so that the movement will visually attract
the attention of learners then ask them to identify the
stressed syllable. One can also use the overhead projector or
power point slides of the stress triplets telling the learners
that red-colored syllables are the stressed ones.
7) Finally, you can ask for volunteers to demonstrate the

stress triplets.
c) Production: Obviously, production is the last phase in

the training of learners to master the dynamics of stress. It is in
the production phase that learners will be far more able to sense
stress dynamically and proprioceptively because they will actu-
ally be physically producing the stress either through non-
linguistic activities or through authentic linguistic ones as fol-
lows:
1) Foremost in this regard is for the instructor to demon-
strate any stress performance that he intends to ask learners
to execute through tapping or beating of syllables in dou-
blets, triplets or even quadruplets. It is preferable for both
instructor and learner to use the hand or a small hammer or
gavel to perform the demonstration. It is easier for the learn-
er to begin with two-syllable structures and proceed further
with multi-syllable structures. Ask for a volunteer to perform
the beating of two syllables once with the first syllable re-
ceiving the strong stress and then the second syllable receiv-
ing it as in:
Gradually, the performance should include more learners

with more syllables and beats as in:
or
While all this tapping or beating of syllables goes on, you

may ask the learners to notice the difference in the physical
effort exerted by the hand of the performer. The physical dif-
ference is visually quite noticeable. In fact, the performer
may be asked to intentionally exaggerate or intensify the
physical effort or bodily gesture that accompanies the pro-
duction of a stressed syllable.
2) After one is done with the non-linguistic exercises, it is

time to move to the linguistic ones especially with counting
numbers beginning with one two and moving to one two
three etc as follows:
/one two/
/ /
Reverse the stress.
/one two/
/ /
CHAPTER 13 223
Move to three units advancing the stress every time:
Move to four units advancing the stress every time:
3) In English, the most appropriate materials to teach stress

placement and its linguistic significance at word level are
the noun-verb doublets such as: in table 13.4, below,
Noun Verb
subject subject
record record
contract contract
perfect perfect
present present
insult insult
Table 13.4. Sets of noun-verb contrasts signaled by stress

placement
Pronounce pairs such as the above slowly and somewhat

emphatically (i.e., with somewhat more determined articula-
tion) and ask the whole class to repeat after you as a chorus.
Later on, ask for individual volunteers to come to the front
and repeat the above list after you. At this stage, the instruc-
tor may carry the performance a step further by asking indi-
vidual learners to produce the pairs without being modeled.
If there are still more learners who still experience some dif-
ficulty, you have to be patient with them and be ready to re-
peat the performance and the demonstrations. In such in-
stances, you should proceed in the following manner. First,
demonstrate the examples with somewhat more emphatic ar-
ticulation in the hope of highlighting the difference between
them even further. In fact, you may select pairs of words in
English which contrast not only grammatically, such as the
above noun-verb, but also semantically and orthographically
such as <insight> vs. <incite> and <August> vs. <au-
gust>. Second, demonstrate them in conjunction with
somewhat more visible facial and body gestures. Third, as
pointed out in the early stages of teaching production, learners
should produce the targeted modeling instantaneously after
you with no background noise or speech separating the model-
ing from production to trigger the recency effect and its impact
on memory. Fourthly, ask the learner to produce a series of
repetitions for a given item of demonstration. These repeti-
tions may sound mechanical in nature, but they may have a
positive impression on the brain inasmuch as their long-term
retention is concerned. After all, learning by human beings is
a process of transforming mechanical habits into cognitive
ones.

There are no solid and comprehensive rules that predict and
capture stress placement in English words for two main reasons.
First, English is a stress-timed language with mostly unpredicta-
ble stress placement (MacKay, 1978, 150). Second, there is a
high percentage of foreign loanwords in English that maintain
their own stress patterns and create exceptions to the native
rules. This makes the problem of mastering stress placement for
learners of English as L2 a major difficulty and source of accent.
It is, therefore, an area which is worthy of the attention of any
instructor teaching English or any other language as L2. The
instructor has to be professionally qualified to teach this very
essential linguistic aspect in cross-language teaching. As has
been discussed earlier on, instruction can begin as a phonetic
practice which gradually should lead to authentic linguistic ma-
terials. The most appropriate linguistic materials would be the
CHAPTER 13 225
so-called strong vs. weak forms the majority of which fall under
the categories of prepositions, conjunctions, auxiliary verbs, etc.
Word Strong Form Weak Form (most frequent)
<and> [] [] or even [] or []
<than> [] []
<have> [] []
The noun-verb category of words which are only distinguished

grammatically by the placement of stress such as the ones cited
in section 13.7. Additional excellent materials that serve the
purpose of teaching stress placement in English words is through
grammatical derivations. One of the most typical examples cited
by teachers of English to demonstrate the variability of stress
and rhythm in English is the word <photograph>. Notice the
following changes in primary stress indicated by the bold sylla-
bles in conjunction with the large dots as marked below.
<photograph>

<photographic>

<photography>

Because of the highly unpredictable nature of stress placement
in English, some phoneticians and instructors of pronunciation
advise learners of English to listen carefully to the native speak-
ers or materials recorded by them or use a pronunciation dic-
tionary. This is not a bad advice, but it is a privilege that is not
attainable by all learners of English. Consequently, some, re-
gardless of how few, of the most common and effective rules of
stress assignment in English should be taught systematically.
However, foremost in importance is that stress and stress place-
ment should be taught because many learners in many lan-
guages are not really aware of the existence of stress as a lin-
guistic dynamic or may not have the auditory sensitivity to pick
up the stress on their own.
CHAPTER 14: TIPS FOR ACCENT REDUCTION AND
ACCENT DETECTION

Accent reduction has already been defined and elaborated on
(4.5.2) as an attempt to bridge the pronunciation gap between
the native speaker of a given language and someone who is at-
tempting to learn that given language as L2. The only further
addition to the theme of accent reduction in this chapter is to
provide some key tips to help learners focus on the points which
are considered significant in its reduction.
With regard to accent detection in the context of this book,
two points are worthy of consideration. First, how does one dis-
cover that he has an accent? Second, how can accent lead one to
the identification of the linguistic affiliation of the speaker? In
response to the first question, one has to bear in mind that per-
haps one of the most controversial aspects of pronunciation at
large is that many people do not realize they have an accent or
admit to having it because they hear themselves through their
own native linguistic filter or prism. There are two ways to con-
vince one to admit having an accent. First, the instructor or
coach brings it to ones attention and advises him to work on it.
Second, it is self-discovered after having ample exposure to the
target language especially in cases of full immersion coupled
with experience in speech production and pronunciation. The
latter case applies to my personal attempt at accent reduction.
Nobody brought my accent in English to my attention; I was
fortunate to be able to discover it and work on it diligently.
I pointed out in chapter 1, that my so-called Iraqi English
was distinctly accented with the linguistic substrata of my three
native languages. I did not have any problem with consonants
except for replacing the English approximant <r> with my tap
and trill <r>s. I had some problems with the vowels, especially
227
with the neutralization of unstressed ones. My most serious

problem involved accentuation and rhythm at large. I also have
to admit that I did not discover my accent right away; it took
about six months to gradually realize that I had a few funda-
mental problems. I had to spend long days, weeks and months
paying intensive attention to native speakers of English and
practicing what I listened to. I certainly reduced my accent con-
siderably in segmental as well as suprasegmental constituents.
Specifically, I have to emphasize that I improved my stress
placement and accentuation in a very marked way. In spite of
all that improvement coupled with decades of full immersion in
English as an adult in the United States, I still feel I have an ac-
cent; however, it hardly ever interferes with communication.
Briefly, I have overcome almost all sources of phonological ac-
cent, but there are still some residues of phonetic accent that
pop up here and there.
14.2. TIPS FOR ACCENT REDUCTION

There are three general routes that lead to accent reduction.
First, try to improve pronunciation as much as possible by first
mastering the basic phonological differences between L1 and L2,
including those pertaining to segmental elements (consonants,
vowels) and other suprasegmentals (stress, rhythm and intona-
tion and/or tones) to avoid semantic confusion. This is one way
to prove the functionality of the phonological vs. phonetic ac-
cent dichotomy and why educationally the former should re-
ceive more attention. Second, work on the most striking phonet-
ic differences that do not result in semantic change, but do gen-
erate different degrees of noise ranging from slight to substantial
which indirectly interfere with comprehension. Third, raise the
competency level in other aspects of language including, mor-
phology, syntax, lexicon and idiomatic expressions to facilitate
comprehension.
14.2.1. Tackle the most Salient Phonological Problems

Because the phonological deficiencies in ones L2 proficiency are
more perceptually outstanding and semantically distracting,
they should be identified first and given the priority in tackling.
Naturally, if the individual is not able to identify such deficien-
CHAPTER 14 229
cies, a competent instructor or a seasoned supervisor has to do

that. In any case the following are some of the most significant
phonological accent-causing problems that should receive priori-
ty.
a) There is a significant number of consonantal elements
generating phonological accent across languages. Just to cite
some examples, Arabic poses a set of several really challenging
sounds that learners of Arabic usually encounter. They include,
the uvulars [], the pharyngeals [ ] and the emphatic [
]. For instance, /q/ = [q] is overwhelmingly confused with
/k/ = [k] by non-Arabs. For Germans, Arabs, Filipinos and
Greeks the sounds of //, /p/ /f/ and // are, respectively,
quite demanding. The pair / / is absent in many languages;
consequently, it is a fundamental source of both phonological
and phonetic accent; frequently, the pair is replaced with either
/s, z/ or /t, d/.
What is of utmost importance is not just the ability to accu-
rately pronounce such sounds in isolation (out of linguistic con-
text), but also in proper conversational context and in lengthy
discourse. In my teaching career, I have come across many
learners who master the execution of the phonologically alien
sounds in isolation, but occasionally fail when a lengthy and in-
depth conversation is sustained. This happens when the alien
sound has not yet become a cognitive entity stored in the sub-
conscious; consequently, the speaker may become vulnerable to
a phonological lapse or pull in the direction of L1.
b) Care must be taken of the most significant vocalic ele-
ments generating phonological accent such as the absence of
short (lax) vs. long (tense) vowels of English in many languages
including Italian, French, Russian, Tagalog and Spanish, among
others. For instance, a Mexican must distinguish between an
English short/lax vowel [] and a long/tense vowel [i] in order
not to confuse a <bid> = [bd] (for a contract) with a
<bead> = [bd] (on a rosary). There is no place, whatsoever,
for a typical Mexican [bid] rendition in English. Equally, a na-
tive Italian should be able to distinguish between <pull> =
[pl] and <pool> = [pul] as there is no place for his rendi-
tion of those two English words as [pul]. As for individual vow-
els one can readily notice the absence of close front rounded
vowel /y/ of French and German in English and many other
languages. The very English vowel // has no counterpart in

many languages; it is, therefore, replaced with /a/ or //, etc.
c) Seriously consider the most significant consonant clus-
ters generating phonological problems for speakers of languages
without clusters or with very limited word-initial, middle or
word-final clusters. Typically, such problems are observed with
learners of English of Japanese, Korean, Chinese, Hispanic, Ital-
ian, Arabic linguistic backgrounds, among others. Just for the
sake of example, any Japanese should really work extremely
hard on moving his pronunciation of <football> = [f] far
away from the traditional Japanese rendition of [fut-ta-bo-].
d) Carefully abide by the stress-placement differences be-
tween L1 and L2. Differences in stress placement can be a major
source of both phonological and phonetic accent. For instance,
one should be aware of the fact that the difference in stress loca-
tions between some English words determines whether they are
verbs or nouns/adjectives such as with <import, contract, de-
sert>, etc. Also, the learner should be aware of the fact that
regardless of the difference in spelling between <incite> and
<insight>, it is the stress location that triggers the semantic
difference between them.
e) Observe the rhythm differences between L1 and the tar-
geted L2. The shift from stress-timed rhythm to syllable-timed
one or vice versa is one of the most significant markers of ac-
cent. Although the accent tends to be more phonetic than pho-
nological, but the phonetic noise that ensues causes serious pro-
nunciation distortion that obscures meaning. If, for instance, a
Spanish speaker imposes his syllable-timed rhythm on the stress-
timed rhythm of English the overall rhythmical chunking of the
target language and tempo changes. In many instances, a His-
panic speaker, whose overall linguistic competency in English is
quite proficient except for the retention of his syllable-timed
rhythm, may still sound somewhat unintelligible to a native
speaker of English. It is like speaking a morse-code rhythm
with a machine-gun rhythm.
f) Tone and intonation are equally important like stress and
rhythm. They are extremely significant when the individual in-
volved is moving from a tone language, such as Chinese, to an
intonation language such as English or vice versa. For instance,
Mandarin Chinese has four tones in the form of: high level, rise,
CHAPTER 14 231
fall-rise and fall. To demonstrate the lexical function of such

tones, one has to use the most common monosyllabic word in
Mandarin, <ma> which conveys the following meanings with
the four tones: /ma/ = (mother); /ma/ = (hemp); /ma/ =
(horse) and /ma/ = (scold) (http://mandarin.about.com/od/pro
nunciation/a/tones.htm). Although both tone and intonation are
based on pitch, the difference is in the use of pitch. In the into-
nation languages, pitch tends to be the feature of a phrase or
sentence to signal attitudinal, emotional and, at times, grammat-
ical differences, while tone tends to be the feature of a word or
syllable to signal lexical and/or grammatical differences. Thus,
the difference between tonal and intonational languages consti-
tutes a major difference in the overall melody of a given lan-
guage. This is why adult learners of intonational languages, such
as English, experience serious difficulty when learning tonal
languages. By the same token, people with tonal languages en-
counter similar level of difficulty when embarking on learning
intonational languages. Each group struggles to restrain the
strong drive in the direction of pitch orientation in their lan-
guages when working on their L2s. Failing to restrain that drive
results in serious overall melody change coupled with lexical,
grammatical and attitudinal alterations.
14.2.2. Tackle the most Salient Phonetic Problems

Obviously, the instructor has to point out to learners which
problems rise to the level of a phonological accent and which
ones remain at the phonetic level between the two languages
involved. Between any two languages, there are innumerable
phonetic differences some more prominent than others. Let us
take English and Greek. In Greek, the English postalveolar affri-
cates [ , as in <church> and <judge>, are absent and are
usually replaced with their alveolar counterparts [ . The
substitution does not confuse meaning; it simply renders unfa-
miliar the words in which the sounds occur. Although for Eng-
lish learners of Spanish the confusion between two <r>s as in
<caro> = [kao] (dear) and <carro> = [karo] (cart/car)
constitutes a phonological problem, for Spanish learners of Eng-
lish the problem remains phonetic. Most typical for speakers of
sub-continental Indian language, retroflexion is the primary
source of their phonetic accent; however, it is so pervasive that
it can seriously interfere with their overall comprehension by

the natives of the language they are targeting. The comprehen-
sion is further hampered with the syllable-timed stress type that
most Indians languages have. On the flip side, it is worth point-
ing out that for English speakers learning other languages, it
makes a substantial difference in their phonetic rendition of
those languages if they can replace their approximant <r>
with a tap, flap, or trill <r> according to the specific language
they are targeting.
14.2.3. Improve other Linguistic Skills

Developing high competency in all linguistic systems including
syntax, morphology, lexicon and idiomatic expressions alleviates
the negative impact of both phonological and phonetic accent in
two ways. First, it affords the person using L2 greater sentential
and discourse fluency. Second, the resulting fluency increases
the likelihood of predicting the meaning of the mispronounced
words by the native listener from the context of the sentence or
discourse produced. This is exactly what most educated adults
attempting communication in L2 make recourse to. It was men-
tioned earlier on that retroflexion permeates throughout the na-
tive languages of sub-continental India. Consequently, when
speaking or learning other languages, the strong retroflexion
colors their rendition of any L2 they attempt to speak. It, thus,
generates serious phonetic noise to the ear of the natives of the
attempted L2. One strategy to mitigate the pervasive interfer-
ence of retroflexion noise in their L2s is through enhancing the
competency in syntax, morphology, lexicon, idiomatic expres-
sions, etc. They will all contribute toward a better conveyance
of meaning to their listeners regardless of the phonetic interfer-
ence.
14.3. ACCENT DETECTION

Research and investigation in the domain of automatic speech
and accent detection and recognition systems using advanced
software is in considerable progress (Zheng et al, 2005). Howev-
er, machine accent detection is not the focus here; rather, the
focus is on live accent detection in real-life situations. Stated
differently, the attention is focused on man-to-man accent detec-
CHAPTER 14 233
tion not machine-to-man. There are different scenarios for the

detection of the presence of accent by different individuals rang-
ing from an ordinary unsophisticated person to a professional
one. Let us shed some light on these scenarios.
14.3.1. Accent Detection by Ordinary Individuals

This is a scenario which involves ordinary native speakers of a
given language listening to others who are speaking their lan-
guage as L2. The listeners in this case may range from illiterate
to educated and to highly educated individuals, but none of
whom has professional orientation in linguistics, in general, and
phonetics and pronunciation, in particular. All that they have is
a sensitive ear to their language that helps them to innately as-
sess the accuracy of the pronunciation of their L1 by an L2
speaker. Often, they perceptually, as well as cognitively, sense
the accent, but cannot describe it or pinpoint the details. The
assessment of accent by some of them might be very general and
as simple as saying: He has an accent. Others might say: The
vowels sound somewhat off the normal. A few might be fairly
detailed in their commentary by identifying certain vowels, con-
sonants or even stress and rhythmic performances that did not
sound accurate and/or native-like.
14.3.2. Accent Detection by Professionals

In this context, professionals represent those individuals who
possess knowledge and experience in the nature and structure of
human language with reasonable exposure to linguistics and
phonetics. They understand human language as sets of struc-
tures and systems and the manner in which they collaborate
accurately to generate meaning. With specific focus on pronun-
ciation, they possess enough expertise to make very refined
judgment as to the source of mispronunciation and accent as
well as the recommendation and techniques to eliminate the
source of the problem. However, to be able to achieve all that,
the individual has to have knowledge about the structures and
systems of the two or more languages involved. This knowledge
does not imply knowing the two languages fluently or even fair-
ly; however, he must have a certain degree of experience and
exposure to the two languages in action and real-life situations.
Personally, I do not consider Spanish a language that I know,

but I have had ample exposure to it in the community and as
well as in my classes. For a period of three decades, I accumu-
lated rich knowledge and experience about Hispanic learners of
English. I identified almost all the major pronunciation errors
they made in English. This authentic interaction with Hispanic
learners of English encouraged me to pursue my interest in doc-
umenting my experience in a book1 which covered a wide range
of pronunciation problems they encounter in learning English
and the tips to enable them overcome them. It was through such
educational exchange with my students that I enriched my own
knowledge and field experience in teaching pronunciation. Their
paramount difficulty in distinguishing English <fill> from
<feel> or <full> from <fool> helped me identify the vowel
system of Spanish as a centrifugal one in which the vowels tend
to be almost frozen in their quality and quantity. This forces
them cognitively and articulatorily to cluster the two vowels of
each pair into one vowel of their own that is neither of the two
English vowels. It took me a long time to discover the problem,
analyze it and design effective strategies and exercises to help
them perceive the difference, recognize it and execute it com-
fortably.
14.3.3. Telling the Linguistic Background of a Speaker

through Accent
Some people, especially linguists or those who have a passion
for languages, manifest a hobby or skill at trying to identify the
linguistic background of a speaker of a second language (L2) by
simply listening to the person without even having any prior
knowledge of the native language of the speaker. This is not an
easy task because it needs talent, linguistic skill and experience;
most likely, a combination of all three. Personally, I have tried
this hobby and I succeeded at times and failed at others. Such a
linguistic skill can work as a two-edged sword for professional
1
Linguistic tips for Latino learners and teachers of English, 2007.
CHAPTER 14 235
orientation of under-cover agents as well as for those who hunt

for them. More will be said on this in due course.
This is not an easy task because it needs thorough
knowledge about the overall pronunciation of the L2 that is spo-
ken and the L1 of the speaker; additionally, it requires
knowledge about the two phonologies and at times even more
refined phonetics. Let us take English as the attempted L2 by an
X language speaker. One of the foremost prerequisites to make a
correct prediction is to be aware of some of the most salient
phonological features of the X language and their expected re-
flexes in the rendition of English. Even with this apparent thor-
ough knowledge, one can go wrong for two primary reasons.
First, some typical phonological and/or phonetic features may
be shared by more than one language. Thus, one tends to think
that the language of the speaker is X but it turns out to be Y or
Z. Let me cite one example of how I went wrong in my linguistic
identification of one of my students. It was the first day of a
class and I wanted to get as much background information
about my students as possible. Everyone began to give a brief
summary of his/her background. In the short presentation of
one of the students, I heard a couple retroflex <r>s, so,
I said: Are you from India or Pakistan?
She responded: I am from Guatemala.
Of course, I was mute and I apologized. Deep in my heart, as a

linguist, I was embarrassed for the glaring misidentification. She
identified herself as a native speaker of Spanish, but where did
she get that retroflex <r>? I still do not know to this day. My
only explanation at the time was that the retroflex <r> might
have been part of a linguistic substratum of a local Native Amer-
ican language or an idiosyncratic feature of her overall speech.
To add more specifics to this linguistic encounter, I have to ad-
mit that besides the hearing of the retroflex <r>, the student
was of a dark complexion very much like many sub-continental
Indians. The complexion enticed me to make the judgment that
turned out to be wrong. So since then I taught myself a lesson:
Dont be fooled by the complexion.
Obviously, my attempts at accent detection have not al-
ways been disappointing. On several occasions I was right on
target and then the question came: How did you know I was
X? Let me cite some such accent detection anecdotes.
Years back, in one instance, I was involved in the registra-
tion for the new semester courses. One course was canceled, but
was still on the list. Some students came complaining and yell-
ing: Why is it still on the list? One of the students who desper-
ately needed the course was the most vocal. I heard typical re-
peated alveolo-palatal [] and [] sounds in place of the English
[s] and [z]. I wanted to cool her down and,
I said to her: Are from Greece or Cyprus?
She abruptly said: How did you know?
I said: Register for my course and I will tell you later.
No doubt, my course was relevant to what she was planning to

study.
In another instance, I was in hospital for a surgery. A day
after surgery, I was allowed to take some clear soup. When it
was brought to my room the nurse said: You hab to pinish all of
it. Naturally, her sentence contained several pronunciation hints
that pointed in the direction of her Tagalog language background.
Three hints were outstanding, namely, [b] for [v], [p] for [f] be-
sides her unaspirated [t] and [p]. When she came back to check on
me, I greeted her in Tagalog.
She was surprised and said: Are you Pilipino?2
I said: No, but I thought you were. I read your name tag.3
Just recently, I greeted one of my new neighbors simply to

strike a friendly relationship. There was work done in his garage
door.
I said: You must be busy.
He replied: Yes, I am changing my garage door.
2
Remember, not Filipino
3
Of course, I gave a fake answer to avoid commenting on her accent.
CHAPTER 14 237
Once I heard [ instead of [] and [g in-

stead of [g , I guessed he was Greek.
I said: Tikanis? (How are you?).
He responded: Kala (Good).
And he was Greek. It took one little indicator of accent to identi-

fy him linguistically.
14.3.4 Hiding an Agent through Hiding an Accent

In this section, the aim is not to detect an accent, but rather to
hide it lest it should be exposed with unknown consequences. It
was made clear earlier on that accent reduction is professionally
very beneficial for instructors of languages, actors, newscasters,
spontaneous interpreters and under-cover agents; however, in
the real world, there is one difference between the under-cover
agents and the rest of the professionals who can tolerate making
mistakes or even blunders, but under-cover agents are not sup-
posed to. Any inappropriate linguistic gaffe can cost them dear-
ly. No doubt, linguistic talent, language competency, in general,
and pronunciation, in particular, are indispensible prerequisites
for someone who is willing to serve as under-cover agent or the
more crudely named profession of spying. Grosjean (2010-a)
wrote a compact, but very linguistically and culturally rich arti-
cle concerning the linguistic and cultural prerequisites for this
very risky and mentally demanding profession of spying, espe-
cially as sleeper agent (or deep cover agent).
A sleeper agent can be either the native of the same coun-
try that is spied on or a native of the spying country who has
been embedded in the target country for a long time. In the case
of a native citizen, there is no problem of native language and
culture competency and native unaccented pronunciation be-
cause he belongs to that linguistic/cultural community. The lack
of native or even native-like linguistic and cultural qualifications
may be the source of problem for the non-native agent. This is
so because in spite of the very disciplined linguistic and cultural
orientation of the person, he may inadvertently and subcon-
sciously reveal a linguistic or cultural hint, regardless of how
minor, that is inconsistent with the language and culture of the
targeted country. As the intruder country does its best to groom
its spies, so does the targeted country in priming its counterin-

telligence cadre. Spy trackers should supposedly be coached to
watch for the tiniest linguistic, cultural and behavioral indica-
tions to be able to detect deeply covered agents.
The above discussion brings forth the very delicate respon-
sibility of preparing an under-cover agent. Naturally, if the
agent is a native speaker then accent becomes a non-issue. For
the non-native agent achieving native proficiency in L2 is a rari-
ty except if he is embedded as a very young person so that he
will have ample opportunity for full immersion in the target
language and culture. In the absence of the latter condition, the
candidate will come from amongst adults. Even with adults, the
younger the better for mastering an L2 or C2. Inasmuch as pro-
nunciation is concerned the target will be near-native proficien-
cy as the native proficiency is a rarity. Even near-native profi-
ciency is not an easy task to achieve in pronunciation as early
exposure to the sound system seems to be a prerequisite unlike
the rest of the linguistic systems of lexicon, morphology and
syntax in which many educated non-natives can outperform
many natives. Once again, international figures such as Joseph
Conrad, Krishna Menon and Henry Kissinger achieved the high-
est levels of native competency in all linguistic systems of Eng-
lish except for pronunciation. In the case of the latter two,
whom I have heard speaking, it has been a phonetic accent (no
interference with meaning) rather than a phonological one (in-
terference with meaning).
It is assumed that individuals recruited for spying must
demonstrate prowess in many skills, foremost of which are ex-
pected to be competency in L2. This implies extensive orienta-
tion in pronunciation with particular focus on areas that trigger
the most revealing phonological accent followed by areas of less
revealing phonetic accent. To achieve this goal, the individual
should go through a very stringent preparation and orientation
under the supervision of highly gifted linguists and phoneticians
whose focus would be all the steps highlighted in section 14.2
perhaps with greater depth and rigor.
The greater in-depth and rigorous orientation of spies is
needed because of the risky nature of the assignment. Greater
in-depth orientation implies not only preparation in linguistic
and phonetic nature of the targeted language, but also in C2
CHAPTER 14 239
orientation. If a person is claiming to be a native speaker, there

may be occasional language oddities or incongruous phrasing
that can alert you to this person not being a native speaker but
trying to hide this fact (http://www.wikihow.com/Spot-a-Spy).
This implies that any L1 and C1 residues seeping accidentally
through the adopted L2 and C2 can become potential traps.
At the highest level of orientation, the person may need a
certain degree of theoretical familiarization with linguistics and
the dynamics of speech production and pronunciation. Consid-
erable difference is expected in the interpretation of speech acts
and linguistic facts between an ordinary person, a person with
some exposure to linguistics and a specialist. Consider this ex-
ample that I recently had to comment on. One of my friends
who is a geologist, thought that another friend of ours was the
speaker of certain Neo-Aramaic (Modern Assyrian) dialect be-
cause all his <r>s sounded fully retroflex instead of the regular
tap or trill <r>s of the language. I said: No, he was not. Then,
I explained to him that our friend has a speech defect that made
his <r>s sound as if they were retroflex. This indicates the
huge gap between the impression of a lay person and the reality
of a specialist. There is no doubt, whatsoever, that the theoreti-
cal knowledge and expertise in any field facilitates the mastery
of applied skills and the creativity in them.
One last piece of advice from a linguist to whoever is will-
ing to consent to the mission of an under-cover agent: beware
your subconscious when handling the target language (L2) and
target culture (C2). It has been emphatically stressed throughout
this book that both native language (L1) and native culture (C1)
are the creation of the brain and are stored in the subconscious.
This is why Grosjean highlights the fact that Sleeper agents
have to speak monolingually at all times, even when the situation
is conducive to code-switching (Grosjean, 2010-b). Certainly,
Grosjean inspires me to stress the fact that sleeper agents should
also behave monoculturally lest they should be snared because of
falling victim to their cognitively deep-seated C1. Let me cite
the following anecdote from years back when two undercover
Iraqi policemen tricked two Iranian smugglers into admission
and arrested them by manipulating the linguistic/cultural sub-
conscious of the smugglers:
Half a century ago when we used to travel from Kirkuk in

the north of Iraq to Baghdad in the center, we had the option
of taking a very slow train that took twelve (12) hour or the
option of a bus that took six hours, but was vulnerable to
getting lost due to the lack of modern roads with proper
signs. Once I took the bus and it pulled out in the morning
and continued for two (2) hours after which it stopped at a
small town and picked up two more passengers and contin-
ued the journey. The two passengers were dressed in folk at-
tire and carrying midsize bags on the back. After an hour or
so, and to the surprise of all passengers, the driver an-
nounced that he had lost his way and had been circling
around during the last hour. The diver was embarrassed to
find himself back at the same town where he first stopped
and picked up two passengers. We were all frustrated, but
we had no better choice. Just a coincidence, two more pas-
sengers were picked up who were dressed in pants and jack-
ets. They proceeded to the last seats on the bus which were
empty. The bus drove for about ten (10) minutes and all of a
sudden I heard someone from the back seats loudly uttering
a few words which I knew they were not Arabic, Turkmeni
or Assyrian because I spoke those languages. They did not
sound Kurdish as I had some impression of how Kurdish lan-
guage sounds. When the few words were spoken out, I saw
only two people from the middle seats turn their head to
look back at the source of the words. It was one of the last
two passengers who was the speaker. In a split second, they
both dashed forward in the direction of the first two passen-
gers who looked back, yelled out some obscenities and
handcuffed them saying that they were Iranian smugglers
whom they have been tracking for hours. They then ordered
the driver to return to the same town once again where the
police station was.
I discovered later that the last two passengers were undercover

policemen and had lost the track of those two Iranian smugglers,
but they had a feeling that they had taken the bus. Ironically,
the bus lost its destination and by a stroke of chance was back
where it was two hours ago. Apparently, the two undercover
policemen intentionally went to the back of the bus and sound-
CHAPTER 14 241
ed out loudly a phrase in the Farsi language. Of all the passen-

gers, only the two smugglers turned their head decidedly and
looked back because they thought they were welcomed by an
acquaintance. Certainly, by turning their heads to the source of
the Farsi language greeting, they did not behave consciously;
rather, it was the two experienced policemen that played a lin-
guistic trick on them through manipulating the unintentional,
subconscious and reflexive response of the two native speakers
of Farsi language. It was their linguistic and/or cultural subcon-
scious that betrayed them.
In sum those two smugglers were caught because they
failed to behave as Arabic-speaking monolinguals, as Grosjean
phrased it, and hide their identity; hence, under-cover agents
should be more careful than those two smugglers in order not to
take the bait and uncover themselves under the influence of the
subconscious.

No one likes to have an accent; however, accent is usually part
and parcel of attempting an L2 when one is no longer a child.
Some adults stay with their accent throughout their life, but
others try to reduce it as much as possible. The degree of success
in accent reduction is contingent on many factors including
agethe younger the more successfullinguistic talent, enthu-
siasm, professional orientation, etc. If all factors are available, it
is the professional instruction and orientation that eventually
make the difference in the degree of success. Doubtless, not eve-
ryone is qualified to conduct the professional orientation in mat-
ters of language mastery and accent reduction. Any person dele-
gated with this responsibility must have the qualifications of a
linguist and phonetician or be a linguistically gifted individual.
From the linguistic perspective, all attempts at accent reduction
or accent faking have no negative intent except for spying which
is a double-edged sword: good for the spying agent, but bad for
the spied on. If one accepts this type of professional service and
wants to avoid the risky consequences, he has to learn how to
restrain the subconscious linguistic and cultural dominance of
L1 and C1 in situations where L2 and C2 are needed.
REFERENCES
Abercrombie, D. (1967). Elements of general phonetics. Edin-

burgh: Edinburgh University Press.
Adams, C. (1979). English speech rhythm and the foreign learner.
The Hague: Mouton.
Aitchison, Jean (1996). Seeds of language: language origin and
evolution. New York: Blackwell.
Anderson, John R. (1980). Cognitive psychology and its implica-
tions. San Francisco: W.H. Freeman & Company.
Arnold, Magda B. (1984). Memory and the brain. Hillsdale, New
Jersey: LEA Publishers.
Baddeley, Alan D. (1976). The psychology of memory. New York:
Basic books, Inc. Publishers.
(1993). Your memory: a users guide. London: Multimedia
Books Limited.
Bates, Elizabeth (1999). On the nature and nurture of language.
Frontiere della Biologia. The Brain of Homo Sapiens. Rome:
Giovanni Trecanni.
Beck, Douglas L., and Flexer, Carol (2011). Listening is where
hearing meets brain. http://www. hearingreview. com/issues/
articles/2011-02_02.asp.
Best, Catherine T. (1991). The emergence of native-language
phonological influences in infants: a perceptual assimilation
model, Haskins laboratories status report on speech research,
SR107/108, 1-30.
Best, Catherine T, McRoberts, Gerald W. and Goodell, Elizabeth
(2001). Discrimination of non-native consonant contrasts
varying in perceptual assimilation to the listeners native
phonological system. Journal of the Acoustic Society of Amer-
ica, Vol.109, 2, 775794.
243
Borden, Gloria J. and Harris, Katherine S. (1980). Speech science

primer: physiology, acoustics and perception of speech. Balti-
more: Williams & Wilkins.
Bourne, L.E, Dominowski, R.L., Loftus, E.F. and Healy, A.F.
(1986). Cognitive processes. Englewood Cliffs, New Jersey:
Prentice-Hall.
Caplan, David (1995). Language and the Brain, Vol. 4, No. 4, 14.
Carpenter, Harry (2004). The Genie within: your subconscious
mind. San Diego: Anaphase II Publishing.
Catford, J.C. (1977). Fundamental problems in phonetics. Edin-
burgh: Edinburgh University Press.
Catford, J.C. (1994). A practical introduction to phonetics. Oxford:
Clarendon Press
Chechik, Gal, Meilijson, Isaac and Ruppin, Eytan (1999). Neu-
ronal regulation: a mechanism for synaptic pruning during
brain maturation. Journal of Neuronal Computation, Vol. 11,
20612080.
Chomsky, N. and Halle, M. (1968). The sound pattern of English.
New York: Harper & Row.
Christensen, Ken Ramshj (2001). The Co-evolution of language
and the brain: a review of two contrastive views (Pinker &
Deacon). Grazer Linguistische Studien, Vol. 55, 120.
Croom, Christopher, (2003). Language origins: did language
evolve like the vertebrate eye, or was it more like bird
feathers? http://www.csa.com/discoveryguides/lang/gloss_f.php
Clancy, Barbara and Finlay, Barbara (2001). Neural correlates of
early language learning. In Language development: the essen-
tial readings (Mike Tomasello & Elizabeth Bates, eds.), 307
330, Wiley-Blackwell.
Cummins, Jim (1979). Cognitive/academic language proficien-
cy, linguistic interdependence, the optimal age question
and some other matters. Working papers in bilingualism, 19,
121-129.
(1984). Bilingualism and special education: issues in assess-
ment and pedagogy. Clevedon: Multilingual Matters.
Dalbor, J. (1969). Spanish pronunciation: theory & practice. New
York: Holt, Rinehart & Winston.
Dale, P. & Poms, L. (1985). English pronunciation for Spanish
speakers. Englewood Cliffs, New Jersey: Prentice-Hall.
REFERENCES 245
Daniloff, Raymond G. (1973). Normal articulation processes.

(Fred D. Minifie, Thomas J. Hixon, Frederick Williams, Da-
vid J. Broad, eds.). Normal aspects of speech, hearing and lan-
guage, 169209. Englewood Cliffs, New Jersey: Prentice-
Hall.
Dauer, R. (1983). Stress-timing and syllable-timing reanalyzed.
Journal of Phonetics, 11, 5162.
Deacon, Terrence (1997) The Symbolic species: the co-evolution of
language and the human brain. London: Penguin Books.
de Houwer, Annick, (1990). The Acquisition of two languages from
birth: a case study, Cambridge Studies in Linguistics.
Delattre, P. (1965). Comparing the phonetic features of English,
German, Spanish and French. Heidelberg: Julius Groos Ver-
lag.
Eimas, P.D. (1978), Developmental aspects of speech perception.
(In R. Held, H. W. Leibowitz, & H. L. Teuber, eds.), Hand-
book of sensory physiology, vol. 8. Berlin: Springer.
Esling, John H. and Wong, R. F. (1983). Voice quality settings
and the teaching of pronunciation. TESOL Quarterly,
Vol. 17, 8995.
Fleischhacker, Heidi (2000). Cluster-Dependent Epenthesis
Asymmetries (http://www. linguistics.ucla.edu/people/grads/
fleischhacker/uclawpl.pdf).
Fox, Barbara, J. and Hull, Marion A. (2002). Phonics for the
teacher of reading. Upper Saddle River/New Jersey: Merrill.
Frana, Aniela Improta (2006). Introduction to neurolinguistics.
In: Textos em Psicolingstica, Ingrid Finger; Carmen
Matzenauer(Org.). Pelotas: Editora da Universidade Catlica
de Pelotas, Vol.. 1, 152.
Gardner, Howard. (1983). Frames of mind: the theory of multiple
intelligences. New York: Basic Books.
(1993). Creating minds: an anatomy of creativity seen
through the lives of Freud, Einstein, Picasso, Stravinsky, Gra-
ham, and Gandhi. New York: Basic books.
Gimson, A. C. (1967). An introduction to the pronunciation of Eng-
lish. London: Arnold.
Gopnic, A., Meltzoff, A., Kuhl, P. (1999). The scientist in the crib:
what early learning tells us about the mind. New York: Harper
Collins Publishers
Grosjean, Franois (2010-a). Bilingual: life and reality. Harvard

University Press.
(2010-b). http://www.guardian.co.uk/education/2010.
Hadlich, Roger, Holton, James and Montes, Matias (1968). A
drillbook of Spanish pronunciation. New York: Harper & Row,
Publishers.
Handbook of the international phonetic association (1999). Cam-
bridge: Cambridge University.
Hiiemae, Karen M. and Palmer, Jeffrey B. (2003). Tongue
movements in feeding and speech. Critical Reviews in Oral
Biology & Medicine. Vol. 14, 413429.
Hyman, Larry (1975). Phonology. New York: Holt, Rinehart &
Winston.
Jabbari, Ali A., van de Weijer, Jeroen, Safari, Parvin and Fa-
laknaz, Farane (2012). Journal of Teaching Language Skills,
3/4, 5976.
Jakobson, R., Fant, G. & Halle, M. (1969). Preliminaries to speech
analysis. Cambridge, Massachusetts: M.I.T. Press.
Johnson, J. S. and Newport, E. (1989). Critical period effects in
second language learning: the influence of maturational
state on the acquisition of English as a second language,
Cognitive Psychology, Vol. 21, 6099.
Joseph, Rhawn (2011). Origin of thought: consciousness, lan-
guage, egocentric speech and multiplicity of mind. Journal
of Cosmology, Vol. 14.
Kanokpermpoon, Monthon (2007). Thai and English consonan-
tal sounds: a problem or a potential for EFL learning? ABAC
Journal, Vol. 27-.1, 5766.
Kelly, Spencer, zyrek, Asli and Maris, Eric (2009). Two sides
of the same coin: speech and gesture mutually interact to
enhance comprehension. Psychological Science, Vol. 20,18.
Kenyon, J. S. and Knott, T. A. (1953). A Pronouncing dictionary of
American English. Springfield, Massachusetts: G & C Merri-
am Company.
Kissin, Benjamin (1986). Conscious and unconscious programs in
the brain. New York: Plenum Medical Book Company.
Ladefoged, P. (1982). A course in phonetics. New York: Harcourt
Brace Jovanovich.
Ladefoged, P and Maddieson, Ian. (1996). Sounds of the worlds
languages. Cambridge, Massachusetts: Blackwell.
REFERENCES 247
Larson, Christian D. (1912) Your forces and how to use them

(http://www.sacred-texts.com/nth/yfhu/yfhu02.htm).
Laver, John (1980). The phonetic description of voice quality.
Cambridge: Cambridge University Press.
(1994). Principles of phonetics. Cambridge: Cambridge
University Press.
Lehiste, I. (1970). Suprasegmentals. Cambridge, Massachusetts:
M.I.T.
Lenneberg, Eric H. (1967). Biological foundations of language.
New York: John Wiley & Sons.
Levitt, Robert A. (1981). Physiological psychology. New York:
Holt, Rinehart & Winston.
Loftus, Elizabeth (1980). Memory: surprising new insights into how
we remember and why we forget. Reading, Mass.: Addison-
Wesley Publishing Co.
Lowie, W. and Bultena, Sybrine (2007). Articulatory settings and
the dynamics of second language speech production
(http://www.phon.ucl.ac.uk/ptlc/proceedings/ptlcpaper_15e.pdf)
MacKay, I. R. A. (1978). Introducing practical phonetics. Boston:
Little Brown & Company.
McKinney, James C. (2005). The diagnosis and correction of vocal
faults: a manual for teachers of singing and for choir directors.
Long Grove/Illinois: Waveland Press.
Martinet, Andr (1964) Elements of general linguistics. London:
Faber & Faber.
Miller, Cynthia. Brain for success. (http://brainforsuccess.com/
howyourbrain work.html).
Morley, Joan (1991). The pronunciation component in teaching
English to speakers of other languages. TESOL Quarterly.
Vol. 25, 481520.
OConnor, J. D. (1973). Phonetics. London: Penguin Books.
Odisho, Edward Y. (1977). Arabic /q/: a voiceless unaspirated
uvular plosive. Lingua, Vol. 42, 343347.
(1977). The opposition/ t / vs. / th / in Neo-Aramaic.
Journal of the International Phonetic Association, Vol. 7, 79
83.
(1979/a). An emphatic alveolar affricate, Journal of the
International Phonetic Association, Vol. 9, 6771.
(1979/b). Consonant clusters and abutting consonants.
System, Vol. 7, 205210.
(1981). Teaching Arabic emphatics to the English learn-

ers of Arabic. Papers and Studies in Contrastive Linguistics,
Vol. 13, 275280.
(1988a). The sound system of modern Assyrian (Neo-
Aramaic). Wiesbaden: Otto Harrassowitz Velag.
(1988b). Sibawaihis Dichotomy of majhra and
mahmsa Revisited. Al-arabiyya. Vol. 21, 8190.
(1990) Phonetic and phonological description of the la-
bio-palatal and labio-velar approximants in Neo-Aramaic.
Wolfhart Heinrichs (ed). Studies in Neo-Aramaic, 2933. At-
lanta, Georgia: Scholars Press.
(1992). Transliterating English in Arabic. Zeitschrift fr
arabische Linguistik, Vol. 24, 2134.
(2002). Bilingualism: a salient and dynamic feature of
ancient civilizations. Mediterranean Language Review, Vol.
14, 7197.
(2003). Techniques of teaching pronunciation in ESL, bilin-
gual and foreign language classes. Mnchen: Lincom-Europa.
(2004). A linguistic apoproach to the application and teach-
ing of the English language. New York: Edwin Mellen Press.
(2005). Techniques of teaching comparative pronunciation
in Arabic and English. New Jersey: Gorgias Press.
(2007/a). A Multisensory, Multicognitive Approach to
Teaching Pronunciation. Revista de Estudos Linguisticos da
Universidade do Porto, Vol. 2, 328.
(2007/b). Linguistic tips for Latino learners and teachers of
English. New Jersey: Gorgias Press.
(2010). An Aerodynamic, Proprioceptive and Perceptual
Interpretation of Sibawayhis Misplacement of // and //
with Majhra Consonants. Zeitschrift fr Arabische Linguistik,
Vol. 52, 3952.
(2013). Some primary sources of accent generation in
the pronunciation of English by native Arabs. Nicht nur mit
Engelzungen (Beitrge zum semitischen Dialektologie: Fest-
schrift fr Werner Arnold zum 60 Geburstag, eds. R. Kuty,
U Seeger and Sh. Talay, 265274). Harrassowitz Verlag,
Wiesbaden.
Pennington, Martha C. and Richards, Jack C. (1986). Pronuncia-
tion revisited. TESOL Quarterly, Vol. 20/2, 207225.
REFERENCES 249
Petitto, Laura-Ann (2002) http://www.dartmouth.edu/~news/rel

eases/ 2002/nov/ 110402a.html.
Pinker, S. (1994). The language instinct. New York: Harper Per-
ennial Modern Classics.
Port, Robert F. (2007).The graphical basis of phones and pho-
nemes. Murray Munro and Ocke-Schwen Bohn, eds. 349
365) Second-language speech learning: the role of language ex-
perience in speech perception and production. Amsterdam:
John Benjamin Publishing Co.
Repetti, Lori (2012). Consonant-final loanwords and epenthetic
vowels in Italian. Catalan Journal of Linguistics, Vol. 11,
167188.
Roach, Peter (1983). English phonetics and phonology. Cambridge:
Cambridge University Press.
Selinker, Larry (1972). Interlanguage. International Review of
Applied Linguistics, Vol. 10, 209231.
Shiver, Elaine (2001). Brain development and mastery of lan-
guage in the early childhood years. (http://www.idra.
org/IDRA_Newsletter/April_2001_Self_ Renewing_Schools_Early_
Child-
hood/Brain_Development_and_Mastery_of_Language_in_the_Early
_Childhood_Years/)
Stockwell, R. P. and Bowen, J. (1965). The sounds of English and
Spanish. Chicago: The University of Chicago Press.
Tortora, G. and Grabowski, S. (1996). Principles of anatomy and
physiology,(8th ed.), New York: HarperCollins College Pub-
lishers.
Vicentini, Alessandra (2003). The economy principle in lan-
guage: notes and observations from early Modern English
grammars. (http://www.ledonline.it/mpw/.)
Werker, J.F. (1995). Exploring developmental changes in cross-
language speech perception. (Lila R. Gleitman and Mark
Liberman, eds.). An invitation to cognitive science. Cam-
bridge, Massachusetts: MIT Press.
Werker, Janet F. and Tees, Richard C. (2002). Cross-language
speech perception: evidence for perceptual reorganization
during the first year of life. Infant Behavior & Development,
Vol. 25, 121133.
Werker, J.F., Yeung, H.H., & Yoshida, K.A. (2012). How do in-
fants become experts at native-speech perception? Current
Directions in Psychological Science, Vol. 21/4, 221226.
Wesson, Kenneth A. (Neurosciencehttp://www.sciencemaster.com/
columns/wesson/ wesson_part_03.ph).
Whitley, M. S. (1986). Spanish/English contrasts: a course in Span-
ish linguistics. Washington, D.C.: Georgetown University
Press.
Zero to Three. How the brain develops (2009). (https://www.
childwelfare.gov/pubs/issue_briefs/brain_development/how.cfm).
Zheng, Yanli, Sproat, Richard, Guy, Liang, Shafranz, Izhak,
Zhouz, Haolang, Suz, Jurafsky, Dan, Starr, Rebecca, Yoon,
Su-Youn (2005). Accent detection and speech recognition
for Shanghai-accented Mandarin. (http://www.isca-
speech.org/ archive/ interspeech_2005/i05_0217.html).

Odisho, E. Y. Pronunciation Is in The Brain, Not in The Mouth: A Cognitive Approach To Teaching It

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Odisho, E. Y. Pronunciation Is in The Brain, Not in The Mouth: A Cognitive Approach To Teaching It

Uploaded by

Copyright:

Available Formats

Pronunciation is in the Brain, not

A Cognitive Approach to Teaching it

All rights reserved under International and Pan-American Copyright

Library of Congress Cataloging-in-Publication

Table of Contents ...................................................................... v

Chapter 2: The Cognitive Base of Language ............................. 25

4.4. Accent: A Normal Linguistic Phenomenon................. 64

Chapter 6: Ten Commandments for Teaching Effective

8.2. The Vowel System of English .................................. 121

11.2.6. Instructors Academic and Professional

13.8. Concluding Remarks ............................................. 224

My fascination with human language, in general, and pronuncia-

that pronunciation is in the brain prior to being in the mouth

Hence, if accent is not a pathology, then what is it? From

The emergence of accent in the speech of adult L2 learners

ample represents a phonological accent, whereas the latter rep-

have tried the combination he/she, but after a while it becomes

I recently celebrated my 75th birthday and with it came my deci-

The conventions and symbols of the International Phonetic Asso-

Vowels Phonetic Description

Consonants Phonetic Description

Voiced labialdental approximant

Subscript dot under /d t s/ = [ ] indi-

Consonants IPA Phonetic Description

[] voiced pharyngeal fricative

1.2. THE EVOLUTION OF MY INTEREST IN LINGUISTICS AND

ty of sound materials from different languages. Such an experi-

1.2.1. Natural Language Internalization: Language

that it was the official language of the country especially in edu-

1.2.2. A Major in English Language in a non-English

mersion in the target language (L2) due to the absence of a fully

1.2.3. Full Immersion as an Adult in Two Languages

1.2.3.1. Full Immersion as an Adult in all-Kurdish Environment

Kurdish was a necessity in a city where almost all daily interac-

ing me predict the meaning of many words and expressions

After this brief outline of my experience with Kurdish, it is abso-

that it seemed a more natural way for human language acquisi-

1.2.3.2. Full Immersion as an Adult in an all-English Environment

ic accent2 that did not interfere with meaning. Obviously, all

segmental aspects of their pronunciation. I remember vividly

1.2.4. Phonetic and Linguistic Orientation in Graduate

tion the inventory doubled or even tripled into such sounds as

1.2.5. Educational and Professional Challenges in the U.S.

to face head on any challenge. I succeeded in all three challeng-

1.3. THE IMPACT OF MY LINGUISTIC/PROFESSIONAL BACKGROUND ON

1.3.1. Impact of my Linguistic Background

and with only minimum context/situation-embedded environ-

very carefully to native speakers of English and carefully

1.3.2. Impact of my Teaching Career

1.3.2.1. Teaching Linguistics and Phonetics

1.3.2.2. Teaching and Training ESL Teachers

ed many of them in all of the seven colleges of CCC; in fact, I

1.3.2.3. Teaching Language Arts

styles. Everywhere I thought I could infuse knowledge from lin-

1.3.2.4. Teaching Bilingual Education

from scratch a minimum of six (6) graduate courses supported

1.4. CONCLUDING REMARKS

1.4.1. Childhood Trilingualism Triggered Interest in

1.4.2. Learning Kurdish Triggered Interest in Linguistics

1.4.3. Graduate Study Immersed me in Phonetics and

The purpose of this experiment was to monitor the intraoral

1.4.4. Professional Challenges in the U.S.

to the pedagogy of teaching pronunciation that is multisensory