Words, Context and Contextualization

Universitatea Dunrea de Jos
din Galai
Facultatea de Litere
Specializarea:
Limba i literatura romn Limba i literatura englez
Curs opional de
limba englez
Conf. dr. Gabriela Dima
Anul al II-lea, semestrul I
D.I.D.F.R.
UDJG
Faculty of Letters
Words. Context. Contextualization

(An elective course in English language
for 2nd year students)
Course tutor:
Associate Professor Gabriela Dima, PhD
Galati
2013
Contents
Chapter 1. Word Definitions
1.1. The Phonological Word
1.2. The Orthographic Word
1.3. The Grammatical Word
1.4. The Lexical Word
1.5. Exercises
Chapter 2. Word Classifications
10
2.1. Content / vs/ Function Words. Lexical Density
10
2.2. Core Words /vs/ Non-Core Words. Core Vocabulary
10
2.3. Dictionaries as Repositories of Words
12
2.4. Lexical Words and Computer Corpora
16
2.5. Exercises
18
Chapter 3. Context and Contextualization
20
3.1. Acceptations of the Terms Context and Contextualization
20
3.2. Text Typology
22
3.3. Lexical Profiles. A Corpus Illustration
22
3.4. Exercises
27
Final Test
30
References
34
CHAPTER 1. Word Definitions

The term word has been given numerous definitions corresponding to the area
of research and the linguistic objectives envisaged. A good synthesis has been
provided by Crystal (1995: 379):
a. A unit of expression which has universal intuitive recognition by native speakers,
in both spoken and written language.
b. The physically definable unit which one encounters in a stretch of writing
(bounded by spaces) or speech. The word in this sense is often referred to as the
orthographic word or the phonological word.
c. A unit of meaning incorporating all the grammatical variations or forms in which it
is liable to occur, i.e. a lexeme, defined in its turn as the minimal distinctive unit in
the semantics of a language.
d. A linguistic unit that combines to form phrases, clauses and sentences and is
otherwise distinguished as the smallest possible sentence unit.
The present course will illustrate the way these definitions work along two
coordinates by firstly considering words in isolation and secondly in various
contexts, underlining the function of words as the most complex means of verbal
communication: What can you talk of? hed ask. Go on pick a subject. Talk. Use
language. Do you know what language is? Well, Id never thought before- have
you- its automatic to you isnt it, like walking? Well, language is words, hed say, as
though he were telling me a secret. Its bridges, so that you get safely from one
place to another. And the more bridges you know about the more places you can
see ( Arnold Wesker, Roots)
1.1. The Phonological Word
The word identified in a stretch of speech is called a phonological word. It is
studied within the phonological system of each and every language under the domain
of phonology. It can be represented by a range of sounds varying between different
dialects, contexts and individual speakers.
The English phonological system is made up of 12 vowels, 8 diphthongs and 24
consonants (Dima 2009). The twelve English vowels are: / i:, i, e, , , :, o,:,,u:,
:, / ( Dima 2009). They have been established by applying the commutation and
substitution methods. It is important to realize that simple English vowels may
contrast with one another in several different ways. The most relevant contrastive
factors are tongue-position, length and lip-rounding. The eight English diphthongs
are: /ai/, /ei/,/ i/, /a/, //, /i/, / , //. They have been defined as sequences of
two (vocalic/consonantal) sounds and treated as being made up of a vowel nucleus
followed by one of the glides /i/,//,// . The twenty-four English consonants are: /p,
t, k, b, d, g, t , d, f, , s, , v, z, , , h, m, n, , l, r, j, w/. They have also been
established by applying the commutation and substitution methods. In order to get an
overall view of the whole range of consonantal sounds, we can arrange them in a
number of groups, each group having in common a certain mechanism of
articulation. The standard pronunciation of the English sounds is changed when
registers, levels of formality and dialects are taken into account. Such a treatment has
been the subject of various debates among both linguists and sociologists.

1.2. The Orthographic Word
The word identified in a stretch of writing is called an orthographic word. It
is studied under the domain of orthography which specifies the correct way of using
a corresponding writing system to write a certain language. The word orthography is
derived from the Greek words orths (correct) and grphein (to write).
The spelling of concrete words is based upon one or another of the graphical
rules that exist in a language. In all European languages spelling systems are based
upon the use of the alphabet. The English orthography is chiefly based on the
historical principle, the majority of words preserving their Old English or Middle
English spelling, sometimes without modifications. The English language spelling
system is highly irregular, but still regular to some degree, its mastery only requiring
knowledge of the 26 letters of the alphabet.
The phonetic principle (one letter for one sound) is not expected to play any
role at all in the English spelling. No letter represents one sound only; most letters
are used to represent several sounds each and many combinations of letters are
sometimes used to denote the same sounds.
The morphological principle can be traced by and large on the basis of the
following morphemes indicating: the plural of nouns; the 3rd person singular of the
simple present tense; the past tense; the comparative and superlative degrees of short
adjectives. Except for these rules, the English orthographic word varies with
typography and script. Variation is seen not only at the grammatical level, but at the
lexical level, too.
1.3. The Grammatical Word
Grammatical words are traditionally referred to as items which have the
capacity to fit particular types of linguistic environment according to some patterns
and to the kind of meaning associated with a particular class of word. The canonical
classes defined for the English language as parts of speech are: nouns, pronouns,
adjectives, articles, verbs, adverbs, conjunctions, prepositions, interjections. In what
follows we consider it helpful to have a quick review of these, as they are at the core
of obtaining contextualization at the local and sentential levels.
a. Noun. A word or word group that names a person, a place, a thing, an attitude, an
idea, a quality, a condition, an event, a process, a phenomenon etc. Examples: girl,
garden, chair, boldness, truth, clairvoyance, solitude, oxidation, photosynthesis etc.
The noun is at the core of nominality, a characteristic of the English written
language.
b. Pronoun. A word that functions as a substitute for a noun. Examples: it, he, she,
we, they, us, ourselves, you, this, them. The English language naturally bears upon
the obligatory occurrence of pronouns with a significant referential capacity, whereas
Romanian is a PRO-drop language, suppressing the noun in subject position or as a
marker of possession (e.g. the possessor and the possessed: Give me your hand / Dami mana).
c. Verb. A word or word group that expresses an activity, condition or state of being.
The verb syntactic function is referred to as predication and the main verb in a
sentence is therefore called predicate. Examples: run, sleep, be, feel, believe,
promise, write. A verb phrase (a group of words acting like a single part of speech)
will usually consist of a main verb plus an auxiliary verb like have or be: has been
6

going, is moving. A special subclass of auxiliary verbs is provided by modals: can,
may, must, ought, shall, will.
d. Adjective. A word or word group that modifies (limits, defines, characterizes, or
describes ) a noun. Examples: nice, young, stubborn, cozy, perfect, impressive,
sublime, undeniable, annoying.
e. Adverb. A word or word group that modifies a verb, an adjective, or another
adverb.Examples: run fast, sleep deeply, very rarely, extremely delicate, run
extraordinarily fast.
f. Preposition. A word or word group that signals relationships of space, time,
direction, or association between its object (the object of a preposition is always a
noun) and some other word or word group. Examples: in the courtyard, after 7:00
P.M., to the lighthouse, with a vote.
g. Conjunction. A word or word group that connects two or more sentence
components. There are three major subtypes: coordinating conjunctions (examples:
and, but, for, yet, so); subordinating conjunctions (examples: although, because, if,
whether); and correlative conjunctions (examples: either or, neither... nor, both
and, not only but also).
h. Interjection. Any part of the sentence that is syntactically dependent on the rest of
the sentence and expresses the speakers attitude towards various elements of the
context of situation. Examples: Well! Oh! For Gods sake!
1.4. The Lexical Word
Lexical words are units of lexical meaning; i.e. of non-grammatical meaning
which contribute to organizing our world experience into a wide range of categories.
This classificatory capacity is illustrated by dividing the lexicon/vocabulary of a
certain language into lexical sets or fields: The diversity of approaches to the
lexicon of a natural language emerges from its dynamics manifested both at the
synchronic level in form of variation and at the diachronic level in form of change.
The most complex way of analyzing these coordinates is the ordering of the lexicon
in lexical fields considered as useful tools for the exploration of lexical meaning,
providing information about the form, meaning, usage, categorization and
relationships of words and phrases ( Dima 2011:)
Lexical fields are organized along a common dimension of meaning which
allow the occurrence of lexical relationships among words such as: antonymy,
synonymy, hyponymy, polysemy, homonym. As examples of lexical fields we can
mention:
a. Board games: chess, draughts, go, Monopoly etc.
b. Parts of the body: arm, leg, head, finger etc
c. Cakes and pastries: almond cake, clair, muffin, pie, tart etc
d. Chairs: stool, bench, pew, sedan chair etc
e. Sea mammals: dugong, sea lion, elephant seal etc
Lexical fields are studied within the domains of lexicology and semantics, as
branches of theoretical linguistics.

1.5. Exercises
1. Practice the following sentences containing words with the contrastive sounds
/ i, e, / and then provide your own examples
a. Bill has seven children.
b. When did Dan tell him?
c. Has Ed been pretty busy?
d. The women met last Saturday.
2. Practice the following sentences containing words with the contrastive sounds
/ ai, ei, oi / and then provide your own examples:
a. John is my baby boy.
b. I like boiled rice and soy sauce.
c. What kind of a noise annoys the oyster.
d. We eat steak each day.
3. Read aloud the following sentences with words containing the contrastive
sounds / / and / /.
a. This is the third toothbrush Ive lost this month.
b. They have to think this thing through.
c. The babys teething, so her mouth is rather sore.
sounds / / and / /.
a. Its a pleasure to see you, Mr. Shaw.
b. She wore a beige suit and red shoes.
c. Shall we wash our clothes, or brush them?
sounds /t / and / d /.
a. George bought that chair last July.
b. Which subject does Mr. Jackson teach?
c. Did Charles and Joe enjoy the lecture?
6. Write the following nouns in the plural. Specify the rule you have
applied:
a. wharf
roof
wolf
chief
sheaf
safe
strife
self
proof
b. volcano
solo
mosquito
tomato
hero
octavo
c. monkey
turkey
baby
supply
chimney
body
leaf
shelf
hoof
potato
allegro
halo
cry
day
glory
calf
wife
half
photo
cargo
negro
piano
echo
radio
reply
donkey
joy

folly
dormitory
victory
sympathy
7. Add the endings, keeping -y or changing it to i:

alloy-s/es
employ-ed
pay- ing
ray-s/es
fly-s/es
berry-s/es
empty-ed
marry-s/es
betray-ed
dizzy- er
delay- ed
verify-ing
heavy-er
canary- s/es
guy-s/es
busy-est
8. Add the endings for the comparative of the following short adjectives and
mind the changes in spelling.
narrow
busy
happy
free
long
old
funny
rich
9. Make up lexical fields using the words listed below:

bathroom, house, apple, chair, bill, banknote, money, pear, purse, grapes, stream,
waistcoat, cardigan, pullover, train, bird, sparrow, seagull, wood, oak-tree, birch,
canvas, painter, brush, ink, pen, spatula, satellite, star, planet, chess, tennis, skirt,
wheel, kitchen, path, window, cardinal, knight, queen, blackbird, swallow, path, lark,
cracker, biscuit, chocolate cake, pie, teacher, politician.
10. Complete the sets you have delineated in exercise 9 by adding other words
up to obtaining 10 members for each field.
CHAPTER 2. Word Classifications

It has become almost a truism to state that words in texts are distributed very
unevenly: a few words are very frequent, some are fairly frequent, and most are very
rare. These facts are due to the two distinctions which provide ways of talking about
the vocabulary of a certain natural language and the distribution of the lexis in texts:
the first distinction refers to content /vs/ function words, and the second goes to core
/vs/ non- core words.
2.1. Content / vs/ Function Words. Lexical Density
Content words specify what a text is about, while function words relate
content words to each other. Content words are also referred to as major, full or
lexical words. They carry most of the lexical content in being able to make reference
outside language. Function words are also referred to as minor, empty, form,
structural and grammatical words. They are essential to the grammatical structure of
sentences. The above mentioned categories divide the traditional parts of speech into
two broad sets: content words: nouns, adjectives, adverbs, main verbs; function
words: auxiliary verbs, modal verbs, pronouns, prepositions, determiners,
conjunctions.
There is a fuzzy boundary between the two word classes. For example, modal
verbs (must, can, should, will etc.) which express an array of modalities such as
obligation, permission, ability, volition, willingness etc., therefore convey content,
have the syntactic function of auxiliary verbs. Pronouns can have extra-linguistic
reference, and play a distinct role as a noun substituter. Therefore, besides a rough
semantic distinction, content and function words can have strikingly different formal
characteristics. Accordingly, content word-classes have many members (e.g. tens of
thousands of nouns, but only a couple of dozen pronouns) and are open to hosting
new words coming from childrens vocabulary, literary language etc. But it is very
rare for new pronouns to enter the language. And only content words take inflections
(such as plural inflections on nouns, person endings on verbs).
This distinction is relevant when analyzing text structure, because different
types of texts have predictably different proportions of content and function words.
On average, written texts have a higher proportion of content words than spoken
texts, because written texts can be more tightly packed with information.The
proportion of lexical words expressed as a percentage represents the lexical density
of a text. If N is the number of running word-forms in the text, and L is the number of
lexical word forms, then lexical density = 100 x L/N. This might be very helpful in
comparing texts from both similar and different registers of a natural language.
2.2. Core Words /vs/ Non-Core Words. Core Vocabulary
Texts can be also compared by calculating what percentage of words from the
core vocabulary they contain: By definition, the core vocabulary is known to all
native speakers of the language. It is that portion of the vocabulary which speakers
could simply not do without (Stubbs 2001:41). In approximate synonymic series
such as : happiness, beatitude, blessedness, bliss, felicity, gladness; compete,
contend, oppose, rival, vie; great, eminent , illustrious , notable, noteworthy there
would be widespread agreement that one word in each list is somehow more basic
than the others: happiness, compete, great. Such assertions are partly based on
10

frequency, but also on functional criteria: which words would be most easily
understood by children or non-native speakers; which words would be most useful to
introduce in the early stages of teaching English as a foreign language; which general
words are more inclined to be used in different registers.
As a conclusion, core vocabulary will certainly contain the most frequent
words in the language (as marked for example in dictionaries such as Longman
Dictionary of Contemporary English Language, Macmillan Dictionary). The 100
most frequent word-forms from a large general corpus will be mainly function words
such as: the, of, and, to plus a few content words such as think, know, time, people,
two, see, way, first, new, say, man, good, little. And the 2,000 or 3,000 most frequent
word-forms will include words which are indispensable for the discussion of a wide
range of topics.
The main defining criterion of core vocabulary is that of maximum usefulness
which can be operationalized both by distribution in texts and semantic usefulness.
Thus, we can discover which words are widely and relatively evenly distributed in
texts of different kinds, and which words can be used for defining other words. For
example, doctor will occur in texts of many kinds, both everyday and specialist,
whereas occulist may be common in a few texts, but only on restricted specialist
subjects. Still, core vocabulary is not restricted to specialist field or genres: for
example children (versus offspring or progeny), brothers and sisters (versus
siblings), and stomach (versus abdomen). And core vocabulary is neutral
stylistically, neither markedly casual nor formal: for example, child (versus kid or
kiddy), drunk ( versus pissed or inebriated), and give (versus award or donate).Core
words such as laugh and softly can be used to define non-core chuckle while clumsy
and walk can be used in defining waddle. Sometimes, the two criteria coincide: for
example, occulist is a hyponym of doctor, and award and donate are hyponyms of
give.
The concept of coreness is thus discussed starting from the notion of core
vocabulary. The claim is that some words are more tightly integrated than others into
the language system, and some are more discoursally neutral (unmarked and
unexpressive) than others: [] it is more accurate to speak of clines and gradients
and of degrees of coreness in words (Carter, 1987:43). The greater the degree of
coreness a word has, the more neutral the word is with respect to field (i.e. the field
is not easily identifiable), and tenor (i.e. it will emerge as neutral in a formality test,
or fall at the mid point of formality-informality cline).
Core words, according to Carter, are words of high frequency with an
evenness of range and coverage of text: [] they will have to be measured as being
evenly distributed over a range of different spoken and written texts(idem:45).
Based on Carters reasoning, core words appear most frequently in the central
regions while non-core words occur at the two poles of the cline of register. It is the
degree of formality that determines their place on the cline of coreness. Hence, it is
to be expected that the more formal spoken registers and the less formal written
registers employ core words most frequently (i.e. registers in the middle regions of
the cline) (see Figure 1).
LEXIS
informal --------------------------------------- formal
------------------------- core words -------------------Figure 1: Cline of Coreness
11

Non-core words which occur towards the literate pole tend to be technical
and specific to the register. The registers representing the latter part of the
developmental processes involve a reclassification of knowledge. In the discussion of
technical lexis, Martin asserts that the move from describing to classifying is a
move from the everyday to the technical. He investigates what the functions of
technicality in the text are. (1985:26): Through technicality a discipline establishes
the inventory of what it can talk about, and the terms in which it can talk about
them. That is, the function is field creating (Martin1985:58). He disputes the
belief that for any technical term there exists already a perfectly adequate common
or garden term word which could be used in its place. In other words, he is
arguing that technical words are simply core words in jargon form.
The reverse process is pointed out by Viel with reference to the use of high
frequency technical terms in common English: [] as soon as an invention or a
new device leaves the closed circle of scientists and technicians, gains popularity and
is used in everyday life, the corresponding word passes from the category of general
scientific and technological words to that of general English ( http://esp
world.7p.com/Articles_1/vocabulary.html).
2.3. Dictionaries as Repositories of Words
Before finding out the place where words can be deposited, we must mention
that they are studied within the following theoretical domains:
a. Lexicology, a branch of theoretical linguistics which analyses the phonological,
morphological and contextual (semantic) behavior of general words, the change in
their form and meaning, their origin, development and current use
b. Terminology, which deals with the study of special-language words or terms
associated with particular areas of specialist knowledge. Terminology is conceptbased, reflecting the fact that the terms which they contain map out an area of
specialist knowledge in which encyclopaedic information plays a central role
The practical counterparts of lexicology and terminology are lexicography
and terminography. Lexicography is one of the most dynamic branch of applied
linguistics which deals with the compilation of dictionaries in order to facilitate
peoples understanding of the meaning of general words.Terminography is
concerned exclusively with compiling collections of the vocabulary of special
languages.
Dictionaries are referential works which have communication-orientated
functions named after the use situation, as briefly presented below, following
Bergenholtz and Nielsen (2006: 287):
a) to assist the users in solving problems related to text reception/ production of texts
in the native language
b) to assist the users in solving problems related to text reception/ production of texts
in a foreign language
c) to assist the users in solving problems related to translation of texts from the native
/a foreign language into a foreign /the native language
The most frequently mentioned criteria in classifying dictionaries are:
- scope of coverage (e.g. the general or special dictionary, the monolingual or
bilingual dictionary);
- shape /size or content (e.g. the pocket , unabridged or desk dictionary);
- manner of financing (e.g. the commercial dictionary or scholarly dictionary);
- the complexity of the headword (e.g. the dictionary of idioms or collocations,
dictionary of phrasal verbs);
12

- the type of target user (e.g. the learners dictionary or dictionary for native
speakers; specialized dictionaries: legal, accounting, medicine, mechanical
engineering etc.);
- the nature of the dictionary seen as a product under various formats : paper
dictionaries , recorded dictionaries on CDs, internet dictionaries and online
dictionaries, computer corpora ( BNC, MICASE, LOB, FROWN, etc.);
- other criteria refer to age ( childrens dictionaries); learners language level
(dictionaries for advanced learners) the number of entries etc.
Dictionaries can be defined through the elements they contain. A look at any
type will reveal a rather general, acknowledged structure, including the entry,
alphabetization, lexical and linguistic information (phonology/pronunciation,
spelling, morphology, syntax, semantics, context/register, etymology, usage),
illustrations, front and back matter. The absence of one or more of these elements is a
matter of choice among the authors.
The entry represents the alphabetized headword by which the word or
expression being defined is identified. Most headwords are canonical forms making
up a paradigm and being representative of a certain natural language in its standard.
The standard here refers primarily to spelling and pronunciation. Not all words count
as entries.
In what follows I shall briefly comment both upon the nature and number of
the entries as contained in the dictionaries in general and in certain dictionaries in
particular by having a look at what might be called ways of counting the contents of a
dictionary. Here I mention the criterion of multiple (vs) single entry.
When a word belongs to more than one word-class there are two possibilities:
to use as many entries as the word-class the word belongs to or to use only one entry
for all the word-classes the word belongs to. For example, heavy may be used as an
adjective, adverb or noun: the Longman Dictionary of Contemporary English (1995)
contains three headwords heavy in the specified order:
e.g. heavy1 / hevi/ adj. heavier , heaviest
1. <WEIGHT > weighing a lot: I cant lift this case- its too heavy. / The baby
seemed to be getting heavier and heavier in her arms how heavy? (= how much
does it weigh)
How heavy is the parcel? opposite LIGHT3 (4)
heavy2 adv time hangs/lies heavy on your hands if time hangs or lies heavy on
your hands, it seems to pass slowly because you are bored or have nothing to do.
heavy3 n [C] 1 informal [ usually plural] a large strong man who is paid to
protect someone or to threaten other people 2 a serious male character in a play or
film, especially by a bad character; VILLAIN (1) 3 the heavies BrE large, serious
newspapers. (Longman Dictionary of Contemporary English, 1995: 664)
The Chambers Twentieth Century Dictionary (1997) includes the adjective,
the adverb and the noun under a single entry.
e.g. heavy , hevi adj. weighty: ponderous; laden: abounding.
- adv. heavily
- n. the villain on stage or screen
When a word has several/various meanings or senses, they are separated
under the same entry either by numbers, in the case of different meanings or by
letters, in the case of closely-related meanings, both methods being in the practice of
most dictionaries nowadays.
e.g. dip2 / dip/ n 1 [C] an act of dipping. 2[C] (infml) a quick BATHE(3) in the sea:
have/take/go for a dip. 3 [U] a liquid for dipping sheep in to protect them from
infection or insects. 4 [C,U] a thick mixture into which biscuits or pieces of raw
vegetables are dipped before being eaten, eg at parties: a/some cheese dip. 5 [C] a
13

downward slope: a dip in the road o a dip among the hills. See also LUCKY DIP
(OXFORD: 325)
e.g. ridge1 / ri d/ n1 [C] a long area of high land, especially at the top of a
mountain: a windswept ridge see picture on page 835. 2 a) a line of something that
rises above a surface: a ridge of boulders/ a sandy ridge b) a long narrow raised part
of a surface: The ridges on the soles give the shoes a better grip 3 a ridge of high
pressure technical a long area of high ATMOSPHERIC pressure. ( Longman: 1220).
The method of counting by references has been designed by Landau as a
system used to maximize the number of entries in American dictionaries. The
following items are considered references by Landau (1989: 84-85):
1. The main entry, or headword
2. Any additional word class a word belongs to/ part of speech of the headword, i.e.
as a verb/noun/adjective. Some dictionaries allot separate headword status to each
part of speech, others do not .
3. Any inflected forms given such as optional presence of identical past tense and
past participle forms in -ed and -ing forms.
4. Run- on derivatives without definitions
5. Run - in idioms or other fixed expressions included within an entry
6. Variant spellings.
7. Words given in lists and derived by prefixation with common prefixes, such as in,non,- re,-un.
8. Anything in a bold typeface is counted as reference
These references may overlap with the following headwords contained in
modern dictionaries: a. abbreviations; b. prefixes; combining forms; e. open
compounds; f. encyclopedic entries, neither in the specified order or with the
obligatory presence of all of them. The main entry form in a dictionary serves a
number of different purposes according to Landau (1989: 87); it indicates the
referred spelling, the usual printed form of the lexical unit and syllabication.
When speaking about dictionaries, the usual question of how many words
they contain will surely arise, but the answer depends, not surprisingly, upon the
authors choice to save space: Every decision a biographer makes affects the
proportion of space his dictionary will allot to each component. It is perfectly fair for
critics to question his judgment, but they must realize that the length of a dictionary
is finite, and as large as it may appear to them, it is never large enough for the
lexicographer(Landau 1987: 87).
Alphabetization in dictionaries concerns primarily the headwords, thus
enabling the user to quickly find the word that he is looking up. There are two ways
of alphabetizing: letter by letter and word by word.
Letter by letter arrangement is by far the most general method, having the
great advantage that users need not bother about knowing whether a compound is
spelled as one word, a hyphenated one, or as two words, e.g. power, powerful, power
of attorney.
Word by word arrangement seems rather complicated for the average user but
it might be of great use for specialists, e.g. power of attorney comes before powerful.
It is important to notice that few dictionaries operate with a strict alphabetical order
of the lexical items. This is because all dictionaries use some degree of nesting where
a lexical item may be included within the entry of another lexical item which has
headword status.
This nesting policy or running-on refers to the arrangement of the following
categories of words: words derived by suffixation and prefixation ; fixed phrases;
idioms; compound words.In general, some dictionaries accord headword status only
14

to those derivatives whose meaning has diverged significantly from the root, others
ordinarily give headword status to any derivative that merits separate definition.
In the case of suffixation, run-ons do not need separate definitions, since the
user is assumed to be capable of deducing the meaning from the headword and the
suffix, e.g. cello - cellist.
A more strict selection is operated in the case of words derived by prefixation
which are always allotted separate headword status, because a user would not
otherwise be able to find them in the alphabetical listing, e.g.
believable/unbelievable. Fixed phrases and idioms are nested under the headword of
the first main word in the phrase (even if this is not always very clear) so, by and
large, no form of alphabetization can be fully successful since there is not always
clear whether the idiom should be placed under the first word or under the most
important word and eventually decide upon which word is more important. As a
conclusion, most dictionaries prefer to list idioms under the first word, but there are
also exceptions, e.g. that one could cut with a knife could be entered either under
knife or under cut headwords.
In the case of compound words nesting depends primarily upon spelling,
since such category of words can be spelt either as solid, e.g. landmark, hyphenated,
e.g. land-law or open, e.g. land mass.
Some dictionaries (usually those which nest all suffixed derivatives) nest all
compounds. At the other extreme, there is the practice of according headword status
to all compounds, without nesting: Intermediate between these two extremes is the
policy that accords headword status to solid compounds but nests hyphenated and
open compounds or that nests only open compounds, treating them like phrases,
while giving headword status to solid and hyphenated compounds. (Landau 1987:
165).
Since dictionaries are printed, alphabetically ordered reference works,
spelling is a given. Deciding upon spelling is of a paramount importance for the
dictionary maker, The first task of the editor of a dictionary is to decide on the
spelling of his word-entry. Usually on a modern dictionary this affords no difficulty
as usage has fixed a single spelling (Hulbert 1992).
In English, the standard (which here refers primarily to spelling and
pronunciation) which emerged during the fifteenth century, was that of the East
Midland district that included London, but it was not until Baileys dictionaries of
the 1730s and more particularly Johnsons Dictionary of 1755 that the spellings of
many words became fixed.
There are two particular kinds of information about spelling that dictionaries
usually provide. The first concerns the spelling changes in the root of a word when
adding a suffix. As illustrations we can quote both irregular and regular inflections
for nouns, adjectives, verbs, e.g. country-countries, bad-worse, reply-replied, sinsinning.
The second refers to alternative spellings: British and American English e.g.
travelling-traveling, centre-center, colour-color; alternatives in British English only:
systematic, e.g. the er/ - or alternation in adviser or ie/ - y alternation in
words like auntie/y; idiosyncratic, e.g. baloney/boloney, both/bodge. The desire for
uniformity is so great that popular variants are not welcomed: The graphic dress of
the language is now so sacrosanct that dictionaries are used as authoritarian, style
manuals in matters of spelling, hyphenation and syllabification(Zgusta: ).
Besides spelling dictionaries provide lexical and linguistic information about:
the pronunciation, morphology, syntax, semantics, context/register, etymology of the
words.
15

2.4. Lexical Words and Computer Corpora
Words can be studied from various points by means of computer corpora
within the domain of corpus linguistics. Corpus Linguistics (CL) provides a new
point of view for researching language by means of computer-assisted methods and it
has been proclaimed as a discipline by various linguists with the first occurrence of
the term dating back to 1984 ( Aarts & Meijs1984). Despite the objections which
have been formulated against it by the transformational grammar supporters, corpus
linguistics has been continuously gaining ground by making it possible to collect
new types of data and study patterns of language use: Many of the patterns are
consistent and found in independent corpora: they are reproduced by speakers ,
though speakers are often unaware of them, and they are observable only across the
language use of many speakers in a discourse community(Stubbs 2001: 220).
As the nomenclature indicates, one of the key concepts with which CL
operates is that of corpus (the Latin for body) meaning any collection of more than
one text .When used in the context of modern linguistics, the term frequently tends to
have some more specific connotations than this simple definition. Its up-to-date
profile would include the following characteristics:
a. Sampling and Representativeness
CL specialists are interested in creating a corpus which should be maximally
representative of the variety under examination e.g. spoken academic American
English, Cockney dialect, economy jargon etc.: What we are looking for is a broad
range of authors and genres which, when taken together, may be considered to
average out and provide a reasonably accurate picture of the entire language
population in which we are interested (Mc Enery& Wilson 1996:88)
b. Finite Size.
A corpus should consist of a finite number of words which, usually, is determined at
the beginning of a corpus-building project. For example, the Brown Corpus
contained 1,000, 000 running words of text. See also Longman Dictionary of
Contemporary English Language on CD.
c. Machine-Readable Form.
Modern corpusnearly always implies the additional feature machine-readable, i.e.
computer readable data.
d. A Standard Reference.
A corpus constitutes a standard reference for the language variety that it represents.
This presupposes that it will be widely available to other researchers, which is indeed
the case with many corpora. A standard corpus also means that a continuous base of
data is being used.
e. Concordance. KWIC
A concordance is the main tool of corpus linguistics. The computer is programmed
to search for all examples of a node word in a corpus and to print them out in the
centre of the page or screen within a given context of a few words to left and right
(Stubbs 2001). The node word could be any of the word-forms that a LEXEME
(LEMMA) might take (e.g. SEE =lemma, seeing, sees, saw, seen are word-forms)
while the words either to the left or right of the node are called collocates. Frequency
of occurrence of the node words and their collocates are accessed by the computer
16

using the acronym KWIC (Key Word in Context). A Concordancer has the following
functions:
- contribute to understanding how a word is used in a variety of contexts (meaning,
grammar, style, register)
- produces a list of occurrences for any search of a word, phrase or any string of
letters
- gives information on the overall frequency of occurrence of words
- indicates how core a word is in vocabulary (general or specialized)
f. Other CL tools.
The use of corpora brings back into linguistics the text as a record of an actual
event which can be printed, picked up and examined. In what follows we shall
introduce some of the most common computer devices used to analyze and interpret
the text and its component elements.
The Parser:
- connects form and meaning
- helps in decoding the meaning of the text by replacing a string of
characters with category labels
- interprets the grammatical meaning of new texts
The Tagger:
- assigns word class tags to the word-forms in a text
- extracts information from text
- provides statistical information for the classification of texts
The Collocator:
- determines the collocations of a word-form ( left-node word-right)
- the study of collocation brings out the notion of a lexical set
The Compounder:
- identifies compound lexical units
The Lemmatiser:
- conflates inflections into lemmas
The Disambiguator:
- is the mark of collocational consistency i.e. intercollocation of
collocates
-helps in disambiguating meaning, when one word may have several
meanings (collocations can be used to disambiguate)
The lexical tools mentioned (collocator, compounder, disambiguator) operate on
texts in order to discover information about the organization of the language as a
whole, according to the scheme in Figure 1:
collocator --------- lexical parser = > bring out the collocations in a text
instance and relate them to the
meaning of the words involved
compounder -------phrase finder = > bring out the mutually conditioned
choices and assess which may be
proposed as separate lexical items
disambiguator ---- exemplifier => evaluate the instances and assign them
to meanings
Figure 1
The output from the phrase finder can be used to improve the lexical parser, and
clear the ground for the exemplifier
g. Types of Corpora
17

A rough classification of corpora divides them into the following classes:
- General reference corpus (language as a whole)
- Special purpose corpus (particular aspect of a language)
- Written corpus ( written texts)
- Spoken corpus ( transcripts of spoken material)
- Monolingual corpus ( texts in a single language)
-Multilingual corpus (texts in two or more languages )
-Parallel corpora ( texts in language A alongside their
translations into language B, C etc)
-Comparable corpora (texts originally written in language A, B, C,
etc., but they are all on the same subject and type of text, same time
etc. e.g. instruction manual, technical report etc
- Synchronic corpus (limited time frame)
- Diachronic corpus (long period of time)
- Open corpus ( is constantly being expanded)
- Closed corpus ( no longer expanded)
2.5. Exercises
1. Write down three sentences containing the word umbrella.
a. Ask your classmate to do the same. Compare your sentences with those of your
classmate. Did you both think of exactly the same sentences?
b. Use BNC and lextutor and compare your sentences to those shown in the
concordance lines.
c. Compare the concordance against the entry taken from the Oxford Paperback
Dictionary (1988 edition):
umbrella n. 1. a portable protection against rain, consisting of a circular piece of
fabric mounted on a foldable frame of spokes attached to a central stick that serves
as a handle 2. any kind of general protecting force or influence.
d. What information do you learn from the corpus that was not present in the
dictionary and vice versa? These two types of resource may offer complementary
information.
2. Consider the words procrastinate and shoddy. Use BNC and Web Corp to
search for the distribution of the items. Evaluate the usefulness of the resources
in becoming familiar with their usage and meaning.
3. Read the following text and, without consulting any reference material, try to
rewrite the text in such a way that it conforms to the original. Once you have
finished, look on the Internet for some recipes. Compare your version to the
recipes that you find. What have you got? Are the terms, collocations and style
that you used in line with those of the recipes that you found?
To make this recipe for a creamy potato casserole, you will need a package of frozen
hash brown potatoes (a 2 pound bag), some green onions, cheddar cheese, cream
of potato soup (a 10oz can), butter, sourcream and salt and pepper. First turn on your
oven. Set it to 350 degrees Fahrenheit (or 175degrees Celsius if it is a metric oven).
Get a pan and put it on the stove on a setting thats not too high, then pour in the
soup as well as a quarter of a cup of butter and two cups of sour cream. Now chop up
18

one third of a cup of green onions and grate two cups of cheddar cheese. Get a bowl
(not too small!) and put in the package of frozen potatoes along with the green
onions and one cup of cheddar cheese. Put the soup/butter/sour cream from the pan
into the bowl with the potatoes/green onion/cheddar cheese. You can also put in
enough salt and pepper so that it will be to your liking. Now pour everything into a
casserole dish that measures nine inches by thirteen inches. Put the rest of the grated
cheddar cheese on the top. Now it all goes into the oven for about thirty to forty-five
minutes. Eat it before it gets cold!
19
CHAPTER 3. Context and Contextualization
CHAPTER 3: Context and Contextualization

3.1. Acceptations of the Terms Context and Contextualization
In linguistics a word is a bundle of information related to phonology,
morphology, lexicology, semantics, syntax, morpho-syntax, text, grammar,
etymology, metaphor, discourse, pragmatics and the world knowledge (Pinker
1995:344). It is not easy to capture all the information provided by a word just by
looking at its surface form or orthography. A versatile system along with our native
language intuition are required to decipher all the possible explicit and implicit
meanings of a word.
Words acquire value only when used in a context. Similarly, the value of a
word is given by its meaning in that context. When asking What does the word love
mean? we consider the word love in isolation. It is used correctly but quite
differently in I love my child, Love makes the world go round, The score is lovefifteen. Therefore we cannot say what the meaning of a word is until it is put into an
adequate context. But we should not think that the meaning resides in the word itself:
it is rather spread both over the word and the neighbouring words, since only the
latter identifies the semantic field, the group of relevant associations ( cf
Saussure), where we find contrasting words by which to measure the one used: The
occurrence of a unit (e.g. a sound, word) is partly or wholly determined by its
context, which is specified in terms of the units relations, i.e. the other features with
which it combines as a sequence (Crystal 1995: 78).
The term context has various acceptations. In linguistics it refers to the
specific parts of an utterance or text adjacent to the word which is focused upon.
Providing a context to a word means in fact put that word in context in order to
clarify its meaning, this process being called contextualization.
Other acceptations of the term are given by collocating it with specialized
terms. Reference is often made to:
a. In the field of generative grammar we can distinguish between context-free /vs/
context-sensitive grammar [] where forms can be classified in terms of whether
they occur only in specific context (context-sensitive/restricted/dependent rules) or
are independent of context ( context-free rules)(Crystal 1995:79).
b. Variants of units (sound, morpheme, etc.) which are dependent on context for their
occurrence are called contextual variants, e.g. allophone, allomorph, etc. An analysis
performed in these terms is called a contextual analysis.
c. Situational context is the term used to refer to the relationships holding between
the extra-linguistic features against which the linguistic units are used so as to
specify what is immediately observable in the co-occurring or immediate situation.
d. Context of situation is the term first used in linguistics by Firth 1951 in order to
draw attention to the context-dependent nature of meaning and as a refinement of the
term context in interdependence with the general, cultural conditions where
language is used. This widening of contexts, linguistic and non-linguistic, has
influenced the study of meaning relating on the one hand to external-world features
20

and on the other hand to different levels of linguistic analysis such as phonetics,
grammar, lexicology and semantics.
e. Context of utterance refers to all the factors that determine the form and meaning
of utterances.
With reference to word meaning we shall also consider the classification
proposed by Dash 2008 and which we have adapted to our purpose here:
a. Local context refers to the immediately linguistic environment (i.e.neighbouring
words) in which a word appears. It provides us necessary information regarding: the
nature of the relation with the word neighbours (e.g. if it is idiomatic) and the
understanding of the lexical collocation of words used in a lexical block. Hence, if
co-occurrence of any two words is caused by choice (to evoke an intended sense) or
by chance (having no special significance) word meaning variation is due to its
relation with the neighboring words.
b. Sentential context refers to the information retrieved within the sentence where the
word occurs; more particularly it supplies information about the respective words
explicit or implicit syntactic relations with the other words making up the sentence. It
allows us to explore if there is any variation of meaning of the word analyzed due to
its relation with the other words located far away in the sentence.
c. Topical context refers to the whole topic of discussion and focuses on the content
of a piece of text when looking for the actual meaning of the word in question.
Variation of topic or content brings about meaning variation of the word in question:
Taken together, the sentences display a network of meanings, which is not
obtainable from individual sentences. Here, special meaning is possible to extract
only when we refer to the topic and interpret the sentences with close reference to the
topic of the text. (Dash, 2008: 28).
d. Global context helps in acquiring information from the extralinguistic world for
deciphering the contextual meaning of the word when other contexts cannot fulfill
this purpose. The meaning of a word is not only related to the meanings of other
words occurring within the local, sentential and topical context, but also to
extralinguistic reality surrounding the linguistic acts undertaken by language users.
The verb forms of a language, for instance, depend upon the NPs gravitating around
it, rendering a scene of action indicating an agent, a patient, a place, and time. This
signifies that in order to understand the meaning of a verb form under investigation
we need to consider all the elements in a cognitive interface to realize both its
denotative and connotative meanings.
Generally, a huge chunk of information of the global context is available from the
external world that supplies vital cues of place, time, situation, interpretation,
pragmatics, discourse, demography, geography, society, culture, ethnology, and
various other things (Allan 2001: 20). Since the global context builds up a cognitive
interface between language and reality, it also becomes a valuable source of
information for meaning disambiguation of words, helping us to understand if the
word has any meaning variation, and if so, what it is.
21

3.2. Text Typology
Contextualization can be better understood if we analyse the type of text we
deal about or the words we want to use in a context. Therefore we will briefly
consider only some of the most complex classifications.
Cognitive classifications e.g. Kinneavy (1980) and Werlich (1976),
concentrate on ways of conceptualizing, perceiving or portraying the world by:
a. Narration, which stands for our viewing the continuum changes of reality in a
dynamic way, by providing the differentiation and interrelation of perceptions in
time, texts becoming narrative.
b. Description, which stands for our static view focused on individual experience, by
providing the differentiation and interrelation of perceptions in space- descriptive
texts.
c. Exposition, focusing on the comprehension of general concepts through
differentiation by analysis or synthesis- expository texts.
d. Argumentation, understood as evaluation of relations between concepts by
extracting similarities, contrasts and transformations- argumentative texts.
e. Instruction, meaning prospective attitudes, planning of future behaviour with
option and without option- instructive texts.
Textual specificity has been enlarged by De Beaugrande and Dresslers
procedural approach (1981) adding the categories of scientific, didactic, literary and
poetic texts.
A more general classification has been made by Trosborg (1997) who
distinguishes texts by taking into account two criteria:
- purpose, based on communicative functions, suggesting that texts are intended to
inform, to express an attitude, to persuade and to create a debate
- mode of discourse, underlining rhetorical strategies, hence the grouping into
descriptive, narrative, expository, argumentative or instrumental texts.
A typological synthesis has been proposed by Reiss who classifies texts into
informative, expressive and operative (in Fawcett 2003: 104).
3.3. Lexical Profiles. A Corpus Illustration
In order to analyze words at a certain linguistic level we can start either from
the system as a whole or from its ultimate constituents i.e. linguistic units. One
possible variant to the latter approach has been proposed by Stubbs (2001) who,
starting from the most frequent words and their use in recurrent patterns from various
English corpora, has designed semantic schemas which are in fact abstract, mental
models which constitute a central part of the individuals communicative
competence: Semantic schemas are general and simple patterns which have
considerable lexical variation due to local context and choice (Stubbs, 2001: 97). He
denominates such schemas models of extended lexico-semantic units which are
combinations of lexis, syntax, semantics and pragmatics. Frequent word patterns
having strong, phraseological tendencies have also been referred to as multi-word
units by Pawlay and Syder( 1983) constructions by Fillmore et al (1988),
extended units of meaning or lexical items by Sinclair( 1996, 1998.)
The model of extended lexico-semantic units helps in building up lexical
profiles for the words which have frequent, typical and central uses in the everyday
vocabulary of English and may even constitute the basis for the norms characteristic
for the behaviour of a language community in any of its manifestations, i.e. registers,
dialects, etc. The purpose of profiles (Crystal,1991) is to summarize the information
and present it under a coherent and systematic manner which should facilitate
22

comparisons and contrasts and the discovery of significant and typical patterns by a
numerical dimension. It is frequency and collocation which state the order and
reestablish the facts.
By reformulating attested concepts in lexicology, semantics, syntax and
pragmatics, Stubbs (2001) reiterates the Saussurean principle of defining a linguistic
unit by specifying its constituents and the possible relations among them; the
constituents define the semantic content of the unit whereas the relations define its
structure. There are seven relations together with their corresponding constituents
which Stubbs reunites under the banner of a model of extended lexico-semantic unit
and which we represent in an adapted, schematic form in Table 1.
RELATION
COLLOCATION
CONSTITUENT
collocate: individual word-form or
lemma/ lexeme
COLLIGATION
grammatical category
SEMANTIC PREFERENCE
lexical set/ field: class of semantically
related word-forms or lemmas
DISCOURSE PROSODY
descriptor of speaker attitude and
discourse function
STRENGTH OF ATTRACTION
percentage terms (statistics)
POSITION
AND
POSITIONAL directional/ vs/ non-directional patterns
MOBILITY
DISTRIBUTION IN TEXT-TYPES
descriptor of language variety or register
Table 1. A Model of Extended Lexico-Semantic Unit
(adapted from Stubbs, 2001: 87-88)
In the linguists point of view These semantic schemas can be modelled as
clusters of lexis (node and collocates), grammar (colligation), semantic preferences
for words from particular lexical fields) and pragmatics (connotations or discourse
prosodies) (Stubbs 2001: 96) thus narrowing the gap between corpus linguistics
supporters and its opponents.
In what follows, we shall check out the appropriateness of Stubs model in
building up a lexico-syntactic profile for the verb see. The corpus analysed includes a
sample of Michigan Corpus of Academic Spoken English and contains 500
concordance lines out of 4172. The illustration furthers the analysis proposed in
Dima (2002) and is supported by self-evident opinions acknowledged in the literature
concerning the lexical, syntactic, semantic and pragmatic behaviour of physical
perception verbs.
Given that collocation means frequent co-occurrence and that collocates are
its basic constituents, we will consider that the verb see can be described
linguistically by a number of very interesting characteristics, starting from the node
word see occurring with its collocates, in a span of approximately 3:3, according to
the formula:
(1)
collocates (N-3) see (N+3) collocates

span
The verb see is basically the prototype of non-agentive verbs of seeing

(Dima 2002), the one which engages the Experiencer in a pure physical act,
describing the direct relationship between the perceiver and the object of his/her
23

perception, syntactically represented by the surface structure NP1 V NP2, with the
Subject NP1 [+Human] and the Direct Object NP2 [+Human] / [Animate]. In our
corpus, this structure has got the highest percentage: 35.4%, meaning 133
occurrences of the node see and its collocates to the right side of the span of (N+2).
The NP2 standing for the DO has a complex structure given by a rich morphological
representation of the head and its determiners.
(i) The Head
a) Nouns at Head sum up 134 occurrences in concordance lines and are distinguished
by the following semantic features: 13 [+Human]; 24 [+Abstract]; 95 [+Concrete].
The greatest number of nouns belongs to the sphere of concreteness and is supported
by the fact that the majority of contexts pertain to real situations, developed within
specialized academic lectures and experiments in the domains of art (music, theatre)
science (physics, mathematics, chemistry, anatomy and medicine) and environment
(ecology, agriculture, geography). Accordingly, the semantic preference
characteristic for see within the corpus of Michigan Academic Spoken English
analysed is mainly directed to highly specialized words such as square, hexagon,
octagon, proof belonging to the lexical field of mathematics; particles, magnets,
light, track belonging to the lexical field of physics; electrons, benzodiazepine,
blending, ambivalence belonging to the lexical field of chemistry; capillary space,
kidneys, inflammation, dose, therapeutic concentration belonging to the lexical field
of anatomy and medicine; forest, plantation, coconut, biodiversity preservation
belonging to the lexical field of ecology and environment, etc.
Nouns at Head having the feature [+Abstract] refer to results of
experimentation or theoretical argumentation: problem, pattern, similarity,
difference, etc.
Nouns at Head having the feature [+Human] stand for the listeners cf. the
students or for key topics e.g. Hume, Kelvin, the Chinese, etc.
b) Pronouns at Head are either personal, marked for Accusative, it having the
highest degree of occurrence (20 concordance lines) or indefinite: someone,
somebody, something but with a low rate of occurrence.
(ii) Determiners are best represented by: articles with the found in 35 concordance
lines and a in 18; adjectives in 26 concordance lines and modifying nouns belonging
to the semantic fields specified above and referring to: size and measure (little, less,
quantitative), colour (black), qualities (new, dramatic, quaint, unusual, various, etc.)
and varia (natural, monocultural, elementary, regular, experimental, sanitary);
demonstratives which refer either to distance, e.g. that (9 occurrences) or proximity/
nearness, e.g. this (6) these (5).
The verb see can also be subcategorized for clausal DOs introduced by that
realizing the surface structure VthatS and entering the semantic field of cognition.
In our corpus this characteristic feature is represented and sustained by a percentage
of approximately 13.2% out of 100% corresponding to the 500 concordance lines
analysed, figure which describes see as a factive predicate which presupposes the
truth of its clausal complement, fact underlined by the contexts selected under (2):
(2)
so I see that youre from Hartland Michigan

whuh! I can see that you have been thats
chemists can see that this is a
renal distress and you can see that the mice that were
you can see that after treatment there are
24
(57)
(127)
(187)
(190)
(199)

tile with the pieces to see that these things tile if
you can see that these particles get accelerated
stretches them to see that they too could be
so its really nice to see that its come back
on the continent youll see that the Andes Mountains run
(242)
(318)
(438)
(456)
(500)
Cornilescu (1986, 2003) states that physical perception verbs may lose
factivity in questions (cf. reported speech) and conditionals. This is well proved by
our corpus analysis where wh-complementizers introduce complement clauses in a
percentage of 6.8% (out of our basic 100% (500 concordance lines)) and ifcomplementizers introduce complement clauses in a percentage of 4.8% (out of or
basic 100% (500 concordance lines)). Pragmatically and semantically interpreted in
terms of discourse prosody, these figures are descriptors which point to the manner
in which the feed-back of communication is achieved in terms of the speaker-listener
interplay: asking for attention, anticipating the topic under discussion, describing,
making or testing experiments, etc.
(3) a. do you see what Im saying ?
(3)
and to see what would come from
(31)
get to see how parallel the distractors are
(47)
and see how I could combine them
(52)
enable us to see where the contours are so
(277)
lets see who was the third Hobbes
(283)
b. would be more fun to see if you could put it
(53)
let me see if I can do something
(87)
and see if somethings happening with (114)
and will see if that description holds
(148)
we want to see if this compound has the
(196)
Im gonna see if I can find that
(220)
See-complementation in the domain of physical perception can slightly be
mentioned in connection with our corpus analysis due to the occurrence of see in Acc
+ Inf constructions in only 10 concordance lines and Acc + Part constructions in only
18 concordance lines. The larger number of see-occurrences in Acc + Part
constructions emphasizes the orality descriptor and the fact that the majority of
messages refer to actions in full development at the moment of speaking
(experiments, demonstrations, illustrations, etc.).
(4.) a.
and youll see people park their boat
I feel a lot of pleasure to see this event occur
Im really sad to see them leave because
so incredibly rare usually youll see email come across the
b. means of this drawing youll see an electron moving
positron, a pair and you see it happening all over the
off and later on you see light coming off
but I could see this picture filling up
I see doctors trying to write
(146)
(440)
(451)
(452)
(350)
(356)
(372)
(391)
(463)
A complete analysis of the complementation of see as a trigger of Acc + Inf / vs/

Nom + Inf and Acc + Part / vs/ Acc + Ing constructions is to be found in Cornilescu
(2003).
25

Moving to the left side of the concordance span, the findings confer a solid
pragmatic coordinate to the profile of see designed by its use in three canonical
contexts specific to the oral register.
(i)
with the modal verbs can, could met in 80 concordance lines, meaning
a percentage of 16%.
(5) you can see the coconut trees
we can see that quote the obligations
you can see the bending it
he could see them in his telescope
(263)
( 289)
(314)
(458)
(ii)
in the imperative with lets in 23 concordance lines, meaning 4.6%
(6) alright lets see
(48)
well, lets see with German and Spanish
(50)
uh, lets see, so Ive got one
(241)
(iii) in simple present I see met in 33 concordance lines, meaning 6.6%.
(7)
so I see
ohh now I see
okay I see
(57)
(81)
(132)
In all these cases see acquires the function of a hedge performative, like
some glue, within the oral exchange (see Dima 2002).
Still seen from the left, the lexico-syntactic profile of see would not be
complete if we didnt take into consideration the first member of the surface structure
from where we started, i.e. NP1 V NP2 (see 4.1). The NP1 standing for the
Subject can have both an agentive and a non-agentive role. The Agentive is
characteristic for the occurrence of see in complement clauses of the type and
contexts presented above, while the non-agentive role is characteristic for all the
contexts in which there is no change in the positive value of the feature [+physical
perception]. The private nature of see having as subject the I of the first person
singular standing for the all-powerful speaker, e.g. university professor, is
counterbalanced by the overwhelming figure of 98 concordance lines meaning 19.6%
in which see has as subject the you of the listener, e.g. student, who lets himself
carefully guided by and delicately submitted to the speaker.
(8) a.
and so you see a lot of sloppy staff
b. in Beijing one night you see
(176)
(181)
We should not overlook the complexity of the discourse prosody and the
strength of attraction provided by the use of see in its to-infinitive form (82
concordance lines, meaning almost 16.4%) after: nouns such as performance,
preference, pieces, class, Strauss, etc.; miscellaneous verbs such as: want, get, cause,
love, come, like; adjectives such as: excited, delighted, easy, hard, etc.
(9) a.
her Chinese language class to see the play
b. arent the regular performances to see nowadays
c.
what they loved to see
d.
people who go to see
e.
its a little hard to see
f.
most people are unable to see
26
(139)
(182)
(162)
(163)
(237)
(290)

The purpose of the ilustration has been twofold. Firstly, it has aimed at
making up a lexico-syntactic profile of the verb see by adapting Stubbs model of
extended lexico-semantic units and secondly, it has been designed as a resourceful
application of corpus linguistics methods and terminology in the study of words in
context.
3.4. Exercises
1. Explain in your own words the meaning of the words and phrases in italics:
a. He tossed her a few dollars. Here, he said, go buy yourself some finery.You
look like the bottle of an old teakettle.
b. A young and impressionable moth once set his heart on a certain star. He told his
mother about this and she counseled him to set his heart on a bridge lamp instead.
c. There was a glint in his eye now. He was determined to get the thing into high on
his next attempt; we had come about half a mile in the lower gears.
d. News of his miracles got around by word of mouth among the poorer classes of
town.
e. But he would not let her interfere with his designs, so she flew into a rage and left
him.
f. Then he came out for a breath of air before beginning work on a new design.
g. Since he did not possess a streamlined mind, he did not perceive that his little joke
had gone far enough.
h. The Browns had arrived in fairly good spirits to find themselves in a buzzing
group of young students.
2. Read and translate into Romanian. Use the dictionary.
At length the bear saw the error of his ways and began to reform. In the end he
became a famous teetotaler and a persistent temperance lecturer. He would tell
everybody that came to his house about the awful effects of drink, and he would
boast about how strong and well he had become since he gave up touching the stuff.
To demonstrate this, he would stand on his head and on his hands and he would turn
cartwheels in the house, kicking over the umbrella stand, knocking down the bridge
lamps, and ramming his elbows through the windows.
3. Read the following text and underline the nouns and the verbs.
a. Make an analysis of their capacity to express movement in the topical context.
b. Analyse the local contexts for the verb look.
I was looking at the high waves. The breakers always are parallel to the coast and
shape themselves to it except where the curve is sharp however the wind blows. They
are rolled out by the shallowing shore just as a piece of putty between the palms
whatever its shape runs into a long roll. The slant rock or crease one sees in them
shows the way of the wind. The regularity of the barrels surprised and charmed the
eye; the edge behind the comb or crest was smooth and bright as glass. It may be
noticed to be green behind and silver white in front: the silver marks where the air
begins, the pure white is foam, the green solid water. Then looked at to the right or
left they are scrolled over like mould boards or feathers or jibsails seen by the edge.
It is pretty to see the hollow of the barrels disappearing as the white comb on each
side runs along the wave gaining ground till the two meet at a pitch and crush and
overlap each other.
27

(Breakers on the Shore by Gerard Manley Hopkins)
4. Read the text and analyse the sentential context around the verbs.
But what appealed most to my wonder was the way they all swam. A dozen
sprawling, lace-like shapes would suddenly gather themselves into streamlines and
shoot upwards, jet-propelled by the marvelous siphon in their heads, like a display of
fairy water-rockets. At the top of their flight, they seemed to explode; their tails of
trailed tentacles burst outwards into shimmering points around their tiny bodies, and
they sank like drifting gossamer stars back to the sea-floor again.
The female octopus anchors her eggs to stalks of weeds and coral under water. It
seems to be a moot point whether she broods in their neighbourhood or not, but I
once saw what I took to be a mother out of exercise with five babies. She had a body
about the size of a tennis ball and tentacles perhaps a foot long. The length of the
small ones, streamlined for swimming, was not more than five inches over all. They
were cruising around a coral pinnacle in four feet of water. The big one led, the
babies followed six inches behind, in what seemed to be an ordered formation: they
were grouped, as it were, around the base of a cone whereof she was the forwardpointing apex.
(Octopi on the Move from A Pattern of Islands by Arthur Grimble)
5. Read the text and make comments upon the importance of the global context.
Use corpora to establish the scientific domain the text belongs to.
How fond nature is of spot-making! The wings of butterflies, the feathers of birds,
the surface of eggs, the leaves and petals of plants are constantly spotted; so, too,
fish, as trout. From the wings of the butterfly I looked involuntarily at the foxglove I
had just gathered; inside, the bells were thickly spotted dots and dustings that might
have been transferred to a butterflys wing. The spotted meadow-orchid; the brown
dots on the cowslips, brown, black, greenish, reddish dots and spots and dustings on
the eggs of the finches, the whitethroats and so many others some of the spots seem
as if they had been splashed on and had run into short streaks, some mottled, some
gathered together at the end; all spots, dots, dustings of minute specks, mottling, and
irregular markings.
(The Open Air by Richard Jefferies)
6. Read the poem and explain the role played by onomatopoeic words in
rendering the sound of rain.
Rain-pour, pitter-pitter,
Rain-pour, splitter-splatter.
Spluttering on glass,
Splashing on slates,
Gushing down drains.
Liquid spears attacking flowers,
Swishing, swirling, silvery showers,
Dripping through leaves,
Dropping from branches,
Rattling on shutters.
Falling on rivers in resonant plops,
Dancing on puddles in translucent drops.
Scudding, skimming, fizzing- all blending,
Till one more deluge comes to an ending.
28

7. Read and translate into Romanian by using both monolingual and bilingual
dictionaries:
Always in our house one of lifes principal sensual pleasures has been to smell at
fresh coffee. I would go to Sheffield with my mother, to Daveys provision store;
there take a morning coffee at a low table, and then accompany her round the various
counters for brawn, pressed-ham, crisp golden tubes of sugar fried in fat, and the
fresh-ground coffee which came in crackling green bags tied with thin string. When
we got home it was always my job to empty the coffee into our coffee tin, squeezing
the bag until not a grain was left. Then came the savouring as we passed the tin
around, sniffing carefully so as
not to inhale the powder.
(from I Said the Sparrow by Paul West)
29
Final Test
Final Test
Choose the correct answer:
1. The word is:
a. a unit of expression
b. a unit of meaning
c. a linguistic unit
d. a lexeme
2. The phonological word is:
a. a stretch of speech
b. a range of speech sounds
c. a cluster of sounds
d. a morpheme
3. The orthographic word is:
a. a stretch of writing
b. a stretch of speech
c. a morpheme
d. a vowel
4. English orthography is chiefly based on:
a. the morphological principle
b. the phonetic principle
c. the historical principle
d. spelling and pronunciation
5. Corpus linguistics studies:
a. the corpus
b. language by means of computer assisted methods
c. speech production
d. the syntax of the word group
6. The modern corpus is characterized by:
a. well-formedness
b. representativeness
c. non-finite size
d. liability
7. A collocation is:
a. node-collocate pair
b. a group of words
c. a free combination of words
d. a verb hierarchy
8. Pronouns are:
a. function words
b. content words
c. particles
d. prepositional phrases
9. Content words have the function of:
a. being optional in a sentence
b. specifying what a text is about
c. indicating the subject relation
d. determining bondage
30
4p
2p
1p
2p
1p
1p
2p
1p
1p
Final Test
10. Function words:
1p
a. relate content words to each other
b. are open class words
c. define the sentence
d. express modality
11. Content words are:
1p
a. less important
b. major, full or lexical words
c. old-fashioned
d. non-marked
12.Function words are:
1p
a. interested
b. minor, empty, form words
c. invigorating
d. sort of genitives
13. The lexical density of a text represents:
1p
a. a fraction
b. a noun- phrase with modifiers
c. the proportion of lexical words expressed as a percentage
d. a type of argument structure
14. The core vocabulary contains:
1p
a. the most frequent words in the language
b. only word-forms
c. synonyms
d. idioms
15. The most frequent word is (from a large general corpus):
1p
a. see
b. look
c. stare
d. gawk
16. A concordance is:
1p
a. a word/phrase and its surrounding context
b. a lemma
c. an attributive clause
d. a manifestation of diagrams
17. A lemma or lexeme is:
1p
a. a mathematical calculus
b. the set of different forms of a word, such as the inflected forms of a part of
speech
c. a degree
d. a tagger
18. Concordancers
1p
a. give information on the overall frequency of occurrence of words.
b. describe qualities of text-types.
c. are simple verses.
d. provide logical equivalents.
19. Explode a myth is:
1p
a. a collocation
b. a free word combination
c. an idiom
d. a set-phrase
31
Final Test
20. She was angry.us:
1p
a. of
b. at
c. above
d. according
21. Which is correct ?
1p
a. spick and pun
b. spick and span
c. span and spick
d. spick and spin
22. Which is most frequent?
1p
a. arrive at an agreement
b. come to an agreement
c. deal an agreement
d. agree an agreement
23. Which is acceptable?
1p
a. white tea
b. strong tea
c. feeble tea
d. thin tea
24. Which is formal?
1p
a. a bunch of flowers
b. a gathering of flowers
c. a bouquet of flowers
d. a pack of flowers
25. Make a beeline for is
1p
a. a free word combination
b. an idiom
c. a set phrase
d. a collocation
26. The set speak, speaks, spoke, spoken, speaking is:
1p
a. a lexical set
b. a word family
c. a lexeme
d. a semantic field
27. The adjective blind collocates with:
2p
a. block
b. river
c. date
d. alley
28. The noun garden collocates with:
2p
a. grey
b. city
c. book
d. flat
29. The term context refers to:
1p
a. the specific parts of an utterance or a text which the word is focused upon
b. a units relations with the other features with which it combines to form a
sequence
c. contextualization
d. morphological processes
30. Contextual analysis is performed on the basis of:
1p
32
Final Test
a. contextual variants
b. variants of sound, morpheme
c. context
d. syntax
31. Local context indicates:
a. the immediately linguistic environment in which a word appears
b. context of situation
c. the meaning of pronouns
d. passivization
32. Sentential context refers to:
a. the neighbouring words
b. sentence variation
c. the implicit and explicit syntactic relation of a certain word with other
words far away in the sentence
d. pragmatic behaviour
33. Topical context refers to:
a. words
b. quotations
c. the first sentence
d. the whole topic of the text
34. The global context helps in:
a. building up sentences
b. acquiring information from the extralinguistic world
c. disambiguating word meanings
d. calculate lexical density
1p
1p
1p
1p
Grades:
0-15 5; 15-20 6; 20-257; 25-30 8; 30-35 9; 35-42 10
33
References
References
Aarts, J., Meijs, W. (1984) Corpus Linguistics : recent developments in the use of
computer corpora in English Language Research, Amsterdam, Rodopi
Allan, K. (2001) Natural Language Semantics, Oxford: Blackwell Publishers
Bergenholtz, H., Nielsen S. (2006) . Subject field components as integrated parts of
LSP dictionaries . John Benjamins Publishing Company
Brown, K. et al. (2006) Encyclopaedia of Language and Linguistics, Second Edition,
Elsevier
Cornilescu, A. (1986) English Syntax, Bucureti
Cornilescu,A.(2003) Complementation in English, Editura Universitii din Bucureti
Croitoru, E. (2002) The English Sentence Structure, Editura Fundaiei Universitare
Dunrea de Jos Galai.
Crystal, D. (1991)Stylistic profiling in K.Aijmer and B. Altenberg (eds.) English
Corpus Linguistics, London: Longman.
Dash, N. S. (2008) Context and Contextual Word Meaning in SKASE Journal of
Theoretical Linguistics [online]. vol. 5, no. 2 [cit. 2008-12-18].
http://www.skase.sk/Volumes/JTL12/pdf_doc/DASH.pdf
Dima, G.(2002) Verbele sentiendi n limbile englez i romn, Ed. Fundaiei
Universitare Dunrea de Jos din Galai.
Dima, G. ( 2005) Introducing Corpus Linguistics. A Lexico-Syntactic Profile for
'See', n volumul Colocviile Filologice Glene, E.D.P., R.A., Bucureti.
Dima, G. (2009) English Phonetics and Phonology. (Limba engleza contemporana.
Fonetica si fonologie), IDD PIED http ://www. idd. ugal.ro/
Dima, G. (2011) Patterns of Weaving Words. The Lexical Field of Quantifiers.The
Case of Speck, ANALELE UNIVERSITATII DUNAREA DE JOS DIN
GALATI , FASCICULA XXIV , Lexic comun / Lexic specializat, General
Lexicon/ Specialized Lexicon, Lexique commun / Lexique spcialis, ANUL
IV, No. 5 / 2011, Editura Europlus, Galati
Fawcett, P., (1997/2003) Translation Theories Explained, Manchester: St. Jerome
Fillmore, Ch.et al. (1988) Regularity and Idiomaticity in Grammatical
Constructions: The Case of let alone, Language, 64, 3.
Firth, J.R., Papers in Linguistics 1934-1951
http://www.beaugrande.com/LINGTHERFirth.htm
Harris, R. (1991) Reading Saussure, Open Court, La Salle, Illinois
Kinneavy , James, L., (1980) A theory of discourse: The aims of discourse, New
York, Norton
Landau, S. (1987)/(2001). Dictionaries. The Art and Craft of Lexicography.
Cambridge: CUP
Mc Enery, T., Wilson, A. (1996) Corpus Linguistics, Edinburgh University Press
Pawley, A., Syder, F.H. (1983) Two Puzzles for Linguistic Theory. In J.C.Richard
Sinclair, J. (1996) The Search for Units of Meaning, Textus 9.
Sinclair, J. (1998) The Lexical Item in E. Weigand (ed.) Contrastive Lexical
Semantics, Amsterdam: Benjamins
Stubbs, M. (2001) Words and Phrases. Corpus Studies of Lexical Semantics,
Blackwell Publishers
Trosborg, Anna, (1997) Text Typology: Register, Genre and Text Type. Text
Typology and Translation: 3-23. John Benjamins
Viel,J.,C. (2002) The vocabulary of English for scientific and technological
occupational purposes, ESP World , Issue 1, Vol. 1, May, ( http://esp
world.7p.com/Articles_1/vocabulary.html)
34
References
Werlich, E. (1976) A Text Grammar of English,Heidelberg:Quelle & Meyer
Widdowson, H.G.(1998) Linguistics, Oxford University Press
Zgusta, L. (1971) Manual of Lexicography , Prague : Academia/The Hague , Paris :
Mouton
Corpus Sources
a . Search engines and meta-search engines
Alta Vista: http://www.altavista.comC4:
http://www.c4.comDogpile: http://www.dogpile.com
Google: http://www.google.com
MetaCrawler: http://www.metacrawler.com
Northern Light: http://www.northernlight.com
ProFusion: http://www.profusion.com
b. Online Corpus Availability
The TEC Concordance Browser website
WebCorp tool (WebCorp website)
Multiconcord : http://www.copycatch.freeserve.co.uk/multiconc.htm
Paraconc: http://www.ruf.rice.edu/~barlow
Wordsmith Tools: http://www.oup.com/elt/global/isbn/6890
Texts available: http://web.bham.ac.uk/johnstf/timconc.htm
BNC
Lextutor
www.hti.umich.edu/m/micase/
35

Words, Context and Contextualization

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Words, Context and Contextualization

Uploaded by

Copyright:

Available Formats

Universitatea Dunrea de Jos

Anul al II-lea, semestrul I

Words. Context. Contextualization

1.1. The Phonological Word

1.2. The Orthographic Word

1.3. The Grammatical Word

1.4. The Lexical Word

Chapter 2. Word Classifications

2.1. Content / vs/ Function Words. Lexical Density

2.2. Core Words /vs/ Non-Core Words. Core Vocabulary

2.3. Dictionaries as Repositories of Words

2.4. Lexical Words and Computer Corpora

Chapter 3. Context and Contextualization

3.1. Acceptations of the Terms Context and Contextualization

3.2. Text Typology

3.3. Lexical Profiles. A Corpus Illustration

Words. Context. Contextualization

CHAPTER 1. Word Definitions

CHAPTER 1. Word Definitions

Words. Context. Contextualization

CHAPTER 1. Word Definitions

Words. Context. Contextualization

CHAPTER 1. Word Definitions

Words. Context. Contextualization

CHAPTER 1. Word Definitions

CHAPTER 1. Word Definitions

7. Add the endings, keeping -y or changing it to i:

9. Make up lexical fields using the words listed below:

Words. Context. Contextualization

CHAPTER 2. Word Classifications

CHAPTER 2. Word Classifications

CHAPTER 2. Word Classifications

CHAPTER 2. Word Classifications

Words. Context. Contextualization

CHAPTER 2. Word Classifications

CHAPTER 2. Word Classifications

Words. Context. Contextualization

CHAPTER 2. Word Classifications

CHAPTER 2. Word Classifications

Words. Context. Contextualization

CHAPTER 2. Word Classifications

CHAPTER 2. Word Classifications

Words. Context. Contextualization

CHAPTER 2. Word Classifications

Words. Context. Contextualization

CHAPTER 3. Context and Contextualization

CHAPTER 3: Context and Contextualization

Words. Context. Contextualization

CHAPTER 3: Context and Contextualization

Words. Context. Contextualization

CHAPTER 3. Context and Contextualization

Words. Context. Contextualization

CHAPTER 3: Context and Contextualization

collocates (N-3) see (N+3) collocates

The verb see is basically the prototype of non-agentive verbs of seeing

CHAPTER 3. Context and Contextualization

so I see that youre from Hartland Michigan

Words. Context. Contextualization

CHAPTER 3: Context and Contextualization

A complete analysis of the complementation of see as a trigger of Acc + Inf / vs/

CHAPTER 3. Context and Contextualization

Words. Context. Contextualization

CHAPTER 3: Context and Contextualization