You are on page 1of 11

Purposes

[[Whyisitusefultoestablishamapping?
cfr.cogalexpaper:
WordNet is one of the bestknown lexical resources and it contains one of the most complete
verbal ontologies, not only in terms of lexical entries, but also for the number of relations among
verbs (hyponymy/hypernymy, troponymy, entailment). It is therefore very useful to investigate
how IMAGACT maps onto WordNet. A mapping of both resources would lead to a reciprocal
enrichment on several aspects: for instance IMAGACT does not show semantic relations among
verbs, nor it uses definitions/glosses to define actions or action types, while WordNet does on
the other side WordNet does not distinguish between primary and marked senses, often
confusing proper (concrete?) uses with metaphorical or idiomatical ones. Furthermore, WordNet
defines horizontal relations among senses (synsets) with glosses, while IMAGACT uses scenes
to represent the event type which different verbs can refer to in similar contexts (equivalent verb
classes). So in case of perfect matching between an action type and a synset, IMAGACT videos
would be enriched by WN glosses, and WN glosses could be more intuitively understood if
visuallyrepresented]]
ImagAct:amultimodalontologyofaction
The ImagAct ontology (www.imagact.it) focuses only on high frequency action verbs
(approximately 600 lexical entry) of both Italian and English, which represent the basic verbal
lexicon of action in the two languages. It was derived with a bottomup approach from spoken
corpora (for Italian: CORALROM LABLITA LIP CLIPS for English: BNCSpoken) and its
nodes consist of videos, instead of lexical entries. Each short video represents in a very simple
way a particular type of action (e.g. a man taking a glass on a table) and it is also provided with a
list of Italian and English verbs that can be used to describe that action (with example sentences
derived from the oral corpora). Action types were individuated through a parallel work of manual
annotation conducted both on Italian and English [chin?] data, so the ontology is inherently
interlinguistic. In the framework of the project, types can be seen as [addensamenti]
coalescences ?, on the pragmatic continuum of actions that are extensionally denoted by each
action verb thus, they are psychologically (cognitively?), rather than simply lexically, grounded .
1
This makes it possible to enrich the ontology with other languages, just mapping data on the
1
Thenononetoonecorrespondencebetweenthelexicallevelandthelevelofactiontypesis
demonstratedbythefactthateachverbcanhavemorethanoneactiontypes,andeachtypecanbe
denotedbymorethanoneverb.
sameontology.
The ontology is accessible in different ways to users (for example, L2 learners), because the
interfaceisbuiltaroundthreesections:Dictionary,GalleryandCompare.
In the Gallery, the scenes are organized in nine macrocategories
(facial expressions, actions referring to the body, movement,
modification of the object, deterioration of an object, force on an
object, change of location, setting relation among objects, actions in
the intersubjective space), so it is quite intuitive to find the type of
action one is interested in. For example, we can look at actions that
causethemodificationofanobject.
Running through the 313 scenes grouped into this category, we
found the action represented in Fig. 2. The best example that
accompains the video (Marta suona il campanello) tells us that this
particularactionisdenotedbytheItalianverbsuonare.
If we now want to know which are the verb that describes this action
in other languages, we enter the lemma suonare in the Dictionary,
where we found the two action types of the Italian verb, with their translations. The first one
indicates the action of playing an instrument, whereas the second one indicates that of ringing
thebell:
SuonareType1(Scene:bbc50559)
BestExample:Fabiosuonailpianoforte
Englishtoplay
Chinesetn
Spanishtocar,sonar.
SuonareType2(Scene:4b8bcda1)
BestExample:Martasuonailcampanello
Englishtoring,tosound
Chinesen
Spanishtocar(timbre).
For Suonare Type 1, an alternative scene is
given, representing a girl playing drum, Maria
suona la batteria (93c459f9) in Italian, both
the videos have been reconducted to the
same action type (as the colours indicate).
This is due to the fact that in English, only in
this specific context, also a more specific
verb, to drum, can be used beside to play. Italian, Spanish and Chinese do not codify this
distinction.
Finally, if we are interested in comparing the use of these verbs among languages, for examples
between English to ring and Italian suonare, we enter the two lemmas in the Compare section:
the two verbs cover only one shared action type, only suonare can mean to play an instrument,
onlytoringcanmeantotelephoneortodialanumber.
So, the ImagAct ontology can be accessed by lemma, by scene and, furthermore, the usages of
twoverbsoftwodifferentlanguagescanbecompared.
WordNets
WordNets are a lexical monolingual databases that group words into sets of synonyms
(synsets) and makes explicit the various semantic relations between them. Every synset thus
contains a group of synonymous words or collocations and word senses are distinguished by
their appearence in different synsets, i.e. every word sense appears in only one synset. While
originally grounded in cognitive/psychological properties, wordnets are essentially based on
lexical meanings of words. WordNets can also be seen and used as lexical ontology in
computational systems (e.g. word sense disambiguation, machine translation, etc.): a synset
canbetakenasdenotingaconcept(orasenseofagroupofwords).
It is evident that the two ontologies are built in very different ways. First of all, WordNet takes into
account the entire lexicon of a language, whereas ImagAct only considers the domain of action
verbs another great divergence is found in the purposes: Imagact aims to list the different event
types (one or more) which we refer to when using action verbs, whereas WordNet aims to
describealldifferentusesofaverb(alsoincludingidiomaticormetaphoricalexpressions).
But the two resources also share some commonalities. They are both multiliingual, although in
very different ways: WordNets have been now produced for many languages and sometimes
these are connected one to another (see for example the EuroWordNet, GlobalWordNet
projects),whereasImagactsontologyistotallyderivedfrominterlinguisticcomparison.
Definingagoldstandard
3.4Thegoldstandard
To define a gold standard, we manually established a mapping between 358 Italian action types
(271 lemmas) and IWN synsets, comparing the scene and the best example associated to each
actiontypewiththedefinitiongivenforItalWordNetsynsets.
The action types considered are mostly derived from Italian activity verbs. Activity verbs are
processes (that correspond to Vendler's activities, charachterised by being dynamic, durative
and atelic) that are built around a prototypical concept, thus they project a clear mental image
(Moneglia, 1995). In our sample, we mostly selected activity verbs because especially for these
verbs we expect to find both in ImagActs and in ItalWordNets ontologies a node that coincide
with the prototype of the action [questo un punto un po debole, perch si ammette di non aver
scelto un sampling rappresentativo del corpus imagact baster giustificare la scelta facendo
capire che per avere un gold standard era importante avere pi matches possibili, e quindi
siamo andati a cercarli laddove era pi probabile trovarli? o semplicemente tagliamo qst 5
righe?]
These are the kinds of relation we found in mapping ImagAct action types on ItalWordNet
synsets:
a) Perfect match. For 235 action types, especially those referring to specific actions (most of
true activity verbs), we found a perfect correspondance between one ImagAct action type and
oneIWNsense:
Ex.1,nuotare,toswim
IMAGACTActionType1:
Matteonuotanellacqua
Mattew swims in the
water
= IWNSense1:
muoversi sulla superficie dellacqua eseguendo
movimenticoordinatidellebracciaedellegambe.
to move on the surface of water moving arms and legs in
acoordinateway
In these cases, also when IWN has more than one candidate sense, it is easy to establish the
matchjustexcludingthesensesclearlyreferredtoidiomatic/metaphoricaluses:
Ex.2,bere,todrink
IMAGACTActionType1:
Simonabeve(Simonadrinks)
Simonabevelacqua(Simonadrinksthewater)
Paolobeveuncaff(Paolodrinksacoffee)
= IWNSense1:
ingerireliquidi(totakeinliquids)
IWNSense2:
ingerire abitualmente vino e/o altre
bevande alcoliche (to drink regularly
wineoralcoholicdrinks)
IWNSense3:
credere ingenuamente e
acriticamente (to believe sth.
unquestioningly)
2) Imperfect match (T=S+S). In many cases, an action types subsumes more than one synset.
So we found an imperfect match between one action types and two or three synsets (five and
threetimes,respectively):
Ex.3,urlare,toshout
IMAGACTActionType1:
Fabiourla,(Fabioshouts) =
IWNSense2:
parlare a voce troppo alta e in modo
sguaiato(totalktooloud)
IWNSense3:
parlare con tono di voce molto alto,
udibile a distanza (talk in a loud
voice)
[IWN Sense 1: emettere suoni
potenti, si dice del verso del leone e
dialtrebelvesimili.
IWNSense4:
rumoreggiare, detto soprattutto degli
elementi naturali come mare e
vento.]
3) Imperfect match (T+T=S). In many cases, a mapping was established between two, three or
fouractiontypesandasinglesynset(twentyfour,fiveandthreecases,respectively).
Experimentstowardsanautomaticmapping
Given the potential complementarity of the two resources, the possibility of establishing an
automaticmappingbetweentheImagActontologyandWordnetresourceshasbeenexplored.
In the following, we describe the experiment conducted in this direction, present an evaluation of
themethodandfinallymakesomegeneralconsiderationsabouttheoutcomes.
Assumptionsandhypothesis
In the ImagAct ontology, as we have seen, basic action types are represented by scenes
(usually described by means of short video movies where actors perform the action) and each
scene is associated to one or more verbs (or better, verb types). Scenes can thus be seen as
sets of (locally) equivalent verbs types, and therefore considered as similar entities to Wordnet
synsets.Inparallel,ImagactverbtypescanbeseenassimilartoWordnetwordsenses.
On the basis, of these assumptions, our hypothesis is that we can automatically establish
correspondences between wordnet verb senses and Imagact verb types by considering the
similaritiesofactiontypescenesandsynsetsonthebasisoftheverbstheycontain.
The experiment conducted focuses on the mapping of the Italian WordNet (ItalWordNet) and the
ItalianverbsassociatedtotheActiontypesintheImagactontology.
Asafirststep,thegoalistoautomaticallyfindperfect/imperfectmatches.(??)
Method
Briefly,themethod/algorithmimplementedisthefollowing:
1. From the ImagAct data, we derive the list of scenes with the associated verbs. Each verb is
furthermore annotated with the relation it bears to the scene: it can be prototypical (PROTO) or
an instance (INST), an instance always being a verb denoting a more general action than the one
represented in the scene. For each scene we thus consider the set S of verbs associated to the
scenewiththespecificrelationtheybearwiththesceneasfeatures.
2. For each verb of the scene, we search ItalWordNet for the list of its possible senses and for
each of the senses we take the relative synset ID. Thus, each verb of the scene will be
associatedtoalistofsynsetids.
3. For each synset id, we retrieve the set of verb lemmas it contains in ItalwordNet (i.e we
considertheoriginalsynset).
4. For each synset that contains a prototype verb of the scene, an extended set is created by
including:thesetofitsverblemmasandthesetofverblemmasofallitshyperonyms.
5.Bothsetsasin3.and4.arethenconsideredtogetherastheset(ofsets)V.
6. A similarity measure (Jaccard index, Tversky ratio model) is then used to assign a score to
eachsetXinVrelativetothesceneS,whichprovidesuswitharankingofpossiblematches.
7. The highest score is taken for selecting the best candidate(s) for matching. That is, the
synset(s) that receive the higher similarity score will be proposed as the best mapping(s) to the
Imagactscene.
Evaluation
Discussion
Conclusions
(analysisoffalsepositiveserrorsduetothedifferencesbetweenimgctiwn...)
References
Moneglia, M. (1995), Prototypical vs. notprototypical predicates. Paper read at the international
conference Linguistics by the end of the Twentieth Century: Achievements and Perspectives
(Moscow State University, February 1995) available at http://lablita.dit.unifi.it/preprint, Collezione
1995.
Moneglia, M., Monachini, M., Panunzi, A., Frontini, F., Gagliardi, G. & Russo, I. (2012), Mapping
a corpusinduced ontology of action verbs on ItalWordNet. To appear in Christiane Fellbaum &
Piek Vossen (eds), Proceedings of the Global Wordnet Conference (January 913, 2012,
Matsue,Japan),Brno.[neparliamodiquestoprecedente?]
Rodrguez, M.A. and Max J. Egenhofer (2003) Determining Semantic Similarity among Entity
Classes from Different Ontologies. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING,VOL.15,NO.2,MARCH/APRIL2003.
Vendler,Z.(1967),LinguisticsinPhilosophy.Ithaca,NY:CornellUniversityPress.
Tracciadeivariesperimenti
Applyingthealgorithmwithoutmodifications:
Recall:58%
Precision:40%
Withoutconsideringsceneswithonlyoneelement:
Recall:68%
Precision:46%
Augmenting (leggermente) the score of the synset with the lowest sense(S1, S2, S3) in the list
ofsensesforeachlemma:
Recall:52%
Precision:52%
Without considering scenes with only one element AND augmenting the score of the sense with
thelowestranking:
Recall:54%
Precision:56%
case **: Augmenting (pesantemente) the score of the synset with the lowest sense se la scena
di appartenenza ha un solo elemento, else Augmenting (leggermente) the score of the synset
withthelowestsense.
recall:56%
precision:59%
comecase**conlemodifiche(a.)e(b.)pertenercontodeiriflessivi.
recall:54%
precision:54%
comecase**maconSOLOlamodifica(b.)chetienecontodeiriflessivi
recall:57%
precision:61%
dovelemodifichepertenercontodeiriflessivisono:
(a.) come sysnet canditati dallalgoritmo per ogni parola V della scena vengono considerati
anchequellicontenentilaversioneriflessivadiV.
(b.) in fase di valutazione e assegnazione dello score un verbo e il suo analogo riflessivo
vengonoconsideratilastessacosa.
conilnuovofilediinput:(10.10.2013[1120scene])
recall:59%
precision:64%
|AintersecatoB|
tversky(A,B)=
|AintersecatoB|+alfa*|A\B|+(1alfa)*|B\A|
dove0<=alfa<=1
risultati(filenuovocomecaso**+riflessiviinvalutazione):
alfa recallprecision

0.7 60% 66%


0.8 59% 66%
0.75 60% 65%
0.35 60% 64%
0.15 60% 62%
0.88 60% 68%
0.5 59% 64% //casostilejaccard
0.9 60% 68%
1.0 61% 71%
0.95 61% 71%
conalfavicinoa1siottiene61%direcalle71%diprecision!
Cosarappresentaalfa

La misura di similarit adottata basata su features matching in un set teorethical model, e non
in un geometrical model come illustrato in Tversky (A. Tversky Features of similarity.
Psychological Review, Vol 84(4), Jul 1977, 327352.) non detto perci che essa sia una
distanzametrica,inparticolarecherispettilassiomadisimmetria.
Nel nostro caso confrontiamo una scena A con un certo numero di synset {B} candidati
utillizzandocomefeaturesglielementichelicompongono.
Alfa unimportanza, un peso compreso tra 0 e 1 che viene assegnato agli elementi della scena
che non stanno nel synset, pi alfa vicino a 1 e pi diamo importanza al fatto che gli elementi
della scena debbano essere matchiati dal synset viceversa pi alfa vicino a 0 e pi diamo
importanza agli elementi del synset che non stanno nella scena. Un generico alfa compreso tra
gliestremitienecontoconcontinuitdiambedueglieffetti.
ANALYSISFALSE+
136falsepositives
Errors:
Theanalysisoffalsepositiveshasrevealedtwoerrorsintheinputfile:
COLLOCARET2027:itismappedonS2(32262)butitshouldbeS1(32261):20cases
ATTACCARE T 529: S11 has become S1 (!), so we should rewrite 37979 (S1) with 37985
(S11):4cases
Inallofthesecases,thealgorithmfoundthecorrectcandidate.
Falsepositives:
Totalnumberoffalsepositives(errors):112
Totalnumberoffalsepositives(errors):112
1) The mapping proposed is considered as totally accettable in 24 cases (21.43%), although the
outputdiffersfromthegoldstandard.
Ex.fromscene64fa01f7,verbverniciare:
BE:L'imbianchinovernicia TypeID:1535
Expectedsynset:{34031}coprireconunostratodiverniceunmuro,unmobile,uninfisso,ecc.
Actualresult:{32367}tinteggiareconpittureevernicicolorate.
Rephrasing:
We made a qualitative analysis of the 112 matches classified as false positives. We found that
in 24 cases (21.43%) the mapping proposed was totally acceptable, although different from the
gold standard. This happens when two synsets are almost identical both in meaning and in the
verbstheycontain:
Ex.fromscene64fa01f7,verbverniciare:
BE:L'imbianchinovernicia TypeID:1535Expectedsynset:{34031}(verniciare[1])
coprireconunostratodiverniceunmuro,unmobile,uninfisso,ecc.
Actual result: {32367} (colorare [3], pitturare [2], verniciare [2]) tinteggiare con pitture e vernici
colorate
Thus,theactualprecisionreachedbythetheautomaticmappingis.
2) the mapping proposed is partially notaccettable in 19 cases (16.96%), when the senses
describedbythebestexampleandthesynsetdefinitionaresimilarbuttheydonotoverlap.
3)themappingproposedistotallynotaccettable:60
4) the mapping proposed is partially accettable: there is a clear correlation between the type and
thesynset,butthelemmachosenisnotcorrect(reflexives):9
DISPORSI,RIGA58)

You might also like