Professional Documents
Culture Documents
Natural Language
Understanding
NLP
Written Text
Speech
Speech Recognition
Speech Synthesis
2
Languages
Artificial
Smaller vocabulary
Simple syntactic structures
Non-ambiguous
Not tolerant to errors (ex. Syntax error)
Natural
3
Who is Bram
Stoker?
(Dracula)
Information Extraction
Subject: curriculum meeting
Date: January 15, 2012
To: Dan Jurafsky
Hi Dan, weve now scheduled the curriculum meeting.
It will be in Gates 159 tomorrow from 10:00-11:30.
-Chris
Event:
Curriculum mtg
Date: Jan-16-2012
Start: 10:00am
End:
11:30am
Where: Gates 159
zoom
affordability
size and weight
flash
ease of use
since the camera is small and light, I won't need to carry around
those heavy, bulky professional cameras either!
the camera feels flimsy, is plastic and very light in weight you have
to be very delicate in the handling of this camera
Machine Translation
Fully automatic
.
Translation from Stanfords Phrasal:
mostly solved
Sentiment analysis
Spam detection
Lets go to Agra!
Buy V1AGRA
Coreference resolution
Paraphrase
ADJ
NOUN VERB
ADV
ORG
LOC
Summarization
Parsing
Economy
is good
13
Dialog
Party
May
27
add
2.
3.
Syntactic Ambiguity
At last, a computer that understands you like your mother
Semantic Ambiguity
At last, a computer that understands you like your mother
a female parent
a cask or vat used in vinegar-making
Acoustic Ambiguity
At last, a computer that understands you like your mother
Pragmatic Ambiguity
At last, a computer that understands you like your mother
14
1.
2.
3.
Information Retrieval
Speech Recognition
Predictive Text / Word Completion
Language Identification
Text Classification
Authorship Attribution
16
Stages of NL Understanding
Stages of NL Understanding
Parsing (Syntax):
Semantic interpretation:
Lexical Semantics :
What is the
meaning/semantic
relations between
individual words?
Chair: person? Furniture?
Compositional
Semantics: What is the
meaning of phrases and
sentences?
The chairs leg is broken
source: Luger (2005)
18
Stages of NL Understanding
Discourse Analysis
Pragmatics
World Knowledge
Examples
I start the AI class with the sentences:
syntactically OK?
20
Stages of NL Understanding
Parsing (Syntax):
Syntactic Parsing
1.
2.
22
English parts-of-speech
24
Syntax
How parts-of-speech are organised into larger
syntactic constituents
Main Constituents:
S:
sentence
The boy is happy.
NP: noun phrase
the little boy from Paris, Sam Smith, I,
VP: verb phrase
eat an apple, sing, leave Paris in the night
PP: prepositional phrase in the morning, about my ticket
AdjP: adjective phrase
really funny, rather clear
AdvP: adverb phrases slowly, really slowly
25
A Parse Tree
26
a CFG consists of
sentence S
An Example
Lexicon:
N --> flights | trip | breeze | morning
V --> is | prefer | like
Adj --> direct | cheapest | first
Pro --> me | I | you | it
PN --> Chicago | United | Los Angeles
Det --> the | a | this
Prep --> from | to | in
Conj --> and | or | but
//
//
//
//
//
//
//
//
noun
verb
adjective
pronoun
proper noun
determiner
preposition
conjunction
Grammar:
S --> NP VP
NP --> Pro | PN | Det N
VP --> V | V NP | V NP PP
PP --> Prep NP
//
//
//
//
I + prefer United
I, Chicago, the morning
is, prefer + United,
to Chicago, to I ??
28
Parsing
parsing:
we need:
29
Parsing Strategies
Bottom-up parsing /
breadth first
1.
2.
3.
4.
5.
6.
7.
8.
9.
Top-down parsing /
depth first
1.
2.
3.
4.
5.
6.
7.
8.
9.
S
NP VP
PN VP
John VP
John V NP
John ate NP
John ate ART N
John ate the N
John ate the cat
31
Depth-first vs Breadth-first
the cat eats the mouse.
NP-VP
Det-N-PP
the cat
Det-N
VP Aux-NP-VP
Prep-NP
Grammar:
(1) S --> NP VP
(2) S --> VP
(3) S --> Aux NP VP
(4) NP --> Det N PP
(5) NP -- > Det N
(6) PP -- > Prep N
Lexicon:
(10) Det --> the
(11) N --> cat
(12) VB --> eats
Depth-first Breath-first
Top-down
Bottom-up
Top-down,
Depth-first
Bottom-up,
Depth-first
Top-down,
Breath-first
Bottom-up,
Breath-first
33
34
PP attachment:
The man saw the boy with the telescope.
Correct parse 1
Correct parse 2
Probabilistic Parsing
One morning I shot an elephant in my pyjamas. How he got
into my pyjamas, I dont know.
G. Marx, Animal Crackers, 1930.
Example of a PCFG
So for:
source: Manning, and Schtze, Foundations of Statistical Natural Language Processing, MIT Press (1999)
37
P(t1) = 1x.1x.7x1x.4x.18x1x1x.18
= .0009072
P(t2) = 1x.1x.3x.7x1x1x.18x1x.18
= .0006804
source: Manning, and Schtze, Foundations of Statistical Natural Language Processing, MIT Press (1999)
38
Stages of NL Understanding
Semantic interpretation:
Lexical Semantics :
What is the
meaning/semantic
relations between
individual words?
Chair: person? Furniture?
Compositional
Semantics: What is the
meaning of phrases and
sentences?
The chairs leg is broken
Semantic Interpretation
Map sentences to some representation of its
meaning
Lexical Semantics
1.
Compositional Semantics
2.
40
Lexical Semantics
paw
animal
journey
human
tape
jambe
patte
leg
bird
foot
pied
animal
chair
human
41
42
Input:
Output:
intuition:
sense of a word depends on the sense of surrounding words
ex: bass = fish, musical instrument, ...
Surrounding words
river
fish
violin
instrument
salmon
fish
play
instrument
player
instrument
striped
fish
45
Goal: choose the most probable sense s* for a word given a vector
V of surrounding words
Feature vector V contains:
Features: words [fishing, big, sound, player, fly, rod, ]
Value: frequency of these words in a window before & after the
target word [0, 0, 0, 2, 1, 0, ]
Bayes decision rule:
s* = argmaxsk P(sk|V)
where:
j=1
P(vj | sk ) =
count(vj, sk )
count(vt, sk )
Nb of occurrences of feature j
over the total nb of features
appearing in windows of Sk
P(sk ) =
count(sk )
count(word )
Nb of occurrences of sense k
over nb of all occurrences of
ambiguous word
47
Example
Training:
5/30
1/30
1/30
P(the|BANK2) =
P(world|BANK2) =
P(and|BANK2) =
P(the|BANK1) =
P(world|BANK1) =
P(and|BANK1) =
P(off|BANK1) =
P(Potomac|BANK1) =
0/30
0/30
P(off|BANK2) =
1/12
P(Potomac|BANK2) = 1/12
P(BANK1) = 5/7
P(BANK2) = 2/7
3/12
0/12
0/12
Score(BANK1)=log(5/7) + log(P(shoe|BANK1))+log(P(on|BANK1))+log(P(the|BANK1))
Score(BANK2)=log(2/7) + log(P(shoe|BANK2)+log(P(on|BANK2))+log(P(the|BANK2))
48
Stages of NL Understanding
Semantic interpretation:
Lexical Semantics :
What is the
meaning/semantic
relations between
individual words?
Chair: person? Furniture?
Compositional
Semantics: What is the
meaning of phrases and
sentences?
The chairs leg is broken
Compositional Semantics
The cat eats the mouse = The mouse is eaten by the cat.
Goal:
E.g.
50
Definition
Example
Syntactic
Role
agent
instigator of the
action
subject, bypp
force/tool causing
event
instrument
window.
the window.
with-pp,
subject
patient/the
me
thing affected
direct object,
subject
experiencer
person involved in
perception/state
subject
destination
final location
I walked to school.
to-pp, into-pp
51
Some Difficulties
m (
f man(m) woman(f) loves(m, f))
f(
m man(m) woman(f) loves(m, f))
52
Stages of NL Understanding
Discourse Analysis
Pragmatics
World Knowledge
Discourse Analysis
In logics: A B C C B A
Not in NL:
John visited Paris. He bought Mary some expensive
perfume. Then he flew home. He went to Walmart. He
bought some underwear.
John visited Paris. Then he flew home. He went to
Walmart. He bought Mary some expensive perfume. He
bought some underwear.
Humans infer relations between sentences that may not
be explicitly stated in order to make a text coherent.
(*) Bill went to see his mother. The trunk is what makes
the bonsai, it gives it both its grace and power.
54
SEQUENCE
CONTRAST
CAUSE
RESULT
PURPOSE
55
Input:
Output:
Stages of NL Understanding
Discourse Analysis
Pragmatics
World Knowledge
Pragmatics
58
Stages of NL Understanding
Discourse Analysis
Pragmatics
World Knowledge
60
Stages of NL Understanding