Professional Documents
Culture Documents
NLP Intro
Language is meant for Communicating about the world.
Processing spoken language, using all the information needed above plus
additional knowledge about phonology as well as enough added information to
handle the further ambiguities that arise in speech.
NLP Intro
The Problem : English sentences are incomplete descriptions of the
information that they are intended to convey.
The problem : There are lots of ways to say the same thing :
Mary was born on October 11.
Mary’s birthday is October 11.
The good side : When you know a lot, facts imply each other. Language is
intended to be used by agents who know a lot.
Steps in NLP
1. Morphological Analysis:
• Individual words are analyzed into their components and non-word tokens
such as punctuation are separated from the words.
• Tries to extract root word from decline or inflectional form of word after
removing suffices and prefixes. Ex: getting the root “push ” from declined
from pushes, pushed, pushing, etc.
• Assign appropriate syntactic categories such as noun, verb, adjective etc. to all
words in the sentence.
2. Syntactic Analysis:
• Use the result of Morphological analysis to build a structure description of
sentence based on grammatical rules. This step is called parsing.
analyzer would reject the sentence “Boy the go the to store”
• Creating a parse tree is the first step towards understanding a sentence.
Steps in NLP
3. Semantic Analysis:
• The structures created by the syntactic analyzer are assigned meanings.
• It maps individual words in to corresponding object in the knowledge base
and combine the words with each other with semantic rules.
“Colorless green ideas sleep furiously”, will reject because of semantically
anomolous.
4. Discourse integration:
• The meaning of an individual sentence may depend on the sentences that
precede it and may influence the meanings of the sentences that follow it.
Ex: the word “it” in sentence “John wanted it”, depends up on prior
discourse context, may influence the later sentence “he always had”
Steps in NLP
5. Pragmatic Analysis:
• It refers to intended meaning of sentences used in different contexts. The
context affects the interpretation of the sentence.
Ex: John saw Mike in the garden with a cat.
• The structure representing what was said is reinterpreted to determine what
was actually meant.
EX: “Do you know what time it is?” should be interpreted as a request to
be told the time.
Syntactic Processing
• Syntactic Processing is the step in which a flat input sentence is converted
into a hierarchical structure that corresponds to the units of meaning in the
sentence.
• Almost all the systems that are actually used have two main components:
– A declarative representation, called a grammar, of the syntactic facts
about the language.
– A procedure, called parser, that compares the grammar against input
sentences to produce parsed structures.
Grammars and Parsers
• In Context of NLP , Parsing implies analyzing a sentence syntactically to assign
syntactic tags (subject, verb, object etc.) to provide constituent structure (noun
phrase, verb phrase etc.) or to characterize the syntactic relations between two
words.
• Parsing technique is further divided in to rule-based parsing and statistical
parsing.
Rule-based parsing
• In rule-based parsing, syntactic structure of language is provided in form of
linguistics rules which can be coded as production rules that are similar to context
free rules.
• Production rules are defined using non-terminal and terminal symbols.
Statistical parsing
• Require large corpora and linguistic knowledge is represented as statistical
parameters or probability.
Grammars and Parsers
• Once the grammar rules are defined , a sentence is parsed using the
grammar and tree kind of structure is built, if sentence is syntactically
correct. This tree is called parse tree.
• Parsing can be done in two methods: top-down parsing and bottom up
parsing.
• Bottom-up parsing: we start with the words in the sentence and apply
grammar rules in the backward direction until a single tree is produced
whose root matches with start symbol.
• Top-down parsing: we start with start symbol and apply grammar rules in
forward direction until the terminal symbol of the parse tree corresponds to
the word in the sentence.
• The choice between these two approaches is similar to the choice between
forward and backward reasoning in other problem-solving tasks.
• The most important consideration is the branching factor. Is it greater going
backward or forward?
• Sometimes these two approaches are combined to a single method called
“bottom-up parsing with top-down filtering”.
Grammars and Parsers
• Consider the simple context free grammar for English language.
Top- NP VP Bottom
Down - Up
Parsing NP
Parsing
Det Adj Noun Verb det Noun
The cute Girl eat an apple
Grammars and Parsers
• First rule can be read as “ A sentence is composed of a noun phrase
followed by Verb Phrase”; Vertical bar is OR ; ε represents empty string.
• Symbols that are further expanded by rules are called non terminal
symbols.
• Pure context free grammars are not effective for describing natural
languages.
Causal chains
There was a big snow storm yesterday.
The schools were closed today.
Planning sequences:
Sally wanted a new car
She decided to get a job.
Implicit presuppositions:
Did Joe fail CSE402?
Discourse and Pragmatic processing
• We focus on using following kinds of
knowledge:
– The current focus of the dialogue
– A model of each participant’s current beliefs
– The goal-driven character of dialogue
– The rules of conversation shared by all
participants.