Professional Documents
Culture Documents
C Sudhakar
CEO
Web:www.raskeysoft.com
SMART ANALYTICS
TYPES OF ANALYTICS
Data analytics
Compete on Analytics
Text analytics
Video analytics
Social networking analytics
Web analytics
Speech analytics
TEXT ANALYTICS
Text analytics is the process of analyzing
unstructured text, extracting relevant
information, and transforming it into useful
business intelligence
Text analysis is now capable of telling us things
we did not already know and perhaps more
importantly had no way of knowing before.
Access to huge text data sets an improved
technical capability means we can now mine the
text for patterns and trends that can be
incredibly useful in business.
TEXT CATEGORIZATION
Text categorization applies some structure to
the text which can then be used for analysis
or query
Text analytics assigns a document to one or
more classes or categories according to the
subject or according to other attributes such
as document type, author, creation date etc.,
TEXT CLUSTERING
CONCEPT EXTRACTION
This concept allows you to extract concepts
from text.
Meaning varies with concept
SENTIMENT ANALYSIS
EXAMPLE
ANALYSIS
Sentences (2), (3) and (4) express three positive opinions, while
sentences (5) and (6) express negative opinions.
Then we also notice that the opinions all have some targets on which they are
expressed.
The opinion in sentence (2) is on iPhone as a whole,
the opinions in sentences (3) and (4) are on the touch screen and voice
quality features of iPhone respectively.
The opinion in sentence (6) is on the price of iPhone, but the opinion/emotion in
sentence (5) is on me, not iPhone.
This is an important point.
In an application, the user may be interested in opinions on certain targets, but
not on all (e.g., unlikely on me).
Finally, we may also notice the sources or holders of opinions.
The source or holder of the opinions in sentences (2), (3) and (4) is the author of
the review
(I), but in sentences (5) and (6) it is my mother. With this example in mind, we
can define sentiment
TECHNICAL CHALLNGES
Object Identification
Feature grouping and synonym grouping
Opinion orientation classification
Integration
Identification of spam reviews/ documents
CLASSFICATION
Document-level sentiment analysis;
Sentence-level sentiment analysis;
Aspect-based sentiment analysis;
Comparative sentiment analysis; and,
Sentiment lexicon acquisition.
DOCUMENT SUMMRIZATION
Again as the name suggest this text analytic
tool allows you to automatically summarize
documents to retain the most important
points from the original document.
Extraction
Abstraction
SUMMARY
SMALL EXAMPLE IN AI
Lead Validity
Intelligence
Positive
Opportunity
DOMAIN INTELLIGENCE
Document
Url & Name
Negative
Url / Name
pattern
Url / Name
pattern
Unsure
Both
Positive and
Negative
Neither
Positive Nor
Negative
Challenges
Dmoz /
Jigsaw Data
Positive
Solution
EXTRACTION ENGINE
Document
Text, Xml
and
Metadata
Old
Document
New
Document
Parser
Tika and
Pdf2Xml
Challenges
PdfMiner
Solutions
CONTEXT INTELLIGENCE
Parser
Document
Titles and
Headers
Positive
Unsure
Challenges
Negative
Solutions
KEYWORD INTELLIGENCE
Parser
Context
Around
Keyword
Paragraphs
Bullet
Points
Challenges
Tables
Solutions
INTENT ANALYSIS
Context
Around
Keyword
Paragraph
Direct
Relation
Indirect
Relation
Bullet Point
Header
Analysis
Bullet Point
Analysis
Table
Row
Analysis
Header
Analysis
Human Ambiguity
Machine Ambiguity
Solution
Stanford Mistakes
Solution
OTHER CHALLENGES
Noisy Keywords
Noisy Domains
Duplicates
40%
37%
35%
30%
32%
28%
25%
19%
20%
15%
10%
12%
8%
5%
0%
May
June
13%
Lost Business
Wrong Context
Rejected by reviewer
Approved
False
Positives
Lost
Business
Low
Budget
Wrong
Industry
Too Early
Others
Challenges
Duplicates
Solution
80%
70%
65%
57%
60%
50%
40%
43%
Identified L.B
Not identified L.B
35%
30%
21%
20%
10%
0%
Juniper
(55/120)
Google
(30/150)
Tegile (19/55)
Keyword Intelligence
THANK YOU