You are on page 1of 7

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 2 Issue: 11

ISSN: 2321-8169
3595 3601


A Survey on Sentiment Mining

Ms.Sonali. D.Ingale


Dept of CS & IT
Dr.Babasaheb Ambedkar Marathwada university
Aurangabad, India

Head & Prof of Dept of CS & IT

Dr.Babasaheb Ambedkar Marathwada University
Aurangabad, India

Abstract In past days before putting money into any product people used to ask judgment to their family, friend circle and colleagues and then
they take the decision. In todays world there is a boom of World Wide Web, enormous amount of data is available on internet so while
purchasing a product instead of asking to people customer take decisions by analyzing electronic text. As the growth of e-commerce crowds of
people encouraged to write their opinion about numerous merchandise in the form of statements/comments on countless sites like
facebook,flipkart,snapdeal,amazon,bloggres,twiter,etc.This comments are the sentiments about the services expressed by users and they are
categorized into positive, negative and neutral. Different techniques are use for summarizing reviews like Information Retrieval, Text Mining
Text Classification, Data Mining, and Text Summarizing. Countless people write their sentiments on plenty of sites. These comments are written
in random order so it may cause trouble in usefulness of the information. If someone wants to find out the impact of the usability of any product
then he has to manually read all the sentiments and then classify it, which is practically burdensome task. Sentiment mining is playing major role
in data mining; it is also referred as sentiment analysis. This field helps to analyze and classify the opinion of users. In this paper we will discuss
various techniques, applications and challenges face by the sentiment mining.
Keywords- Information Retrieval, Text Summarizing, Data Mining, Sentiment Mining, Sentiment Analysis



The electronic world change the way of expressing feelings of

particular. Major leading industries uses written customer
reviews in comments for the business intelligent [1]. Different
products from numerous amount of retailers are available for
E-Shopping. Different shopping sites like amazon, flipkart,
snapdeal, myntra, offers customers to write their opinion about
different features of the product [2]. This enormous corpus of
review is playing very important role in competitive
intelligence and it gives right direction to consumer as well as
retailer. Not only the consumer uses this information for smart
purchasing but also retailer uses this information to find out
the pitfalls in their product and improve the quality, for finding
out the current requirement of the market and to adapt the
change of marketed [3].

Figure 1.Varius Features of Smartphone

Before purchasing any goods user find out the response of

others about that product. for example a particular Smartphone
contains various features like mp3, Bluetooth, calendar ,alarm
,browser ,wifi etc. as shown in figure1.People can debate some
of the aspects play more important role than others, this may
have a major waightage for general users for taking decision
and to retailer for their future development plans[4]. for
example consider some of the aspects of smart phone like
"Battery, Browsing speed, have more importance by
"alarm","calender".same like this if we consider the product
Laptop then the features like Processing speed"."Graphics
card have more weightage than the feature "button". So
likewise high weightage product aspects will improve
sentiment analysis and it will provide benefit to user as well as
merchant sites [5]. Customers will pay more attention while
purchasing any product and retailer will improve their service.
If we consider in practical way read all the comments and then
identify the important aspects then it will be a lengthy n time
consuming process.sor analyzing the enormous amount of data
we need an automatic approach which will process all the data
,find out the aspects of product and then classify it into like,
dislike and neutral sentiment[6].
The term sentiment mining is also referred as opinion mining
which will help us for the process of sentiment analysis and
for classifying sentiments for the furtherance of comfort
remember of this paper is constructed as follows: section 2
represents origins of data for sentiment mining. Section 3
explains about sentiment analysis, section 4 contains different

IJRITCC | November 2014, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 2 Issue: 11

ISSN: 2321-8169
3595 3601

sentiment classification techniques, section 5 consist of
applications of sentiment mining, sixth section is about
applications of sentiment mining. Last section will conclude
out study and discuss about the future research areas.


In todays world E-text is growing in large scale which is

available for sentiment analysis. User reviews plays very
important role for growth of company and improve services
[7]. The data is available from different sources like Social
Network, Dataset, Online Portals, Blogs, News Portals, etc.
1] Social Network:
Social network is a leading platform for expressing feelings to
particulars. It is a huge network in which millions of people
can write their feelings and share it [8]. Various social
networking sites are available like,,,,,
which contain millions of sentiments.
2] Dataset: Most of sentiment analysis is done on dataset
available at different sites. Movie review dataset available at
dataset is also available for multidomain sentiment analysis at dataset like KDD cup, AWS (Amazon Web
3] Online Portals:
Different online portals are available for smart purchasing.
Before dealing with any product customer reads the opinions
about that product and then decide where to invest his
money[10]. There are different sites are available like,,www.kno which encourage user to write feelings about
product. Different sites available which contain reviews about
4] Blogs: Web log which is popularly known as blog. Blog is a
personal web page of its user he allowed to write his
likes,feelings,give link to other sites, dislikes etc.Blog contain
different kind of sentiments like social issues, political issues
,recopies, Product services ,debates etc[12]. Twitter is a
leading micro blog site in which user allowed to update status
within fixed limited words called as tweet. Many researchers
showing their interest in analyzing of tweeter dataset .recently
in America by using tweets, winning chances of Barak Obama
was predicted.
5]News Portals: Different news portals like,,,,www.aajtak.
com contains news article and ask people their opinion, which

is helpful for finding out the impact of particular in their life,

political views etc[13].


Sentiment mining is a part of data mining which process the

Electronic text and tag the words into 3 categories that is
positive, negative and neutral [14]. Different techniques are
used for sentiment analysis, classification and summarization.
Different techniques used for sentiment summarization are
Data mining, classification of Text, Information Retrieval and
Summarization of Text [15].Figure2 shows general structure
of sentiment analysis. Sentiment analysis can be achieved at
various levels, the levels are: Phrase Level, Aspect Level,
Sentence Level, Document Level, Natural Language
Processing. Depending upon nature of use level of Sentiment
analysis is selected.

Figure 2: Struture of Sentiment Analysis

1] Phrase Level: It works with phrases of a document. The
words which are closer to each other are called as phrases.
Phrase level classification is performed on the phrases which
contain opinion words. Here we cannot find the long range
dependency. Usability is depends upon situation [16].In a
sentence if opinion words are too far then phrase level analysis
is not useful [17].
2] Aspect Level: Aspect level analysis is also called as feature
based classification. Instead on studying structure of the
sentence it directly deal with the aspects in the text[18]. For
example Nokia internet speed is good but its battery drain out
early it consist two aspects, internet speed and battery life of
the product Nokia.The opinion on the Nokia internet speed is

IJRITCC | November 2014, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 2 Issue: 11

ISSN: 2321-8169
3595 3601

positive, but on its battery life is negative. so internet speed
and battery life are targets. Based on this analysis, a structured
summary of sentiments about products and their features can
be produced [19].

contain grammatically incorrect words, but we can overcome

it by preprocessing the sentences [24].

3] Sentence Level: It works with finding out polarity of each

sentence. It considers subjectivity and objectivity of the
sentence. Subjectivity is related to domain, one sentence is
opinion about single domain. If the sentence is too complex
then sentence level classification fails [20].

attributes as testing dataset. It is more efficient and accurate

[25].It contains different methodology like SVM (Support
Vector Machine), NB(Nave Bayes), ME (Maximum Entropy),
Decision Tree, etc [26]. Below Table consist different models
which use supervised learning

4] Document Level: It works with the documents which

contain E-text. It finds out polarity of the document, if a
positive phrase is present it does not mean that user like
everything vice versa for the negative phrase [21]. Document
Level analysis classify whole document here sentiments of the
single document is considered so it will not work well with
news portal data ,blogs .It is used with both supervised and
unsupervised learning algorithms[22].

Table1: Different Models Supervised Learning Method


5]Natural Language Processing: Natural Language

Processing works with the grammar of the sentence , on the
basis of grammar it searches the nouns ,adjectives,verbs,etc.for
aspect level classification it is best suited[23]. Consider
example ,"The mobile has excellent battery backup. here we
can identify the camera is a feature but for machine
understanding the grammatical structure takes place by
grammar if we consider then the noun term is the feature of
the product. The demerit of NLP is it fails if the sentences

Figure3: Various techniques used in sentiment analysis

IJRITCC | November 2014, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 2 Issue: 11

ISSN: 2321-8169
3595 3601

Sentiment Analyzing and Classification uses various
techniques which are mainly divided in to two approaches they
are Machine Learning and Lexicon based approach. Each
technique is again sub divided. They are explained as below:
1] Machine Learning: Machine learning methodology consist
of supervised ,unsupervised and semi supervised categories
.each category is again sub divided as shown in figure3

words. Corpus based category searches the opinion word in

domain specific approach. It will categorized in to
1: Semantic Approach: It is use to search co-occurrence
sequence and seed opinion words
2: Statistical Approach: This approach give sentiment values
directly relied on various standards for computing the
synonym words [31].

a]Supervised Learning: It predicts attribute classes on the

basis of given set of training values. It contains training and
testing dataset. Training dataset is smaller which contain same
b]Unsupervised Learning: Unlike supervised learning
classification training data is not needed for unsupervised
learning, here for data classification it uses clustering
algorithms like K-Mean clustering, Hierarchical Clustering
etc.for determining threshold values of words neural network
may be used and then classify them depending upon
Below Table consist different models which use unsupervised
Table 2: Different Models Unsupervised Learning Method

c) Semi supervised Learning: It is the combination of

supervised learning and unsupervised learning .It aggregates
the accuracy of supervised learning and word stability and
readability of unsupervised learning [28].
2] Lexicon Based Techniques:
Opinions words are categories into positive and negative
sense. Sentiment phrases and idioms are also there which are
called as sentiment lexicon. Sentiment lexicon is mainly
categorized in to three types Manual, dictionary based and
corpus based [29]. Manual approach is more time consuming
than other two automated approaches. in Dictionary based
category tiny set of opinion words are gathered and then they
increased by looking their word finder for their similar words
and opposite words[30]. If new words found then it will
append to list otherwise iteration starts again until no new
word found. It will not work for domain and context specific


In the field of opinion mining there are different areas which

excite to researchers [32].Natural Language processing groups
are showing their curiosity in Sentiment and opinion mining.
Major areas which get benefited are listed below
1. Marketing Strategies: Different merchant sites sell their
products online while purchasing products user motivated to
write their views on the product by using the customer
products retailers can design their marketing strategies
2. Prediction: by analyzing users reviews on different sites
like, blogging sites, social networks weather, and
share market prediction can be possible.
3. Recommendation: It is possible to classify particular users
likes and dislikes based on his positive and negative reviews.
After finding interest of particular recommendation system can
be easily implemented without asking user to his personal
4. Business Intelligence: By analyzing the customer opinions
different major retailers deciding their business policies which
are flexible with users.
5. Voice of People: Voice of people explains in detail of user
expectations and preferences. Sentiment mining helps to find
out what is the current issue of community and the changes in
community are needed.
6. Government: Government can find out what are merits and
demerits of current working policies by analyzing reviews of
different people.
7. Building Policy: Using sentiment mining it is possible to
find out drawbacks of current working rules and building of
new policies by removing the pitfalls [30].
8. Junk Detection: In recent days World Wide Web is easily
available to each and every person, anyone can upload
anything on internet, so the junk messages are increased. By
using it is easy to classify authenticated and spam data [31].
9. Calamity Detection: By constantly keep track on news
group, social networks, different community forums it become
easy to identify the calamity. After detection of the calamity it
will be made easy to save people life [32].
10. Politics: By analyzing people opinion about different
politicians it will be possible to predict who will win the
election before actual result declaration different voting
prediction applications are available like

IJRITCC | November 2014, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 2 Issue: 11

ISSN: 2321-8169
3595 3601



Opinion Mining is going on research field it works with

electronic text, when we work with E-text for sentiment
mining different problems arrive [33]. These different
problems are listed below
[a]Consider a user comment, Keypad is easy to use and
another user comments, Keyboard using is easy. Even if
both users give review for the single feature using different
synonym words so this is a challenge in Sentiment Mining
[b]It is necessary to identify the pragmatics of customer which
sometimes varies the throughout opinion. Consider the
example given below:
Once I have watched the show of "DESTROY"
The ending destroy me completely
Here first statement denotes positive sentiment and second is
for negative sentiment, so this is a challenging task [35].
[c]The noun words are considered as features but in some of
the cases verbs, adjectives may also act as feature, so it may
become complicated to identify and classify aspects. [36]
[d]NLP will generate pitfalls if the text contains co-references,
grammatical mistakes, and ambiguity so it may generate effect
on sentiment mining [37].
[e]User opinion, response, features, comments are written in
different language like Chinese, German, Marathi, Urdu, etc.
Identifying and classifying sentiments in each language is
challenging task [38].
[f]Electronic text carry two types of authenticated and spam
text, so different sentiment classification methodology used to
identify and remove spam text before processing the text for
further analysis. It is done by identifying redundancy and
removes outliers [39].
[g]As the electronic texting increases people use shortcuts of
language while typing comments/reviews in free text format.
They may use alphanumeric abbreviations consider example
ni8 for night, 2marrow for tomarrow, fi9 for fine and so on. So
to find out the slang words will become a challenging task for
sentiment mining [40].
[h]Recognize the polarity of the sentence is difficult task in
sentiment analysis, some words behave different in different
situation[41],like The Camera resolution is less in this
sentence less is used for negative polarity but if we say "Price
of mobile is less in this sentence less is work as positive
adjective[42] .
[i]Sentiment Analysis is mainly focus on domain dependency
that is one feature may work well on one product but will not
give result for another product [43].
[j] Different social networks mainly twitter have sentiments
which changes according to time. These situation and changes
need to find [44].


In this paper we study a survey of sentiment mining which

includes different origins of data which are used in sentiment
mining, various methodologies like Sentiment Analysis, NLP
(Natural Language Processing), and applications of sentiment
mining and Challenges of it. Sentiment mining is ongoing
research field of data mining which is used to separate out
useful information from countless amount of users review,
comments, feedback, and response on any statement/post or
any service. Sentiment extraction is useful tool to identify and
predict running and future trends, survey of any product,
people opinions on social issues, effect of any event on users.
Different large scale organizations like TCS, SAS, SAP are
using sentiment analysis for their Business Intelligence. Major
of work is done in the field on sentence level, document level
sentiment analysis ,but still lot of major problems and
challenges are present like , grammatical mistakes in English
,fake reviews, single word with different meaning ,recently
there is a trend change in expressing feelings , user write
comments in the format of smiley, images , sentiment
mining will be always a ongoing research field for present and
future researchers.

This review has been partially supported by Department of
Computer Science and IT Dr.Babasaheb Ambedkar
Marathwada University, Aurangabad. The views expressed
here are those of the authors only.






Zheng-Jun Zha, Member, IEEE, Jianxing Yu, Jinhui

Tang, Member, IEEE,Meng Wang, Member, IEEE,
And Tat-Seng Chua,"Product Aspect Ranking And Its
Applications",IEEE Transactions On Knowledge And
Data Engineering, Vol. 26, No. 5, May 2014
Singh and Vivek Kumar, A clustering and opinion
mining approach to socio-political analysis of the
blogosphere. Computational Intelligence and
Computing Research (ICCIC), 2010 IEEE
M Fan, G WU Opinion Summarization of Customer
comments International conference on Applied
Physics and Industrial Engineering in 2012.
B. B. Khairullah Khan, Aurangzeb Khan, Sentence
based sentiment classification from online customer
reviews, ACM, 2010.
Ayesha Rashid, Naveed Anwer, Dr. Muddaser Iqbal,
Dr. Muhammad Sher, A Survey Paper: Areas,
Techniques and Challenges of Opinion Mining,
IJCSI International Journal of Computer Science
Issues, Vol. 10, Issue 6, No 2, November 2013
Liu, B. (2010), Sentiment Analysis and
Subjectivity. Appeared in Handbook of Natural

IJRITCC | November 2014, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 2 Issue: 11

ISSN: 2321-8169
3595 3601















Language Processing, Indurkhya, N. & Damerau, F.J.

Bing Liu,Mining Hu,Junsheng Cheng,"Opinion
Observer: Analyzing and Comparing Opinions on the
Web",WWW 2005, May 10-14, 2005, Chiba,
Japan,ACM 1-59593-046-9/05/0005.
Zhang, Z. and B. Varadarajan. Utility scoring of
product reviews. In Proceedings of ACM
International Conference on Information and
Knowledge Management (CIKM-2006), 2006.
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?
Sentiment classification using Machine Learning
techniques. In: Proc. of CoRR (2002)
Nasukawa, T. and Yi, J. 2003. Sentiment analysis:
Capturing favorability using natural language
processing. Proceedings of the 2nd Intl. Conf. on
Knowledge Capture (K-CA2003).
Camelin, N., Damnati, G., Bchet, F. and De Mori,
R., Opinion Mining in a Telephone Survey Corpus,
International Conference on Spoken Language
Processing in 2006.
Mohammad S, Dunne C, Dorr B. Generating highcoverage semantic orientation lexicons from overly
marked words and a thesaurus. In: Proceedings of the
conference on Empirical Methods in Natural
Language Processing (EMNLP09);2009.
Bing Liu, Sentiment Analysis and Opinion Mining,
Morgan and Claypool Publishers, May 2012.p.1819,27-28,44-45,47,90-101.
Kim S, Hovy E. Determining the sentiment of
opinions. In: Proceedings of interntional conference
on Computational Linguistics (COLING04); 2004.
J. Han and M. Kamber, Data Mining: Concepts and
Techniques, third edition Publication Date: July 6,
Harb, A., Planti, M., Dray, G., Roche, M., Trousset,
F. and Poncelet, P. , Web Opinion Mining: How to
extract opinions from blogs?, International
Conference on Soft Computing as Trandisciplinary
Science in 2008.
Walaa Medhat a, Ahmed Hassan, Hoda
applications:A survey, in Shams Engineering
Journal (2014)
Haseena Rahmath P,"Opinion Mining and Sentiment
International Journal of Application or Innovation in
Engineering & Management (IJAIEM), Volume 3,
Issue 5, May 2014
Boiy, E., Hens, P., Deschacht, K. & Moens, M.-F.
(2007), Automatic Sentiment Analysis in OnLine
Text. In Proceedings of the Conference on
Electronic Publishing (ELPUB-2007), p. 349-360.













M. Baroni and S. Vegnaduzzo Identifying subjective

adjectives through web-based mutual information. In
E. Buchberger, editor, In Proceedings of the
Conference for the Processing of Natural Language
and Speech KONVENS), pages 17--24, in 2004.
Liu, B. (2010), Sentiment Analysis and
Subjectivity. Appeared in Handbook of Natural
Language Processing, Indurkhya, N. & Damerau, F.J.
A. Ghose and P. G. Ipeirotis,Estimating the
helpfulness and economic impact of product reviews:
Mining text and reviewer characteristics, IEEE
Trans. Knowl. Data Eng., vol. 23, no. 10, pp. 1498
1512. Sept. 2010.
N. M. Shelke, S. Deshpande and V. Thakre Survey
of Techniques for Opinion Mining International
Journal of Computer Applications (0975 8887)
Volume 57 No.13, November 2012.
Neha S. Joshi, Suhasini A. Itkat,"A Survey on
International Journal of Computer Science and
Information Technologies, Vol. 5 (4) , 2014, 54225425
C.-p. W. C.C.Yang, Y.C. Wong, Classifying web
review opinions for consumer product analysis,
ICEC09, ACM, 2009.
Hu, M. & Lui, B. (2004), Mining and Summarizing
Customer Reviews. In Proceedings of ACM
SIGKDD Conference on Knowledge Discovery and
Data Mining 2004 (KDD-2004), p. 168177.
Pang, B. & Lee, L. (2008), Opinion Mining and
Sentiment Analysis. In Foundations and Trends in
Information Retrieval 2 (1-2), p. 1135.
Ayesha Rashid, Naveed Anwer, Dr. Muddaser Iqbal,
Dr. Muhammad Sher,"A Survey Paper: Areas,
Techniques and Challenges of Opinion Mining",
IJCSI International Journal of Computer Science
Issues, Vol. 10, Issue 6, No 2, November 2013
Daille, B. 1996. Study and Implementation of
Combined Techniques for Automatic Extraction of
Terminology. The Balancing Act: Combining
Symbolic and Statistical Approaches to Language.
MIT Press, Cambridge
Somasundaran S, Wiebe J. Recognizing stances in
online debates. In: Proceedings of the joint
conference of the 47thannual meeting of the ACL and
the 4th international joint conference on natural
language processing of the AFNLP; 2009.p. 22634
Bo Pang and Lillian Lee,"Opinion Mining and
Sentiment Analysis"book.
Boiy, E., Hens, P., Deschacht, K. & Moens, M.-F.
(2007), Automatic Sentiment Analysis in OnLine

IJRITCC | November 2014, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 2 Issue: 11

ISSN: 2321-8169
3595 3601











Text. In Proceedings of the Conference on

Electronic Publishing (ELPUB-2007), p. 349-360.
Sentiment Analysis with Topic-Based Mixture
Proceedings of the 52nd Annual Meeting of the
Association for Computational Linguistics (Short
Papers), pages 434439,Baltimore, Maryland, USA,
June 23-25 2014.
M.Govindarajan, Romina M,"A Survey of
Classification Methods and Applications for
Sentiment Analysis ",The International Journal Of
Engineering And Science (IJES) ||Volume||2 ||Issue||
12||Pages|| 11-15||2013|| ISSN(e): 2319 1813
ISSN(p): 2319 1805
Dave K., Lawrence, S. & Pennock, D.M. (2003),
Mining the Peanut Gallery: Opinion Extraction and
Semantic Classification of Product Reviews. In
Proceedings of the 12th International Conference on
World Wide Web, p. 519- 528.
MC Wu, YF Lo and SH Hsu, A fuzzy CBR
technique for generating product ideas Expert
Systems with Applications, 34 (1), pg. 530-540
January 2008.
Turney, Peter D. 2002. Thumbs Up or Thumbs
Unsupervised, Classification of Reviews. Proceedings
of the 40th Annual Meeting of the Association for
Computational Linguistics (ACL'02), Philadelphia,
Pennsylvania, USA, July 8-10, 2002. Pp 417-424.
NRC 44946.
Sowmya Kamath S, Anusha Bagalkotkar, Ashesh
Khandelwal, Shivam Pandey, Kumari Poornima,
Sentiment Analysis Based Approaches for
Understanding User Context in Web Content, 9780-7695-4958-3/13, 2013 IEEE.
Bing Liu,"Sentiment Analysis and Opinion Mining",
Raisa Varghese, Jayasree M,"A SURVEY ON
MINING", and IJRET: International Journal of
Research in Engineering and Technology eISSN:
2319-1163 | pISSN: 2321-7308
Keisuke Mizumoto, Hidekazu Yanagimoto and
Michifumi Yoshioka, Sentiment Analysis of Stock
Market News With Semi-supervised Learning, IEEE
Computer Society, IEEE/ACIS 11th International
Conference on Computer and Information Science,
Segaran, T. (2007), Programming Collective
Intelligence. Sebastopol: OReilly Media, Inc.
T. Yao and L. Li A Kernel-based Sentiment
Classification Approach for Chinese Sentences


World Congress on Computer Science and

Information Engineering in March 31 2009-April 2
N, Anwer and A, Rashid Feature Based Opinion
Mining of Online Free Format Customer Reviews
Using Frequency Distribution and Bayesian
Statistics Networked Computing and Advanced
Information Management (NCM), 2010 Sixth
International Conference on 16-18 Aug. 2010.

IJRITCC | November 2014, Available @


You might also like