Professional Documents
Culture Documents
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 3, Issue 1, January February 2014 ISSN 2278-6856
Lovely Professional University, School of technology and sciences, NH-1, Punjab, India
Data
Mining,
1. INTRODUCTION
Data Mining means the extraction of data which can be used by many peoples, experts, organizations etc. Online social networking sites are coming out in which individuals can write contents on sites, can change information and establish online relations. Blogosphere is main e.g. of online social networking site. A blog can be created by user in which everyone can publish their own papers. With the use of blog services, users may develop relations with other user through online sites. Every person can create their own blog which helps them to increase in business because if some blog is so popular then its rating will go high they will get money due to which their business improves. Blog network is a social network which is developed through blogs and their relationships (e.g. facebook.com). This concept defines activities of other users can be calculated through which influence users can be obtained. In business domain the number of influenced user can be find out which will be more efficient in terms of finance. Working of blog in earlier days is done in small user groups and single subject is involved on which conversation would be done. At present multiple groups of users can operate more number of blogs which could be written by many
authors. There are a lot of authors who place their material on a particular blog in which multiple users can do some actions such as likes, dislike and comments etc. It depends on person to person how they respond after understanding the post of blog. Blogs offer simple relations with other blogs, web pages etc [7]. In blog atmosphere author can write their own content which can be change or modify it. Blog is an online diary in which anyone can comments or discussed on any issue, they vary from for e.g. music, products, politics, and celebrities etc., to extremely own benefit [1]. The significant social group part of blogosphere is blogroll [6]. Bookmark represents to many blog they have their own blogs and create friends over blogs also. Users can express their opinions on the web. Blogosphere is the grouping of all the blogs in which any person can create their own blog regarding any topic then many users can rate that blog with the help of like. Number of likes can decide that a particular blog contain some valuable document or not. Users can comments, share, like also due to which rating of that blog would be high and influential user can also be identify. Influential user shows an impact on another normal user for liking the blog. Content or document of a blog should be relevant or interesting which influence the users otherwise no one will like it. The main objective of this paper is to calculate the direct and indirect influential users by some data mining techniques.
2. PREVIOUS WORK
The Statistical Analysis was used to show the relation between blog content and user engagement. Many organizations understand the importance of social media in positioning, spreading and advertising the goods and services [2]. Hundreds or thousands of peoples are busy on social media. Organizations create their own blog to support their products. The post-processing technique was used and based on blog clusters which related to user's comments on blogs. The clusters based on between commentators a lot different from the content clusters. But the value of content clusters can be better by using implicit ties between commentators in which do not fit into single cluster [4]. New concept of QIM was also introduced, which is used to calculate the influence score of bloggers [5].QIM combination of two components: This shows the interaction between bloggers and the reader who likes blog and level of information Page 237
3. CURRENT WORK
3.1 Scope of the study The purpose of CPU (Content Power User) is to establish a blog network which is the combination of both bloggers and their actions. There are following steps for constructing a blog network by capturing special impact interaction with users and blog: (1) Firstly it is based on users different behavior and bookmarks. The users action on another users topic or matter can be calculated when the user get influence by other users blog. So, user can perform some actions on blog such as like, comments, dislikes etc are some best features for measuring the CPU in a blog network. Which users will influence by other user over the blog network. (2) Proposes a scheme to analyze the content of a document. The document is more important due to which user can attract by content of blog. The document should show some impact on the user then only user gets attracted towards specific blog otherwise not. As a result content of a document can be considered from direct and indirect user behavior. (3) The document of a user can be calculated by simply adding the more valuable content on blog. As a document have more publicity than user gets easily attracted over a specific blog otherwise not. (4) At last, establish the CPUs in a blog system by normally selecting the peak n number of users from the top most user document. 3.2 Methodology Used The research is based on content mining technique in which data is collected from various sources. By applying data mining techniques we always get good results for blog like receiving an influential user, dormant users means inactive user, active users etc. First of all classify the users based on some features then clusters them which Volume 3, Issue 1 January February 2014
Figure 1 Methodology By doing this, get an influential user who shows some impact on normal user of a blog due to which increase in user which like more document in a blog. The method shall be using the method of DCP which is defined as the content power of a document. Similarly, UCP is defined as the content power of a user. Some AT represents the types of actions (i.e., comment, trackback, and scrap) a user can perform in a blog network. When computing the content power of a document, different weights may be assigned depending on the types of actions. The actual work starts over here for mining of high, average utility document and user, after providing threshold value for evaluation.
4. CONCLUSION
This paper combines two data mining techniques which have been proposed for identifying the influential users and document also in a blog network. The method which is used in this research is very user friendly. Data mining means to extracting or mining knowledge from huge amounts of data [3].This research uses the two very popular Data mining techniques; they are as classification and clustering which is used to classify the direct and indirect influential users with some features. Through this method, the blog user can easily identify, which users are useful for their blog and which are not useful. On the basis of some threshold values utility users and decision can be obtained by decision makers very efficiently. This research may expand for those users who will add some additional features due to which blog quality will improve and also improves business also. Future work aim is to Page 238
References
[1] Bharati M. Ramageri, Data mining Techniques and Applications, Indian Journal of Computer Science and Engineering Vol. 1 No. 4 301-305. [2] Apoorva Vikrant Kulkarni, Blog Content and User Engagement-An Insight Using Statistical Analysis, International Journal of Engineering and Technology (IJET) Vol. 5, No. 3, Jun-Jul 2013. [3] Jiawei Han and Micheline Kamber, Data Mining Concepts and Techniques, Illinois University, Urbana Champaign, 2nd edition. [4] Tomas Kuzar, Slovak Blog Clustering Enhanced by Mining the Web Comments, In IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology 2011. [5] Eunyoung Moon, A Quality Method to Find Influencers Using Similarity-Based Approach in the Blogosphere, IEEE International Conference on Social Computing/IEEE International Conference on Privacy, Security, Risk and Trust 2010. [6] Malik Muhammad Saad Missen, Opinion Detection in Blogs: What is Still Missing?, International Conference on Advances in Social Networks Analysis and Mining 2010. [7] Hsiu-Ju Chen, Bloggers Social Presence Framing and Blog Visitors Responses, IEEE Computer Society, Eight IEEE/ACIS International Conference on Computer and Information Science 2009. [8] Malik Muhammad Saad Missen, Sentence-Level Opinion-Topic Association for Opinion Detection in Blogs, International Conference on Advanced Information Networking and Applications Workshops 2009.
AUTHOR
Azaim Khan received the B.Sc and M.Sc degrees in Computer Science and Information Technology from Rani Durgawati University and Makhanlal University, Jabalpur in 2005 and 2008 respectively. She is pursuing M.Tech in Computer Science and Engineering from Lovely Professional University Phagwara in 2012. Richa Sapra received the B.Tech and M.Tech in Information Technology and Computer Science from Guru Nanak Dev Engg. College, Ludhiana in 2007 and Lovely Professional University, Phagwara in 2012 respectively. From 2012 till now, she is working with Lovely Professional University, Phagwara as an Assistant Professor.
Page 239