Data Mining Techniques and Applications in Different Domains

Abstract 1.
Introduction
Data mining is an analytic process designed to explore large volume of data in search of consistent patterns. It is the process of analyzing a given set of data or knowledge discovery form data bases. Also it is a process of automatic extraction of novel, useful and understandable pattern from a large collection of data. Data mining is the process of secondary analysis of large databases. This process consists of a sequence of the following steps [1][2] - data cleaning to remove noise and irrelevant data - data integration where multiple data sources are combined - data selection for retrieve from the database only the relevant data for the analyze - data transformation where data are transformed or consolidated into forms appropriate for mining - data mining the phase where the algorithms are applied in order to extract data patterns - pattern evaluation to find the interesting patterns who representing new knowledge - knowledge presentation visualization techniques are used to present the mined knowledge to the user In order to ensure that the extracted information generated by the data mining algorithms is useful, additional activities are required, like incorporating appropriate prior knowledge and proper interpretation of the data mining results [1]. The knowledge discovery process includes data selection, data cleansing, enrichment, data transformation, data mining and representation of discovered information. [3]. Standard architecture of the complete process of knowledge discovery in databases is shown in Fig. 1. However in contrast with the general architecture, we are using a homogeneous set of data as an input to the preprocessing step. Data Management concerns the specific mechanisms and structures for data storage and access, data preprocessing ensures data quality, data mining tasks applies various algorithms to perform data mining and post-processing step refine and evaluate the knowledge discovered from data mining procedure. [4]
Data bases Data Cleaning Data Integration Data ware hous e Pre proces sed data Mine d Data Disco vered Know ledge
Pre processi ng
Data Mining
Pattern Evaluati on
Fig. 1. Knowledge discovery in data bases
The rest of the paper is organized as follows. Section 2 discusses the various data mining aspects. Section 3 presents Video Mining aspects and state of the art techniques in Moving object detection and tracking. Section 4, describes some future directions for data mining on videos. Section 5 concludes the paper.
2. Thrust Ares in Data Mining Data mining involves six common classes of tasks, which are, Anomaly detection used for outlier/change/deviation detection, Association rule learning for. dependency modeling, Clustering to., discover groups and structures in the data without using known structures in the data, Classification does the task of generalizing known structure to apply to new data, Regression, which finds a function to model the data with the least error and Summarization for compact representation of the data set. [5] Based on the major classes of tasks the data mining techniques have different applications in different area, like text, multimedia mining etc., which are discussed in later sections of this paper.
2.1 Speech Mining Speech mining is a process used to describe automatic methods of analyzing speech to extract useful information. It may include, the topic being discussed, the speaker identification, speaker gender, emotions in speech and background noise or silence. The process is to identifying valid, novel, potentially useful, and ultimately understandable patterns in speech data. A major challenge for speech recognition tools has been recognizing the speech of different users in different environments. Speech recognition and classification can be a classic speech mining problem. Mainly there are two common methods for speech mining to extract useful information, one of which is, Large vocabulary Based Continuous Speech Recognition (LVSR) or Text-based indexing, in which speech will be converted to text and then identifies words in a dictionary that can contain several hundred thousand entries. If a word or name is not in the dictionary, the LVCSR system will choose the most similar word it can find, the second method is Phoneme based in which the basic unit of operation is a phoneme. Instead of converting the speech into text it uses the phonemes and each language has a very limited number of phonemes. Phoneme based approach uses either Grammar or Key word spotting for recognition. The applications of speech mining are widespread like user identification in ATM card operations, railway IVR systems, many other civilian applications as a soft-biometric method and in telephone speech recognition systems [6].
2.2 Text Mining Text mining is the process of collecting meaningful information from natural language text by analyzing text to extract information for specific purposes.. The unstructured nature of text makes it difficult to deal with algorithmically. The major steps in text mining are information extraction, topic tracking, key word extraction, summarization, categorization, clustering, concept linkage, information visualization and query answering. The main Text Mining applications are most often used in the sectors like publishing and media, telecommunications, energy and other services industries, information technology sector and internet, banks, insurance and financial markets, political institutions, political analysts, public administration and legal documents, pharmaceutical and research companies and healthcare. The major steps in text mining are information extraction, topic tracking, key word extraction, summarization, categorization, clustering, concept linkage, information visualization and query answering.[7]
2.3 Web Mining Web mining discovers useful information or knowledge from the web hyperlink structure, page content, and usage data. The nature of web data is heterogeneous and semi-structured or unstructured in nature. So it is not purely an application of traditional data mining. Many new mining methods and algorithms were introduced for web mining. Based on type of data, there are types: Web structure mining, Web content mining and Web usage mining. Web structure mining discovers useful knowledge from hyperlinks. Hyperlinks normally represent the structure of the Web. Using hyperlinks, discovery of important Web pages is possible. Search engines use this key technology. Using web structure mining, discovery of sharing of common interests from user communities is possible. Web content mining discovers knowledge from web page contents. Automatic classification and clustering of Web pages based on topics, extraction of knowledge like from descriptions of products, postings of forums etc. are also possible. Similarly, consumer sentiments can also be mined from customer reviews and forum postings. Web usage mining mines usage logs to the discover access patterns of user from User patterns record every click made by each user.[8]
2.4 Multimedia Mining Multimedia Mining is the discovery of knowledge from large amounts of different types of multimedia data. It involves the extraction of implicit knowledge, multimedia data relationships, or other patterns not explicitly stored in multimedia databases. Multimedia data consists of text, images, audio and
video. The text mining aspects already discussed in section 2.2. The volume of data in multimedia is very high and also the hidden information. These data can be mined to extract hidden information which are useful. Combination of multimedia data can also be used for the extraction of information. Spatial and spatio-temporal mining are examples of this. This includes image, audio and video mining. Video mining aspects will be discussed in the following sections. Image processing techniques usually concentrates on finding abnormal patterns and image retrieval. But image mining focuses on discovery of abnormal patterns and image searching. Image mining can also be used for discovering the image relations in images. It is very difficult to relate multimedia mining with the traditional mining. The audio mining techniques are similar to speech mining. It can either me a continuous speech or spoken language words. The process is already discussed in section 2.1. Here in this paper the discussion is concentrated on the video mining aspects [9].
2.5. Video Mining Video Mining is the process of unsupervised discovery of patterns from video. Mining video data is a complicated process. Normally video is considered a s a collection of related sill images with a time factor, but it is more than that with abundant of hidden information [10]. Different researches have proposed many methods for video mining applications with context. The some of the major application areas are, video mining in digital libraries, query and retrieval for videos, facial expression and behavior analysis of customers, traffic video surveillance, detection of crowed pattern to understand mob control, detection of traffic patterns for traffic understanding management, Moving object
detection and tracking and automated event analysis and suspicious movements like suspicious people in large crowed, abandoned objects in public places etc., In this article the major discussion is on the state of the art in object detection and tracking of moving objects.
2.5.1. Moving Object Detection and Tracking Moving object detection and tracking is an important step in computer vision and video analysis which contains motion-based recognition, automated surveillance, and traffic monitoring and vehicle navigation. The key steps involved in process are detection of interested moving object, frame to frame tracking of the detected object and post processing of these objects for their behavior Object representation has a vital role in object detection and tracking. As per A Yilmaz et al different shape and appearance representations can be utilized. The major shapes explained are points, primitive geometric shapes, Object silhouette, contours, architectural shape models and skeletal models.
Probability densities of objects appearance, Templates, Active appearance models and multiview appearance models are the major appearance representations. The common visual features that can be used for are color, edges, optical flows and textures [11].
A Yilmaz et al summarized the popular object detection methods in the context of object tracking. They classified the as point detectors, segmentation background modeling and supervised classifiers. The major methods described are Moravecs detector [Moravec 1979], Harris detector [Harris and Stephens 1988], Scale Invariant Feature Transform [Lowe 2004]. Affine Invariant Point Detector [Mikolajczyk and Schmid 2002] in point detectors, Mean-shift [Comaniciu and Meer 1999], Graph-cut [Shi and Malik 2000], Active contours [Caselles et al. 1995] for segmentation. In background modeling, the methods used are Mixture of Gaussians[Stauffer and Grimson 2000], Eigen background[Oliver et al. 2000], Wall flower [Toyama et al. 1999], and Dynamic texture background [Monnet et al. 2003]. In supervised classifiers the methods are Support Vector Machines [Papageorgiou et al. 1998], Neural Networks [Rowley et al. 1998], and Adaptive Boosting [Viola et al. 2003] [11]. Apart from this, Kevin Murphy et al proposed a method for object detection using local and global features. This approach can be used for detecting rigid objects [12]. Another approach developed by Liming Wang et al, is a combination of recognition and segmentation. This method uses a top down recognition with bottom up segmentation with a hypothesis generation step and verification stage [13]. X H Fang et al designed a novel algorithm by using a pixel and its neighbors as an image vector. For full spatial information a combination of color segmentation and background model is used. This method is good for quick moving object detection [14].
3. Future Directions 4. Conclusion
1. Mirela DANUBIANU1, Tiberiu SOCACIU2, Does Data Mining Techniques Optimize the Personalized Therapy of Speech Disorders?, Journal of Applied Computer Science & Mathematics, no. 5 (3) /2009, Suceava, pp 15-18 2. Wirth, R. and Hipp, ( 2000) J. CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, pages 29-39, Manchester, UK. 3. Ramez Elmasri and Shankant B Navathe, Fundamentals of Database systems, 3ed, Pearson Education Asia, page 885-869,2002, 4. Tao Li, Qi Li, Shenghuo Zhu, Mitsunori Ogihara, A Survey on Wavelet applications in data Mining, SIGKDDD Explorations, Volume 4, Issue 2, page 49-68 5. Fayyad, Usama; Gregory Piatetsky-Shapiro, Padhraic Smyth (1996). "From Data Mining to Knowledge Discovery in Databases". Published in American Association for Artificial Intelligence, 1996, pp 37-54 6. Neal Leavitt, Lets Hear It for Audio Mining, Technology News, October 2002, pp.23-25 7. Vishal Gupta, Gurpreet S. Lehal, A Survey of Text Mining Techniques and Applications, Journal Of Emerging Technologies In Web Intelligence, Vol. 1, No. 1, August 2009, pp 60-76
8. Bing Liu, Web DataMining-Exploring Hyperlinks,Contents, and Usage Data, Springer-Verlag Berlin Heidelberg 2007, pp 6-7 9. Bhavani Thuraisingham, Managing and Mining Multimedia DataBases, International Journal of Artificial
Intelligence Tools, Vol. 13, No.3 (2004) 735-750
10. JungHwan Oh, JeongKyu Lee, Sae Hwang, Video Data Mining, 2005, Idea Group Inc., 11. Alper Yilmaz, Omar Javed, Mubarak Shah, Object Tracking: A Survey, ACM Computing Surveys, Vol. 38, No. 4, Article 13, Publication date: December 2006. 12. Kevin Murphy, Antonio Torralba, Daniel Eaton, William Freeman, Object detection and localization using local and global features 13. Liming Wang, Jianbo Shi, Gang Song, I-fan Shen, Object Detection Combining Recognition and Segmentation 14. X H Fang, W Xiong, B J Hu, L T Wang, A Moving Object Detection Algorithm Based on Color Information, International Symposium on Instrumentation Science and Technology, Journal of Physics: Conference Series 48 (2006) 384387

Data Mining Techniques and Applications in Different Domains

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Mining Techniques and Applications in Different Domains

Uploaded by

Copyright:

Available Formats

Abstract 1.

Fig. 1. Knowledge discovery in data bases

3. Future Directions 4. Conclusion

You might also like