You are on page 1of 4

114

COIVPU;ER SCIENCE & ENGINE-RINC Jdr) ]CIU

JAI'\IAHARLAL NEHRU TECH NOLOGICAL UNIVERSITY HYDERABAD lV Year B.Tech. CSE - I Sem T'P'D C

L 4

1t-t-

(57048) DATAWAREHOUSING AND DATA MINING


UNIT I

lntroduction: Fundarnentals of data mlning, Data Mining Functionalities, Classification of Data Mining systems, Data Mining Task primitives, lntegration ofa Oata Mining System with a Database ora Data Warehouse System, Major issues in Daia Mining. Data Preprocesslng: Need for Preprocessing the Oata, Data Cleaning,
Data lntegration and Transformation, Data Reduction, Oiscretization and Concept Hierarchy Generation.
UNIT II

)'r
I ,

tY
\)

Data Warehouse and OLAP Technology

for Data Mining: Data'l

Warehouse, Multidimensional Data Model, Oata Warehouse Architecture, Data Warehouse lmplementation, Further Development of Data Cube \ Technology, From Data Warehousing to Data Data Cube Computation and Data Generalization: Efficient Methods for^\ Oata Cube Computation, Further Development of Data Cube and OLep ) I Technology, Attribute-Odented

Mining

lnduction.

UNIT III

Mining Frsquent Paiterns, Associatlons and- Gorrslations,

Srsl:l-

Concepts, Effcient and Scalable Frequent ltemset Mining MetnoOs, Uining various kinds of Association Rules, FromAssociation Mining to Conelation

"\
,,i

q\

v\

Analysis, Constraint-Based Association Mining


UNIT

IV

Classification and Prediction: lssues Regarding Classification and\ Prediction, Classification by Oecision Tree tnduction, Bayesian/ Classification, Rule-Based Classification, Ctassification bt{_ Backpropagation, Support Vector Machines, Associative Classification,'\
Lazy Learners, Other Classification Methods, Prediction, Accuracy and Enor measures, Evaluating the accuracy of a Classifier or a Predictor, Ensemble
I

unitv

Methods

Cluster Analysis lntroduction :Types of Oata in Ciuster Analysis, A I Categorization of Maior Cluslering Methods, Partitioning Methods. t Hierarchical Methods, Density-Based Methods, Grid-Based lvlethods.

145

COVPUTER SCiENCE & ENGINEERING ](xI..I,?Oi(f

Model-Based Clustring Methods, Ctustering High-Oimensionat Data, Constraint-Based Cluster Anatysis, Ou!ie. Analysis. UNITVI Mlning Streams, Tims Ssriss and Saquonce Data: Minjng Data Streams, Mining Time-Series Oata, Mining Sequence patterns in Transactional Databases, [rining Sequence patterns in Biologicat Data, Graph Miningr.\ Social Network Analysis and Muttirelational Data UNITVII Mlning Object, Spatlal, Muttimedia, Tsxt and Web Oata: Muttidimensionat Analysis and Descriptive Mining of Complex Data Objects, Spatial Datr Mining, Multimedia Data Mining, Text Mining, Mintng the WorlO WiOe

Mining:

uNtTvlt

Weff

Applicatlons and Tr.nds in Data Mining: Data Mining Applications, Data Mining System Products and Research prototypes, Additional Themes { on Data Mining and Social lmpacts oi Oata Mining.
TEXT BOOKS:

1.

2.

REFERENCEB(EKS: 1. Data Mining Techniques- Arun K puiari,Z- edition, Universities


.

Data Mining - Concepts and Techniques . Jiawei Han & Micheline Kamber, Morgan Kaufmann publishers, Etsevie(2- Edition, 2006. lntroduction to Oata Mining - pang-Ning Tan, Michael Steinbach and Vipin Kumar, Pearson education.

2.

4.
5. 6

Press. Data Warehousing in the RealWorld- SamAanhory & Oennis Munay Pearson Edn Asia. lnsight into Data Mining, K.pSoman, S.Diwakar, V.Ajay, pHl,2OOB. Data Warehousing Fundamenlals - paulraj ponnaiah Wiley student Edition The Data Warehouse Life cycte Toot kit - Ratph Kimball Witey siudent
editio n

Building the Oata Warehouse By Willjam H Inmon, John Witey & Sons lnc, 2005. 7. Data Mining lntroductory and advanced topics -Margaret H Ounham, Pearson education 8. Oata Mining,VPudi and P.Radha Krishna,Oxford University press. 9. Data Mining:Methods and Techniques,A.B. M Shawkat Ali and S.A.Wasimi, Cengage Leaming. 10. Data Warehouse 2.0,The Architecture forthe nextgeneraion of Data Warehousing. W.H.lnmon, O.Strauss, G.Neushloss, Elsevier, Distributed by SPD.

w.e.t.2010-201 1 academic year

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


lV Year B.Tech. Computer Science Engineering. l-Sem.

DATA WAREHOUSING AND DATA MINING Unit-l: lntroduciion to Data Mining: What is data mining, motivating challenges, origins of data mining, data mining tasks, Types of Data-attributes and miasuremints, types of data sets, Data Quality (Tan) Unit-ll: Dta preprocessing, Measures of Similarity and Oissimilarity: Basics, similarity and dissimilarity between simple attributes, dissimilarities between data objects, simiririties between data objects, examples of. proximity-measures: similarity ,"""rr"" for binary data, Jaccard coefficient, Cosine similarity, Extended Jaccard ctefficient, Conelation, Exploring Data : Data Set, Summary Statistics (Tan)

Unitlll:
Data Warehouse: basic concepts:, Data Warehousing Modeling: Data Cube and OLAp, Data Warehouse implementation efficient Olta cuU6 computation, partiai materialization, indexing OLAP data, efficient processing of OLAp queries. ( H &

i)

Uniuv:
Classification: Basic Concepts, General approach to solving a classification problem, Decision Tree induction: working of decision tree, building aiecision tree, meihods foi expressing attribute test conditions, measures for selecting the best split, Algorithm for decision tree induction. Model over fitting: Due to presence- of noise, due to lack of represeniation samples, evaluating the performance of classifier: holdout method, random sub sampling, cioss_ validation, bootstrap. (l-an) Unit-V: Classification-Alternative techniques: Bayesian Classifier: Bayes theorem, using bayes theorm for classification, Nai've Bayes classifier, Bayes erior rate, eayesian'eellet Networks: Model representation, model building (Tan)

Unit-Vl: Association Analysis: Problem Definition, Frequent ltem-set generation_ The Apriori principle , Frequent ltem set generation in the Apriori algorithri, candidate generation and pruning, support counting (eluding support counting using a Hash tree) , Rule generation, compact representation of frequent item sets, Fp-Growth Algorithmj. Gan)

w.e.f.20 1 0-201 1 academic year

Unit-Vll:
Overview- types of clusledng, Basic K-means, K -means -additional issues, Bisecting k-means, k-means and different types of clusters, strengths and weaknesses, k_meani as an optimization problem.

Unit-Vlll: Agglomerative Hierarchical cl'lslelrlg, basic agglomerative hierarchical clustering algorithm, specific techniques, DBSCAN: Traditionat density: center_based approach] strengths and weaknesses (Tan)
TEXT BOOKS:

l.
2.

lntroduction to Data Mining : pang-Ning tan, Michael Steinbach, Vipin Kumar, Pearson Data l\rining ,Concepts and Techniques, 3/e, Jiawei Han , Micheline Kamber , Elsevier

REFERENCE BOOKS:

2. 3. 4.

l.

lntroduction to Data Mining with Case Studies 2nd ed: GK Gupta; pHl. Data Mining : lntroductory and Advanced Topics : Dunham, Siidhar, pearson. Data Warehousing, Data Mining & OLAP, Alex Berson, Stephen J Smith, TMH Data Mining Theory and practice, Soman, Diwakar, Aiay, pHl,2006.

You might also like