You are on page 1of 17

S1 Teknik Informatika

Fakultas Teknologi Informasi


Universitas Kristen Maranatha

DM-MA/S1IF/FTI/UKM/2012 1
Materi Perkuliahan
 Review Basis Data
 Data Warehouse
 Data Mining

DM-MA/S1IF/FTI/UKM/2012 2
What is Data Warehouse?
 Defined in many different ways, but not rigorously.

 A decision support database that is maintained


separately from the organization’s operational
database
 Support information processing by providing a solid
platform of consolidated, historical data for analysis.
 “A data warehouse is a subject-oriented, integrated, time-
variant, and nonvolatile collection of data in support of
management’s decision-making process.”—W. H. Inmon
DM-MA/S1IF/FTI/UKM/2012 3
An example
 JC Penney store :
A high level manager (Mr. X) wants to
know
 What types (style, dress material, color, and
size) of women clothing move faster during the
months of October to December for Northeast
region?
Type Oktober November Desember Total

Casual 1.500.000 1.700.000 3.500.000 6.700.000


Formal 2.300.000 2.500.000 2.200.000 7.000.000
Party 1.500.000 2.000.000 5.000.000 8.500.000
Total 5.300.000 6.200.000 10.700.000 22.200.000
DM-MA/S1IF/FTI/UKM/2012 4
Never Ending….
– After two days -- Mr. X wants the query to be re-
done by each week during Oct – Dec
– Next day Mr. X wants it by each city in Northeast
region

 Mr. X is back again - Is it possible to get it by


each store in a city?

• By age group, if possible


• By income groups, if data are available
• By charging method (check, cash, credit-card..)

…..IS Manager is tired …


DM-MA/S1IF/FTI/UKM/2012 5
Arsitektur Data Warehouse

DM-MA/S1IF/FTI/UKM/2012 6
Why Data Mining?
 The Explosive Growth of Data: from terabytes to
petabytes
 Data collection and data availability
 Automated data collection tools, database systems, Web,
computerized society
 Major sources of abundant data
 Business: Web, e-commerce, transactions, stocks, …
 Science: Remote sensing, bioinformatics, scientific
simulation…
 Society and everyone: news, digital cameras,
 We are drowning in data, but starving for knowledge!
 “Necessity is the mother of invention”—Data mining—
Automated analysis of massive data sets

DM-MA/S1IF/FTI/UKM/2012 7
Evolution of Database Technology
 1960s:
 Data collection, database creation, IMS and network DBMS
 1970s:
 Relational data model, relational DBMS implementation

 1980s:
 RDBMS, advanced data models (extended-relational, OO, deductive,etc.)
 Application-oriented DBMS (spatial, scientific, engineering, etc.)

 1990s:
 Data mining, data warehousing, multimedia databases, and Web databases
 2000s
 Stream data management and mining
 Data mining and its applications
 Web technology (XML, data integration) and global information systems

DM-MA/S1IF/FTI/UKM/2012 8
Evolution of Sciences
 Before 1600, empirical science
 1600-1950s, theoretical science
 Each discipline has grown a theoretical component. Theoretical models often motivate
experiments and generalize our understanding.
 1950s-1990s, computational science
 Over the last 50 years, most disciplines have grown a third, computational branch (e.g.
empirical, theoretical, and computational ecology, or physics, or linguistics.)
 Computational Science traditionally meant simulation. It grew out of our inability to
find closed-form solutions for complex mathematical models.
 1990-now, data science
 The flood of data from new scientific instruments and simulations
 The ability to economically store and manage petabytes of data online
 The Internet and computing Grid that makes all these archives universally accessible
 Scientific info. management, acquisition, organization, query, and visualization tasks
scale almost linearly with data volumes. Data mining is a major new challenge!
 Jim Gray and Alex Szalay, The World Wide Telescope: An Archetype for Online Science, Comm. ACM, 45(11): 50-54, Nov. 2002

DM-MA/S1IF/FTI/UKM/2012 9
Knowledge Discovery (KDD) Process

 Data mining—core of
Pattern Evaluation
knowledge discovery
process
Data Mining

Task-relevant Data

Data Warehouse Selection

Data Cleaning

Data Integration

Databases
DM-MA/S1IF/FTI/UKM/2012 10
What Is Data Mining?

 Data mining (knowledge discovery from data)


 Extraction of interesting (non-trivial, implicit, previously unknown

and potentially useful) patterns or knowledge from huge amount


of data
 Data mining: a misnomer?

 Alternative names
 Knowledge discovery (mining) in databases (KDD), knowledge
extraction, data/pattern analysis, data archeology, data dredging,
information harvesting, business intelligence, etc.
 Watch out: Is everything “data mining”?
 Simple search and query processing
 (Deductive) expert systems

DM-MA/S1IF/FTI/UKM/2012 11
Data Mining and Business Intelligence
Increasing potential
to support
business decisions End User
Decision
Making

Data Presentation Business


Analyst
Visualization Techniques
Data Mining Data
Information Discovery Analyst

Data Exploration
Statistical Summary, Querying, and Reporting

Data Preprocessing/Integration, Data Warehouses


DBA
Data Sources
Paper, Files, Web documents, Scientific experiments, Database Systems

DM-MA/S1IF/FTI/UKM/2012 12
KDD Process: A Typical View from
Machine Learning and Statistics

Input Data Data Pre- Data Post-


Processing Mining Processing

Data integration Pattern discovery Pattern evaluation


Normalization Association & correlation Pattern selection
Feature selection Classification Pattern interpretation
Clustering
Dimension reduction Pattern visualization
Outlier analysis
…………

 This is a view from typical machine learning and statistics communities

DM-MA/S1IF/FTI/UKM/2012 13
Data mining applications
 Amazon.com uses data mining to provide customers
with purchase suggestions:
Customers who bought this book also bought:
 Seven Methods for Transforming Corporate Data Into Business
Intelligence by Vasant Dhar, Roger Stein
 Building Data Mining Applications for CRM by Alex Berson, et al
 Data Preparation for Data Mining by Dorian Pyle Kellogg on
 Integrated Marketing by Dawn Iacobucci (Editor), et al Multivariate
 Data Analysis (5th Edition) by Joseph F. Hair (Editor), et al
Explore similar items

DM-MA/S1IF/FTI/UKM/2012 14
DM Applications: Retail
 Performing basket analysis
 Which items customers tend to purchase together.
 This knowledge can improve stocking, store layout strategies, and
promotions.
 Sales forecasting
 Examining time-based patterns helps retailers make stocking
decisions.
 If a customer purchases an item today, when are they likely to
purchase a complementary item?
 Database marketing
 Retailers can develop profiles of customers with certain behaviors,
for example, those who purchase designer labels clothing or those
who attend sales.
 This information can be used to focus cost–effective promotions.

DM-MA/S1IF/FTI/UKM/2012 15
DM in CRM: Customer Life Cycle
 Customer Life Cycle
 The stages in the relationship between a customer and a
business
 Key stages in the customer lifecycle
 Prospects:
 people who are not yet customers but are in the target market
 Responders:
 prospects who show an interest in a product or service
 Active Customers:
 people who are currently using the product or service
 Former Customers:
 may be “bad” customers who did not pay their bills or who incurred
high costs

DM-MA/S1IF/FTI/UKM/2012 16
DM-MA/S1IF/FTI/UKM/2012 17

You might also like