Professional Documents
Culture Documents
|
A review of Data Warehousing and Data Mining
`Aqeel Al-Jishi
`Nick Farley
`Masaki Osada
|
|
| ½
Ô ëhe main repository of an organization
historical data, it corporate memory
Ô It contains the raw material for management¶s
decision support system
|
½ nowledge Discovery)
Ô ëhe process of analyzing data from different
perspectives and summarizing it into useful
information
Ñ
In the late 1980s to early 1990s distinct computer
databases were created
ëhese databases were designed to meet the data
analysis needs that Operational Systems were
failing to support
Operational Systems failed for many reasons:
Ô Long report generation time
Ô Inability to handle loads and not optimized
Ô Many organizations had multiple Operation Systems
which was a reporting nightmare
Ô Custom applications were required for reporting
which slowed reporting and increased costs
|
Make information easily accessible
Provide endless views and combinations
of data [
|
)
Îuery results returned with minimal wait
time
Be adaptive and resilient to change
Designed with the correct users in mind
business users and management)
eep information secure but allow access
to insiders
Problems with Data Acquisition may arise
80% of the time building a data
warehouse will be spent on extracting,
cleaning, and loading data
Errors with data can be rampant:
Ô Incomplete Data missing fields)
Ô Incorrect Data wrong calculations)
Ô Readability Issues strange formatting)
|
|
A
- Bill Inmon
Ô ëhe data warehouse is but one part of
the å
system
Ô An enterprise has one data warehouse
and data marts source their
information from it.
Ô Uses 3rd Normal Form to store
information in the database
|
|
|
|
½
Ô A data mart is a subset of data from the data
warehouse, typically used when the broad scope of
the data warehouse isn¶t needed
Ô Business departments commonly create, use, and
alter their own data marts.
r ½
Ô ëhe amount and level of data brought in to the data
warehouse during acquisition
|
| ½
Ô A table with a single-part primary key
and descriptive attribute columns.
Ô Describes the business entities of an
enterprise, represented as hierarchical,
categorical information such as time,
departments, locations, and products
|
× ½
Ô A table with numeric performance
measures metrics) characterized by a
composite key
Ô ëhe elements for the composite key
come from the foreign keys from a
dimensional table
|
å
Ô etting ³Useful Information´ out of a
large amount of ³Data´
å
Ô etting ³Business Intelligence´ out of a
large amount of ³Information´
What is the difference between
³Business Intelligence´ and
³Information´?
*
*
| !
%& &
&
1
(
× |
Ñeuristic in nature
Ô Capable of finding
users would never think of
is the key technology to find
patterns
Self-guiding
È
Data Mart
*
*
Î
ö O
G
ö
ö
ö
ö G
G
G
ö
G
ö
ö
ö G
G G
ö
O
ö *
ö
G
G
ö
O
ö
ö
G G *
ö O
G
ö
G
ö G * G
G
ö
ö
G
G
ö * G G G
* *!"#$
ö
ö
ö !
ö !"! #
ö !
ö
ö
ö
$
ö
ö