Professional Documents
Culture Documents
Foundations of
Business Intelligence:
Databases and
Information
Management
1
The use of a traditional approach to file processing encourages each functional area in
a corporation to develop specialized applications and files. Each application requires a
unique data file that is likely to be a subset of the master file. These subsets of the
master file lead to data redundancy and inconsistency, processing inflexibility, and
wasted storage resources.
7
Database
A database is a collection of data organized to serve
many applications efficiently by centralizing the data and
controlling redundant data.
Rather than storing data in separate files for each
application, data appears to users as being stored in
only one location.
A single database services multiple applications.
For example, instead of a corporation storing employee
data in separate information systems and separate files
for personnel, payroll, and benefits, the corporation
could create a single common human resources
database.
8
11
13
Data warehouse:
Stores current and historical data from many core operational
transaction systems
Consolidates and standardizes information for use across
enterprise, but data cannot be altered
Data warehouse system will provide query, analysis, and
reporting tools
Data marts:
Subset of data warehouse
Data mining:
More discovery driven than OLAP
Finds hidden patterns, relationships in large databases
and infers rules to predict future behavior
Types of information obtainable from data mining
Associations are occurrences linked to a single event.
For instance, a study of supermarket purchasing patterns
might reveal that, when corn chips are purchased, a cola
drink is purchased 65 percent of the time, but when there
is a promotion, cola is purchased 85 percent of the time.
20
23
Text mining
Unstructured data, most in the form of text files, is believed
to account for over 80 percent of useful organizational
information and is one of the major sources of big data that
firms want to analyze.
E-mail, memos, call center transcripts, survey responses,
legal cases, patent descriptions, and service reports are all
valuable for finding patterns and trends that will help
employees make better business decisions.
25
Web mining
The Web is another rich source of unstructured big data for
revealing patterns, trends, and insights into customer
behavior.
The discovery and analysis of useful patterns and
information from the World Wide Web is called Web mining.
Businesses might turn to Web mining to help them
understand customer behavior, evaluate the effectiveness of
a particular Web site, or quantify the success of a marketing
campaign.
27
28
30
31
35
Data cleansing
Data cleansing, also known as data scrubbing, consists
of activities for detecting and correcting data in a
database that are incorrect, incomplete, improperly
formatted, or redundant.
Data cleansing not only corrects errors but also enforces
consistency among different sets of data that originated
in separate information systems.
Specialized data-cleansing software is available to
automatically survey data files, correct errors in the data,
and integrate the data in a consistent company-wide
format.
36