You are on page 1of 20

E-Business and Business

Intelligence

Data Warehouse

By Prof T.R. Vaidyanathan


A Data Warehouse is a Database specifically structured
for query and analysis.

A Data Warehouse typically contains data representing


the business history of an organization.

Data warehousing is a concept. It is not a product that


you can buy off the shelf.

It is a set of hardware and software components that can


be used to better analyze the massive amounts of data
that companies are accumulating to
make better business decisions.
What is a Data Warehouse?

Data Data Warehouse Information

Data warehouse is a Subject Oriented, Integrated, Time Variant and


Non-Volatile collection of data used to support the strategic decision
making process for the enterprise.
Subject Oriented
Operational Applications Data Warehouse Subjects

Order Consumer
processing loans Sales Product

Customer Accounts
billing receivables Customer Account

Claims Savings
processing Accounts Claims Policy
Integrated

Savings
Account

Deposit
Account Subject =Account

Loan Account
Characteristics of Data
Warehousing

 Subject Oriented: It is designed to analyze data. E.g. to


know top 10 customers.
 Integrated: It must put together data from disparate form
into consistent format. They must resolve the naming
conflicts and inconsistencies among units of measure, in
order to integrated.
 Non-volatile; Once data has been entered into the data
warehouse, it cannot be changed
 Time variant: In order to find business trends,
management needs to analyze large amount of data quickly
in contrast to OLTP, where performance requirement lead
to archival of history data.
DIFFERENCE BETWEEN OPERATIONAL SYSTEM AND
INFORMATIONAL SYSTEM
OPERATIONAL INFORMATIONAL

Data content Current Archived, derived and


summarized
Data structure Optimized for transaction Optimized for complex
queries

Access frequency High Medium and low

Access type Read, update, delete Read

Usage Predictable, repetitive Adhoc, random

Response time Sub seconds Several seconds to minutes

Users Large number Relatively small group


Characteristics of a Typical
Data Warehouse

 A database which is typically read-only.


 Repository of historical and current data.
 Stored in an organized format.
 Housed centrally and separate from the transaction processing
systems.
 Used specifically for analytical purposes and designed to support
management’s decision making process.
 Developed to accommodate random, ad-hoc queries and to allow the
users to drill down to minute levels of detail.
 DW is OLAP (Online Analytical Processing) not OLTP (Online
Transaction Processing).
Architecture

 Architecture is a way of representing the overall


structure of data, communication, processing and
presentation that exists for end-user computing within
the enterprise.
 Operational Database / External Database layer
 Information access layer
 Data access layer
Architecture

 Data Directory layer(metadata)


 Process Management layer
 Application messaging layer
 Data warehousing layer
 Data staging layer.
Benefits of Data Warehouse

 End users can obtain information readily, without delays in


response time from the MIS.
 Intuitive and interactive ad-hoc report creation by the user can
stimulate creativity in the analytical thought process.
 Enables executives have access to information.
 Identification of hidden business opportunities through user defined
investigative querying.
 Precision Marketing.
 Rapid response to key events within an organization.
 Rapid response to market and technology trends.
 Increased revenue.
Utilities:
 Immediate information delivery. DW decreases
the length of time between business events occurrence
and executive alert. Using DW, daily, weekly and
monthly sales reports are available on a daily basis.
 Data Integration from across and even outside the
organization- DW typically combines data from
multiple sources such as a company’s order entry and
warranty systems. Thus, with a warehouse, it may be
possible to track all interactions a company has with
each customer – from that customer’s first inquiry,
through the terms of their purchase, all the way
through any warranty or service interactions.
Utilities

 Future vision from historical trends- Effective


business analysis frequently includes trend and
seasonality analysis. To facilitate this, DW contains
data for multiple years.
 Tools for looking at data in new ways. –Instead of
paper reports, DW gives users tools for looking at data
differently. They also allow those users to manipulate
their data. An interactive table that allows the user to
drill down into detailed data with the click of a mouse
can answer questions that might take months to
answer in a traditional system
Functions

 Extracting- Extracting out of data from disparate


sources
 Integrating – Putting together the extracted data into
a consistent format
 Filtering – Process of extracting the data from the
OLTP or external data sources. E.g. the user may be
interested in only the last five years’ sales data
 Standardizing- As the data will be moved from
different OLTP database or flat file system to one
target, data need to be standardized.
Functions

 Transforming – Data is extracted from OLTP databases


and external data sources. Data transformation will
have to be carried out on the extracted data before
data is carried to the warehouse.
 Cleaning – To ensure data quality, accuracy
Benefits of Data Warehouse

Today Tomorrow

Information
Data
Rich
Rich

and

Information
Poor
OLTP and OLAP
OLTP OLAP

 Primary source of information  Secondary source of information


 On-line System  Off-line system
 Inserts, Updates, Deletes, Selects  Selects only (read only)
 Data updation online  Data updation periodic
 All types of data are integrated into
 Different systems hold different one system
types of data in different formats  Data organized by Dimensions of the
 Data organized by application Business
 Data must be integrated
 Data typically not integrated  Standard naming conventions
 Different naming conventions  Standard file formats
 Different file formats  One Warehouse server – logical server
 Different hardware platforms  Historical data
 Recent or Current data  Time Key, Time series analysis
 No time key, No Time series analysis
Data Warehouse
Environment
Any Source Any Data Any Access

Query
Tools

Operational
Data
Warehouse

Analytical
Tools

Applications
External
Data
OLAP Technologies

The two prominent architectures for OLAP systems: \


 Multidimensional OLAP (MD-OLAP)
 Relational OLAP (ROLAP).

MD-OLAP architectures utilize a multidimensional database to provide


analyses; their main premise is that OLAP is best implemented by storing
data multi-dimensionally. Tools available in Market are Business Objects,
Cognos, Brio, Oracle Express, SQL OLAP.
ROLAP architectures access data directly from data warehouses; ROLAP
architects believe that OLAP capabilities are best provided directly against
the relational database. Tools available in Market are Micro Strategy,
Informix, Whitelight.
Worst practice of BI
 1) Assuming average user has the know –how are time
to use
 2)Assuming Excel to become default BI platform
 3)Assuming DW will solve all information access and
delivery requirements
 4) Selecting a tool without a specific business need

You might also like