You are on page 1of 19

DW & OLAP

y y

coined by Bill Inmon in 1990 A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing.

Data Warehouse

Characteristics
y y y

subject-oriented Integrated

Organized around subjects such as sales,product,customer.

time-variant Data Warehouse is constructed by integrating Provides information from multiple heterogeneous sources. historical perspective e.g. past to ensure Data Preprocessing are applied 5-10 years y non-volatile Data once recorded cannot be updated

Data consistency. warehouse requires two operation in data accessing Initial loading of data Access of data

Operational database layer : The source data for the data warehouse
informational access layer - Tools to extract, transform, load data into the warehouse fall into this layer. detailed than an operational system data directory.

Data access layer :The interface between the operational and

Metadata layer : The data directory - This is usually more Informational access layer :The data accessed for reporting

and analyzing and the tools for reporting and analyzing data

Data warehouse architecture

ETL Tools

Operational Data Application oriented Detailed Accurate, as of the moment of access Serves the clerical community Can be updated Run repetitively and non reflectively Understood before initial development Compatible with Software development Life cycle Performance sensitive (immediate response required when entraing a transaction)

DW Subject Oriented Summarized, otherwise refined Represents values overtime, snapshots Serves the managerial community Is not updated Run heuristically Completely understood before development Completely different life cycle Performance Relaxed (immediacy not required)

Operational Accesses a unit at a time (limited number of data elements for single record) Transaction driven Control of update a major concern in terms of ownership High availability Managed in its entirety Non redundancy Static structure; Variable contents Small amount of data used in a process

Data warehouse Accessed a set at a time (many records of many data elements) Analysis driven Control of update no issue Relaxed availability Managed by subsets Reduncancy is a fact of life Flexible Structure Large amount of data used in a process

Enable users to analyze different dimensions of multidimensional data. For example, it provides time series and trend analysis views

OLAP

Multidimensional OLAP

It has the ability to store data in the y Relational OLAP multi dimensional array that is Store the information in the form of rows highly optimized.
y

Hybrid

and columns in the particular sequence are OLAP. serialized by address. Base tables the created at the deep database and new tables properties of both Aggregates the which are created by the users are aggregated to connect the relational and multi dimensional data in meaningful way..

OLAPs

Types of OLAP

A process that uses a variety of data analysis tools to discover patterns and relationships in data that may be used to make valid predictions.

Data Mining

Steps in data mining is to describe the data (summarize its statistical attributes) build a predictive model (Based on patterns determined from
known results, then test that model on results outside the original sample)

An analyst might want to determine the factors that lead to loan defaults. -- query and report tools describe what is in a database --OLAP goes further; its used to answer why certain things are true. -- Data mining is different from OLAP, rather than verify hypothetical patterns, it uses the data itself to uncover such patterns.

Example

Data mining parameters include:


x Association - looking for patterns where one event is connected to another event x Sequence or path analysis - looking for patterns where one event leads to another later event x Classification - looking for new patterns (May result in a change in the way the data is organized but that's ok) x Clustering - finding and visually documenting groups of facts not previously known x Forecasting - discovering patterns in data that can lead to reasonable predictions about the future relationships.

A broad category of technologies that allows for gathering, storing, accessing & analyzing data to help business users make better decisions analyzing business performance through data-driven insight

Business Intelligence

include the activities of


decision support systems query and reporting online analytical processing (OLAP) statistical analysis, forecasting, and data mining.

Track

Act

Analyze

Decide

Model

Closed loop model

Information Architecture

Data Architecture

Technical Architecture

Product Architecture

BI Architecture

You might also like