You are on page 1of 8

Synopsis

Data warehouse tools

Sourav Dutta Dept : CSE 3rd yr. Roll : 17 St. Thomas College of Engg. & Tech.

What is data warehouse?


Data warehouse is subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of managements decisionmaking process. A Data Warehouse is a collection of corporate information, derived directly from operational systems and some external data sources for data management and data analysis. Its specific purpose is to support business decisions, not business operations. A physical repository where relational data are specially organized to provide enterprise-wide, cleansed data in a standardized format. Goal: To integrate enterprise wide corporate data into a single repository from which users can easily run queries. The Purpose of Data Warehousing Realize the value of data Data / information is like an asset to support the decision making process for a company. The Purpose of Data Warehousing : A data warehouse (DW) is a database used for reporting. The data is offloaded from the operational systems for reporting. The data may pass through an operational data store for additional operations before it is used in the DW for reporting & analysis. A data warehouse is a place where data is stored for archival, analysis and security purposes. Usually a data warehouse is either a single computer or many computers (servers) tied together to create one giant computer system. Data can consist of raw data or formatted data. It can be on various types of topics including organization's sales, salaries, operational data, summaries of data including reports, copies of data, human resource data, inventory data, external data to provide simulations and analysis, etc. Role of a data warehouse in a company : Besides being a store house for large amount of data, they must possess systems in place that make it easy to access the data and use it in day to day operations. A data warehouse is sometimes said to be a major role player in a decision support system (DSS). DSS is a technique used by organizations to come up with facts, trends or

relationships that can help them make effective decisions or create effective strategies to accomplish their organizational goals.

Properties of a Data warehouse :


Subject-orientedWH is organized around the major subjects of the enterprise..rather than the major application areas.. This is reflected in the need to store decision-support data rather than application-oriented data Integratedbecause the source data come together from different enterprise-wide applications systems. The source data is often inconsistent using..The integrated data source must be made consistent to present a unified view of the data to the users Time-variantthe source data in the WH is only accurate and valid at some point in time or over some time interval. The time-variance of the data warehouse is also shown in the extended time that the data is held, the implicit or explicit association of time with all data, and the fact that the data represents a series of snapshots Non-volatiledata is not update in real time but is refresh from OS on a regular basis. New data is always added as a supplement to DB, rather than replacement. The DB continually absorbs this new data, incrementally integrating it with previous data Data warehouses have the following distinctive characteristics : multidimensional conceptual view generic dimensionality unrestricted cross-dimensional operations client-server architecture multi-user support accessibility transparency intuitive data manipulation consistent and flexible reporting performance

A general application of data-warehouse:


Problem: ABC Pvt Ltd is a company with branches at Mumbai, Delhi, Chennai and Banglore. The Sales Manager wants quarterly sales report. Each branch has a separate operational system. Solution: Extract sales information from each database. Store the information in a common repository (Data Warehouse) at a single site. Extract data needed for analysis from operational database. Refresh warehouse at regular interval so that it contains up to date information for analysis. Warehouse will contain data with historical perspective.

Steps for building Data Warehouse :


Data Selection. Data Preprocessing: Fill missing values Remove inconsistency Data Transformation & Integration. Loading Data to the warehouse. Stocking the data warehouse with data is often the most time consuming task needed to make data warehousing and business intelligence a success. In the overall scheme of things Extract-Transform-Load (ETL) often requires about 70 percent of the total effort.

The must have features of a data-warehouse tool :


a data warehouse tool requires to support the administration and management of such complex enviroment. for the various types of meta-data and the day-to-day operations of the data warehouse, the administration and management tools must be capable of supporting those tasks: monitoring data loading from multiple sources data quality and integrity checks managing and updating meta-data monitoring database performance to ensure efficient query response times and resource utilization auditing data warehouse usage to provide user chargeback information replicating, subsetting, and distributing data maintaining effient data storage management archiving and backing-up data implementing recovery following failure security management

Choosing a suitable data-warehouse tool :


A data warehouse requires a method of adding data to the warehouse. An extraction, transform, and load (ETL) tool is typically used for this purpose. This data may need to be normalized or modified for consistency or to match the warehouse database structure. Loading the data is critical, as all the relationships and connections to other databases must be maintained to ensure the integrity of the database, so it can be used with other data warehousing tools. Every data warehouse contains a vast number of database tables. These tables are organized to work with each other in a logical way. The maintenance of these tables is essential to the continuing operation and accuracy of the data warehouse. A data integrity function is standard in most data warehousing tools. These modules are often extremely complex to use, with multiple options and functions availab le. Data integrity tools check for consistency within the data, accurate connections between databases. Poor data integrity will result in a data warehouse that provides inaccurate reports.

A query is simply a programmed question or report request. There is


an entire business process surrounding the creation of a data warehouse query. This process requires in-depth knowledge and understanding of the business needs, as well as the data structures within the data warehouse.

Examples of Data Warehousing Tools :


Data Warehouse : Computer Associates -- CA-Ingres Hewlett-Packard -- Allbase/SQL Microsoft -- SQL Server Oracle -- Oracle7, Oracle Parallel Server SAS Institute -- SAS Sybase -- SQL Server, IQ, MPP SQL Server 2000 DTS Oracle 8i Warehouse Builder Extraction and Transformation tools : Microsoft -- Plato Oracle -- Express Prism Solutions -- Prism Warehouse Manager SAS Institute -- SAS/EIS, OLAP++ SQL Server Analysis Services Oracle Express Server Reporting tools : MS Excel Pivot Chart VB Applications Oracle -- Discoverer2000 Andyne Computing -- GQL Advantages & Disadvantages : Data warehouses are not the optimal environment for unstructured data. Because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data. The number one reason why you should implement a data warehouse is so that employees or end users can access the data warehouse and use the data for reports, analysis and decision making. Using the data in a warehouse can help you locate trends, focus on relationships and help you understand more about the environment that your business operates in.

Implementing a Data Warehouse is a tough job,and tougher is to manage it.The system resource requirement to build a Data warehouse is also very high.Accesssing and modifying a data warehouse requires a high system overhead,so one must have a extremely powerful machine. You might also have a problem with current systems being incompatible with your data. It is also important to consider future equipment and software upgrades; these may also need to be compatible with you data. Finally, security might be a huge concern, especially if your data is accessible over an open network such as the internet. You do not want your data to be viewed by your competitor or worse hacked and destroyed. Conclusion : Implementing a Data Warehouse is not a project, but a long term commitment to implement continuously improving business intelligence practices Data warehousing is a field which is somewhat complicated. There are many vendors who are attempting to advertise the tools, but the cost and complexity involved with the products has not allowed them to be used by a large number of companies. Any company that is thinking of using data warehouses must make sure they have taken the time to review and understand the technology. It can only be useful if you know how to use it. Once you understand and acquire the technology, it is possible for you to gain a powerful advantage over your competitors.

********

******** ******** ********

You might also like