Very Important Waht Is Data Warehouse and Why Required

Business Intelligence and Decision Support
System
.
Business is moving at a very high pace.The management is looking for clear

answers or solutions for their questions and they need these information
quickly.This is how business works on modern days.Time is money and real
time information is the requirement. So there is considerable interest in all
areas of data management.
Time is money and real time information is the requirement.
This require integrated,accurate,current and understandable sources of data

that can be accessed very quickly,transformed to information and used for
decision making.
Data is the critical element in any information system
Business Intelligence system provides quality information.
Data is the critical element in any information system.Business Intelligence

system provides quality information that management needs to make good
business decision. In the competitive world of business the survival of a
company depends on how fast they are able to recognize the changing
business dynamics and challenges and respond correctly and quickly.
Companies must also anticipate trends,identify new opportunities,transform

their strategy and reorient resources to stay ahead of the competition. It
require the ability to transform the raw data to actionable information.This
consist of the following steps.
Capturing raw data.
Consolidating,organizing, storing and distributing data.
Analyzing and providing quick and easy access to information.
This is the competitive advantage also the challenge.All of these are the
goals of Business Intelligence(BI).
Business Intelligence helps a company to create knowledge from information

to enable better decision making and to convert those decision to action.
Data Warehousing an Introduction
Decision Support System
Decision Support Systems (DSS) helps organizations in strategic and

operational decision making process.
Organizations use On-Line Transaction Processing (OLTP) servers for their

day-to-day business operations. This only stores current data and very
recent transactional data.
Strategic decision making require historical data.Data Warehouse(DW) are

large storage systems which stores historical data that can be used for
strategic decision making.
Business Intelligence (BI)
The technologies, tools, and practices for collecting, integrating,

analyzing,and presenting large volumes of information to enable better
decision making is referred as Business Intelligence (BI).
How to Build a Data Warehouse
Data from multiple Sources(Sources are usually OLTP Systems or Flat Files)
are extracted and integrated at a common repository called Data warehouse.
ETL (Extract - Transform - Load)
Extract data from multiple sources.

Transform the extracted data according to the requirements of the
Data warehouse
Load the transformed data into the warehouse.
Why ETL ( Extract - Transform - Load)
To build the Data Warehouse we need to integrate data from multiple data
sources.These sources may be a database or flat files.
Even though these sources may have similar kind of data,there may exist
considerable differences in the following ways.
Different attribute names will be used by different sources to represent
same data element. So we need to find the semantically equivalent
attributes from different data sources so that we can represent all of
these as a single attribute in the Data warehouse.
Some times different data sources may use same attribute names to
represent semantically different data elements. We should resolve this
before we load data in to the data warehouse.
We can do various transformation on the source data before loading it

into the warehouse. Some of them are
De-duplication: Remove duplicate records.

Sort: Sort Records based on certain key attributes.
Summarization: Some times Data warehouse does not require the actual
detailed transaction level data,instead it may require only a summary data.
For Example instead of detailed sales data,give a record that specifies total
sales.
Business Intelligence and Decision Support

System
.
Business is moving at a very high pace.The management is looking for clear

answers or solutions for their questions and they need these information
quickly.This is how business works on modern days.Time is money and real
time information is the requirement. So there is considerable interest in all
areas of data management.
Time is money and real time information is the requirement.
This require integrated,accurate,current and understandable sources of data

that can be accessed very quickly,transformed to information and used for
decision making.
Data is the critical element in any information system
Business Intelligence system provides quality information.

Data is the critical element in any information system.Business Intelligence
system provides quality information that management needs to make good
business decision. In the competitive world of business the survival of a
company depends on how fast they are able to recognize the changing
business dynamics and challenges and respond correctly and quickly.
Companies must also anticipate trends,identify new opportunities,transform

their strategy and reorient resources to stay ahead of the competition. It
require the ability to transform the raw data to actionable information.This
consist of the following steps.
Capturing raw data.
Consolidating,organizing, storing and distributing data.
Analyzing and providing quick and easy access to information.
This is the competitive advantage also the challenge.All of these are the
goals of Business Intelligence(BI).
Business Intelligence helps a company to create knowledge from information

to enable better decision making and to convert those decision to action.
Aims and objective of Data warehouse

1. Keeping Analysis/Reporting and Production Separate.
Purpose of Data Warehouse is to keep analysis/reporting (non-production use

data) separate from production data.
2. DW purpose for Data Consistency and Quality
3. High Response time- Normalized Data vs. Dimensional

Modeling
Production/Source system database are typically normalized to enable integrity and non-
redundancy of data. This kind of design is fine for transactions, which involved few records at a
time. However, for large analysis and mining queries, the response time in normalized databases
will be slow given the joins that have to be created.
4. Data Warehouse objective of providing an adaptive and

flexible source of information.
Its easier for users to define the production work and functionalities they want, but difficult to
define the analysis they need. The analysis needs keep on changing and Data-Warehouse has the
capabilities to adapt quickly to the changing requirements. Please refer to 'Dimension Modeling'
5. Establish the foundation for Decision Support

Decisioning process of an organization will involve analysis, data mining, forecasting, decision
modeling etc. By having a common point, which can provide consistent, quality data with high
response time provides the core enabler for making fast and informed decisions.
Inherent Complexity of Data Warehouse Databases
Data Warehouse Databases are Large
Since data warehouse databases are constructed by combining selected data

from several operational databases, data warehouse databases are
inherently large. They are often the largest databases within an
organization. The very size of these databases can make them very difficult
and expensive to query.
Data Warehouse Databases are Structurally Complex
Because data warehouse databases are constructed by combining data from

several operational databases, data warehouse databases are structurally
complex. They have many tables with a variety of relationships existing
between the tables. This complexity makes them particularly difficult to
query, because a typical question will involve a number of tables and each
table must be properly joined into the logic of the query. Most business users
have difficulty even knowing which tables must be included, much less how
they must be joined into a question.
OLTP AND OLAP
OLTP vs OLAP
Online Transaction Processing vs Online Analytical Processing
Online Transaction Processing System:OLTP
OLTP System deals with operational data. Operational data are those data
involved in the operation of a particular system.
Example: In a banking System, you withdraw amount through an ATM. Then

account Number,ATM PIN Number,Amount you are withdrawing, Balance
amount in account etc are operational data elements.
Operational Data
Operational data are usually of local relevance
Frequent Updates
Normalized Tables
Point Query
In an OLTP system data are frequently updated and queried. To prevent

data redundancy and to prevent update anomalies the database tables are
normalized.Set of tables that are normalized are fragmented.Normalization
makes the write operation in the database tables more efficient.Operational
data are usually of local relevance.It involves Queries accessing individual
tuple(individual record).These type of queries are termed as point queries.
Examples for OLTP Queries:
What is the Salary of Mr.John?

What is the address and email id of the person who is the head of maths
department?
Online Analytical Processing:OLAP
OLAP deals with Historical Data or Archival Data. Historical data are those
data that are archived over a long period of time.
Example: If we collect last 10 years data about flight reservation, The data
can give us many meaningful information such as the trends in reservation.
This may give useful information like peak time of travel, what kinds of
people are traveling in various classes (Economy/Business)etc.
Historical Data or Archival Data

Infrequent updates
Analytical queries require huge number of aggregations
Integrated data set with a global relevance
Updates are very rare here.Analytical queries requires huge number of

aggregations. In analytical queries the performance issue is mainly in query
response time.Query need to access large amount of data and require huge
number of aggregation.
OLAP Queries have significant importance in strategic decision making. This

helps the top level management in decision making.
Examples for OLAP Queries
How is the profit changing over the years across different regions ?
Is it financially viable continue the production unit at location X?
3 Physical Design in Data Warehouses

This chapter describes the physical design of a data warehousing environment, and includes the
following topics:
Moving from Logical to Physical Design

Physical Design
Moving from Logical to Physical Design

Logical design is what you draw with a pen and paper or design with Oracle Warehouse Builder
or Oracle Designer before building your data warehouse. Physical design is the creation of the
database with SQL statements.
During the physical design process, you convert the data gathered during the logical design
phase into a description of the physical database structure. Physical design decisions are mainly
driven by query performance and database maintenance aspects. For example, choosing a
partitioning strategy that meets common query requirements enables Oracle Database to take
advantage of partition pruning, a way of narrowing a search before performing it.
See Also:
Chapter 5, " Parallelism and Partitioning in Data Warehouses" for

further information regarding partitioning
Oracle Database Concepts for further conceptual material regarding

all design matters
Physical Design
During the logical design phase, you defined a model for your data warehouse consisting of
entities, attributes, and relationships. The entities are linked together using relationships.
Attributes are used to describe the entities. The unique identifier (UID) distinguishes between
one instance of an entity and another.
Figure 3-1 illustrates a graphical way of distinguishing between logical and physical designs.
Figure 3-1 Logical Design Compared with Physical Design

Description of the illustration dwhsg006.gif
During the physical design process, you translate the expected schemas into actual database
structures. At this time, you have to map:
Entities to tables
Relationships to foreign key constraints
Attributes to columns
Primary unique identifiers to primary key constraints
Unique identifiers to unique key constraints
Physical Design Structures
Once you have converted your logical design to a physical one, you will need to create some or
all of the following structures:
Tablespaces
Tables and Partitioned Tables
Views
Integrity Constraints
Dimensions
Some of these structures require disk space. Others exist only in the data dictionary. Additionally,
the following structures may be created for performance improvement:
Indexes and Partitioned Indexes
Materialized Views
Tablespaces
A tablespace consists of one or more datafiles, which are physical structures within the operating
system you are using. A datafile is associated with only one tablespace. From a design
perspective, tablespaces are containers for physical design structures.
Tablespaces need to be separated by differences. For example, tables should be separated from
their indexes and small tables should be separated from large tables. Tablespaces should also
represent logical business units if possible. Because a tablespace is the coarsest granularity for
backup and recovery or the transportable tablespaces mechanism, the logical business design
affects availability and maintenance operations.
You can now use ultralarge data files, a significant improvement in very large databases.
See Also:
Chapter 4, " Hardware and I/O Considerations in Data Warehouses"

for information regarding tablespaces
Tables and Partitioned Tables
Tables are the basic unit of data storage. They are the container for the expected amount of raw
data in your data warehouse.
Using partitioned tables instead of nonpartitioned ones addresses the key problem of supporting
very large data volumes by allowing you to decompose them into smaller and more manageable
pieces. The main design criterion for partitioning is manageability, though you will also see
performance benefits in most cases because of partition pruning or intelligent parallel processing.
For example, you might choose a partitioning strategy based on a sales transaction date and a
monthly granularity. If you have four years' worth of data, you can delete a month's data as it
becomes older than four years with a single, fast DDL statement and load new data while only
affecting 1/48th of the complete table. Business questions regarding the last quarter will only
affect three months, which is equivalent to three partitions, or 3/48ths of the total volume.
Partitioning large tables improves performance because each partitioned piece is more
manageable. Typically, you partition based on transaction dates in a data warehouse. For
example, each month, one month's worth of data can be assigned its own partition.
Table Compression
You can save disk space by compressing heap-organized tables. A typical type of heap-organized
table you should consider for table compression is partitioned tables.
To reduce disk use and memory use (specifically, the buffer cache), you can store tables and
partitioned tables in a compressed format inside the database. This often leads to a better scaleup
for read-only operations. Table compression can also speed up query execution. There is,
however, a cost in CPU overhead.
Table compression should be used with highly redundant data, such as tables with many foreign
keys. You should avoid compressing tables with much update or other DML activity. Although
compressed tables or partitions are updatable, there is some overhead in updating these tables,
and high update activity may work against compression by causing some space to be wasted.
Views
A view is a tailored presentation of the data contained in one or more tables or other views. A
view takes the output of a query and treats it as a table. Views do not require any space in the
database.
See Also:
Oracle Database Concepts
Integrity Constraints
Integrity constraints are used to enforce business rules associated with your database and to
prevent having invalid information in the tables. Integrity constraints in data warehousing differ
from constraints in OLTP environments. In OLTP environments, they primarily prevent the
insertion of invalid data into a record, which is not a big problem in data warehousing
environments because accuracy has already been guaranteed. In data warehousing environments,
constraints are only used for query rewrite. NOT NULL constraints are particularly common in data
warehouses. Under some specific circumstances, constraints need space in the database. These
constraints are in the form of the underlying unique index.
See Also:
Chapter 7, " Integrity Constraints"
Indexes and Partitioned Indexes
Indexes are optional structures associated with tables or clusters. In addition to the classical B-
tree indexes, bitmap indexes are very common in data warehousing environments. Bitmap
indexes are optimized index structures for set-oriented operations. Additionally, they are
necessary for some optimized data access methods such as star transformations.
Indexes are just like tables in that you can partition them, although the partitioning strategy is not
dependent upon the table structure. Partitioning indexes makes it easier to manage the data
warehouse during refresh and improves query performance.
See Also:
Chapter 6, " Indexes" and Chapter 15, " Maintaining the Data
Warehouse"
Materialized Views
Materialized views are query results that have been stored in advance so long-running
calculations are not necessary when you actually execute your SQL statements. From a physical
design point of view, materialized views resemble tables or partitioned tables and behave like
indexes in that they are used transparently and improve performance.
See Also:
Chapter 8, " Basic Materialized Views"
Dimensions
A dimension is a schema object that defines hierarchical relationships between columns or

column sets. A hierarchical relationship is a functional dependency from one level of a hierarchy
to the next one. A dimension is a container of logical relationships and does not require any space
in the database. A typical dimension is city, state (or province), region, and country.
STRUCTURE OF DATA WARE HOUSE

GRANULARITY
Granularity
From Wikipedia, the free encyclopedia
Jump to: navigation, search
Granularity is the extent to which a system is broken down into small parts, either the system
itself or its description or observation. It is the extent to which a larger entity is subdivided. For
example, a yard broken into inches has finer granularity than a yard broken into feet.
Coarse-grained systems consist of fewer, larger components than fine-grained systems; a

coarse-grained description of a system regards large subcomponents while a fine-grained
description regards smaller components of which the larger ones are composed.
The terms granularity, coarse, and fine are relative, used when comparing systems or
descriptions of systems. An example of increasingly fine granularity: a list of nations in the
United Nations, a list of all states/provinces in those nations, a list of all counties in those states,
etc.
Decision support system(DSS)
A decision support system (DSS) is a computer-based information system that supports

business or organizational decision-making activities. DSSs serve the management, operations,
and planning levels of an organization and help to make decisions, which may be rapidly
changing and not easily specified in advance. Decision support systems can be either fully
computerized, human or a combination of both.
DSSs include knowledge-based systems. A properly designed DSS is an interactive software-

based system intended to help decision makers compile useful information from a combination
of raw data, documents, and personal knowledge, or business models to identify and solve
problems and make decisions.
Typical information that a decision support application might gather and present includes:
inventories of information assets (including legacy and relational data

sources, cubes, data warehouses, and data marts),
comparative sales figures between one period and the next,
projected revenue figures based on product sales assumptions.
decision support system (DSS)

E-Mail
Print
AA
AAA
LinkedIninShare
A decision support system (DSS) is a computer program application that analyzes business data
and presents it so that users can make business decisions more easily. It is an "informational
application" (to distinguish it from an "operational application" that collects the data in the
course of normal business operation).Typical information that a decision support application
might gather and present would be:
Comparative sales figures between one week and the next

Projected revenue figures based on new product sales assumptions
The consequences of different decision alternatives, given past experience in
a context that is described
A decision support system may present information graphically and may include an expert
system or artificial intelligence (AI). It may be aimed at business executives or some other group
of knowledge workers.
The Development Life Cycle
We have seen how operational data is usually application-oriented and, as a consequence, is

unintegrated, whereas data warehouse data must be integrated. Other major differences also exist
between the operational level of data and processing, and the data warehouse level of data and
processing. The underlying development life cycles of these systems can be a profound concern,
as shown in Figure 1-13
Figure 1.13. The system development life cycle for the data
warehouse environment is almost exactly the
opposite of the classical SDLC.
\
DAT A MODELING TECHNIQUES
FOR A DATA WARE HOUSE
Chapter 6. Data Modeling for a Data Warehouse
.................
35
6.1 Why Data Modeling Is Important
.........................
35
Visualization of the business world
.....................
35
The essence of the data warehouse architecture
............
36
Different approaches of data modeling
...................
36
6.2 Data Modeling Techniques
............................
36
6.3 ER Modeling
.....................................
37
6.3.1
Basic Concepts
................................
37
6.3.1.1
Entity
....................................
37
6.3.1.2
Relationship
................................
38
6.3.1.3
Attributes
.................................
38
6.3.1.4
Other Concepts
..............................
39
6.3.2
Advanced Topics in ER Modeling
......................
39
6.3.2.1
Supertype and Subtype
.........................
39
6.3.2.2
Constraints
................................
40
6.3.2.3
Derived Attributes and Derivation Functions
............
41
6.4 Dimensional Modeling
...............................
42
6.4.1
Basic Concepts
................................
42
6.4.1.1
Fact
.....................................
42
6.4.1.2
Dimension
.................................
42
Dimension Members
..............................
43
Dimension Hierarchies
............................
43
6.4.1.3
Measure
..................................
43
6.4.2
Visualization of a Dimensional Model
...................
43
6.4.3
Basic Operations for OLAP
.........................
44
6.4.3.1
Drill Down and Roll Up
.........................
44
6.4.3.2
Slice and Dice
..............................
45
6.4.4
Star and Snowflake Models
.........................
45
6.4.4.1 Star
Model
................................
46
6.4.4.2
Snowflake
Model
.............................
46
6.4.5
Data Consolidation
..............................
47
6.5 ER Modeling and Dimensional Modeling
.
In general, we can say that a DSS is a computerized system for helping make decisions. A
decision is a choice between alternatives based on estimates of the values of those alternatives.
Supporting a decision means helping people working alone or in a group gather intelligence,
generate alternatives and make choices. Supporting the choice making process involves
supporting the estimation, the evaluation and/or the comparison of alternatives. In practice,
references to DSS are usually references to computer applications that perform such a supporting
role
Online Transaction Processing : OLTP also refers to computer processing in which the computer
responds immediately to user requests. An automatic teller machine for a bank is an example of
transaction processing.
DSS (Decission support system) which helps to take decission for the top executive
people. It generally based on historical dataOLTP (Online trasnaction processing)
system is the the system where day to day transaction are taking into
consideration. It is based on current data.
OLAP:
1). Store historical data of an organization
2). Data is used for BI/ business strategic decision
OLTP
1). Real time transactional data
2). Data is used to have track of transaction details.
The following table summarized main differences

between OLPT and OLAP:
OLTP OLAP
Application Operational: ERP, Management Information System,

CRM, legacy apps, ... Decision Support System
Typical users Staff Managers, Executives
Horizon Weeks, Months Years
Refresh Immediate Periodic
Data model Entity-relationship Multi-dimensional
Schema Normalized Star
Emphasis Update Retrieval

Let's go straight to each key points.
Horizon
OLTP databases store live operational information. An invoice, for example, once paid, is
possibly moved to some sort of backup store, maybe upon period closing. On the other side 5-10
strategic analysis are usual to identify trends. Extending life of operational data, would not be
enough (in addition to possibly impacting performance).
Even keeping that data indexed and online for years, you would surely face compatibility
problems. It is quite improbable that your current invoice fields and references are the same of
10 years ago!
But neither performance nor compatibility are the biggest concern under large horizon. Real
problem is business dynamics. Today business constantly change and the traditional entity-
relationship approach is too vulnerable to changes. I will better explore this point in next post
with a practical example.
Refresh
OLPT requires instant update. When you cash some money from an ATM you balance shall be
immediately updated. OLAP has not such requirement. Nobody needs instant information to
make strategic business decision.
This allows OLAP data to be refreshed daily. This means extra timing and resources for
cleansing and accruing data. If, for example, an invoice was canceled, we wouldn't like to see its
value first inflating sales figures and later reverted.
More time and more resources would also allow better indexing to address huge tables covering
the extended horizon.
Data Model & Schema
This is possibly the most evident difference between two approaches. OLTP perfectly fits
traditional entity-relationship or object-oriented models. We usually refer to information as
attributes related to entities, objects or classes, like product price, invoice amount or client name.
Mapping can be with a simple, one argument function:
Why build a dimensional database?

In a data warehousing environment, the relational databases need to be optimized for data
retrieval and tuned to support the analysis of business trends and projections.
This type of informational processing is known as online analytical processing (OLAP) or

decision support system (DSS) processing. OLAP is also the term that database designers use to
describe a dimensional approach to informational processing.
A dimensional database needs to be designed to support queries that retrieve a large number of
records and that summarize data in different ways. A dimensional database tends to be subject
oriented and aims to answer questions such as, What products are selling well? At what time
of year do certain products sell best? In what regions are sales weakest?
In a dimensional data model, the data is represented

as either facts or dimensions. A fact is typically
numeric piece of data about a transaction, such as the
number of items ordered. A dimension is the reference
information about the numeric facts, such as the name
of the customer. Any new data that you load into the
dimensional database is usually updated in a batch,
often from multiple sources.
Relational databases are optimized for online transaction processing (OLTP) are designed to
meet the day-to-day operational needs of the business. OLTP systems tend to organize data
around specific processes, such as order entry. The database performance is tuned for those
operational needs by using a normalized data model which stores data by using database
normalization rules. Consequently, the database can retrieve a small number of records very
quickly.
Some of the advantages of the dimensional data model are that data retrieval tends to be very
quick and the organization of the data warehouse is easier for users to understand and use.
If you attempt to use a database that is designed for OLTP as your data warehouse, query
performance will be very slow and it will be difficult to perform analysis on the data.
The following table summarizes the key differences between OLTP and OLAP
databases:
Normalized database (OLTP) Dimensional database (OLAP)
Data is atomized Data is summarized
Data is current Data is historical
Processes one record at a time Processes many records at a time
Process oriented Subject oriented
Designed for highly structured repetitive Designed for highly unstructured

processing analytical processing
Many of the problems that businesses attempt to solve are multidimensional in nature. For
example, SQL queries that create summaries of product sales by region, region sales by product,
and so on, might require hours of processing on an OLTP database. However, a dimensional
database could process the same queries in a fraction of the time.
Besides the characteristic schema design differences between OLTP and OLAP databases, the
query optimizer typically should be tuned differently for these two types of tasks. For example,
in OLTP operations, the OPTCOMPIND setting (as specified by the environment variable or by
the configuration parameter of that name) should typically be set to zero, to favor nested-loop
joins. OLAP operations, in contrast, tend to be more efficient with an OPTCOMPIND setting of
2 to favor hash-join query plans. For more information, see the OPTCOMPIND environment
variable and the OPTCOMPIND configuration parameter. See the IBM Informix Performance
Guide for additional information about OPTCOMPIND, join methods, and the query optimizer.
IBM Informix also supports the SET ENVIRONMENT OPTCOMPIND statement to change
OPTCOMPIND setting dynamically during sessions in which both OLTP and OLAP operations
are required. See the IBM Informix Guide to SQL: Syntax for more information about the SET
ENVIRONMENT statement of SQL.
Informix is designed to help businesses better leverage their existing information assets as they
move into an on-demand business environment. In this type of environment, mission-critical
database management applications typically require combination systems. The applications need
both online transaction processing (OLTP), and batch and decision support systems (DSS),
including online analytical processing (OLAP).
Archieve
An archive is a collection of computer files that have been packaged together for
backup, to transport to some other location, for saving away from the computer so
that more hard disk storage can be made available, or for some other purpose. An
archive can include a simple list of files or files organized under a directory or
catalog structure (depending on how a particular program supports archiving).
Archiving a conversation will hide it from your messages view, while deleting a
conversation from Messages permanently removes the entire conversation and its
history.
To archive a conversation, simply click the "x" next to the conversation. The
conversations history will be preserved, and you will still be able to find it later. If
the same person sends you a new message later, the archived conversation will
reappear, and the new message will be added to it.
To permanently delete a conversation from your messages, simply open the

conversation, then select "Delete Messages..." from the Actions drop-down menu.
If you click "Delete All" at the bottom of the page, the full conversation history will
be permanently cleared from your messages. You can also check the boxes next to
individual messages and click "Delete Selected" to permanently delete parts of the
conversation.
What is the difference between a program file and

a data file?
Answer:
A program file may require a data file to work. a data file holds information that a
program file may use. For example... a program file might be a shortcut that you
click on to run a program such as Notpad but a data file is like a *.doc file type or
a .txt which contains data.
A program file may only contain binary operation codes, addresses and embedded
data as permitted by the designers of the computer processor that is going to be
executing the program. A data file can be in any format as determined by the
programmers.
Well, a program file contains code that can be translated by a compiler and run as a
program on a computer.
A program file refers to sets of instructions that are put together to build up the
application you will be using. Several files are linked between each other to see the
application that pops in your screen.
A data file is basically the application's feeding as it provides what is needed for
your program to work and convert single pieces of data on understandable
information.
What is the difference between a .LOG file and a .TXT file?
Answer: Files with ".log" and ".txt" extensions are both plain text files. This means they can
both be viewed with a standard text editor like Notepad for Windows or TextEdit for Mac OS X.
The difference between the two file types is that .LOG files are typically generated
automatically, while .TXT files are created by the user. For example, when a software installer is
run, it may create a log file that contains a log of files that were installed. Log files typically have
one entry per line, which includes information such as the filename, the action (created, moved,
deleted, etc.), and the location of the file.
Abstract (summary)
Jump to: navigation, search
An abstract is a brief explanination of a research article, thesis, review, conference proceeding

or any in-depth analysis of a particular subject or discipline, and is often used to help the reader
quickly ascertain the paper's purpose.[1] When used, an abstract always appears at the beginning
of a manuscript or typescript, acting as the point-of-entry for any given academic paper or patent
application. Abstracting and indexing services for various academic disciplines are aimed at
compiling a body of literature for that particular subject.
The terms prcis or synopsis are used in some publications to refer to the same thing that other
publications might call an "abstract". In management reports, an executive summary usually
contains more information (and often more sensitive information) than the abstract does.
C````````````````````````````````````````````
```````````

Very Important Waht Is Data Warehouse and Why Required

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Very Important Waht Is Data Warehouse and Why Required

Uploaded by

Copyright:

Available Formats

Business Intelligence and Decision Support

Business is moving at a very high pace.The management is looking for clear

Time is money and real time information is the requirement.

This require integrated,accurate,current and understandable sources of data

Data is the critical element in any information system

Business Intelligence system provides quality information.

Data is the critical element in any information system.Business Intelligence

Companies must also anticipate trends,identify new opportunities,transform

Capturing raw data.

Consolidating,organizing, storing and distributing data.

Analyzing and providing quick and easy access to information.

Business Intelligence helps a company to create knowledge from information

Decision Support Systems (DSS) helps organizations in strategic and

Organizations use On-Line Transaction Processing (OLTP) servers for their

Strategic decision making require historical data.Data Warehouse(DW) are

Business Intelligence (BI)

The technologies, tools, and practices for collecting, integrating,

How to Build a Data Warehouse

ETL (Extract - Transform - Load)

Extract data from multiple sources.

Load the transformed data into the warehouse.

Why ETL ( Extract - Transform - Load)

We can do various transformation on the source data before loading it

De-duplication: Remove duplicate records.

Business Intelligence and Decision Support

Business is moving at a very high pace.The management is looking for clear

Time is money and real time information is the requirement.

This require integrated,accurate,current and understandable sources of data

Data is the critical element in any information system

Business Intelligence system provides quality information.

Companies must also anticipate trends,identify new opportunities,transform

Capturing raw data.

Consolidating,organizing, storing and distributing data.

Analyzing and providing quick and easy access to information.

Business Intelligence helps a company to create knowledge from information

Aims and objective of Data warehouse

Purpose of Data Warehouse is to keep analysis/reporting (non-production use

2. DW purpose for Data Consistency and Quality

3. High Response time- Normalized Data vs. Dimensional

4. Data Warehouse objective of providing an adaptive and

5. Establish the foundation for Decision Support

Inherent Complexity of Data Warehouse Databases

Data Warehouse Databases are Large

Since data warehouse databases are constructed by combining selected data

Data Warehouse Databases are Structurally Complex

Because data warehouse databases are constructed by combining data from

OLTP AND OLAP

Online Transaction Processing vs Online Analytical Processing

Online Transaction Processing System:OLTP

Example: In a banking System, you withdraw amount through an ATM. Then

In an OLTP system data are frequently updated and queried. To prevent

Examples for OLTP Queries:

What is the Salary of Mr.John?

Historical Data or Archival Data

Analytical queries require huge number of aggregations

Integrated data set with a global relevance

Updates are very rare here.Analytical queries requires huge number of

OLAP Queries have significant importance in strategic decision making. This

Examples for OLAP Queries

3 Physical Design in Data Warehouses

Moving from Logical to Physical Design

Moving from Logical to Physical Design

Chapter 5, " Parallelism and Partitioning in Data Warehouses" for