You are on page 1of 32

BIO

PRESENTATION

T9
11/17/2005 11:15 AM

INTELLIGENCE TESTING:
TECHNIQUES FOR VALIDATING A
DATA WAREHOUSE SYSTEM
Geoff Horne
iSQA

International Conference On
Software Testing Analysis & Review
November 14-18, 2005
Anaheim, CA USA
Geoff Horne
Geoff Horne is the Managing Director of iSQA which is based in New Zealand and
enjoys an international clientele. He has over 25 years experience in IT including
software development, sales and marketing and IT and project management. In
1994, almost by accident, he found himself involved in testing a complex fault
management system which led to further testing and QA assignments covering a
wide range of applications and tools. iSQA was subsequently founded to bring a full
range of testing consultancy services to the IT industry.

With iSQA, Geoff specialises in testing strategies, methodologies, planning and


project management, taking on selected assignments personally. He has also written
a variety of white papers on the subject of software testing and has been a regular
speaker at the Star testing conferences. Geoff is married with four children and in his
spare time (which there is not a lot of) enjoys writing and recording original
contemporary music.
Independent Software Quality Assurance Ltd

www.isqa.com
www.isqa.com

Intelligence Testing:
Techniques for Validating a
Data Warehousing System

Geoff Horne - iSQA


www.isqa.com

So what is testing?

“Examination of a program by executing it to verify that it satisfies


specified requirements or to verify differences between expected
and actual results.”

- Dave Lutzker, Innovative Technologies


www.isqa.com

Its value?
“The value of testing equals the sum of all of the changes in
our confidence about the system behaviours. If we test it
10x we have a confidence. If we test it 100x, we have a
higher level of confidence.”

and the result?

“The output of testing is information about the behaviour of


the system. The result is NOT a quality product - you
cannot test quality into a product.”
- Dave Lutzker, Innovative Technologies
www.isqa.com

Image removed due to space restrictions


www.isqa.com

Image removed due to space restrictions


www.isqa.com

Image removed due to space restrictions


www.isqa.com
www.isqa.com

Image removed due to space restrictions


www.isqa.com

The Cost of Correcting an Incident - by phase

Requirements $10

Design $50 (5x)

Coding (incl. unit testing) $100 (10x)

Testing (system, functional) $300 (30x)

Acceptance Testing $400 (40x)

Production
$??? (nnx)
The earlier an incident is found the better!

Source: Barry Boehm


www.isqa.com

Software Testing - Process Components


Unit Testing Testing a single program or subsidiary component
of a program for compliance to program/component
specifications when executed in isolation
Development
Testing/QA
Integration Testing of pre-tested programs/components, integrated
Testing together to create sub-systems

System Testing of the entire system for compliance to the


Testing software’s functional specification
did the product get built right?

Business did the right product get built?


Acceptance Testing of the system for compliance to the
Testing business requirements specification

Source: IEEE Standard 829


www.isqa.com

The Testing Plane


low high
Resource
low high
Time
low high
Cost
low high
Quality
high low
Risk

Adhoc testing Where do you want your testing to be? Formal process
No scripts Full test scripts
No plan Can use low skilled
Based solely on personnel for test execution
personnel knowledge Carefully planned
Little or no management & managed
Random process

Source: iSTEP Testing Methodology


www.isqa.com

Software Testing - Overview Process: The V-Model

User User

Business Acceptance
Requirements Testing

Software System
Specification Testing

Software Integration
Architecture Testing

Detail Design Unit


Specification Testing

Source: IEEE Standard 829


www.isqa.com

What is a Data Warehouse?

“A data warehouse is a copy of transaction data


specifically structured for querying and analysis.”

Ralph Kimball
Kimball Group

Author of:
The Data Warehouse Toolkit
The Data Warehouse ETL Toolkit
The Data Warehouse Lifecycle Toolkit
www.isqa.com

Data Warehouse Components:

Source: OLTP systems, RDBMS etc.

ETL: Extract, Transform and Load tools eg. Oracle


Warehouse Builder, Ascential DataStage

Data Warehouse: Data Warehouse RDBMS

Business OLAP, Reporting, BI Systems eg. Hyperion,


Intelligence: Cognos, Business Objects

Metadata: Additional data required not available


anywhere else
www.isqa.com

How do they hang together?


www.isqa.com

Testing Phases:

From Source to Data


Warehouse:

From Data Warehouse


to BI Users:
www.isqa.com

Transformation rules:

• Specify source table elements from all data sources including


metadata

• Specify Data Warehouse destination table elements:


• Dimensions – reference data, keys etc.
• Facts – data assets

• Specify how the source table elements map onto the


destination table elements

• Form the basis of unit test cases


www.isqa.com

Transformation rules (simple example):

Source_Database_1 Dest_Database_DWH Transformation Rules


SD1_Table_1 DWH_Dim
SD1_T1_Attr_1 DD1_T1_Attr_1 = SD1_T1_Attr_1
SD1_T1_Attr_2 DD1_T1_Attr_2 = SD1_T1_Attr_2
SD1_T1_Attr_3 DD1_T1_Attr_3 = SD1_T1_Attr_3 + SD1_T1_Attr_4
SD1_T1_Attr_4
SD1_Table_2 DWH_Fact
SD1_T2_Attr_1 DD1_T2_Attr_1 = (SD1_T2_Attr_1 * SD1_T2_Attr_3)/52
SD1_T2_Attr_2 DD1_T2_Attr_2 = SD1_T2_Attr_3 + " " + SD1_T2_Attr_4
SD1_T2_Attr_3 DD1_T2_Attr_3 = DD1_T1_Attr_3/SD1_T2_Attr_4
SD1_T2_Attr_4
www.isqa.com

From Source to Data Warehouse – Unit Testing:

• Know your transformation rules!

• Test cases should cover each transformation rule and include


positive and negative situations

• Row counts: DWH (Destination) = Source + Rejected

• Correctly access all required data including metadata

• Cross reference DWH Dimensions to source tables

• All computations are correct especially those based on


business rules

• Database queries, expected vs actual results


www.isqa.com

From Source to Data Warehouse – Unit Testing (cont):

• Rejects are correctly handled and conform to business rules

• Slow-changing Dimensions eg. address, marital status

• Correctness of surrogate keys eg. time zones, currencies in


Fact tables

• Opportunities for automation

• Summary - dual drive:


• Source table driven – data ends up in the right place
• DWH (Destination) table driven – contains the right
result
www.isqa.com

From Source to Data Warehouse – Integration Testing:

Once all extract, transformation and load unit tests have been
successfully executed, need to execute ETL process from end-to-end:

• Job sequences and dependencies


• Errors in one job that impact subsequent jobs
• Error log generation
• Restarting the ETL process in case of failure:
• Does it have to be started over?
• Can it start from where it failed?
• Restores required?
• Auto/manual?
• Impact of failure on subsequent jobs
• Processing of rejected records
• Reprocessing of already processed records
www.isqa.com

From Data Warehouse to Business Intelligence Mechanisms – System


Testing:

• Reports and presentations correctly execute


• Fields are set up as per specifications
• Displayed information is correct (reference back to Source)
• Drill down facilities are operational and correct
• Column headings are relevant and correct
• Columns correctly aligned
• Units are displayed correctly
• Graphs and associated data items
• Aggregations and summaries
• Totals and subtotals
• Exception reporting
• Alerts
www.isqa.com

From Data Warehouse to Business Intelligence Mechanisms –


Acceptance Testing (Users):

• Frequency
• Reports provide information required:
• Too high level, insufficient detail?
• Too low level, excessive detail?
• Aggregations and summaries
• Graphs
• Other graphical representations
• Formats
• Subtotals and totals
• Meaningful – information, not just data
www.isqa.com

From Source to Business Intelligence Mechanisms – End-to-End:


www.isqa.com

From Source to Business Intelligence Mechanisms – End-to-End:

• DWH process executes from end-to-end uninterrupted


• Within required processing window including
reprocessing of rejects?
• Standard reports
• Complex reports
• High data volumes – across historical time periods
• Low data volumes - lowest permissible time period
• First time creation
• Concurrent user access
• DWH database constrained
• Frequency of updates
www.isqa.com

Data Warehouse Testing – Iterative Process

BI Acceptance
Testing

BI System Testing

ETL Integration Testing

ETL Unit Testing


www.isqa.com

Data Warehouse Testing – Iterative Process

• Get each one right before commencing the next


• If a prior step fails again, go back and get it
right first
• Track the number of errors etc. arising through
each step - should decrease
www.isqa.com

Data Warehouse Testing – Continually Changing Source Systems

• Inherent nature of DWH is continually updating


data and source systems so DWH testing must
allow for both
• New Source data/schema/application =
retesting/regression testing
• DWH systems are always high maintenance
• Will always find new issues
• Opportunities for automation
• Package test suites modularly for ease of
repeatability
Independent Software Quality Assurance Ltd

www.isqa.com

You might also like