Professional Documents
Culture Documents
Christopher Reed is solution Would one buy a house when the stability of Root Causes of Information Quality Issues
consultant at Infogix Inc., the foundation is uncertain? Would one make a While several factors can be attributed to the
leading solution consulting payment if the accuracy of the bill is in question? information quality issues, the following are the
efforts. He works with Fortune If the answer is no, then why would any major causes of information errors within data
500 companies to assist in organization settle for making business decisions warehouses:
the creation of information based on inaccurate and inconsistent data • Changes in the source systems—Changes in
control solutions throughout warehouse information? A number of studies1, 2, 3 the source systems often require code changes
the enterprise. In addition to show that much of the data warehouse in the ETL process. For example, the ETL
his work at Infogix, Reed was information available to business users is not process corresponding to the credit risk data
an architectural consultant at accurate, complete or timely. Despite significant warehouse in a particular financial institution
Unisys, where he consulted investment in data warehouse technologies and has approximately 25 releases each quarter.
with customers on deploying efforts to ensure quality, the trustworthiness Even with appropriate quality assurance
mission critical applications. of data warehouse information at best remains processes, there is always room for error. The
questionable.4, 5 Current approaches to restore following list outlines the types of potential
Yaping Wang, CISA, PMP, trust in data warehouse information are often errors that can occur because of changes in the
is product consultant at heroic efforts of the individuals responsible for ETL processes:
Infogix, where she leads the data warehouse and include: – Extraction logic excludes certain types of data
client service projects that • Manual or semiautomated balancing, tracking that were not tested.
provide assessment, advisory, and reconciliation to prove accuracy – Transformation logic may aggregate two
implementation and other • Ad hoc queries of data sources to support different types of data (e.g., car loan
services in automated “audit needs” and boat loan) into a single category
information control domains. • Extensive research and remediation to identify, (e.g., car loan). In some cases, transformation
diagnose and correct issues logic may exclude certain types of data,
Angsuman Dutta is unit These approaches provide short-term respites resulting in incomplete records in the data
leader of the customer but are not sustainable in the long run. The warehouse.
acquisition support team at increased labor cost for manual processes and – Similar issues are also observed with the
Infogix. Since 2001, he has the high processing cost for reruns when errors loading process.
assisted numerous industry- are identified late in the process increase ongoing • Process failures—Current processes may
leading enterprises in their operational costs. The cumbersome and costly fail due to system errors or transformation
implementation of automated processes for supporting audit needs also create errors, resulting in incomplete data loading.
information controls. organizational stress. Frequently, a large number System errors may include abends due to the
of data warehouse projects are abandoned unavailability of source system/extract or the
because of the high costs of efforts to ensure incorrect format of the source information.
information quality.6 Transformation errors may result from
While standardized tools, such as those for incorrect formats.
extraction, transformation and loading (ETL) • Changes/updates in the reference data—
and data quality processes, solve part of the Outdated, incomplete or incorrect reference
problem, there is an urgent need for adopting a data will lead to errors in the data warehouse
systematic approach for establishing trust in data information. For example, errors in the sales
warehouse information. The proposed approach commission rate table may result in erroneous
outlines a framework for ensuring the integrity of calculation of the commission amount.
data warehouse information by using end-to-end • Data quality issues with the source system—
information controls. The source system’s data may be incomplete or
X1
X5
App #1
Data X3
X2 Staging/ETL X4
App #2
Data Warehouse
Data
X6
X1
5. Control X5, assurance that the data balance with information accuracy within the data warehouse and across
downstream applications or data marts—Ensure that the enterprise. Successful organizations expand the scope of
the data warehouse information can be balanced and information controls beyond the scope of the data warehouse
reconciled with the downstream processes. by developing a companywide program for ensuring
6. Control X6, validation between parallel systems and the the enterprise information quality. With an appropriate
data warehouse—Data warehouse information can also selection of tools and frameworks for information controls,
reside in other systems. For example, loan information organizations can achieve the elusive goal of having higher-
resides both in the GL and the credit risk data warehouse. quality enterprise information assets.
It is important to reconcile the information in the parallel
system with the data warehouse information. In the absence Endnotes
of such a control, the loan information in the financial 1
English, Larry; Improving Data Warehouse and Business
reports, generated from the GL system, may become out Information Quality, Wiley and Sons, USA, 2000
of sync with the loan information used for estimating the 2
Eckerson, Wayne W.; Data Quality and the Bottom Line,
capital requirements for Basel II. TDWI research series, USA, 2001
3
Friedman, Ted; Data Quality “Firewall” Enhances Value of
Conclusion the Data Warehouse, Gartner Report, USA, 2004
With the accelerating changes in the source systems to 4
Violino, Bob; “Do You Trust Your Information?,” The
support business needs, increasing reliance on data warehouse Information Agenda, 23 October 2008
information for critical business operation and decisions, and 5
Computer Sciences Corp., Technology Issues for Financial
an expanding (and ever-changing) array of regulations and Executives, USA, 2007
compliance requirements, the use of automated information 6
Gupta, Sanjeev; “Why Do Data Warehouse Projects Fail?,”
controls is no longer an option; it is the only way to ensure Information Management, 16 July 2009
4 ISACA JOURNAL VOLUME 4, 2010