Professional Documents
Culture Documents
Saad Khan
Data Governance: A handy book for laymen.
Contents
Acknowledgement ........................................................................................................................................ 6
INTRODUCTION ............................................................................................................................................. 7
1. Data Silos:.......................................................................................................................................... 8
Organization:........................................................................................................................................... 19
Saad Khan
Data Governance: A handy book for laymen.
Centralized Approach.......................................................................................................................... 35
Data Assets.......................................................................................................................................... 45
Saad Khan
Data Governance: A handy book for laymen.
Issue .................................................................................................................................................... 49
BICC ..................................................................................................................................................... 52
Approach ................................................................................................................................................. 57
Identify .................................................................................................................................................... 57
Quickwin ................................................................................................................................................. 57
Refine ...................................................................................................................................................... 58
Analysis ................................................................................................................................................... 58
Saad Khan
Data Governance: A handy book for laymen.
Responsibilities ................................................................................................................................... 60
Members ............................................................................................................................................. 61
Schedule .............................................................................................................................................. 61
Responsibilities ................................................................................................................................... 61
Members ............................................................................................................................................. 62
Schedule .............................................................................................................................................. 62
References .................................................................................................................................................. 62
Saad Khan
Data Governance: A handy book for laymen.
Acknowledgement
Data for organizations is becoming more critical for decision making. The necessity of managing
and controlling data and data related processes are more been focused. This gives an
opportunity for a more comprehensive approach and methodology to be developed and
practiced.
There are many standards on the domain of data governance including DAMA , DGI and more
that have defined set of practices and standards to follow.
This book focuses on the domain of data governance generically, the methodology, approach
and structure defined are designed to be customized for different organizational needs. A set of
policies and processes are listed serving as a set of examples and other processes and policies
can be added as required. The maturity matrix which is a direct contribution of Christopher
Bradly (VP of DAMA) enables to measure the level readiness for a DG program in any
organization. The methodology briefed in this book, is a result of continuous effort and
improvement from the experience of implementing these DG programs in several organizations.
The methodology is purely iterative in nature as the DG Programs requires. The governance KPIs
and measures represents a set of KPIs that can directly be used to measure the effectiveness of
the initiated DG program.
The book will help an organization to eliminate silos based data development, helps the
organization to consider data development at enterprise level, helps in maintaining cross
department common business vocabulary and helps reform the organization to an information
centric culture. The organization bodies and their responsibilities are also highlighted.
Saad Khan
Data Governance: A handy book for laymen.
INTRODUCTION
“Organizations that rely on data to make decisions, considers data to be a vital asset for the
organization”.
Customer names, addresses, emails, product names, order details hence everything that an
enterprise collect from a customer or for a customer serves as data for the organization. Data in
any organization can be categorized differently, for example, there can be customer data,
product data, orders data, supplier’s data, employees’ data, financial data, and operations data
and so on. The aggregation and segregation of data is a known fact to drive analytical needs of
the business, for example the financial data is a collective sum of many different attributes of
the organizational data. If we brief, how can we answer “What is the yearly profit a clothing
company is generating through selling online clothes”. In order to answer this question we need
data. And the data comes in from different attributes of the organization. It involves product
data, orders data, in certain cases the logistics data, in case of returns the returns data and a
certain business formula to calculate the profit.
Profit in this example becomes a KPI (Key Performance Indicator) or a measure for the business,
it must have a definition or a description. It must have a business formula, it must have an
owner who is accountable for the accuracy of this term Profit, it must be developed from data
that resides somewhere in the organization and the data must be produced, collected,
transformed and published by some people according to some conditions for the business using
certain technology.
The consistency or accuracy of the term Profit and the quality of the data involved in producing
Profit is a collective responsibility of the people involved in collecting, storing, retrieving,
transforming and using this data. Quality of data becomes more important than data itself. If the
data is of no quality neither only it does not server its purpose but also may be a factor in
making a wrong decision.
Secondly, the term Profit can be used differently by different departments or people, so who
will make sure what exactly Profit will be for the organization. And who will decide who is
responsible of what (this for certain is not the HR Job description) in terms of data to be
managed, owned, accurate and fit for purpose.
Saad Khan
Data Governance: A handy book for laymen.
The terms Master Data Management, Metadata Management, Data Quality, Data Strategy, Data
warehouse, Data Integration, Enterprise data or information model, Business Intelligence,
Database Operations, Data Security, Reference Data, Data Architecture; all are involved in
governing enterprise data.
Data Governance will answer the What, Who and How for managing, owning, producing and
using the data (asset) of the organization. What covers the assets of the organization. Assets can
be of different types namely business assets, data assets, technology assets and governance
assets (this we will discuss in detail). Who involves the Roles and Responsibilities of the people
taking part in data activities. How involves the Processes the Who(s) will follow in order to
govern a certain asset.
Data Governance can serve as a platform that enables People to collaborate with each other in a
pre-defined workflow or Process using Technology.
“You cannot improve what you cannot manage; you cannot manage what you cannot measure;
you cannot measure what you cannot define; you cannot define what you do not govern.”
— Daniel J. Paolini
1. Data Silos:
The ability of data systems (sales force) supporting business processes (sales process), degrades
over time, because of the ever changing requirements of business that reflects bad quality of
data in every change. Moreover, these data systems are usually developed “bottom-up” to meet
the business need of a particular business unit / department, commonly funded by these
particular business units / departments. Although the exchange of data through these data
systems happens between two or different departments, it happens without mutually agreed
definitions of data and terms used. This typically is the biggest factor of encouraging data silos
(departmental level data marts) and discourages cooperation between departments and
enterprise data warehousing. This results in quality issues making it more cost effective to
resolve these issues as the effort and resources add to the timeline of decision making and the
accuracy of the decision to be made.
Saad Khan
Data Governance: A handy book for laymen.
NOTE: The term Profit described in the introduction session, must be a part of enterprise
business glossary, must be defined in one of the business or logical model to enforce the use of
Profit as an enterprise entity rather than departmental. Off course the definition of Profit may
differ from department to department or from people to people, it remains a consolidated
enterprise entity. Each department may have a term similar to Profit but with a different
description, definition, business formula and name, the name might be the synonym for Profit.
Saad Khan
Data Governance: A handy book for laymen.
Metadata Management
Cataloguing information about data objects is referred as Metadata Management. Information
is stored in an organization across disparate heterogeneous system, these systems are owned,
managed and accessed by different users and departments that often lack in communication
with each other. Metadata management and documentation is part of data governance
initiatives that helps an organization to take majors about quality and reliability of data in
planning and decision making processes. Metadata management ensures that the final reports
and analysis are from the right data sources, are complete, and have quality.
The purpose is to establish the foundation for managing and documenting metadata across the
entire enterprise, covering all the terminologies and related data attributes used across the
enterprise covering different goals. Responsible to collect and maintain a repository for
metadata that would be utilized by every department.
1. Business Metadata
Business Metadata is usually business driven, it is the information about the glossary,
KPIs, business rules etc. For example the term Profit has a definition, a description, a
type, a frequency of use, involvement of data elements, involvement of system, an
accountable owner etc. all these are the business metadata for the term Profit.
Business metadata enable the business users at large to answer the following
questions:
Saad Khan
Data Governance: A handy book for laymen.
2. Technical Metadata
Technical metadata consist of the technical description of the data assets, including
information about schema, table, column, physical attribute of data mapping,
specifications of transformation jobs. It defines the semantics of a physical data
assets and its effect on management and business planning decisions.
3. Operational Metadata
Business Intelligence layer serves the analytical needs of the business. Each department may
have multiple BI Applications or dashboards providing multiple analytics. Business Intelligence
Office in coordination with business users and data warehousing resources, built a platform
Saad Khan
Data Governance: A handy book for laymen.
where self-service analytics is made possible for ad-hoc reports and operational and predictive
analytics.
4. Self-Service Analytics
Organizations needs more self-service analytical and reporting capabilities for their end users,
creating a necessity for a common understanding of data across the organization.
Saad Khan
Data Governance: A handy book for laymen.
Data Governance helps organization to achieve their organizational strategies that requires the
underlying assets to be governed and managed. Though DG does not results in direct revenue
generation for the enterprise, rather it helps the enterprise to strategies all those factors that
result in direct revenue for the enterprise.
More than 50% of the time of a Business Analyst is utilized to find the right data, DG helps in
reducing the time to allocate the right data source thus resulting in maximizing the productivity
of a BA that helps in the acknowledgement of a correct decision.
Saad Khan
Data Governance: A handy book for laymen.
Data breaches and Data security issues that has to comply with certain regulations, usually
results in heavy fines for the enterprise, DG provides a comprehensive way to protect data
breaches and data security concerns.
DG improves the operational efficiency of an enterprise, by providing a structures approach to
bridge business and IT through common goal, and by giving accountabilities for establishing and
maintaining data quality.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
departments. But the basic structure of any organization are these business units or department
that contribute in generating revenue for the organization. The departments have business
processes to follow, that requires technology to help the business minimize manual efforts. Thus
these departments use transactional systems like ERP, Salesforce, CRM, LMS to record business
and customer transactions, though in a large organization these system are not directly used by
the business to perform analytics or to calculate reports for management.
Organizations that are driven by data for decision making utilizes the power of a data
warehouse for analytical purposes. The data warehouse is utilized in forms of separate data
marts that are department specific or in order words these data marts are built to fulfill the
requirements of a specific department, requirements that are essential to run business
operation. Thus the foundation of data in silos are formed. Although these data marts are
created to fulfill the needs of the business and are driven by business people, the use of
business vocabulary differs from department to department, therefore a common
understanding of the business is missed from the very essence of a data mart. Still many
business taxonomies are similar between the departments but the difference of vocabulary
usually results in delays and wrong decision making.
Business Intelligence tools and dashboards provides self-service capabilities for the business, but
the uncommon vocabulary remains as still. Data is integrated from several system causing
inconsistency of data as the same data is integrated several time with different business
transformation making it harder to track back to the real source. Also data integrators are
pulling and pushing the data as required by the business unit rather than as it should be utilized
for the enterprise instead. This means the entity Customer is rather an Enterprise entity than a
departmental entity, there may be credit customers, debit customers, and loan customers for a
bank, but they all are Customer for the Bank itself. Similarly Profit / Revenue will be different for
Credit Customers, Debit Customers and a Loan Customer but for the enterprise is a collective
sum.
In order to resolve the impact of data silos in an organization the Enterprise Data Management
Office plays a vital role. It resolves the effect of data being used in silos by maintaining an
Enterprise Data Model or Business Information Model or Subject Areas to be utilized as a first
layer for the data to be loaded. This helps in maintaining the term Customer or Profit / Revenue
at enterprise level and giving the opportunity of the Data Integrators to push and transform the
customer data to the customer subject area with the defined data and quality rules.
The EDMO subsequently enhances the functions of all sub departments for Data Integration,
Data Quality, Data Security by classifying data attributes for access and sharing. On the other
hand since a golden source helps maintaining the integrity of data that might change in one of
the system and reflects unconditional effects on the other. EDMO helps enabling Master &
Reference Data Management.
Saad Khan
Data Governance: A handy book for laymen.
EDMO is responsible to provide models for data and information management, these models
differ in enterprise and departmental level models. But these model based data development
helps eliminate redundancy and helps maximize accuracy and consistency of the data used at
both enterprise and department level.
Data Governance helps to control the management of data in an agreed, predefined and best in
class practice by involving and giving accountabilities to each of the data offices including
EDMO, DQM, DI, MDM, business stakeholder and IT people.
DG establishes a three or four tier structure including Executive Sponsor, Data Governance
Board, Data Governance Council and Data Stewardship council. Each tier has a defined
responsibility and are accountable for their action that will impact the governance program as a
whole. Each tier has members involving from EDMO, DI, DQM, BICC, business and technology.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
The framework is categorized in to several components that are essential in governing any
situation that may arise in the management of data.
Organization:
It includes The Offices of Data Management and Data Governance along with business and IT
stakeholders to form an organization structure to manage data issues. The organization
structure describes the responsibilities of each participating office. The terms of reference for
each office is set, and a method to collaborate with other participating members. The frequency
and objectives of the meeting of working groups (Data Integrators, Data quality personals,
enterprise architect and so on) are documented and followed.
Saad Khan
Data Governance: A handy book for laymen.
All the above roles exist in the business and are performing other duties than DG itself. For
example the Data Owner is usually the person who is heading a particular department, he is
hold accountable for the activities happening in managing the data of that department. A Data
Steward is usually the manager of a particular department, he is held responsible for the
activities happening in managing the data of his department. The relation between accountable
and responsible is, the person held responsible is the DOER and accountable person will be
Saad Khan
Data Governance: A handy book for laymen.
questioned if things are not in the defined manner. In a real world example, the head of the
department who is accountable for the quality of data used for his department will not usually
deal with the data quality issues directly, rather the manager of the department who is
responsible to make sure the data is of correct quality will do the handful of things required to
make sure the data quality is matched by involving the data people who act as Data Custodians.
Following are the typical roles involved in a DG Program, though it depends on the strategic
objective of the program but the below mentioned roles are used typically to solve DG related
issues.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
Senior Manager of Accountable for the data strategy and vision within their Business /
Data Management Function
(An individual with Accountable for the execution against strategy and vision
accountability for
Accountable for the governance structure within B/ F including Policy
Business / Function)
adherence
Accountable for representing the Business at the SDDA and SDSB
Accountable for the communication of data management strategy to
the Business they represent
Responsible for the identification of key data
Responsible for data Governance, BDA and data stewardship
activities
Responsible for Data Program Execution within programs and
projects
Responsible for Data Quality programs
Responsible for Relationship Management
Responsible for the creation and monitoring of data standards with
input from data producers, consumers, and data stewards
Saad Khan
Data Governance: A handy book for laymen.
Remediation
Business System Owner Accountable for the Business Management of the system
(An individual who is Accountable for the planning, use and budget of system
accountable for the physical
Responsible for ensuring SLAs are in place and managed
system)
Responsible for working with the IT Owner to ensure Policy
and Requirements are implemented
Saad Khan
Data Governance: A handy book for laymen.
Below is the list of policies and processes that are typically used in a DG Program.
List of Policies
The following list an example of policies that are used in contrast to DG, these are not the
ultimate policies, and others can be added or deleted from the list.
Saad Khan
Data Governance: A handy book for laymen.
The purpose of this policy is to govern the collection of all data processed and controlled
by the organization and its subsidiaries.
List of Processes
A process is a series of steps and decisions involved in the way work is completed. A DG process
involves a series of defined steps to be taken by a number of different roles in a particular
structure in order to govern the management of data. Every process is been carefully drafted to
handle a particular situation. Every situation has to involve different roles, not every role is
involved in every process but only the required ones. These roles are involved in a process as
these roles are held accountable to mitigate the risks associated with data and its management.
The roles are dependent on one another to make a correct decision to overcome a situation.
Process are designed to be proactive and reactive from their behavior depending on the
situation they were designed to handle.
Saad Khan
Data Governance: A handy book for laymen.
A process includes a process flow defined step by step, roles responsible to complete the tasks
required to successfully end the process, a description of all the terminologies that may be used
within that process, any association with the policy (if any) and the type of issue that will be
resolved using the process.
Below is a list of processes that can be used in a governance program depending on the DG
Strategy.
List of Processes
The following list is not the ultimate list of process that can be used in a governance program,
but it fulfill the needs of any governance program. More processes can certainly be added
depending upon the requirements and objectives of a DG Program.
The objective of the Monitor Data Cleansing process is to address identified gaps
and needs for high quality data as the enterprise may leverage the results from the
Data Quality Metrics reports.
The objective of the Data Governance Fast Track Escalation process is to efficiently
solve any severe and important data issue in an accelerated manner.
The objective of the Change Data Request process is to track data related changes
that occur within the organization.
Proactively identify potential Data Quality issues before they occur and have a
negative business impact.
Saad Khan
Data Governance: A handy book for laymen.
Data must be of high quality with quality targets defined by Data Stewards and Data
Owners.
The objective of the Critical Data Element Identification process is to govern the
identification and documentation of Critical Data Elements (CDEs), and the
underlying criteria (rules) for the Critical Data Elements.
The objective of this process is to ensure that proper data access is granted to
protect all enterprise data from unauthorized access.
Data Governance programs are initiated to achieve certain objectives, these objectives differ
from organization to organization. So, at the end, the proper initiation of a DG program depends
on what outcome an organization is expecting, directly influencing the DG Strategy.
Saad Khan
Data Governance: A handy book for laymen.
Organizations are different in their businesses as well as in the management of data. Some
organizations are mature in a certain way than others. Organizations that have the leadership of
a CDO usually consider data at enterprise level to be managed, these organization have
developed skills of data management and are endorsing these practices in an invasive manner.
Meaning somehow they have the processes of data management but they lack in governing
these processes which results in certain misunderstanding of business vocabulary and
ownerships of these business critical issues.
The planning of a DG program consider all different aspects of data to be managed, including
highlighting missing roles, considering the maturity of the organization in data management and
the requirement of data governance measures. If an organization is missing offices of data
management including data quality, data integration, enterprise business and logical models,
the objective of DG Program will be to enable these offices, by establishing them and by defining
the term of references for these offices which take their contribution in the governance bodies.
If an organization is having some of these offices, the objective of the DG Program is to establish
the rest and create a communication plan to collaborate between these offices and define tasks
that have to be fulfilled by these offices.
A part from these governance offices and bodies, the governance program also includes the
assets those are to be governed. Once these assets are categorized and linked together, the
traceability and lineage of assets along with it lifecycle is captured to govern each and every
stage of data and information life cycle. These asset types are brief in detail in Asset Reference
Model section later in the book.
The data governance program measures the maturity level of an organization for models and
taxonomies, for data governance council, for data ownership and data stewardship and some
more to finalize the objectives for this program. The maturity matrix is categorized in the below
section.
Maturity Matrix
Maturity Matrix provides the understanding of the maturity level of an organization in terms of
different aspects of data governance. It can be defined for a particular component of
governance or could evaluate the overall program.
To evaluate the maturity for using models and common taxonomies the Model and Taxonomy
maturity matrix can be used to identify the maturity level of any organization. Similarly for
Policies and Principles, as well as Overall Data Governance Maturity Matrix.
Saad Khan
Data Governance: A handy book for laymen.
Below is the list of maturity matrix, these are still an example and been used by many DAMA
followers, these are the direct contribution of Christopher Michael Bradly, who is one of the
founder of DAMA.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
Data Governance approaches are categorized into mainly three broader categories as follows,
each having its own pros and cons. The vision and strategy of an organization finalizes the type
of approach to be chosen.
1. Centralized Approach
Saad Khan
Data Governance: A handy book for laymen.
Centralized Approach
It is much similar to a top-down approach, where the highest authority is responsible to make
decision and guide the rest to follow. Usually the person leading the data governance program is
a Chief Data Officer, Data Governance Program Director, Chief Information Officer or to simplify
this role let call it the Lead Data Governance.
Pros
1. A centralized approach makes it easier to focus on policies and guidelines
Saad Khan
Data Governance: A handy book for laymen.
Cons
1. Increases the bureaucracy because of it linear structure
Decentralized Approach
Almost the opposite of a centralized approach, there is no central governance or single authority
as a data governance program owner, rather the ownership is based on committee decision. The
committee consists of group of people from different management domain as shown below.
Saad Khan
Data Governance: A handy book for laymen.
Pros
1. Relatively easy to accomplish
3. Task and actions are divided into proper resources with delegated ownership of issue
resolution.
Cons
1. Reaching to an agreement tends to take longer
2. Due to the availability of the required resources, it is harder to commit and coordinate
with the required resources to participate in scheduled meet ups.
Saad Khan
Data Governance: A handy book for laymen.
Pros
1. Allows top-down decision making with bottom-up inputs.
4. Provides ability to focus on specific data sets at the business unit level and their
relationship with the enterprise data
5. Full autonomy to develop standards, policies, procedures for the business level
Saad Khan
Data Governance: A handy book for laymen.
Cons
1. A highly skilled DG lead position is required full-time – not an easy find
3. Decisions made at the group level will be pushed up to the upper levels for approval
4. Difficult to find balance between enterprise priorities and those of the individual
business units
5. Oversight over the autonomy of the business units can be challenging and relies a lot on
self-reporting
7. Metadata management not simple to address as it can differ widely from one unit to
another.
A data governance structure defines the layers (committees) required to compliment the agreed
DG Approach. In a data governance structure the governing, managing and performing groups
are identified and the roles required to fulfil the governance needs are addressed. These roles
are than associated with one or more of these committees to be held accountable and
responsible for the actions to be taken.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
1. ‘What is to be governed?
The What part will cover the required assets to be governed, this includes assets that
are distributed across the organization and results in decision making process.
Considering data to be an important assets is the main essence of the What part of the
operating model. The Asset reference model is described in detail below.
2. ‘How is it to be governed?
How helps the organization develop the required process essential to contribute in the
management of certain strategic objectives of the data governance program. These
processes may include asset level change processes and enterprise level process
depending on the type of requirement.
Who assembles the roles and responsibilities of the governing bodies including the
office of enterprise data management, data quality management and others
participating in the DG Program along with IT and business unit.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
Asset Types
An Asset is the capital building block in the Data Governance Center. An asset captures the
authoritative lifecycle metadata, in terms of attributes and relations with other assets, for one
of the following five classes of assets:
An Asset Type formally defines the semantics of an asset in terms of attribute types and relation
types that can be instantiated for it. In other words, it serves as a template. Therefore, all Asset
Types are specializations on of five core asset types, or ‘asset classes’ as illustrated below.
Business Assets
A type of asset that is exclusively used and governed by the business user community. Its
instance assets, and all instances instantiating its subtypes, pertain to the business organization.
Business assets typically include business concepts like Business Term, Business Process, Line of
Business, etc. that help to build the semantics of any organization with insufficient details to
build an actual business application.
Saad Khan
Data Governance: A handy book for laymen.
1. Business Dimension:
A set of reference information that categorizes and describes a Business Term in such a way that
it provides context and meaningful answers to business questions. Examples: Business Process,
Line of Business, Region, Business Capability.
2. Business Process:
A set of activities and tasks that, once completed, produce a specific result and added value to
the business. Examples: Campaign Management, Talent Recruitment & Staffing.
3. Data Category:
Also known as Data Domain or Subject area, this is a container of all the business definitions
that encompass associated terminology and definitions that an organization is trying to govern.
Examples: Master Data (Customer, Product), Reference Data.
4. Line of Business:
Also known as Business Unit or Business Area, the Line of Business is a logical element or
segment of an organization that serves a particular Business need. Examples: Asset
Management, Retail, E-com, Investment Management.
5. Business Term:
A word or phrase that is used to describe something or to express a concept in a particular kind
of language or branch of business. Examples: Customer, Person Purchase Count, Loan Amount.
6. Acronym:
An abbreviation that is used as a word. It is formed from the initial letters of a Business Term.
Examples: ERP, EDW, EAD.
7. Dashboard:
A dashboard is a data visualization that displays the current status of metrics and key
performance indicators (KPI) for an enterprise. Dashboard consolidate and arrange
numbers, metrics and performance scorecards on a single screen.
8. Measure:
Saad Khan
Data Governance: A handy book for laymen.
An asset type on which calculations (e.g. sum, count, average, minimum, maximum) can be
made. Examples: Net Sales, Top Customers, On-hand Inventory.
9. KPI:
10. Report:
A document containing information that is organized in a narrative, graphic, or tabular form. The
document is prepared on an ad hoc, periodic, recurring, regular or as required basis. Reports
may refer to specific periods, events, occurrences or subjects. They may be tailored for a specific
role and display information targeted to a single point of view or a business unit. Examples:
Monthly Financial Statement, Quarterly Marketing Expense Report.
Data Assets
A type of asset that represents details of organizational data in two layers. One layer is
independent of any particular technology for non-technical stakeholder communication. The
other one is taking the implementation system for technical stakeholder communication into
account. Examples: Data Element, Table.
1. Code Set:
An enumerated list of valid code values for a specific topic, where the code set is the
whole and the code values are parts of that whole. It is a data asset that defines the set
of permissible values to be used by other data assets. Examples: Product Code Set,
Person Gender Code, ISO 3166 Country Code.
2. Code Value:
A valid form of representation for an asset, shortened or covert. Examples: In the Person
Gender Code, "male", "female", "not known" and "not specified" are represented by the
valid code values "1", "2", "0" and "9". "US" is part of the "ISO 3166 code set" and refers
to The United States of America.
3. Crosswalk:
Mapping between two or more Code Sets.
4. Data Element:
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
Technology Assets
A piece of information technology (hardware, software, database, software platform) that helps
an organization to run a business application. Examples: Database, File.
1. Database:
A collection of data that is systematically organized or structured in order to make it
easy to create, update and query the information. Examples: Ora_DGC_V45,
SalesDB2020.
2. Directory:
An organizational structure that contains files and/or other directories. Examples:
C:/example
3. File:
A collection of data that is treated by a computer as a unit, for the purposes of input and
output. Examples: businessGlossary.xls, dataDictionary05220.csv, datacatalogv25.txt.
4. System:
Executable software that you can buy commercially off the shelf (COTS), or build
internally, to automate one or more business functions that help run a business
smoothly and efficiently. Examples: CRM, ERP, EDW.
Governance Assets
A type of asset that is used to monitor and advocates to maximize performance or utilization of
other Business and Data assets while minimizing the risk factors in alignment with
Organizational/Business goals.
Represents a criterion relevant for assessing quality and categorizes different aspects of
how data quality is measured. Examples: Accuracy, Completeness, Consistency.
Saad Khan
Data Governance: A handy book for laymen.
An arrangement between data producers and consumers with terms and conditions
including provisions concerning access and dissemination to 'pool' a set of data for
specific purposes. Examples: Sales growth information that is available only for Risk
team to generate internal reports only.
3. Issue Category:
An asset that's used to categorize related issues based on a particular criteria which will
help to assign right people to resolve them. Example:Incomplete Defintion, Inconsistent
Data.
4. Policy:
A statement of intent that is implemented by a set of rules. Policies are usually set by a
data governance council. Examples: Personal Information must be adequately
protected, Customer Data Deduplication Policy.
5. Standard:
A specific low-level mandatory action or rule that helps to enforce and support a policy.
Example: All personal information has to be encrypted with a specific encryption type.
6. Rule:
Defines or constrains an aspect of business (data) and always resolves to either true or
false. Example: Where a Business Rule may state that 'A driver must be of eligible age', a
Data Quality Rule Spec says 'age must be 21'.
7. Business Rule:
A specification that defines which conditions have to be met to measure the quality
level of a data element for its intended use. Example: SSN must be 9 digits unique
Saad Khan
Data Governance: A handy book for laymen.
identification number for 100% of US personal accounts for Tax processing. Country of
origin code should be in taxing country code set for Accounting Business area.
Issue
The parent asset type of all issues.
Issue Assets are further categorized but not limited for the following asset types:
Data Issue:
Saad Khan
Data Governance: A handy book for laymen.
The initial asset reference model provides the needed assets and their relationships with other
assets types. This helps defining the different types of asset, their ownership to the responsible
department or individual. So the business assets are contributed and taken care of by the
business unit people including data owner and data stewards. The data assets are contributes
and operationalized by the IT people, similarly for governance and technology assets. Each
established office or individual will take part in controlling the assets by using the HOW
processes defined along.
3. Other Data-related
The Data Governance Office acts as the facilitator, enabler and implementer of data
governance, the responsibility for enterprise data governance resides with executive
management. Enterprise Data Management follows from Enterprise Data Governance.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
Master Reference Data processes maintained by the Data Integration Office. The DAO is led by
the Enterprise Data Architect (EDA).
BICC
Business Intelligence Competency Center is the unit responsible for determining the information
needs of knowledge workers and other data consumers, developing appropriate data reporting
structures to meet those needs and providing them to the Data Integration Office, and
implementing self-service reporting, analysis and visualization (business intelligence) tools so
they are available throughout the organization. BICC is responsible for both EDW and BI offices.
The roles and responsibilities of individual members are stated in Roles and Responsibilities
section above.
Saad Khan
Data Governance: A handy book for laymen.
These metrics will allow the organization to measure the effectiveness of its Data Governance
meetings with the Data Governance Board, Data Governance Council and Data Stewards Council
Frequen
# Metric Description Calculation
cy
Total number of DG
=sum of DG Board meetings + DG
Total Number of DG Meetings held for all
1 Council meetings + DG Stewardship Quarterly
Meetings networks and steering
meetings + all one-off meetings
committee
Total Number of one- Total number of official = count one-off meetings for all
2 Quarterly
off meetings DG one-off Meetings networks in a period of time
Total number of
Attendance at DG =sum of participants at all meetings
6 participants at the DG Quarterly
Meetings (including one-off meetings
Meetings
Saad Khan
Data Governance: A handy book for laymen.
Average attendance at
Average attendance at =total attendance at DG Meetings /
7 DG Meetings at a point in Quarterly
DG Meetings number of meetings
time
These metrics will assess the effectiveness of the Data Governance Board, Data Governance
Council and Data Stewards Council and determine the level of participation
Total Attendance at
Total number of =sum of participants at the
3 Data Governance Quarterly
participants at the events events
Events
These metrics will allow the Enterprise Data Governance organization to assess how effective
they are in training the participants of the Data Governance Operating Model
Saad Khan
Data Governance: A handy book for laymen.
Total Number of
Number of training training curricula =count of all active
1 Quarterly
curricula created created spanning all training curriculum
Data Governance areas
=total number of
Average number of Average number of
trainings
2 training courses trainings offered at a Quarterly
offered/number of
offered point in time
measurements
Saad Khan
Data Governance: A handy book for laymen.
Data governance programs are time consuming, they require dedication, commitment and more
refinements for the organization to adopt the changes required to execute and successfully
implement the data governance initiatives. DG programs are more iterative in nature and thus
more difficult to implement with methodologies like water fall, these programs require iterative
agile approach.
The below stated methodology is designed specifically for DG Projects to be implemented with a
higher success rate considering the element of change management and adaption for the
organization.
Saad Khan
Data Governance: A handy book for laymen.
Strategy Phase
The methodology begins with a strategy phase, in this phase, the basic objective to have a DG
Program is identified, the barriers if any known will be considered. This phase will drive the
vision of this program for the enterprise. The stakeholders along with the sponsors are clear
identified. As-is assessment is conducted and a basic / initial Data governance board is identified
or formulated.
Approach
Once the strategy and objectives are identified in the strategy phase, the approach for data
governance program is finalized, the To-Be structure is proposed, the organization structure is
developed and identified. The roles and responsibilities are drafted, an initial asset reference
model is finalized and a plan for change management is developed.
Identify
In the identify phase, a starting point is validated, a business unit or function is identified to start
with, since the approach, structure and roles are already identified in the approach phase, these
are applied on the agreed business unit. The development of policies and processes are initiated
and initial data governance council is identified / formed.
Quickwin
This phase emphasis on the implementation of agreed assets and process for the initiated
business unit or a report.
This is again an iterative phase, the number of assets and their implementation can be taken
iteratively. A business unit can begin with their most important assets for example business
terms, or KPIs and later in other iterations of the quickwin phase, more assets can be added,
covering the entire assets reference model.
Saad Khan
Data Governance: A handy book for laymen.
Refine
Once the quickwin has completed its iterations, it is now time to make some noise, create
enterprise wide awareness and make ready the other departments to fully participate for the
program with in the departments. The plan for change management is enforced. More DG
trainings are scheduled and participants and asked to take part in the trainings and sessions. If
the structure has to be modified, in the refine phase these modifications are highlighted and
justified, planed and documented for later approvals and implementations.
Analysis
The analysis phase is the conclusion of change management program. This highlights the lessons
learned from the previous iteration, the need of any further modification in the requirements.
The analysis phase is directly following the approach phase for the next iteration or the strategy
phase. In case of adding more objectives in the DG program, the analysis phase is following
strategy phase where new objectives are strategized. Or otherwise the highlighted gaps or new
requirements from the stakeholders (if any) will be embedded in the approach and the new
iteration will begin.
Below is the sample of artifacts associated with each phase, its outcome and reason.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan
Data Governance: A handy book for laymen.
Responsibilities
Serve as the ultimate authority in defining Data Governance scope and strategic vision
Review and approve the budget for the new initiatives and the continuation of the
program
Review and evaluate overall Data Governance effectiveness by reviewing and publishing
of Data Governance KPIs
Review and approve the Data Governance Policies and Processes compiled by
Stewardship Leads
Escalation requests
Saad Khan
Data Governance: A handy book for laymen.
Members
Roles Member Name
Schedule
On premises meeting + Conference Call
Responsibilities
Mediate and arbitrate the resolution of escalated issues from the Data Stewards Council
Set and communicate the Data Governance strategy across the organization
Review and evaluate overall Data Governance effectiveness through Data Governance
KPIs
Review and approve the Data Governance processes created by the Stewardship Leads
Saad Khan
Data Governance: A handy book for laymen.
Escalation requests
Members
Roles Member Name
Schedule
On premises meeting + Conference Call
Weekly or Bi-weekly.
References
1. DAMA - the Data Management Association International
https://dama.org/content/body-knowledge
2. George Firican, Director of Data Governance and Business Intelligence at the University
of British Columbia.
Saad Khan
Data Governance: A handy book for laymen.
Saad Khan