Professional Documents
Culture Documents
contents
Page 1 of 12
Executive Summary
What is data migration?
Data migration is the process of moving data from one environment to another. It is different
from ETL where the data is moved between the existing environments. However, these ETL
processes are sometimes used to support the migration program. Data Migration is usually
supported by a variety of tools and techniques in order to accomplish the efficient migration.
Migration of data is a process of moving data from one system to another i.e. from the legacy
source systems into a new system, known as the target system or the application. It is
considered as an opportunity for evaluating and improving data quality that can have a
critical impact on business after the new application goes into production.
Data Migration is frequent in IT business and covers considerable budget and resources. The
combination of the occurrence of data migration processes and the resources consumed in a
data migration project results in significant amount of the IT budget for any company. As data
storage architecture becomes larger and more complex, data migrations are also becoming
more complex and risky. Organizations should manage their migration projects more
effectively as it involves large share of their IT budgets.
Change in one database to another, like from SQL server to Oracle database
Page 2 of 12
Page 3 of 12
Semantics risk
Semantics risk arises when the migration is done efficiently, but, some units present in
few fields are not considered, resulting to inconsistency in data. So, it may happen that
the units in the new system are different to that of the source application. Due to these
different notations of few fields, sometimes wrong data can go to the target or new
system. For example, if we have a field called amount in the source database and it is
getting migrated to base amount in target application. In the source application, this
particular field deals with the Indian rupee but at target side, the base amount field
considers US dollars. If 10 INR (Indian Rupee) is getting migrated to 10 US dollars, it is a
mistake because 10 INR can never be considered as 10 US dollars. These types of risks are
known as semantics risk.
Orchestration risk
Orchestration risk involves the order in which the processes are executed in the migration
project. It consists of multiple processes to be performed in a planned manner. Since
there are various dependencies between different business objects, the order of their
migration is very important.
Interference risk
This type of risk appears when all of the stakeholders are using the source application
simultaneously during the transition period. For example, if one stakeholder is accessing a
particular table and he locks that table, and if any other person tries to access that table,
he would not be able to do . In such situations, interference risk arises.
Page 4 of 12
Page 5 of 12
Mapping different testing techniques to risks involved in data migration process Different
testing techniques are used to mitigate the risks present in the data migration process. A
testing technique suitable for mitigating one of the risks in data migration may not be used to
resolve the issues generated from other risks. So, it is very important to know different
mitigation techniques and their suitable mappings to each of risks involved. The following
table clearly depicts the different types of risks and the best testing techniques used to
mitigate the same.
Risk
Testing Technique
Completeness type
Semantics risk
Orchestration risk
Interference risk
Target application
parameterization risk
Page 6 of 12
Reconciliation is the only test that covers almost all the data present in the system. It is
very important to test the mapping between the sources and target fields because
sometimes the mapping changes between sources to target attributes.
Reconciliation test fundamentally checks the objects missing in target database and also
the extra objects coming in the target database. After identifying these objects, fresh
code is deployed to overcome these mismatches.
Testers manually compare objects present in the source and target application by looking
at the main screen of the application.
For example, if in the source application one field is for amount and its unit is US dollars.
Testers can manually look in the target application and check whether the same field has
same unit or different.
Also, sometimes the source application fields deal with decimal points upto 2 but target
application doesnot consider any such constraints. So, it can be noticed by Quality
Assurance (QA) team member by performing testing at Graphical User Interface (GUI)
level only, as it can be identified easily.
Processability test is used to make sure that there is successful interaction between the
target application and the newly migrated data.
This test processes the migrated data that helps in identifying the inconsistencies and
incompatibilities between the migrated data and the parameterization of the target
application.
The data migration process marks the data straight into database tables and the schema
of the database is not concerned with the units of the fields.
If appearance test is used for this particular check, it will not fail, as nothing can be
inferred from the UI. But when someone processes the data, the application crashes.
These types of errors can only be found when the sampling of data and test cases are
written properly to incorporate these types of mismatches.
Integration tests are used when an application is not independent but is interlinked with
other applications as well.
If there is a change in one of the applications, its impact on the other application must be
tested. Hence, all the functionalities of the target application with the migrated data in
context of its interlinked applications must be tested.
Page 7 of 12
In this test, the execution time of the overall data migration process is which directly
influences the downtime of the application, i.e. the interruption of the business
operations.
Accordingly, the stakeholders can plan which business objects takes shorter duration and
which are taking long time to migrate, and hence, they can plan their migration process
accordingly, so that there will be minimum impact on the application.
Partial migration run tests migrates few business objects in less span of time and also speed
up the overall migration process. Hence, the development is done in lesser time and it can be
tested simultaneously. But as a consequence, there is a higher risk of discrepancies in the
migrated data. Since, in these types of tests, small set of data is used in the migration, many
a times the critical data is ignored and because of that the application crashes more
frequently.
How to mitigate the interference risks?
The interference risk is an operational risk, that cannot be addressed by testing. It must be
managed at the organizational level only.
How to mitigate the target application parameterization risk?
The parameterization risk requires determining whether all data could be migrated, or some
data is not accepted by the new application, or the data was migrated completely but the
target application crashes somewhere and does not work properly. Thus, mitigating the
parameterization risk can be covered by the combination of testing techniques used to
address the completeness, the semantics, and the corruption risks.
Page 8 of 12
Analyze the business requirements i.e. the staging area requirements and data integrity
check requirements
Identify success and failure conditions and also the application interface requirements
Performance and tuning testing of migration process by recording the execution time and
checking it against the acceptance criteria
Post-production support to eliminate any issues that may occur when the system go live
The organizations must look for the expertise in the migration area so that he can provide
better guidance for the migration process. The data migration project involves specialized
skills, tools and resources. But sometimes the resources identified for the migration
project may not have the essential knowledge to carry out the migration program.
All stakeholders must be informed in advance of the migration project so that they are
well known about the time period of the migration process, the duration the old system
will not be in use, and benefits accrued through migration of the legacy system into the
new application.
Ratify the working condition of old systems and address the issues found during migration.
Ensure that the backup of the old environment or system is taken so that if the migration
fails, the data can be reloaded or migrated again.
During migration
Always be interactive to all the end users and other stakeholders when the migration
process is in progress so that they will not raise the incident/service tickets to resolve the
issues occurring in the existing application.
The environments or platforms used for migration should not be changed during migration.
In the planning phase only, the estimations for proper backup of environments should be
considered so that the backup is taken properly and is not impacted in any chance by the
system failure.
After migration
All the failed items should be reviewed, migrated and inspected well to ensure why they
failed to migrate.
All the stakeholders should be informed about the expected time when the new system
will come into existence.
The original data and its backup should be deleted after the successful migration.
Page 9 of 12
The major objective was to test whether all the documents related to one particular
category are migrated properly to the new system or not. Since, there were huge number
of documents to be migrated to the new system, it was not feasible to check each and
every document (Completeness Risk).
The big challenge was to test the metadata of the documents migrated in the new system.
Metadata is an attribute associated with documents, such as document ID, owner, created
date, etc. without which, the document serves no purpose. So, to ensure correct data
migration, proper metadata mapping was required to integrate with new system
(Semantics Risk).
Ensuring that no other data or document had any effect or breakdown as a part of
regression in old legacy system, since, the old system is huge and a part of it was
migrated in this project. (Data Corruption Risk).
The overall volume of data migrated was very large, so, for effective testing, the whole
migration process itself presented a greater challenge.
Page 10 of 12
The first major step was to check whether all the documents in the old system are
migrated or not in the new system. To mitigate the completeness risk and ensure better
coverage, the test was applied on different types and names of the document, to ensure
all different combinations of documents were taken.
Second important check was to test the metadata associated with documents migrated in
the new system. The mapping between the source and the target
attributes/columns/fields were provided by the business. We use Data Testing Framework
(DTF), a proprietary tool of L&T Infotech, through which the Structured Query Language
(SQL) queries were built in both the old and new database and are compared for any
discrepancies. By using this tool, the large number of documents metadata were
compared in one test case. The greater coverage and the reduced testing efforts were
accomplished by the tool usage.
Integration tests were performed to ensure that new data was not corrupting the other
document, functionalities and other data sources.
For the mitigation of all the possible risks, testing was done efficiently to cover all the
risks.
3 defects fall into the category of others i.e. related to performance and security issues
Conclusion
Page 11 of 12
Creating an effective data migration testing strategy is critical to reduce risk and deliver a
successful migration. So, to ensure the same, appropriate steps were taken in the project.
This project itself presented major challenges by large volumes of data to be migrated.
Selective migration of these large volumes of data came up with its own level of challenges
such as integration tests and impact on other functionalities. Proper planning, ensuring usage
of correct mitigation techniques and extensive use of automated tools (DTF), helped us to
cope up with all the challenges and make the deliverable an out & out quality product.
Reference Material
http://en.wikipedia.org/wiki/Migration_testing
http://www.datamigrationpro.com/data-migration-articles/2009/11/30/how-toimplement-an-effective data-migration-testing-strateg.html
http://www.syntelinc.com/Solutions.aspx?id=260
Page 12 of 12