You are on page 1of 28

Database Administration:

The Complete Guide to Practices and Procedures

Chapter 17
Disaster Planning

Agenda
The Need for Planning
General Disaster Recovery
Guidelines
Backing Up The Database for
Disaster Recovery
Disaster Prevention
Questions

What is a Disaster?
Sungard Recovery Services defines a
disaster as any unplanned, extended loss
of critical business applications due to
lack of computer processing capabilities
for more than a 48-hour period.
An alternative definition: any event that
has a small chance of transpiring, a high
level of uncertainty, and a potentially
devastating outcome.

The Need for Planning


A disaster does not have to have global
consequences in order for it to be a disaster for
you.
How a disaster might impact your business is
the sole purpose of disaster recovery planning.
Recognize likely situations:
If your business is on a coast, the likelihood of
hurricanes, floods, and tornadoes increases.
If your business is located in the north, blizzards and
severe cold weather will pose more of a risk.
California businesses are more apt to worry about
earthquakes.

Even if your organization has not yet


experienced a disaster, or is not in a
high-risk area, does not absolve you
from the need for contingency
planning.
Database disaster recovery must be
an integral component of your
overall business recovery plan.

Critical

Non-critical

Importance of the data

Risk and Recovery

Static

Dynamic

Volatility of the data

Business
Operations

IT
Operations

DBMS
Operations

Business Needs Dictate


Priorities
Very critical applications.
The most-critical applications in your organization will require current
data upon recovery.

Business-critical applications.
Important to your organization and should be the next group to
recover after the very critical applications.

Critical applications.
Differentiated from a business critical application by its immediacy or
data currency needs. This group of applications, though important,
need not be available immediately.

Required Applications.
Not critical but must be backed up such that they can be recovered at
the remote site if needed.

Noncritical applications.
Need not be supported in the event of a disaster.
Very few applications fall into this category.

General Disaster Recovery


Guidelines
Minimize downtime and loss of data.
Planning for disaster recovery is an enterprise-wide task.
The DBMS and database recovery is just one component of
an overall disaster recovery plan.
When your organization creates a disaster recovery plan, it
needs to look at all of its business functions and
operational activities:
Customer interfaces
Phone centers
Networks
Applications
Every company function that can be impacted by a disaster.

However, this lesson addresses only DBMS and databaserelated recovery issues.

The Remote Site


Off-site location to setup operations.
Must be located far enough away from
primary site so as not to be impacted
by the disaster.
You may need multiple remote sites.

Options:
Dual data centers
Backup data center
Recovery service provider

The Written Plan


The disaster recovery plan needs to be in
writing.
Should be distributed to all key
personnel.
The disaster recovery plan is a living
document.
It needs to be updated as the business and IT
environment changes.
Whenever the plan changes, be sure to
destroy all of the outdated copies of the plan
and replace them with the new plan.

The Benefits of a Written


Plan
It causes you to formulate the explicit actions to be
taken in the event of a disaster.
It makes you order these actions into specific
sequential steps.
It forces you to be specific about the tools to be used
and the exact backup information required.
It documents the location where all the required
information is stored and how it is to be made
available at the recovery site.
It provides a blueprint for others to follow, in case
those who are most familiar with the plan are not
available.

Required Components of the


Written Plan
Off-site location.
The address of the remote location(s), along with the phone number, fax number, and
address of the contact at each remote site. Additional useful details could include a list of
nearby hotels, options for travel to the recovery site, details of how expenses will be handled,
and other pertinent information.

Personnel.
The name and contact information for each member of the recovery team. Be sure to include
the work, home, and mobile phone numbers for each team member.

Authorizations.
The security authorizations necessary for the recovery operations and the personnel to whom
theyve been granted.

Recovery procedures and scripts for all system software, applications, and data.
The complete step-by-step procedures for the recovery of each piece of system software,
every application, and every database object, and the order in which they should be restored.
Part of this section should be a listing of all the installation tapes for system software as well
as the tapes for all maintenance that has been applied. Options for database recovery
procedures will be covered later in this chapter.

Reports.
List the reports you will need at the recovery site to ensure a complete recovery. The reports
should list each backup tape, its contents, when it was produced, when it was sent from the
primary location, and when it arrived at the remote site. As an additional component, include
a description of the naming conventions for the remote site backup files.

Testing Your Disaster Plans


Test the plan at least once and year and after the
following events:
Significant change in daily operations
Change in system hardware configuration
Upgrade of the DBMS (or related system software)
Loss (or hire) of personnel responsible for the recovery
Move of primary data center to a new location
Change in daily backup procedures
Addition of major new applications or significant
upgrades of existing critical applications
Major increase in the amount of data or the number of
daily transactions

Testing Goals
A disaster recovery test can discover weaknesses and errors in the
plan.
A valid disaster recovery test need not end in a successful recover
although that is the desired result. A disaster recovery test that
reveals weaknesses in the plan serves a useful purpose.
Afterward, be sure to address all problems encountered during the test

Use tests to assure the readiness of your personnel.


The best way to prepare for a disaster is to practice disaster
recovery.
The process of actually implementing the plan forces you to
confront the many messy details that need to be addressed during
the recovery process.
Testing can help to identify issues that would never spring to mind
without a test.
Testing also helps you to become familiar with the tools and
procedures you will use during an actual disaster recovery.

Scheduling a Test?
A scheduled test of the disaster recovery plan is
not a good idea.
A disaster recovery test should work more like a
pop quiz that doesnt give you the opportunity to
prepare.
One day your boss should come to work and
announce that the building was just destroyed.
Who should be called?
Is everyone available?
How can you get the right people to the remote site for
recovery?
Can you get your hands on the disaster recovery plan?

Personnel
Choosing the right team is essential.
From the perspective of the DBMS, must be capable of:
installing and configuring the DBMS system software
assuring the integration of the DBMS with other system
software components
recovering individual databases
testing the integrity of the databases
recovering related data that may not be stored in a
database
installing and configuring application software
testing the applications
taking care of the numerous details along the way.

Backing Up the Database


for Disaster Recovery
There are multiple techniques that
can be deployed.
Tape Backups
Storage Management Backups
Other Approaches

Tape Backups
You can use similar techniques as
deployed to create local backup files.
Multiple output from image copy backups:
Local
Remote

Usually not a good idea to backup indexes


for disaster recovery purposes.
Create a report of all backups
created/required.

Ship Backups & Logs OffSite

Timeline

Disaster occurs taking


down the local site.
Database modifications
applied and logged at the
local site.

Backups

Log
Log(s)

Image copy backups are


taken and one is sent offsite to the remote location.

Recovery Using Tape


Backups
Recovery at the remote site is performed for each database object
one by one.
Indexes are rebuilt or recreated after the data is recovered.

Additional preparation may be required, depending on the DBMS


and operating system(s) in use.
The system catalog will need to be recovered at the remote site
regardless of the DBMS in order to recover any database objects
at all.
Other DBMS-related files may need to be recovered after the DBMS is
installed, as well.

Keep at least three backup tapes at your remote site for each
database object.
This provides a cushion in case one or more of the image copy tapes is
damaged.

Be sure to consult the documentation provided by the DBMS


vendor for the particulars of remote site recovery for that DBMS.

Storage Management
Backups
1. Stop the DBMS to create a system-wide
point of stability for recovery.
2. Copy all of the database objects, using
storage management software to dump
complete disk volumes to tape.
3. When all of the disk volumes containing
database objects have been successfully
copied, restart the DBMS.
4. Copy the backup tapes and send them to
the remote site.

Recovery Using Storage


Management System Backups
Recovery at the remote site is
performed a complete disk volume at
a time using the storage
management software.
The biggest problem with this
approach is the requirement to stop
the DBMS.
Most organizations cannot tolerate
such an outage, due to e-business or
global 24/7 requirements.

Other Approaches
WAN for delivery of backups to the
remote site.
Remote mirroring of data to the
alternate site over the network.
Standby Database

Guidelines
Adhere to the written plan.
The DBA must be part of a multidiscipline
team for disaster recovery.
Pay attention to the order of recovery.
Understand data latency.
Remember vital data.
Beware of compression and
encryption.
Post-recovery image copies.

Disaster Prevention
Establish procedures and policies to prevent problems
in the first place.
Although you cannot prevent an earthquake or flood,
you can implement policies to help prevent man-made
disasters.
For example, use surge protectors to prevent power surges
from destroying computing equipment and have backup
generators on standby in case of electrical outages.

Document and disseminate procedures to end users


teaching them how to deal with error messages.
Guidelines can help avoid errorsand manmade disasters.

Disaster and Contingency Planning


Web Sites
http://www.thebci.org
http://www.globalcontinuity.com
http://www.sungard.com

Questions

You might also like