Professional Documents
Culture Documents
Human wrongs
Before considering if there are any real differences between systems offering HA or DR
capabilities today, let us look at the causes of application delivery. The figure below is illuminating
in many ways, showing that human generated failures are very well represented as the primary
cause of service interruption.
Effective use of change management processes and tools, coupled with higher levels of
automation, can help reduce the instances of human error considerably.
Genuine system problems, such as network failure, physical component failure or power
outages, are much less likely to be at the heart of an interruption to service availability. The days
when hardware failure was the usual problem to be fixed are behind us as reliability, availability
and serviceability features have migrated into commodity servers.
We can look deeper into the question of whether HA and DR mean different things today than
they did even in the recent past. Until recently, the term HA was applied only to systems that
needed to function with strict limits on any interruption to service delivery, at least as perceived
by the end users or customers of the service.
DR, on the other hand, was the phrase applied to the process of getting a service up and running
again with users back working, following any form of systems / network failure or any other form
of service interruption. In extreme circumstances, DR was also commonly applied to describe
how to respond to the complete loss of a service or an entire system, or potentially recuperating
from the loss of a building, computer room or data centre.
Today, it is clear that the accepted usage and understanding of these terms has changed
significantly. This is most notable when considering the language commonly used around virtual
servers. It is quite common to see the two terms used almost interchangeably with little
differentiation. Some vendors are prone to describe the now relatively well established ability to
spin up a new virtual machine rapidly following a service degradation or interruption as HA.
It remains to be seen if the distinction between the two terms will remain meaningful in the
coming years, as virtualised systems and associated management tools become ever more
widely deployed. In some ways, the distinction between HA and DR may even prove to be helpful
to IT managers, as the chart above highlights just how difficult it has been in the recent past to
obtain approval to secure funding to protect systems against failure.
In many ways, the reluctance to provide funding for HA mirrors the problem in getting approval to
implement better systems management tools generally. It is to be hoped that, as genuine HA
systems become more affordable, organisations recognise the value of good management tools
to help ensure higher levels of service delivery and the undoubted business benefits delivered by
IT as a result.