Professional Documents
Culture Documents
Reliability Performance Schedule of Operations Availability High Availability Continuous Operations Continuous Availability Fault tolerance
80% or more of unplanned downtime is the result of People and Processes, NOT hardware or O/S failures Application failures Software failures, errors in configurations Scheduling errors Operator errors Out of space conditions Batch prevented OLTP from being available on time Data corruption Unexpected or unplanned volumes
To address the 80%, invest money/time in: Staffing, Training Change management Problem management Job scheduling, restart procedures Page 1 of 6
The Road to High Availability some technology stuff. Minimize SPOF- Single Point Of Failure Environmental, facilities, network Web load balancers, redundant dispatchers RAID: level 5/0/1, mirroring, striping ECC data protection On site spares, hot swappable parts HA solutions, clustering, auto fail over Data Base replication, cloning Oracle Parallel Server- OPS Understand the application architecture and constraints. Understand all application dependencies and interrelationships to needed components. Reduce batch interference. Confront the backup problem. Hot backup strategies, cloning, SANs. Manage other planned changes.
Set schedule and availability expectations early. Have some functions up 24 x 7, not all.
Continuous availability cost about 3.5X as much as a standard application. (GartnerGroup) Applications are interrelated and integrated with others more than ever. Shared infrastructure elements are more common. Managing a maintenance window for each application can be exceedingly complex. A common maintenance window for infrastructure activity can be beneficial. Saves negotiating time, sets expectations
Step 1 Define the Problem A problem well defined is a problem 80% solved. For each application area, determine what the problem/goal is with the correct user representative(s) . Determine the schedule goal. Separately, determine the availability goal. Page 3 of 6
Step 2
Categorize Categorize the applications into groups. For Example. Business Support Systems Operational Support Systems Self Service/E-Commerce Management Support Systems Business Support System Mon-Fri: Sat: 6:00 a.m. to 10:00 p.m. EST 6:00 a.m. to 6:00 p.m. EST
Operational Support Systems Round-the-clock operations, such as physical plant, security, hospitals Near 24x 7 schedule Occasional Sunday morning maintenance Monthly cold backups Batch, backups non-disruptive to users Accessible about 8700 hours/year The most extended schedule
Self Service/ E-Commerce Near 24 by 7 schedule Can tolerate 1-2 hours down per night Accessible from 148 to 156 hours per week Page 4 of 6
Management Support Systems Systems used by management for such activities as reporting, queries. Same schedule as Business Support Systems
Step 3 Know the Applications Understand each applications architecture, constraints, release tolerance, flexibility to change. In-House vs. purchased. Know the applications dependencies on other applications and components. Architecture Diagrams, data flows are key. Know the Baseline What is your current SOP with respect to technology? Procedures? Testing? What is your current availability? What can you expect with existing budget? If you havent already, at least start measuring something. Identify root causes of unplanned downtime. What are infrastructure constraints on expanding schedule? Know the Costs What improvements can you make from existing budget? Training, testing, Q/A, etc. Invest in the right areas for you to expand schedule and availability. Know costs to expand schedule beyond baseline to meet goals. Know costs to increase availability beyond baseline to meet goals. The Business Case Develop a consistent approach to weigh the business benefits vs. the cost. Maintain focus on the business problem/goal. The Steering Committee or business owner(s) of the applications need to determine the business need. Its difficult to cost and plan for applications individually- categorizing may help. Page 5 of 6
Define the resources, people, budget, etc. Define ownership. Develop, document a typical plan, with goals, activities, responsibilities, dates, etc. Make it part of existing project plans
Page 6 of 6