You are on page 1of 237

Storage Systems and

Business Continuity
Overview

Alan McSweeney
Objectives

• To information on SAN storage options


• Toprovide details on business continuity and disaster
recovery options

November 26, 2009 2


Agenda

• Types of Storage
• Enabling Greater Resource Utilisation Through Storage System Virtualisation
• Business Continuity and Disaster Recovery
• Systems Center Operations Manager (SCOM)
• Managing Disk Based Backup Through Storage Virtualisation Single Instance Storage
(Deduplication)
• Enabling greater Data Management Through Storage System SnapShots
• Enabling Greater Application Resilience Through SnapShot Technologies
• Enabling Greater Data Resilience Through Storage System Mirroring
• Easing the Pain of Development Through SnapShot Cloning
• Rapid Microsoft Exchange Recovery through Storage Systems Technologies
• Rapid Microsoft SQL Recovery through Storage Systems Technologies
• Rapid Recovery of Oracle DB Through Storage Systems Technologies
• Server Virtualisation and Storage
• Storage Management and Business Continuity/Disaster Recovery
• Storage Management and WAN

November 26, 2009 3


Types of Storage

• DAS

• NAS

• SAN

November 26, 2009 4


Direct Attached Storage (DAS)

• Directly attached to server


• Internal or External
• Cannot be shared with other servers

November 26, 2009 5


Network Attached Storage (NAS)

• Storage devices connected to Ethernet network


• Can be shared among servers and users
• Usually used in places of dedicated file servers
• Not for database use (In the Microsoft World)

November 26, 2009 6


Storage Attached Network (SAN)

• Hosts attached via Fibre Channel Host Bus Adaptors


• Connect to storage system via Fibre Channel Switches
• Sees pre assigned storage as dedicated free space
• Desktops access storage on local server as normal

November 26, 2009 7


Storage Attached Network

November 26, 2009 8


What Differentiates NAS and SAN?

November 26, 2009 9


What Differentiates NAS and SAN?

Storage Protocols

November 26, 2009 10


What Differentiates NAS and SAN?

Storage Protocols

• File Level — NAS


− Windows File System Share (With no Windows Servers)
− \\ServerName\ShareName

November 26, 2009 11


What Differentiates NAS and SAN?

Storage Protocols

• File Level — NAS


− Windows File System Share (With no Windows Servers)
− \\ServerName\ShareName
• Block Level — SAN
− Sees provisioned disk as its own drives and formats
accordingly. E.g. NTFS, EXT3
− F:\Directory Structure

November 26, 2009 12


File Level

November 26, 2009 13


File Level

• CIFS
− Common Internet File System
− Predominantly Windows Environments

November 26, 2009 14


File Level

• CIFS
− Common Internet File System
− Predominantly Windows Environments
• NFS
− Network File System
− Non Windows Environments
• Unix, Linux, NetWare, VMware

November 26, 2009 15


Block Level

November 26, 2009 16


Block Level

• Fibre Channel
− Uses Fibre Channel Switches
• FC-AL
• 1Gb, 2Gb, 4Gb

November 26, 2009 17


Block Level

• Fibre Channel
− Uses Fibre Channel Switches
• FC-AL
• 1Gb, 2Gb, 4Gb

• iSCSI
− Uses Ethernet Switches
• 1GB
• 10Gb

November 26, 2009 18


Storage Options — Advantages and Disadvantages

November 26, 2009 19


DAS - Pros

• Inexpensive
− Use of large capacity SCSI and SATA drives
− No added expense for controllers

November 26, 2009 20


DAS - Pros

• Inexpensive
− Use of large capacity SCSI and SATA drives
− No added expense for controllers
• Performance
− Dedicated disk array with various cache options

November 26, 2009 21


DAS - Pros

• Inexpensive
− Use of large capacity SCSI and SATA drives
− No added expense for controllers
• Performance
− Dedicated disk array with various cache options
• Skill Levels
− No new skill levels required to mange storage

November 26, 2009 22


DAS - Cons

• Captive Storage
− Storage can only be used by one server

November 26, 2009 23


DAS - Cons

• Captive Storage
− Storage can only be used by one server
• Performance
− Disk Arrays may be limited to the number of drives that can be
used

November 26, 2009 24


DAS - Cons

• Captive Storage
− Storage can only be used by one server
• Performance
− Disk Arrays may be limited to the number of drives that can be
used
− Backups can be slow and inconsistent
• Expense
− Can be expensive in terms of wasted disk space.

November 26, 2009 25


NAS - Pros

November 26, 2009 26


NAS - Pros

• Can replace file servers and introduce enterprise


resilience
− Windows, Unix

November 26, 2009 27


NAS - Pros

• Can replace file servers and introduce enterprise


resilience
− Windows, Unix
• Easily expandable
− From 36GB to over 0.5PB

November 26, 2009 28


NAS - Pros

• Can replace file servers and introduce enterprise


resilience
− Windows, Unix
• Easily expandable
− From 36GB to over 0.5PB
• Cost Effective
− Single Appliance replace multiple servers

November 26, 2009 29


NAS - Pros

• Can replace file servers and introduce enterprise


resilience
− Windows, Unix
• Easily expandable
− From 36GB to over 0.5PB
• Cost Effective
− Single Appliance replace multiple servers
• Ease of backup
− Can backup all shares from NAS appliance

November 26, 2009 30


NAS - Cons

November 26, 2009 31


NAS - Cons

• Expense
− Can be expensive relative to cost of single server

November 26, 2009 32


NAS - Cons

• Expense
− Can be expensive relative to cost of single server
• Performance
− Depending on protocol

November 26, 2009 33


NAS - Cons

• Expense
− Can be expensive relative to cost of single server
• Performance
− Depending on protocol
• Database Support
− No support for MS SQL or MS Exchange

November 26, 2009 34


NAS - Cons

• Expense
− Can be expensive relative to cost of single server
• Performance
− Depending on protocol
• Database Support
− No support for MS SQL or MS Exchange
• Skill Levels
− May require new skill sets

November 26, 2009 35


SAN - Pros

November 26, 2009 36


SAN - Pros

• High Performance
− IO/s
− Disk Utilisation

November 26, 2009 37


SAN - Pros

• High Performance
− IO/s
− Disk Utilisation
• Resilience
− SnapShots
− Mirroring
− Replication

November 26, 2009 38


SAN - Pros

• High Performance
− IO/s
− Disk Utilisation
• Resilience
− SnapShots
− Mirroring
− Replication
• Scalability
− Scales to PB

November 26, 2009 39


SAN - Cons

November 26, 2009 40


SAN - Cons

• Costs
− Initial Capital Cost
− Running Costs
− Maintenance

November 26, 2009 41


SAN - Cons

• Costs
− Initial Capital Cost
− Running Costs
− Maintenance
• Skill Sets
− New skill sets will be required

November 26, 2009 42


SAN - Cons

• Costs
− Initial Capital Cost
− Running Costs
− Maintenance
• Skill Sets
− New skill sets will be required
• Compatibility
− Most vendors require ‘Fork Lift’ upgrades

November 26, 2009 43


SAN - Cons

• Costs
− Initial Capital Cost
− Running Costs
− Maintenance
• Skill Sets
− New skill sets will be required
• Compatibility
− Most vendors require ‘Fork Lift’ upgrades
• Business Risk
− Lose the SAN and lose data from many servers
− Maximum resilience is a must
November 26, 2009 44
Which Storage Solution is Right for Me?

November 26, 2009 45


NAS or SAN?

• Depends on Application requirements


• Depends on User Requirements
• Depends on Skill Budget

November 26, 2009 46


Why Not Both NAS and SAN

• Most organisations will benefit from both NAS and SAN


• NAS for file serving and low end applications
• SANfor greater application performance, OLTP,
Exchange, SQL, Oracle
• Can be expensive
− Use multiprotocol storage systems

November 26, 2009 47


Multiprotocol Storage

Windows Server

UNIX Server
Windows Server

iSCSI
GbE switch
CIFS NFS

FCP

FC fabric

November 26, 2009 48


Multiprotocol Storage Systems

• No physical boundaries between NAS and SAN


• NAS protocols for file serving
• SAN protocols for Application Performance
• Bring enterprise functionality to NAS environment
− NAS data is no less important than SAN data
• Greater return on investment

November 26, 2009 49


SAN Basics

• SAN infrastructure (also called “fabric”) comprises the


hardware, cabling and software components that allows
data to move into and within the SAN
− Server network cards (fibre channel HBAs or Ethernet NICs)
and switches
•A disk array is a centralised storage pool for servers
• Datafrom multiple servers is stored in dedicated areas
called logical unit number (LUNs)
• Datacan be protected against data loss in the event of
multiple disk failures using RAID

November 26, 2009 50


What is RAID

November 26, 2009 51


What is RAID

• Redundant Array of Inexpensive Disks


• Allows for single or multiple drive failure
• Can increase read and write performance
− Depending on environment
• Can have an adverse affect on performance
− Depending on environment
• Dependant on RAID controller

November 26, 2009 52


Multiple RAID Levels

November 26, 2009 53


Multiple RAID Levels

• RAID 0
− No fault tolerance

November 26, 2009 54


Multiple RAID Levels

• RAID 1
− Hardware Mirror

November 26, 2009 55


Multiple RAID Levels

• RAID 4
− Single dedicated parity drive

November 26, 2009 56


Multiple RAID Levels

• RAID 5
− Distributed parity

November 26, 2009 57


Multiple RAID Levels

RAID 6 (As it should be)


− As RAID 4 but with two parity drives with separate parity
calculations. Also known as RAID Diagonal Parity, RAID DP

November 26, 2009 58


RAID 6 Overview (RAID DP)

• Description
− Diagonal-Parity RAID — two parity drives per RAID group

• Benefits
− 2000~4000X data protection compared to RAID 4 or 5
− Protects against 3 modes of double disk failure
• Concurrent failure of any 2 disks (very rare)
• 2 simultaneous disk uncorrectable errors (also very rare)
• A failed disk and an uncorrectable error (most likely)
− Comparable operational cost to RAID 4
• Equivalent performance for nearly all workloads
• Equally low parity capacity overhead supported
− Less system impact during RAID reconstruction

November 26, 2009 59


Why is RAID-DP Needed?
• ‘Traditional’ single-parity-drive RAID group no longer provides enough
protection
− Reasonably-sized RAID groups (e.g. 8 drives) are exposed to data loss during
reconstruction
• Larger disk drives
• Disk drive uncorrectable (hard) error rate

• RAID 1 is too costly for widespread use


− Mirroring doubles the cost of storage
− Not affordable for all data

November 26, 2009 60


Six Disk “RAID-6” Array

D D D D P DP

November 26, 2009 61


Simple RAID 4 Parity

D D D D P DP

3 1 2 3 9

November 26, 2009 62


Add “Diagonal Parity”
D D D D P DP

3 1 2 3 9 7

1 1

2 1 5

12
2 3 1

2 8 12

1 1 3 2

7 11

November 26, 2009 63


Fail One Drive
D D D D P DP

3 1 2 3 9 7

1 1

2 1 5

12
7

2 3 1

2 8 12

1 1 3 2

7 11

November 26, 2009 64


Fail Second Drive
D D D D P DP

3 1 2 3 9 7

1 1

2 1 5

12
7

2 3 1

2 8 12

1 1 3 2

7 11

November 26, 2009 65


Recalculate from Diagonal Parity
D D D D P DP

3 1 2 3 9 7

1 1

2 1 5

12
7

2 3 1

2 8 12

1 1 3 2

7 11

November 26, 2009 66


Recalculate from Row Parity
D D D D P DP

3 1 2 3 9 7

1 1

2 1 5

12
7

2 3 1

2 8 12

1 1 3 2

7 11

November 26, 2009 67


The rest of the block …
diagonals everywhere
D D D D P DP

3 1 2 3 9 7
1 1 2 1 5 12
2 3 1 2 8 12
1 1 3 2 7 11

November 26, 2009 68


Business Continuity and Disaster Recovery

November 26, 2009 69


Specific Business Continuity and Disaster
Recovery Requirements
• RTO — Recovery Time Objective
− How quickly should critical services be restored
• RPO — Recovery Point Objective
− From what point before system loss should data be available
− How much data loss can be accommodated
Systems Restored

RPO
RTO 2

3 1
Last System System Loss
Backup/Copy
November 26, 2009 70
Options and Issues

• Virtualised infrastructure
− Virtualise secondary and/or primary server infrastructure
• Data replication software
− DoubleTake
− WANSync
• Hardware replication

November 26, 2009 71


Possible Core Architecture 1

November 26, 2009 72


Possible Core Architecture 2

1. Core server
infrastructure
virtualised for
resilience and fault
tolerance
2. Centralised server
management and
backup
3. SAN for primary
data storage
4. Backup to disk for
speed
5. Tape backup to
LTO3 autoloader
for high capacity
6. Two-way data
replication

November 26, 2009 73


Data Backup and Recovery

1. Servers backed-up to
low cost disk - fast
backup and reduced
backup window
2. Disk backup copied
to tape - tape backup
to LTO3 autoloader
for high capacity and
reduced manual
intervention
3. Move tapes offsite

November 26, 2009 74


Resilience

• Virtual infrastructure
in VMware HA (High
Availability) Cluster
• Fault tolerant primary
infrastructure
• Failing virtual servers
automatically
restarted
• Dynamic reallocation
of resources

November 26, 2009 75


Disaster Recovery

• Failing servers can be


recovered on other
site
• Virtualised
infrastructure will
allow critical servers
to run without the
need for physical
servers
• Virtualisation makes
recovery easier —
removes any
hardware
dependencies
November 26, 2009 76
Data Replication Options

• Option 1 — Direct server replication


− Each server replicates to a backup server in the other site
• Option 2 — Consolidated virtual server backup and
replication of server images for recovery
− Copies of virtual servers replicated to other site for recovery
• Option 3 — Data replication
− Replication of SAN data to other site
• Option 4 — Backup data replication
− Replication of backup data to other site
• Each option has advantages and disadvantages

November 26, 2009 77


Option 1 — Direct Server Replication

• Install replication software


(DoubleTake, Replistor,
WANSync) on each server
for replication
• Continuous replication of
changed data
• Need active servers to
receive replicated data
• Active servers can be virtual
to reduce resource
requirements
• Replication software cost of
€3,500 per server
• Failing servers can be
restored
• Minimal data loss

November 26, 2009 78


Option 2 — Consolidated Virtual Server Backup

• Use VCB feature of


VMware to capture
images of virtual
machines
• Replicate image copies
• Recovery to last image
copy
• Low bandwidth
requirements

November 26, 2009 79


Option 3 — SAN Hardware Replication

• SAN replication at
hardware level
• Very high bandwidth
requirements - > 1
Gbps each way
• Not all SANs support
hardware replication
• Very fast recovery
• Can be an expensive
option

November 26, 2009 80


Option 4 — Replication of Backup Data

• Scripted replication of
disk backup data
• Recovery to last
backup
• Low bandwidth
requirements
• Low cost option

November 26, 2009 81


Business Focus on Disaster Recovery

• Everyyear one out of 500 data centres will experience a


severe disaster
• 43% of companies experiencing disasters never
re-open, and 29% close within two years
• 93% of business that lost their data centre for
10 days went bankrupt within one year
• 81% of CEOs indicated their company plans would not
be able to cope with a catastrophic event

November 26, 2009 82


Components of Effective DR

DR Recovery
Facility

Operational Primary
Disaster Infrastructure
Recovery Designed for
And Business Resilience and
Continuity Recoverability
Plan

Processes
And
Procedures

November 26, 2009 83


Components of Effective DR

• DR Recovery Facility — this will be the second McNamara site

• Primary Infrastructure Designed for Recoverability — this will


consist of virtualised infrastructure and backup and recovery
tools

• Processes And Procedures — this is a set of housekeeping tasks


that are followed to ensure recovery is possible

• Operational Disaster Recovery And Business Continuity Plan —


this is a tested plan to achieve recovery at the DR site

November 26, 2009 84


Server Virtualisation and Disaster Recovery

• Server Virtualisation assists recovery from disaster


• Changing disaster recovery requirements
− Higher standards are required
− More reliability is expected
− Faster pace of business generates more critical change
− Intense competitive environment requires high service levels

November 26, 2009 85


Challenges of Testing Recovery

• Hardware bottlenecks
− Need a separate target recovery server for each of the primary servers
under test
− If doing “bare metal” restore, need to locate target recovery hardware
matching exactly the primary server configurations
• Lengthy process with manual interventions
− Configure hardware and partition drives
− Install Windows and adjust Registry entries
− Install backup agent
− Before recovering automatically with the backup server
• Personnel not trained
− Complex processes and limited equipment availability make it difficult to
train personnel

November 26, 2009 86


Successful Disaster Recovery

• Ensure successful recovery


− Diligent use of a reliable backup tool
− Regular testing of recovery procedures
• Meetthe TTR/RTO (Time To Recover/Recovery Time
Objective) objectives
− Target recovery hardware available
− Alternate site available
− Processes documented and automated
• Put personnel plan in place
− Primary and backup DR coordinators designated
and trained
− Dry runs are conducted regularly

November 26, 2009 87


Why Virtual Infrastructure for DR?

• Hardware Independence
− Flexibility to restore to any hardware
• Hardware Consolidation / Pooling / Oversubscription
− Test recovery of all systems to one physical server
• Speed up recovery
− Use pre-configured templates with pre-installed OS & backup agent
• Single-step simplified capture and recovery
− Different purposes — same procedures — Staging, Deployment, Disaster
Recovery
− One step system and application recovery
− No additional licensing requirements for bare metal restore tools
− More trained personnel available

November 26, 2009 88


Disaster Recovery at Lower Cost

• Hardware / System/ Application independence


− No need to worry about the exact hardware configuration
− Flexibility to restore to any hardware
− Application independent capture and recovery processes
• Less hardware required at “hot” failover site
• Support for all capture / replication technologies
− Tape / Media
− Disk-based Back up
− Synchronous or Asynchronous Data Replication

November 26, 2009 89


Simplified Processes for Recovery

• Restore system and application data in one step


• Single-step simplified capture and recovery
− One step system and application recovery
− No Windows registry issues
− Easy-to-automate recovery
• No need for 3rd party ‘bare metal’ restore tools
− Reduce learning and ramp-up
− Reduce software licensing expense
• Use the same methodology through application lifecycle
− Staging /Deployment/ DR
• Test once — recover anything
− Application independent recovery means simplified testing

November 26, 2009 90


Virtual Hardware for Real Recovery

• Follow the usual procedure for data backup


• For recovery
− Find ONE physical server
− Install VMware ESX Server
− Copy from a template library a virtual machine with the
appropriate Windows OS service packs and the Backup Agent
pre-installed
− Register and start VM, edit IP addresses
− Restore from tape into VM using backup server

November 26, 2009 91


Compare Recovery Steps

Find hardware
Find
Configure hardware hardware

Do Once
/ partition drives etc.
Physical to Physical

Install VMware
Repeat for each box

with Templates

Physical to Virtual
Install Operating
System
Install backup agent

Repeat for each box


Adjust Registry
entries, “Single-step automatic
permissions, recovery” from backup
accounts server

“single-step automatic
recovery” from backup
server

November 26, 2009 92


Customer Options for Recovery

•1 - Physical to Physical

•2 - Physical to Virtual

•3 - Virtual to Virtual

November 26, 2009 93


Disaster Recovery with SAN Replication

• Speed up recovery in solutions based on storage replication


− No need to upgrade secondary site server hardware in lock-step with the
primary site
− Easy to automate and no need for bare metal recovery tools

November 26, 2009 94


SAN Replication Issues

• Hardware
− Synchronous — data is written simultaneously to both SANs. The write
operation is not completed until both individual writes are completed. This
will require a communications link between both sites operating at least 1
Gbps.
− Asynchronous — data is not written real-time to the backup unit. Data is
buffered and written in blocks. This will require a communications link
between both sites operating at least 2 Mbps.
• Software
− CommVault QiNetix ContinuousDataReplicator
− DoubleTake
− RepliStor
− WANSync

November 26, 2009 95


Virtualisation Resource Allocation and
Configuration Analysis
• How much resources to leave free to cater for server
failure?
HA Cluster
Server 1 Server 2

VM1 VM2 VM3 VM4 VM5 VM6 VM7 VM8

Limit Threshold Reservation Threshold Actual Usage

November 26, 2009 96


Virtualisation Resource Allocation and
Configuration Analysis
• Critical
(or all virtual servers) will be restarted on other
physical server(s)
HA Cluster
Server 1 Server 2

VM1 VM2 VM3 VM4 VM1 VM2 VM3 VM4 VM5 VM6 VM7 VM8

November 26, 2009 97


VMware Platforms and Options

• VMware Infrastructure 3 Starter NAS or local storage


− No HA, DRS, VCB
− Restrictions
• 4 processors
• 8 GB RAM

• VMware Infrastructure 3 Standard


− HA, DRS, VCB available as separate options
• VMware Infrastructure 3 Enterprise
− Includes virtual SMP, VMFS, VMotion, HA, DRS, Consolidated
Backup
• VirtualCentre

November 26, 2009 98


VMware Sample Costs

Product Rough Cost Annual Software Year 1 Total Year 2


Subscription and
Support
VMware Infrastructure 3 Starter for 2 processors €781.25 €697.27 €1,478.52 €697.27

VMware Infrastructure 3 Standard for 2 €2,929.69 €615.23 €3,544.92 €615.23


processors
VMware Infrastructure 3 Enterprise for 2 €4,492.19 €943.36 €5,435.55 €943.36
processors
VMware VirtualCenter Management Server 2 €3,906.25 €625.00 €4,531.25 €625.00

VMWare Enterprise for two 2-processor servers €12,890.63 €2,511.72 €15,402.34 €2,511.72
and VirtualCentre
VMWare Enterprise for four 2-processor servers €21,875.00 €4,398.44 €26,273.44 €4,398.44
and VirtualCentre
VMWare Enterprise for four 4-processor servers €39,843.75 €8,171.88 €48,015.63 €8,171.88
and VirtualCentre

November 26, 2009 99


Sample Configurations

• TwoESX Servers, VirtualCentre, Backup to Disk, Tape


Backup
• TwoESX Servers, VirtualCentre, Backup to Disk, Tape
Backup, Virtualised DR Facility with Replication
• Very Large Scale Implementation

November 26, 2009 100


Two ESX Servers, VirtualCentre, Backup to Disk,
Tape Backup
1. Two servers running ESX
Server — provides
resilience in the event of
server failure
2. SAN to store data
3. VirtualCentre to
administer and manage
virtual infrastructure
4. Backup to disk using low
cost disk
5. Tape backup unit

November 26, 2009 101


Two ESX Servers, VirtualCentre, Backup to Disk,
Tape Backup
1. Primary SAN data copied to
inexpensive disk — fast
backup
2. Disk backup copied to
tape/autoloader

November 26, 2009 102


Two ESX Servers, VirtualCentre, Backup to Disk, Tape
Backup, Virtualised DR Facility with Replication

1. Two servers running


ESX Server — provides
resilience in the event of
server failure
2. SAN to store data
3. VirtualCentre to
administer and manage
virtual infrastructure
4. Backup to disk using
low cost disk
5. Tape backup unit
6. Link for data replication
7. Backup virtual
infrastructure for
recovery
November 26, 2009 103
Two ESX Servers, VirtualCentre, Backup to Disk, Tape
Backup, Virtualised DR Facility with Replication

1. Primary SAN data


copied to
inexpensive disk —
fast backup
2. Disk backup copied
to tape/autoloader
3. Disk to disk copy to
DR location
4. Move tapes to
backup location

November 26, 2009 104


Two ESX Servers, VirtualCentre, Backup to Disk, Tape
Backup, Virtualised DR Facility with Replication

November 26, 2009 105


Very Large Scale Implementation

November 26, 2009 106


Very Large Scale Implementation

November 26, 2009 107


Cost Benefit Analysis

• Tangible savings
− Server purchases
− Operational costs
− Administration costs
− Power, HVAC
− Deferred cost
• Intangible savings
− Faster server provisioning
− Better utilisation
− Reduced floorspace
− Improved business continuity and disaster recovery

November 26, 2009 108


Server Operation Assumptions
Server Environmental Details
Server Watts/Hour 600
UPS Watt/Hour 25
Server BTU/Hour 2000
Server Operational Hours 8760
kWh Cost €0.10
Total kWh/Server/Year 7227
Total Electricity Cost (Server, UPS, HVAC) €722.70
Maintenance/Server €350.00
Operation Costs Per Server/Year €1,072.70

Server Tasks - Per Server Hours Before Hours After


Virtualisation Virtualisation

New Server Deployment 16 2


Build / Installs 40 10
Change / Upgrade 12 3
Configuration Changes 2 0.1
Problem Resolution 2 0.1
Rebuilding Test Servers 2 0.1
Installing Software 2 0.1
Rebooting System 2 0.1
Testing 10 0.5
Recovery 8 1

November 26, 2009 109


Sample Project Costs and Savings 1

• 16 servers to be virtualised
• Avoid 4 new servers a year
Virtualisation Project Initial Year 1 Year 2 Year 3 Total
Software €21,900.00 €6,100.00 €6,100.00 €6,100.00 €6,100.00
Hardware €16,000.00
Procurement €800.00
Project Costs €25,000.00
Server Operation €3,489.40 €3,489.40 €3,489.40
Maintenance and €12,000.00 €12,000.00 €12,000.00
Support
Server Administration €573.73 €573.73 €573.73
Total €63,700.00 €22,163.13 €22,163.13 €22,163.13 €130,189.38
Saving €120,171.68

Existing Servers Initial Year 1 Year 2 Year 3 Total


New Server Purchases €32,000.00 €32,000.00 €32,000.00
Procurement €1,600.00 €1,600.00 €1,600.00
Server Operation €22,798.00 €22,798.00 €22,798.00
Server Administration €27,055.69 €27,055.69 €27,055.69
Total €83,453.69 €83,453.69 €83,453.69 €250,361.06

Return on Investment 39 Months


November 26, 2009 110
Sample Project Costs and Savings 2

• 32 servers to be virtualised
• Avoid 6 new servers a year
Virtualisation Project Initial Year 1 Year 2 Year 3 Total
Software €29,900.00 €8,300.00 €8,300.00 €8,300.00 €8,300.00
Hardware €32,000.00
Procurement €1,600.00
Project Costs €50,000.00
Server Operation €6,978.80 €6,978.80 €6,978.80
Maintenance and €20,000.00 €20,000.00 €20,000.00
Support
Server Administration €1,147.45 €1,147.45 €1,147.45
Total €113,500.00 €36,426.25 €36,426.25 €36,426.25 €222,778.75
Saving €221,107.16

Existing Servers Initial Year 1 Year 2 Year 3 Total


New Server Purchases €48,000.00 €48,000.00 €48,000.00
Procurement €2,400.00 €2,400.00 €2,400.00
Server Operation €43,450.60 €43,450.60 €43,450.60
Server Administration €54,111.37 €54,111.37 €54,111.37
Total €147,961.97 €147,961.97 €147,961.97 €443,885.92

Return on Investment 36 Months


November 26, 2009 111
Sample Project Costs and Savings 2

• 64 servers to be virtualised
• Avoid 8 new servers a year
Virtualisation Project Initial Year 1 Year 2 Year 3 Total
Software €45,900.00 €12,700.00 €12,700.00 €12,700.00 €12,700.00
Hardware €64,000.00
Procurement €3,200.00
Project Costs €75,000.00
Server Operation €13,957.60 €13,957.60 €13,957.60
Maintenance and €25,000.00 €25,000.00 €25,000.00
Support
Server Administration €2,294.90 €2,294.90 €2,294.90
Total €188,100.00 €53,952.50 €53,952.50 €53,952.50 €349,957.51
Saving €424,141.93

Existing Servers Initial Year 1 Year 2 Year 3 Total


New Server Purchases €64,000.00 €64,000.00 €64,000.00
Procurement €3,200.00 €3,200.00 €3,200.00
Server Operation €82,610.40 €82,610.40 €82,610.40
Server Administration €108,222.75 €108,222.75 €108,222.75
Total €258,033.15 €258,033.15 €258,033.15 €774,099.44

Return on Investment 30 Months


November 26, 2009 112
SAN Options and Vendors

November 26, 2009 113


SAN Vendors

• Dell/EMC
− AXnnn - iSCSI
− NSxxx — IP
− CXnnn — Fibre Channel
− DMX
− Centera
• IBM
− DS series
− N Series — multi-protocol
• HP
− MSA
− EVA
− XP

November 26, 2009 114


System Center Operations Manager (SCOM)

November 26, 2009 115


SCOM Configuration

November 26, 2009 116


SCOM Components
Component Description
SCOM Database A Microsoft SQL Server database that stores configuration
information and operations data that is produced by the
monitoring process.
SCOM Management Server A computer that is responsible for monitoring and managing
other computers. The SCOM Management Server consists of the
Data Access Server, and the SCOM Server and SCOM Agent
components. The SCOM Management Server is an essential part
of a management group.
Data Access Server (DAS) A COM+ application that manages access to the SCOM
database.
SCOM Server A component that manages the SCOM Agents that monitor
computers in a MOM environment.
SCOM Agent A component that monitors and collects data from a managed
computer.
SCOM Reporting Database A SQL Server database that collects and stores the operations
data contained in the SCOM Database.
User interfaces The Administrator console and Operator console installed by
default when you install SCOM.
Management Pack A specific extension that provides for the monitoring of a given
service/application
November 26, 2009 117
SCOM Deployment Options

• Agentless Monitoring
− SCOM monitors agentless servers. This is aimed at IT
environments where agents could not be installed on a few
exception nodes. Agentless monitoring is limited to status
monitoring only.
• Agent Support
− Agents are installed on servers. SCOM lets you manage
applications running on servers.
• Server Discovery Wizard
− Allows for server lists to be imported from Active Directory,
from a file, or from a typed list. It also allows the list to be
filtered using LDAP queries, as well as name— and domain
name—based wildcards.

November 26, 2009 118


Architecture

November 26, 2009 119


SCOM Rule: Unit Of Instruction/Policy
• Event Rules • Performance Rules
− Measuring
− Collection rules
− Threshold
− Filtering rules
• Alert Rules
− Missing event rules
− Consolidation rules
− Duplicate Alert Suppression

Rule

Provider Criteria Response Knowledge


 NT event log •Where  Alert • Product
 Perfmon data source=DCOM  Script Knowledge
 WMI and Event  SNMP trap • Links to Vendor
 SNMP ID=1006  Pager • Company
 Log files  E-Mail Knowledge
 Syslog  Task • Links to
Centralised
 Managed Code Company
 File Transfer knowledge
November 26, 2009 120
SCOM Database
• The SCOM database is a single authoritative source of all
Configuration in a Management Group

− Rules, Overrides
− Scripts
− Computer attributes
− Views
− SCOM Server and Agent Configurations
− Nested Computer Groups
− Extensible schema for classes, attributes and associations

November 26, 2009 121


UI Consoles
• Operator Console
− To create and display view instances,
Update Alerts
• User Customizable Views
• Views can be organized in a folder
hierarchy
• Context Sensitive tasks
− Multipane View
• Administrator Console
− One MMC Snapin per management
group
− Rules Node — To author, view,
modify, Export/Import rules
− Config Node — To configure SCOM
• Web Console

November 26, 2009 122


SCOM Console Views

• State View - Provides you with a real-time, consolidated look at the health of
the computers within the managed environment by server role, such as Active
Directory domain controllers, highlighting the systems that require attention.
• Diagram View - Gives you a variety of topological views where the existence
of servers and relationships are defined by management packs. The Diagram
View allows you to see the status of the servers, access other views, and
launch context-sensitive actions, helping you navigate quickly to the root of
the problem.
• Alerts View - Provides a list of issues requiring action and the current state and
severity of each alert. It indicates whether the alerts have been acknowledged,
escalated, or resolved, and whether a Service Level Agreement has been
breached.
• Performance View - Allows you to select and display one or more
performance metrics from multiple systems over a period of time.
• Events View - Provides a list of events that have occurred on managed servers,
a description of each event, and the source of the problem.
• Computers and Groups View - Allows you to see the groups to which a
computer belongs, the processing rule groups with which it is associated, as
well as the attributes of the computer.

November 26, 2009 123


SCOM and SQL

November 26, 2009 124


The SCOM Administrator Console

November 26, 2009 125


SCOM Management Packs

• SCOM management packs provide built-in, product-specific


operations knowledge for a wide variety of server applications
• Management packs contain rules for monitoring an array of
server health indicators and creating alerts when problems are
detected or reasonable thresholds are exceeded
• Monitoring capability is extended by knowledge base content,
prescriptive guidance, and actionable tasks that can be
associated directly with the relevant alerts included in the
management packs
• Administrators can then act to prevent or correct situations, such
as degraded performance or service interruption, maintaining
service availability with greater ease and reliability

November 26, 2009 126


SCOM 2005 Management Packs
• Standard Management Packs
− Exchange 2000 and 2003 Server • Tier 2 Management Packs
− Internet Information Services − Windows Update Services
− SCOM 2005 and SCOM 2000 − Virtual Server 2005
Transition − Web Services
− Security (MBSA) − Application Center 2000
− Terminal Services
− SQL Server 2000
− DHCP
− Windows Active Directory
− Remote File Systems
− Windows Server Cluster − Print Server
− Windows DNS
− Windows Server (2000, 2003,
NT4)

November 26, 2009 127


Management Packs

• Management Pack imported via SCOM Server


• Discovery finds computers in need of a given Management Pack
• SCOM deploys appropriate Management Packs
− No need to touch managed nodes to install
Management Packs
• Rules: Implement all SCOM monitoring behavior
− Watch for indicators of problems
− Verify key elements of functionality
• Management Packs provide a definition of
server health

November 26, 2009 128


Management Pack Features
• Alerts:
Alerts Calls attention to critical events that require administrator intervention
− Product Knowledge: Provides guidance for administrators to resolve outstanding alerts
• Views:
Views Provide targeted drill down details about server health
− Performance plots, collections of specific events/alerts, groups of servers , topology, etc.
• State Monitoring:
Monitoring At a glance view of the state of my servers and applications by
server role
− Detail to component level
• Tasks:
Tasks Enable administrators to investigate and repair issues from the SCOM console
− Context sensitive diagnostics and remediation
• Reports:
Reports Historical data analytics
− Assess operations performance and capacity planning

November 26, 2009 129


Alert Handing and Viewing

• When a new alert is identified it will appear in the Alert Pane


with a resolution state of “New”

• If you highlight that alert its details will appear in the Alert detail
Pane

• Clicking on the “Properties” tab in the Alert Detail Pane will give
you the description (and other details) of the alert

• The alert can be classified as:

− False Negative
− Hardware Issue
− Non Hardware Issue

November 26, 2009 130


Alert Handling

November 26, 2009 131


SCOM VMware Management Pack Integration

November 26, 2009 132


SCOM and nWorks Management Pack

• nworks Collector is referred to


as VEM (Virtual Enterprise
Monitor)
• The VEM server can be a
virtual server to reduce cost

November 26, 2009 133


Enabling Greater Resource Utilisation Through
Storage System Virtualisation

November 26, 2009 134


What is “Storage Virtualisation”?

• Abstracted Physical Storage


• Storage Pools Created from Physical Blocks of Storage
• Virtual Disks created from Storage Pool
• PhysicalDevices and Capacity Distribution Transparent
to Servers and Applications

November 26, 2009 135


Why Is Storage Virtualisation so Critical?

November 26, 2009 136


Opposing Forces on Volume Size

Smaller Gives Control Bigger Gives Efficiency

 Different classes of data  Disks growing


 Different management  ATA growing faster
requirements  More disks for performance
 Tools work on volumes  RAID-DP
(Snapshots, etc)

November 26, 2009 137


The Problem: Volumes Tied to Disks

What we’ve got today:


• Small volumes are impractical
• Large volumes are hard to manage

What we’d like:


• Manage volumes separately from physical disks
• Volumes for data; aggregates for disks

November 26, 2009 138


Virtualisation Improve Utilisation
Logical Drive 1 = 2 Disks
Logical Drive 2 = 8 Disks
Logical Drive 3 = 3 Disks
1 Hot spare

14 x 72 GB disks = 1 TB capacity
Data Parity Data Data Data Data Data Data Data Parity Data Data Parity Spare

VolGB
140 0 Database
370 GB Home
40 GB
Directories

550 GB of wasted space


November 26, 2009 139
The Solution: Flexible Volumes (FlexVol)

• Aggregate contains the


physical storage
Flexible Volumes • FlexVol: no longer tied to
physical storage
• FlexVol: multiple per
aggregate
• Storage space can be easily
reallocated
Disks Disks Disks

Storage Pool

November 26, 2009 140


Storage Pools and Flexible Volumes
How Do They Work?
Flexible Flexible Flexible • Create RAID groups
Volume 1 Volume 2 Volume 3

vol1 vol2
• Create Storage Pool
vol3

• Create and populate each


flexible volume
Storage Blocks • No pre allocation of
blocks to a specific
volume
RG1 RG2 RG3
• Storage System allocates
space from pool as data is
Storage Pool written
November 26, 2009 141
Flexible Volumes Improve Utilisation
Logical Drive 1 = 144GB
Logical Drive 2 = 576GB
Logical Drive 3 = 216GB

1 Hot spare

14 x 72 GB disks = 1 TB capacity
Data Data Data Data Data Data Data Data Data Data Data Parity Parity Spare

400 GB used

Aggregate
Vol0 Database Home Dirs

600 GB of Free Space!


November 26, 2009 142
Flexible Volume Data Management Benefits

• Distinct containers (volumes) for distinct datasets


• Flexible Volumes resize to meet space requirements, simple
command to adjust size (grow / shrink)
• Soft allocation of volumes and LUNs
• Free space flows among all Flexible Volumes in a storage pool;
space reallocation without any overhead
• Flexible Volumes can be:
− SnapManaged independently
− Backed up independently
− Restored without affecting other Flexible Volumes

November 26, 2009 143


Compare Benefits
Flexible Volumes Legacy SAN
Space Allocation Flexible and dynamic  Preallocated and static
 Volumes can be grown  Space is preallocated
and shrunk during configuration
 Space can’t be shrunk

Management  Simple  Complex

Spindle  Automatic sharing of  New spindles are only


Sharing spindles among all used when volumes are
volumes, including newly expanded
added disks  Optimal configuration is a
daunting task
(sliced, striped, etc.)

November 26, 2009 144


Compare Benefits
Flexible Volumes Legacy SAN
Granularity  Volumes can be grown  More granularity comes at
and shrunk in small the expense of
increments (1MB) without performance or
performance or management
management impact
Disruption  Growing and shrinking are  Shrinking is not possible;
nondisruptive and growth involves reshuffling
instantaneous operations of data
 Often involves downtime and
data copying
Rapid  FlexClone™ is immediate  Business continuance
Replication  No performance implications volumes involve physical
 Large space savings for replication of the data
similar volumes  No space savings

November 26, 2009 145


Flexible Volumes: Enabling Thin Provisioning

LUNs Flexible Volumes:


10 TB Application-level
soft allocation
 Container level:
800 GB  flexible provisioning
 Better utilisation
Container-level
soft allocation  Application-level:
150
200
GB
300  Higher granularity
GB GB
FlexVols: 1 TB
 Application over-
100 2TB 50
allocation containment
200
GB GB
GB
 Separates physical
allocation from space
visible to users
 Increases control of
space allocation
Physical Storage: 1 TB
November 26, 2009 146
Managing Complexity through Storage
Virtualisation

November 26, 2009 147


Unified Management

• Storage management and administration is very vendor


specific
• Mostvendors require different skills for different storage
systems
• Hardware is not cross compatible

November 26, 2009 148


The Unified Storage Architecture Advantage
HP, EMC, DELL, IBM Storage Virtualisation

Platforms

Incompatible silos Compatible family

Software &
Processes

Incompatible software; Unified software;


different processes Same processes
Experts &
Integration
Services
Lots of experts and Reduced training & service
integration services requirements
November 26, 2009 149
Virtual Storage Environment / EMC — Comparison
Virtualisation:
Architectural Simplicity
8 - MS Win 5 - Enginuity Multiple Concurrent Protocols
Integrated Mgmnt, DR, BC, ILM, D2D, …
2 - FLARE
EMC FC Limited iSCSI
Support
Virtual Gateway
AX150/S CX3-10 CX3-20 CX3-40 CX3-80 DMX Series Virtual Gateways
1 - FLARE OE HP, IBM, HDS, SUN
8 - MS Win
3 - Dart
EMC IP CentraStar - 6 The EMC Effect? - Complexity
4 - RHEL • 8 Dissimilar Operating Systems
NS350 NS40 NS80 2 - FLARE Centera • 8 Dissimilar Mgmnt GUI’s
• Dissimilar DR, BC, …
iSCSI Only • ILM required
AX150i CX300i

NS40G NS80G NSX


Celerra
Symmetrix / DMX and CX ONLY

External server w/MS Win and CLARalert required to support CX dial/email home support (compare to AutoSupport).

November 26, 2009 150


Managing Disk Based Backup Through Storage
Virtualisation Single Instance Storage (Deduplication)

November 26, 2009 151


Backup Integration

Backup and Recovery Software


Snapshot
Snapshot
and
and Snapshot
Snapshot Restore
Restore Disk
Disk Based
Based Target
Target

Changed
Primary Data Blocks

9AM
Snapshot
12PM Snapshot
Block-Level
3PM Snapshot Backups
Instant Secondary
Primary Secondary
Primary Recovery
Storage
Storage Storage
Storage
Client Drag-and-Drop Restores

Short-Term Mid- to Long-Term


Local Snapshot Copies Disk to Disk

November 26, 2009 152


Advanced Single Instance Storage

User1 presentation.ppt User2 presentation.ppt

20 x 4K blocks Identical file 20 x 4K blocks


= Identical blocks

Data Written to Disk:

With ASIS: 38 blocks

Without ASIS: 75 blocks

User 3presentation.ppt User4 job-cv.doc

Edited, 10 x 4K Different file


8 new 4K blocks
November 26, 2009 153
Enabling greater Data Management Through
Storage System SnapShots

November 26, 2009 154


Snapshots Defined

• A Snapshot is a reference to a complete point-in-time image of the volume’s


file system, “frozen” as read-only.

• Taken automatically on a schedule or manually

• Readily accessible via “special” subdirectories

• Multiple snapshots concurrently for each file system, with no performance


degradation.

• Snapshots replace a large portion of the “oops!” reasons that backups are
normally relied upon for:
− Accidental data deletion
− Accidental data corruption

• Snapshots use minimal disk space (~1% per Snap)

November 26, 2009 155


Snapshot Internals - As They Should Be

Active File System Snapshot

File: FILE.DAT File: FILE.DAT

System writes modified data block to


new location on disk (C’)

A B C C’
Disk blocks

• Client modifies data at end of file


• Data actually resided in block C on
disk
November 26, 2009 156
Snapshot Internals

Active File System Snapshot

File: FILE.DAT File: FILE.DAT

A B C C’
Disk blocks

Active file system version of FILE.DAT is now composed of disk


blocks A, B & C’.
Snapshot file system version of FILE.DAT is still composed of
blocks A, B & C

November 26, 2009 157


Snapshot-Based Data Recovery

User is offered this most


recent previous version
(and up to 255 older
versions)

User may drag


any of these
read-only files
back into
active service

November 26, 2009 158


Snapshots are State-of-the-Art Data Protection

 Snapshots should be near instantaneous!


 To create a point-
point-in-
in-time Snapshot copy
requires copying a simple data structure,
not copying the entire data volume

 Additional storage is expended incrementally


 only for changed blocks
 only as data changes, not at Snapshot creation time

 Avoids the significant costs associated with the I/O bandwidth, downtime, CPU
cycles dedicated to copying and managing entire volumes

November 26, 2009 159


Not all Snapshots Are Equal
Questions to ask regarding storage system data copy techniques:
• What is the disk storage requirement to maintain online data copies?
• Will a planned or unplanned or "dirty" system shutdown lose existing data copies?
• What is the overall performance impact with snapshots enabled?
• How many data copies can be maintained online?
• Is the reserve area fixed? Can this "save area" be re-sized on the fly?
• Are data copies automatically deleted once the save area is full?
• What is the answer to file system recovery? Do they feature a SnapRestore-like
capability?
• Are snapshots a chargeable item? How much? What is the pricing model?
• Is this snapshot method supported across the vendor's entire product line?

November 26, 2009 160


Enabling Greater Application Resilience Through
SnapShot Technologies

November 26, 2009 161


SnapRestore Recovery

X
Active File System Active File System
Snapshot
snap
restore

1 2 … N 1’ 2’ … N’

Marked as free blocks


after Snapshot Restore

November 26, 2009 162


Database Recovery

Corruption !
Snapshots 15:22

1 2 3 4 5 6 7 8 9

9am 10:00 11:00 12:00 13:00 14:00 15:00 16:00 5pm

Snapshot restore

November 26, 2009 163


Enabling Greater Data Resilience Through Storage
System Mirroring

November 26, 2009 164


Storage Mirroring

• Storage Mirroring
− Synchronous
− Semi Synchronous
− Asynchronous

November 26, 2009 165


Storage Mirroring Defined

• Replicates a filesystem on one storage system to a read-only


copy on another storage system (or within the same storage system)

• Based on Snapshot technology, only changed blocks are copied


once initial mirror is established

• Asynchronous or synchronous operation

• Runs over IP or FC

• Data is accessible read-only at remote site

• Replication is volume based

November 26, 2009 166


SnapMirror Function
Step 1: Baseline Source Target
SAN or NAS Attached hosts
Baseline copy
LAN/WAN
…... of source volume(s)

Immediate Write Acknowledgement OR

Step 2: Updates
Source Target
SAN or NAS Attached hosts
Periodic updates
LAN/WAN
…... of changed blocks

Immediate Write Acknowledgement

November 26, 2009 167


Storage Mirroring Internals
Source Volume Target Volume

Snap A
Baseline
Transfer

November 26, 2009 168


Storage Mirroring Internals
Source Volume Target Volume

Source file system


continues to change Target file system is now
during transfer consistent, and a mirror of the
Snapshot A file system

Common
snapshot

Snap A
Baseline
Transfer
Completed

November 26, 2009 169


Storage Mirroring Internals
Source Volume Target Volume

Target volume is now


consistent, and a mirror of the
Snapshot B file system

Snap B Incremental
Transfer
Completed

Snap A

November 26, 2009 170


Storage Mirroring Internals
Source Volume Target Volume

Target volume is now


consistent, and a mirror of the
Snap C file system

Snap C
Incremental
Transfer
Completed

November 26, 2009 171


Storage Mirroring Applications

• Data replication for local read access at remote sites


− Slow access to corporate data is eliminated
− Offload tape backup CPU cycles to mirror

• Isolate testing from production volume


− ERP testing, Offline Reporting

• Cascading Mirrors
− Replicated mirrors on a larger scale

• Disaster recovery
− Replication to “hot site” for mirror failover and eventual
recovery
November 26, 2009 172
Data Replication for Warm Backup/Offload

• For Corporations with a warm backup site, or need to offload backups from
production servers
• For generating queries and reports on near-production data

Backup Site

MAN/WAN

Tape
Library

Production Sites
November 26, 2009 173
Isolate Testing from Production

• Target can temporarily be made read-write for app testing, etc.


− Source continues to run online
− Resync forward after re-establishing the mirror relationship
Production Backup/Test

Snap C Incremental

X Transfer

SnapMirror
Resync

READ & WRITE READ & WRITE

November 26, 2009


(Resync backward works similarly in opposite direction) 174
Cascading Mirrors

• Allows a target volume to be a source to other targets


• Each target operates on an independent schedule
• Replicate data up to 30 destinations

Source NS Target NS Target NS Target NS

SnapMirror SnapMirror SnapMirror

Source Volume Target Volume Target Volume Target Volume


(read + write) (read only) (read only) (read only)

November 26, 2009 175


Cascading Replication - Example

• Replicate to multiple locations (30) across the continent


− Send data only once across the expensive WAN
− Reduces resource utilisation on source NS

Office 1 Office 2 Office 3

WAN
Office 4

Office 5

November 26, 2009 176


Disaster Recovery
• For any corporation that cannot afford the downtime of a full restore
from tape. (days)
• Data Centric Environments
• Reduces “Mean Time To Recovery” when a disaster occurs.

(redirect)

X
Production Site
LAN/
WAN

Disaster Recovery Site

(resync backwards after source restoration)


November 26, 2009 177
Easing the Pain of Development Through
SnapShot Cloning

November 26, 2009 178


Cloning SnapShots

• Write enables SnapShots


• Enables
multiple, instant data set clones with no storage
overhead
• Provides
dramatic improvement for application test and
development environments
• Renders alternative methods archaic

November 26, 2009 179


Cloned SnapShot Volumes: Ideal for Managing
Production Data Sets
• Error containment
− Bug fixing
• Platform upgrades
− ERP
− CRM
• Multiple simulations against a large data set

November 26, 2009 180


Volume Cloning: How It Works

Volume 1  Start with a volume


Snapshot™
Copy of
 Create a Snapshot copy
Volume 1

 Create a clone
Volume 2
(a new volume based on
(Clone) the Snapshot copy)
Snapshot Copy  Modify the original vol
Data Written  Modify the cloned vol
to Disk:
Result:
Volume 1
Changed Blocks Independent volume copies,
Cloned Volume efficiently stored
Changed Blocks
November 26, 2009 181
Volume Splitting

 Split volumes when most


Volume 1
data is not shared
Snapshot™
Copy of
Volume 1
 Replicate shared blocks in
the background

Volume 2

Result:
Easily create new
permanent volume for
forking project data
November 26, 2009 182
The Pain of Development
1.4 TB Storage Solution
Prod Volume (200gb)

Sand Box Volume (200gb)

Pre-Prod Volume (200gb)


QA Volume (200gb)

Dev Volume (200gb)


Test Volume (200gb)

Create copies of the volume


Requires processor time and Physical storage

200 GB Free
November 26, 2009 183
Clones Remove the Pain
1.4 TB Storage Solution
Prod Volume (200gb) Test Volume

Dev Volume QA Volume

Pre-Prod Volume Sand Box Volume

1 Tb Free
Create Clones of the Volume – no additional space required
Start working on Prod Volume and Cloned Volume
Only changed blocks get written to disk!

November 26, 2009 184


Ideally…

Mirror

Secondary
Primary Production
Array
Array

Create Clones from the Read Only mirrored volume

Removes development workload from Production Storage!

November 26, 2009 185


Rapid Microsoft Exchange Recovery through
Storage Systems Technologies

November 26, 2009 186


Why use Storage Systems Series for Exchange Data?

Just a few off the top…

 Snapshot copies “snapshots”


 Data and snapshot management, replication
 Flexible and easy, dynamic provisioning
 Performance
 iSCSI, cost effective and gaining on Fibre Channel
 Excellent high-end FCP, clustering and MPIO options
 Tight Windows OS (incl. MSCS) and Exchange 5.5., 2000, 2003 and 2007
Server integration (SME, VSS on Windows 2003, etc.)

November 26, 2009 187


Required Storage Software for Exchange

• SnapShot Management
− Rapid online backups and restores–integrates with Exchange backup
API; runs ESEFILE verification; automates log replay
− Intuitive GUI and wizards for configuration, backup, and restore
• Server Based Connection Manager
− Dynamic disk and volume expansion
− Supports both Ethernet and Fibre Channel environments
− Supports MSCS and NS Series CFO for high availability
• Single mailbox recovery software
− Restores single message, mailbox, or folder from a Snapshot™ backup to
a live Exchange server or a .pst file

November 26, 2009 188


Effective SnapShot Management with Exchange

• Manages the entire snapshot backup process


• Backup and restore Exchange storage groups
• Backups may be scheduled
• Each backup is a “full” Exchange backup and is verified
using MS provided software, which is integrated into the
storage system

November 26, 2009 189


SnapShot Management with Exchange Overview

• Interacts with Exchange using Exchange backup APIs


• interacts with VSS
− SnapShot Management is VSS requestor
− Exchange is VSS writer
− Storage System is VSS hardware provider
• Provides point-in-time and up-to-the-minute recovery using
snapshots and Exchange database transaction logs

November 26, 2009 190


SnapShot Mirroring

• SnapShot Mirroring
− Automatic mirroring of Exchange data to remote site
− Volume based mirroring
− Occurs immediately following a Exchange backup and is
initiated by Exchange Server
− Can replicate over LAN or WAN
− Only changed blocks since previous mirror are replicated
− Rate of replication can be throttled to minimize impact on
network

November 26, 2009 191


Single Mailbox Recovery

• Allows restores of individual items form Exchange backups in


minutes compared to hours or days
• Single mailbox recovery is the most requested feature by
Exchange customers

November 26, 2009 192


Single Mailbox Restore (Exchange)

• PowerControls Software
− Quickly access Exchange data already stored in the online
snapshot backups
− Select any data, down to a single message
− Restore the data to one of two locations:
• An offline mail file (.PST personal storage file) which can be o
opened
pened in
MS Outlook
• Connect to a live Exchange server and copy data directly into thethe users
mailbox, making it instantly available

November 26, 2009 193


Exchange Single Mailbox Restore (SMBR)

November 26, 2009 194


Current Alternatives: Inadequate

• Perform daily brick level backups


− Pros
• Allows quicker recovery of a single mailbox
− Cons
• Backs up each mailbox separately; one message sent to a 100 people will be
copied 100 times
• Very time and disk intensive
• Impractical to have frequent backups
• Brick level backup software is expensive
• Have a dedicated recovery server infrastructure
− Pros
• Reduces the time to recover a single mailbox by eliminating the need to setup a
recovery server each time
• Eliminates brick level backups
− Cons
• Still very time and labor intensive (many hours)
• Requires additional hardware investments

November 26, 2009 195


SMBR and SnapShot Management
Primary Data Center
• SnapShot
backs up
Exchange in
seconds with
snapshots
• SMBR restores
Single Mailbox
Recovery Software individual
mailboxes
from snapshots
in minutes

Restore mail box

Time to restore: minutes


November 26, 2009 196
SMBR: Features

• Reads contents of Exchange Information Store without an


Exchange server
• Extracts mail items at any granularity from an offline copy of the
Exchange Information Store (E5.5, E2K, & E2K3)
− Folder
− Single mailbox
− Single message
− Single attachment
• Restores single mail items to a production Exchange server,
alternate server or to an Outlook PST file.
• Advanced search and retrieval
− Search subject or message body; keyword, user, or date

November 26, 2009 197


SMBR: Benefits

• Dramatically
reduces the time required for single
mailbox and single message recovery
− From hours or days to just minutes
− Simplifies the most dreaded task by Exchange administrators
• Eliminatesthe need for expensive, cumbersome and
disk-intensive daily brick level backups
• Eliminates the need for recovery server infrastructure
• Allows easy search and discovery of email messages
and attachments

November 26, 2009 198


Rapid Microsoft SQL Recovery through Storage
Systems Technologies

November 26, 2009 199


SnapShot Management with SQL Server

Application consistent data


management

November 26, 2009 200


SnapShot Management with SQL Server

• Provides
integrated data management for SQL Server
2000 and SQL Server 2005 databases
− Automated, fast, and space-efficient backups using Snapshots
− Automated, fast, and granular restore and recovery using
SnapShot restore technologies
− Integrated with storage system Mirroring for database replication
• Providestight integration with Microsoft technologies
such as MSCS, Volume Mount Points.

November 26, 2009 201


SnapShot Management with SQL Server —
Required Features
Features Benefits
Rapid hot backup and restore times • Maximizes SQL database availability and helps meet
stringent SLAs
• Helps organizations recover from accidental user induced
errors or application misbehavior
• Minimizes SQL downtime and thus reduces cost
• Increases the ability of SQL Servers to handle large number
of databases and/or higher workloads.
Hot backups to Snapshot copies • No performance degradation during backups

Configuration, Backup, and Restore • Ease of use


wizards with standard Windows GUIs • Virtually no training costs
• Cost savings
MSCS Support • High availability and enhanced reliability of SQL Server
environment
Clustered Failover • Further enhances availability of SQL Server
Storage Mirroring Integration • Increases SQL Server’s
Server s availability — can replicate the
database to a secondary storage system for faster recovery in
case of a disaster

November 26, 2009 202


SnapShot Management with SQL Server —
Required Features
Features Benefits
Online disk addition (storage • Increases SQL Server’s
Server s availability -- additional storage can
expansion) be added without bringing the SQL Server down

Volume Mount Point Support • Support for Volume Mount Points in order to eliminate the
limitation with drive letters
Native x64 support • Supports 64bit natively on AMD64/EM64T

November 26, 2009 203


SnapShot Management for SQL Server (SMSQL)

DBA:
• Ability to backup DB faster with fewer resources and without any
storage knowledge
• Reduces Mean Time to Recovery on failure
− Quick Restores
− More frequent backups  Less logs to replay  Faster Recovery

Storage Admin:
• Ability to backup and restore DB without any DB knowledge
• Space, time & infrastructure efficient backups, restores and clones
• Increased productivity and storage utilization

November 26, 2009 204


Technical Details — Consolidated SQL Server
Storage
Primary Data Center Consolidate SQL Server storage on
1 storage system

2 Add disks and expand


volumes on the fly without
SQL Server downtime
3 Cluster for higher
availability

iSCSI or FCP

1 Benefits:
2 • Simplified, centralized management
• Shared storage for improved utilization
• Better system availability

November 26, 2009 205


Technical Details — Simplified Backup » More
Frequent Backups
Primary Data Center SnapManager automates data
1 management for SQL Server

2 Snapshots for near-


SQL Server instantaneous backups
3 Backup multiple databases
simultaneously
1
iSCSI or FCP

Benefits:
• Eliminate backup windows
• Automation reduces manual errors
• More frequent backups reduce data
3 loss
2
• No performance degradations

Snapshots

Time to backup: seconds

November 26, 2009 206


Technical Details — Rapid Restores » Less
Downtime
Primary Data Center Near-instant restore from online
1 snapshot

2 Automated log replay for


current image
SQL Server
3 Restore single or multiple
4 databases
Standby 4 Rapid failover to standby
iSCSI or FCP Server
server

Benefits:
• Fast and accurate restoration of SQL
Server
3 1 • Reduce downtime from outages
Snapshot • Automation saves administrative
Roll transaction logs 2 time
Time to restore: minutes

November 26, 2009 207


Technical Details — Simple & Robust Disaster
Recovery
Storage Mirroring replicates SQL
Primary Data Center DR Site 1 Server data to remote location

Replicate over existing


IP networks
Failover
DB Server
After Failure
2 2 Failover to DR site

3 Rebuild primary site


iSCSI or 3 iSCSI or from DR site
FCP FCP Benefits:
• Ensures business continuance
IP • Minimizes length of outages
network
• Cost effective — efficient use of existing
IP network

1 System
Mirroring

November 26, 2009 208


Technical Details — Volume Mount Point (VMP)
Support
• Drive letter limitations in SMSQL
− Only 26 available drive letters in a system.
− Minimum for 2 LUNs required for database migration.
• Limitation for customers who have hundreds of databases.
• The customer might not want to have multiple databases on one/two LUN.
• Again one database might span multiple LUNs.
− LUN restore is performed on whole disk.
• To support individual database restore, each database will require its own LUN
and drive letter.
− Verification will fail on Local server if free drive letter exhausts.

November 26, 2009 209


Technical Details — VMP Storing Database Files

• All
SQL SnapShot related files can reside on a mounted
volume, same as that of a Standard Volume:
− SQL user databases
− SQL system databases
− SQL Server transaction log file
− SnapInfo directory
• Configuration wizard can be used to migrate database
files to a mounted volume, same as that of a Standard
Volume.
− The rules applicable for migrating databases to Standard
Volume will apply for Volume Mount Point also.

November 26, 2009 210


Technical Details — VMP Rules For Mount Point
Root
• Databasefile cannot reside on a LUN which is the root of
a mount point:
− After LUN restore, all the mount points residing in the LUN will
be overwritten.
− For example, db1 resides on G:\mnt1
• Take backup of the database db1 with SMSQL
• Now create a mount point G:\mnt1\mnt2
• Create a second database db2 in G:\mnt1\mnt2
• On restoring the backup set for db1, taken before, G:\mnt1\mnt2 will go
off and hence db2 will become inaccessible

November 26, 2009 211


Technical Details — VMP Rules

• Mounted volumes should not be treated differently from


standard volumes.
• Configuration
rule for multiple databases on one or two
LUNs apply for volume mount point also.
• Backup,restore and other SQL SnapShot operations will
have no difference between mounted volume and
standard volume, just longer path for mounted volume.

November 26, 2009 212


Technical Details — Backup of Read-Only
Databases
• Storage
System SQL SnapShots now allows backup of
Read-Only database
• Inprevious release, read-only databases were not
displayed in the list of databases in Configuration
Wizard
• Now all read-only databases are listed in Configuration
wizard, just as normal databases

November 26, 2009 213


Technical Details — Resource Database
Management
• Eachinstance of SQL Server has one and only one
associated mssqlsystemresource.mdf file
− Instances do not share this file
• TheResource database depends on the location of the
master database
− If you move the master database, you should also move the
Resource database to the same location

November 26, 2009 214


Technical Details — Resource Database
Management
• SMSQL migrates Resource database along with master
database
− Resource database will not be listed in the Configuration Wizard
− Internally SMSQL migrates it while it migrates master database
− It will be migrated to the same location as master database
• This is supported only for SQL Server 2005

November 26, 2009 215


SnapShot Management with SQL Server — Summary

• SnapShot Management with SQL Server:


− Helps consolidate SQL Server on highly scalable and reliable
storage
− Efficient,, Predictable,, Reliable Backup, Restore and Recovery for
SQL Server databases
− Allows dynamic provisioning of storage for databases
− Allows DBAs to efficiently perform database backup, restore,
recovery, clone operations with minimum storage knowledge
− Facilitates Disaster Recovery and Archiving

November 26, 2009 216


Rapid Recovery of Oracle DB Through Storage
Systems Technologies

November 26, 2009 217


Oracle Enterprise Manager Grid Control

Monitor Trends and Threshold Alerts

Monitor Key Statistics

Monitor Utilization

•Ships with Oracle


Enterprise Manager
•Developed,
maintained and
licensed separately
by Oracle

Manage Storage System from Oracle Enterprise Manager 10g Grid Control
November 26, 2009 218
Oracle ASM

Before ASM ASM

Tables Tables
Tablespace Tablespace
0010 0010 0010 0010 0010
Files 0010 0010 0010 0010 0010
File Names Automatic
File System File System Storage
Logical Vol Logical Vol Management
Disks Disk Group

Networked Storage
(SAN, NAS, DAS)

November 26, 2009 219


Compatible Storage Adds Value to Oracle ASM
Oracle ASM Compatible Storage Oracle ASM + Compatible Storage
Data Resilience
Protect against Single Disk Failure Yes Yes Yes

Protect against Double Disk failure No Yes Yes

Passive Block corruption detection Yes Yes Yes

Active Block corruption detection Yes Yes Yes

Lost disk write detection No Yes Yes

Performance
Stripe data across ASM Disks Yes No Yes

Balance I/O across ASM Disks Yes No Yes

Stripe data across Physical Disks No Yes Yes

Balance I/O across Physical Disks No Yes Yes


I/O prioritization No Yes Yes

Storage Utilization
Free space management across physical No Yes Yes
disks
Thin provisioning of ASM Disks No Yes Yes
Space efficient Cloning No Yes Yes

Data Protection
Storage Snapshot based Backups No Yes Yes

Storage Snapshot based Restores No Yes Yes

November 26, 2009 220


Integrated Data Management Approach
Go from this… …to THIS

Application-Based Management
Integration and
Automation

Server-Based Management

Data Sets
and Policies

Centralized Management Storage Management

X High cost of management + Administrator productivity


X Long process lead times + Storage flexibility
X Rigid structures + Efficiency
X Low productivity + Response time
November 26, 2009 221
SnapShot Management with Oracle Overview

• Provides easy-to-use GUI


• Integrates with the host application
Oracle 9i • Automates complex manual effort
Oracle 10g
− Backup/Restores
− Cloning
SnapShot Management • Tight integration
with − RMAN
Oracle − Automated Storage Manager (ASM)
SnapDrive

FCP, iSCSI and NFS*

Storage Systems

November 26, 2009 222


SnapShot Management with Oracle

• Database cloning
− Ability to clone consistent copies of online databases
− GUI support for cloning
− Added support for context sensitive cloning
• Increased footprint of platforms and protocols
− Support for additional flavors of Unix
• SuSE 9, RHEL3/4 U3+, Solaris 9/10
− 32-
32-bit and 64-
64-bit
− NFS, iSCSI and FCP for various Unix platforms
− HP-
HP-UX and AIX (NFS)
− (Refer to compatibility matrix for specific details)
• Product hardening
− Increased product stability and usability
− Improved performance by utilizing snapshot vs. safecopy
− Increase performance when dealing with high number of archive logs
logs

November 26, 2009 223


SnapShot Management with Oracle

• Database cloning to remote hosts


− Ability to clone consistent copies of to remote hosts
− Previously clones were assigned to the host (with SMO) that
initiated the cloning request
• Increased footprint of platforms and protocols
− HP-
HP-UX and AIX support across NFS, iSCSI and FC

November 26, 2009 224


Database Backup and Recovery

Challenges
• DBA’s time spent on non-
value-add backup/restore tasks
• Cold backups lead to lower
SLAs
• Separate backups on each
platform
• Time-to-recover from tape
becomes prohibitive

November 26, 2009 225


Backup and Recovery with
Snapshot and SnapShot Restore
• Significant time savings
• Stay online
300GB • Reduce system and storage
Database overhead
• Consolidated backups
• Backup more often

Time to
Backup To Tape (60GB/Hr Best Case)
Snapshot™
Time to From Tape
Recover Redo Logs
SnapRestore®
Redo Logs
0 1 2 3 4 5 6 7 8
Time in Hours
November 26, 2009 226
SnapShot Management with Oracle
Automates Backup and Recovery
Primary Data Center • Backups in seconds
• Snapshot copies verified
• Near instantaneous restores
• Dramatically shortened recovery
with automated log replays
• Automated recovery tasks
DB
Server

Benefits:
• Extremely fast and efficient
• No performance degradation
• Accurate data restore and
recovery
Storage
System • Reduce downtime from outages
• Automation reduces errors and
Snapshot SnapShot Restore saves time

Time to backup: seconds Time to restore: minutes

November 26, 2009 227


Database Cloning and the
Application Development Process

• Full or partial database copies


required for:
PROD
− App and DB Development
SECONDARY (DR)
− Maintenance (OS, DB upgrade)
− Test and QA
− Training and Demos
− Reporting and DW ETL
• Ability to do this quickly,
correctly, and efficiently directly
impacts Application Development
and Deployment
DEV MAINT TEST/QA RPT/ETL

November 26, 2009 228


Traditional Approaches to Cloning

Production Mirrored Copy


• Copy
− Offline
− Online (using a mirror or standby
database, snapshots, and log-based
consistent recovery)
• Redirected restore
Dev 1 Dev 2 Dev N − From disk- or tape-
− based backups
• Challenges
− Limited storage resources
− Long lead-time requirements
Test 1 Test 2 Test N

November 26, 2009 229


Database Maintenance with Flexible Volume
Clones

Production Mirrored Copy Benefits


• Instantaneous copies
• Low resource overhead
• Easily make copies of a
Production DB
Dev 1 Dev 2 Dev N
production database without
Clones impacting the database
− Use clones to test migrations, apply
bug fixes, upgrades, and patches
Test 1 Test 2 Test N

November 26, 2009 230


New Database Development Methodology
• Mirror PROD for initial copy (DR)
− Mirror from and to storage system
PROD Test/Dev/DR Clones
• Clone database replicas as needed
• Create Snapshot copies of replicas for
instant SnapShot Restore of working
databases

Develop ● Test ● Deploy


November 26, 2009 231
Traditional Approach: Application Development
and Testing
Production Mirrored Copy Production database 100GB
Mirror copy 100GB
Development copies 300GB
Testing copies 300GB
Total: 800GB
Dev 1 Dev 2 Dev 3

• 8x actual storage requirement


• Time consuming
• Resource overhead
Test 1 Test 2 Test 3

November 26, 2009 232


SAN Approach: Application Development and
Testing
Production Mirrored Copy Production database 100GB
Mirror copy 100GB
Development copies 30GB
Testing copies 30GB
Total: 260GB
Dev 1 Dev 2 Dev 3

• Over 67% reduction in storage required


• Near instantaneous copies
• Negligible overhead
Test 1 Test 2 Test 3
• Ability to have many more test and dev
copies

more clones = higher productivity

Assumption: up to 10% change in data in the test and dev environments


November 26, 2009 233
Oracle Applications Lifecycle
Automate Provision
Configure and
backups,
Need restore maximize
systems,
with SMO,
reliable backup Plan utilisation
forecast
SnapMirror,
restore, and DR Re-organize with
storage
ReplicatorX
solution for FlexVol
accurately
DR
Install

Create clones
Create several
Upgrade Testing
Flexible requires
Clone:
with
clones,
Pain Points
Solutions duplicate
Fast &data,
FlexClone, lengthy and
space-efficient
lengthy process, Implement
automate with expensive
data
expensive
SMO process
duplication

Patch
Deploy

Tune &
Maintain Backup and
Need reliable
Need Recovery
backup and
Use Snapshots,
reliable backup Mirror prod. solution with
recovery
SnapShotandRestore, Mirror
datadata Snapshots,
solution
recovery solution to testwith
and dev SnapShot Restore
Storage Mirroring,
system,
ReplicatorX
lengthy
November 26, 2009 process 234
Server Virtualisation and Storage

November 26, 2009 235


Server Virtualisation Components

• Shared storage required for operation


• HA (High Availability)
• VMotion — move virtual servers seamlessly between
physical servers

November 26, 2009 236


More Information

Alan McSweeney
alan@alanmcsweeney.com

November 26, 2009 237

You might also like