You are on page 1of 60

IBM Spectrum Scale

Backup & Archive


Solution & Architecture

Ash Mate
WW Senior Solutions Architect
mate@us.ibm.com

Spectrum Scale / ESS Solution & Architecture

© 2016 IBM Corporation


Agenda

• What is IBM Spectrum Scale?


• Spectrum Scale Deployment Models
• Backup & Archive of Spectrum Scale
– Solution Architecture
– Benefits
– Case Studies
• Backup & Archive to Spectrum Scale
– Solution Architecture
– Benefits
– Case Studies

© 2016 IBM Corporation


What is IBM Spectrum Scale?

© 2016 IBM Corporation


What is IBM Spectrum Scale? (contd)
Users and applications

Client workstations

Compute Farm

Single name space


Map Reduce Connector OpenStack
Cinder Swift
NFS POSIX Glance

Site A Powered by

Spectrum Scale
Off Premise
Site B

Site C

Tape
Flash
Shared Nothing
Disk
Cluster

Spectrum Scale is a scalable parallel filesystem that helps customers


manage and optimize large amounts of data across multiple storage
environments and data tiers
4 © 2016 IBM Corporation
Spectrum Scale Deployment Models

© 2016 IBM Corporation


Simple Cluster Model Overview
Single Cluster/Single Filesystem*

 All Network Shared Disk (NSD)


Client does real-time parallel I/O to all the NSD
servers export NSDs to all the servers and storage volumes/NSDs
clients in active-active mode NFS, CIFS,
 GPFS stripes files across NSD Swift, S3
cNFS
servers and NSDs in units of exports Application
file-system block-size cNFS
Nodes
 NSD client communicates with All clients can access
all the servers all data in parallel Scalable, capacity &
GPFS Protocol performance
 File-system load spread evenly
across all the servers and TCP/IP or Infinband Network
NSDs. No Hot Spots
 No single-server bottleneck Other optional
NSD
NSD
 Can share access to data with Servers
NFS, SMB, S3 and Swift Servers
Inside ESS
 Access via File & Object ESS Storage Storage
protocols for clients without
GPFS client Storage Heterogeneous Optimize
 Easy to scale while keeping the Storage heterogeneous
resources
architecture balanced

*or can be multiple filesystems if desired © 2016 IBM Corporation


Backup & Archive of Spectrum Scale
• Spectrum Scale as the source
• Use Case: Backup or Archive data stored on Spectrum Scale to other backup and
archival products.
– Inbuilt mirroring, snapshotting capabilities (mmbackup)
– Spectrum Protect (TSM)
– Spectrum Archive (LTFS EE)
– Enterprise Content Manager (ECM)
– Third Party Backup Software

© 2016 IBM Corporation


Highlevel Architecture
Customer application can run on Spectrum Protect for Space Management
Spectrum Protect Server
Spectrum Protect NSD client or Spectrum Protect Backup Archive Client Supported storage
server nodes. medium:
Supported platforms: Supported platforms: Disk, Optical, Tape Library,
Supported platforms: AIX™, xLinux, zLinux (4Q15) AIX™, xLinux, pLinux, zLinux, Object Storage
HP, Sol, Windows®
Customer application on NSD client:
AIX™, xLinux, pLinux, zLinux, Windows®
Customer application on NSD server:
Spectrum Protect: AIX™, xLinux, zLinux (4Q15)
Spectrum Archive: xLinux Function:
• Backup, Restore
• Migration, Recall
• SOBAR

Spectrum Scale Spectrum Archive Enterprise Edition


NSD clients Supported storage medium:
Supported platform:
xLinux LTFS compatible Tape Library
Spectrum Scale
NSD server

Supported platforms:
AIX™, xLinux,
pLinux, zLinux, Windows®
Function:
• Migration, Recall

8 © Copyright IBM Corporation 2015 © 2016 IBM Corporation


Concepts
• Snapshot
• mmbackup
• Active Archiving/DMAPI
– TSM/HSM
– LTFS
• Scale Out Backup And Restore (SOBAR)

9 © 2016 IBM Corporation


Snapshot
• Capture file system content at a point in time
• Snapshots are read-only
• Uses Copy-on-Write - does not consume space unless data
changes
• Intermediate online backup capability
– Allows easy recovery from common problems such as
accidental deletion of a file, and comparison with older
versions of a file.
• Backup or mirror programs can use a snapshot to obtain a Capture file system
consistent copy of the file system
content at a point in time
• Policy engine can be restricted to run on a snapshot instead of
live/active file system
• Multiple snapshots of a fileset or file system are allowed (up to
256 snapshots)
– File system level snapshot are called global snasphot

• NOTE: snapshot is not a backup solution since it does not make a


copy of all data
10 © 2016 IBM Corporation
Snapshot (contd)
• Creating Snapshot
– Can be expensive on busy system.
Command: mmcrsnapshot Device SnapshotName [-j Fileset]

# mmcrsnapshot d13 g_snap1


Writing dirty data to disk.
Quiescing all file system operations.
Writing dirty data to disk again.
Snapshot g_snap1 created with id 5.

Restore Operation:
1) Use copy to restore file.
2) mmrestorefs to restore global or independent fileset level snapshot.
– NOTE: mmrestorefs of a global snapshot requires file system to be unmounted.

11 © 2016 IBM Corporation


Backup Of Large Spectrum Scale File Systems

Spectrum Protect backup (mmbackup)


backup archive client
typically installed on
serveral cluster nodes

Spectrum Protect
Spectrum Scale Spectrum Scale Cluster restore (GUI or CLI)
Server
mmbackup tool
coordinates processing

Function Challenges Recommendations

• Massive parallel filesystem backup • Usage of ACL & EA might lead to • If HSM is used use option
processing increased backup traffic MIGREQUIRESBACKUP=YES
• Spectrum Scale mmbackup • If HSM is used inline backup might • Prevent rename of directories
creates local shadow of Spectrum lead to unexpected tape mounts close to file system root
Protect DB and uses policy engine • Administrative operations on • Prevent ACL & EA changes if
to identify files for backup Spectrum Protect Server might not possible
• Spectrum Protect backup archive be observed from mmbackup (e.g.
client is used under the hood to file space deletion)
backup files to Spectrum Protect • Limited handling of include rules
Server using management class binding
• Spectrum Protect restore (CLI or
GUI) can be used to restore files
© Copyright IBM Corporation 2015 12
© 2016 IBM Corporation
Backup Of Large Spectrum Scale File Systems

Backup cycle:

• After start mmbackup evaluates the cluster


initiate mmbackup
environment and verifies product versions and
settings
Analyse result and Evaluate • Optional the Spectrum Protect server is queried for
finish backup run environment
existing backup information. In other cases existing
shadow DB is used for processing
• The policy engine is used to generate a list files
currently eligible for backup activities
Optional: query • Compare existing shadow DB and scan result to
Spectrum Protect
Backup new and
changed files server calculate file lists for required backup activities
• Expire all files deleted in the file system since last
backup run
• Incremental backup all files with changed metadata
in the file system since last backup run
Expire deleted files Perform file system
scan
• Selective backup all files with changed data in the file
system since last backup run
• While backup activities ongoing update shadow DB
Calculate backup inline
activities
• Analyse backup results from all used cluster nodes
and finish backup cycle by selective backup the
current shadow DB

© Copyright IBM Corporation 2015


13 © 2016 IBM Corporation
Backing up Spectrum Scale File System
• IBM Spectrum Scale (GPFS) mmbackup is a utility that uses the powerful GPFS policy engine and
Spectrum Protect (TSM) Backup clients combined to backup a GPFS file system to a TSM server.
– Spectrum Protect backup clients are used as data movers
– Spectrum Protect server as back end storage
– Policy engine for candidate selection and work load distribution
– Fully integrated with Spectrum Protect functionality, restore is done by Spectrum Protect Backup client CLI
– Maintains its own database – no need to communicate with Spectrum Protect server (which is an expensive
operation)
– Multi Threaded and Multiple nodes can participate in a backup job

• Progressive incremental or "incremental forever“

• Expiration: Free up space on Spectrum Protect server for deleted files

14 © 2016 IBM Corporation


Command
• Backup operation:
- Backup from a active file system or a snapshot (in future from a fileset)

mmbackup Device [-t {full|incremental}]


[-N {Node[,Node...] | NodeFile | NodeClass}]
[-g GlobalWorkDirectory] [-s LocalWorkDirectory]
[-S SnapshotName] [-f] [-q] [-v] [-d]
[-a IscanThreads] [-n DirThreadLevel]
[-m ExecThreads | [[--expire-threads ExpireThreads] [--backup-threads BackupThreads]]]
[-B MaxFiles | [[--max-backup-count MaxBackupCount] [--max-expire-count MaxExpireCount]]]
[--max-backup-size MaxBackupSize] [--quote | --noquote]
[--rebuild] [--tsm-servers TSMServer[,TSMServer...]]
[--tsm-errorlog TSMErrorLogFile] [-L n] [-P PolicyFile]

• Restore operation:
- There is no command since restore are done by Spectrum Protect command (dsmc restore)
- Can restore a file/directory or whole file system

15 © 2016 IBM Corporation


Backup and Restore overview
Users and
applications

data

Spectrum Protect SERVER


Spectrum Scale Cluster mmbackup running disks
Incremental job

After backup completes

mmbackup

GPFS GPFS GPFS


TSM Backup TSM Backup Tape Library
LAN
Restore command from TSM B/A clients
•Data goes to disks in TSM server to reduce the backup window time
•Data then moved to tapes after backup has completed
•Restore is done directly from tapes
•Admin executes commands on TSM Backup clients to restore data
Backup data goes from GPFS to TSM server •Backup is done on off peak hours to minimize the impact
Restore: data goes from TSM server to GPFS

16 © 2016 IBM Corporation


Tape Tier and Active Archiving
• IBM Spectrum Scale supports DMAPI (Data Management API) that can be used by Data management application
like Spectrum Protect & Spectrum Archive to provide tape tier/active archiving to Spectrum Scale file system

• Use DMAPI and callbacks with:

• Spectrum Protect (TSM/HSM):

• Client server model

• Require separate TSM server

• Spectrum Archive (LTFS EE):

• Integrated with Spectrum Scale Cluster

• Does not require separate server

• It is based on Linear Tape File System (open source)

• Allows export and import of tapes

17 © 2016 IBM Corporation


Spectrum Scale + Spectrum Protect

Users and applications

file operations
write
i.e. read/write
Spectrum Protect SERVER
Spectrum Scale Cluster
migration due to low
online storage

Migration
based on storage
pool threshold
Tape Library

GPFS GPFS GPFS


TSM/HSM TSM/HSM
LAN
recalls caused by user accessing files
•Migration can be done directly to the tape
•Data is recalled from the tape
•Multiple TSM/HSM clients can move data to the TSM server
Migration: data goes from GPFS to TSM server •Recall due to file access
Recall: data goes from TSM server to GPFS •Migration and recalls are distributed and done by TSM/HSM clients

18 © 2016 IBM Corporation


Tape Tier and Active Archiving (contd.)
Operations: Migration: Move data to the tape
– Uses GPFS policy to find candidate/migrate
– Threshold based migration
• Utilizes Policy Rules (mmchpolicy and user exits to accomplish monitoring of the space)
mmaddcallback CallbackIdentifier --command CommandPathname
--event Event[,Event...] [--priority Value]
[--async | --sync [--timeout Seconds] [--onerror Action]]
[-N {Node[,Node...] | NodeFile | NodeClass}]
[--parms ParameterString ...]

--event: lowDiskSpace, noDiskSpace

– Examples are located at:


/usr/lpp/mmfs/samples/ilm

– Process:
• Set LTFS EE or TSM
• Create Policy with threshold, use mmchpolicy command to install new policy
• Setup callback

19 © 2016 IBM Corporation


Tape Tier and Active Archiving (contd.)
• Recall: Move data from tape to file system.
– On-Demand recall via access to a file
– Based on command to recall a file
– Policy based recall
• Reconciliation
– Recover space for deleted files from tapes

20 © 2016 IBM Corporation


Spectrum Scale + Spectrum Archive

No External
offline storage server

LTFS EE with separate GPFS nodes: LTFS EE connects to tape via LTFS LE+
Tape library can have multiple pools (3 in above example)
Multiple nodes can connect to the tape library – scalability for performance.
21 © 2016 IBM Corporation
Scale Out Backup and Restore (SOBAR)

Scale Out Backup and Restore (SOBAR) is a specialized mechanism for data protection
against disaster only for IBM Spectrum Scale™ file systems that are managed by Spectrum
Protect - Tivoli® Storage Manager (TSM) Hierarchical Storage Management (HSM).

Backup Process:
• Backup configuration data
• Pre-migrate files to HSM so there is copy of the data in the TSM

Restore Process:
• In case of disaster, recreate cluster and file system (using mmrestore config).
• Use image restore process to restore inode space (directory structure and file stubs)
• Now use normal HSM process to recall data (data will be recalled on demand)

22 © 2016 IBM Corporation


Disaster Recovery using SOBAR

Spectrum Protect for


Space Management client
AND backup archive client image backup
typically installed on
several cluster nodes image restore

migration

recall (transparent and manual)


Spectrum Protect
Spectrum Scale Spectrum Scale Cluster
Server
SOBAR toolset used
for processing

Function Challenges Recommendations


• Function Backup • All files to be included have to be • Frequently applied policy rules
• Spectrum Protect HSM used to premigrate
files premigrated or migrated should ensure that newly created
• SOBAR toolset used to generate filesystem • Cluster configuration has to be files will be premigrated
metadata image backed up separately immediately
• Spectrum Protect backup archive client used • Integrate SOBAR backup to your
to backup image files
business process to prevent file
• Function Restore changes shortly before image
• Spectrum Protect backup archive client used capturing
to restore image files
• Prepare pre-fetching importance
• SOBAR toolset used to recreate file system
structure list for recovery processing
• Spectrum Protect HSM used to pre-fetch files
and allow direct access by applying
transparent recall
© Copyright IBM Corporation 2015 23
© 2016 IBM Corporation
SOBAR Commands

Backup of configuration information

mmbackupconfig Device -o OutputFile

Restore configuration information

mmrestoreconfig Device -i InputFile [-I {yes | test}]


[-Q {yes | no | only}] [-W NewDeviceName]
or
mmrestoreconfig Device -i InputFile --image-restore
[-I {yes | test}] [-W NewDeviceName]
or
mmrestoreconfig Device -i InputFile -F QueryResultFile
or
mmrestoreconfig Device -i InputFile -I continue

24 © 2016 IBM Corporation


SOBAR Commands (contd)
Backup of the metadata space (inodes space)

mmimgbackup Device [-g GlobalWorkDirectory]


[-L n] [-N {Node[,Node...] | NodeFile | NodeClass}]
[-S SnapshotName] [--image ImageSetName] [--notsm | --tsm]
[--tsm-server servername] [POLICY-OPTIONS]

Restore filesystem metadata space.

mmimgrestore Device ImagePath [-g GlobalWorkDirectory]


[-L n] [-N {Node[,Node...] | NodeFile | NodeClass}]
[--image ImageSetName] [POLICY-OPTIONS]

25 © 2016 IBM Corporation


Client Examples
IBM Spectrum Scale and Spectrum Protect together

• Consolidate primary and backup Consolidate Enterprise Backup


storage for scalable performance
• Regional Hospital Network Global Large Enterprise

• High availability: No single point of


failure • High availability: Fast failover
• 4 x more capacity and lower storage • Reduced backup infrastructure costs
costs, compared to Data Domain by consolidating over 9 PB of data
• Faster backups and restores: IBM ESS • Easier for Backup Administrators to
40Gb/s network manage storage
• Secure: Encryption for primary and • Supports mixed workloads: Virtual
backup data
servers, SAP and other business data

Solution by Solution by

IBM Internal and Partner Use Only


|
© 2016 IBM Corporation
Backup & Archive to Spectrum Scale
• Spectrum Scale as the destination
• Use Case: Spectrum Protect uses Spectrum Scale / ESS for storing data being
backed up or archived

© 2016 IBM Corporation


Meet Bob, IT Manager
• How do I store more data on a flat budget?

• How do I win the backup window race every night?

• How do I stop buying expensive EMC appliances to meet the data growth?

• How can I make my Backup administrators more efficient?

IBM Internal and Partner Use Only


|
© 2016 IBM Corporation
Storage Operational Challenges

Storage Environment Challenges


(% Rating 4 or 5 on 1-5 scale)
Backup and Disaster Recovery was named as the
biggest challenge, which isn’t central to top-line revenue backup and disaster recovery
generation or day-to-day customer satisfaction scalability
Tiering data
Provisioning/ Flexibility
Scalabity and Tiering data point to the budget constraints
importance of having the data available
and making the best us of it for the leverage underutilized storage
business. Meeting SLAs
Managing heterogeneous o/s
Businesses operations and applications are
dynamic. Despite years of progress in storage software tool to manage of storage
efficiency and non-stop data growth, Leveraging de-duplication
Underutilized Storage remains a top concern. Source: 2014 STG NDB Study, n = 1,206

Meetings SLAs ranks surprisingly


low, pushed down by other urgent
issues.
Back-office departments and IT people are never applauded for ‘keeping
the lights on’, but in Storage that’s just where a lot of time still goes

IBM Internal and Partner Use Only IBM Systems | 29


© 2016 IBM Corporation
Market Facts

• IBM is a leader in the enterprise backup software and integrated appliances magic quadrant
Published date: 06/15/2015 Source: Gartner Magic Quadrant for Enterprise Backup Software and Integrated Appliances

• By 2016, less than 30% of all big data is expected to be backed up


Published date: 06/16/2014 Source: Gartner Magic Quadrant for Enterprise Backup Software and Integrated Appliances

• By 2017, 70% of organizations are expected to have replaced their remote-office tape backup with a disk-based
backup solution that incorporates replication, up from 30% today
Published date: 06/16/2014 Source: Gartner

• By 2018, the number of organizations abandoning tape for backup is expected to double, and archiving to tape
should increase by 25%
Published date: 06/16/2014 Source: Gartner Magic Quadrant for Enterprise Backup Software and Integrated Appliances

Be selective about what you backup


Archive to low cost storage
Backup to fast storage

© 2016 IBM Corporation


IBM Spectrum Protect + IBM Spectrum Scale Solution

• Easier to grow as your data grows

• Lower cost of backup infrastructure

• Easier to use than the competition


 Virtually unlimited scaling

Spectrum Protect  Add turnkey building blocks


Servers
 High performance storage with
parallel data access
Shared Spectrum Scale
File system 200 TB 1 PB 10 PB  Add storage with no impact to
applications or users

IBM Internal and Partner Use Only


|
© 2016 IBM Corporation
We focused our innovation efforts on solving your problems
• Simplify
– Manageability at scale with common graphical user interface
– Storage provisioning is transparent to Spectrum Protect

• Reduce costs
– Lower infrastructure costs to achieve backup window and recovery objectives
– IBM Spectrum Protect’s built in enterprise class data dedup for no additional charge
– Lower admin efforts with simplified provisioning of storage for Spectrum Protect
– Higher storage utilization by leveraging a shared file system
– Build your infrastructure your way using low cost commodity storage
– Real time recovery for a longer retention period per dollar

• Improved availability through our high performance shared file system


– Scalable performance lets you finish your backups during their SLA windows
– Flexible restore helps to get business back in business during an event
– Automated failure for backup and restore options
– Highly available storage that can span datacenters

IBM Internal and Partner Use Only


|
© 2016 IBM Corporation
Spectrum Protect on Spectrum Scale - Overview
• Multiple Spectrum Protect (TSM) instances store
DB and storage pools in a Spectrum Scale file Spectrum Protect Clients
system (GPFS)
– Spectrum Scale provides global name space for
all Spectrum Protect instances
TCP/IP Network
– Instances share all file system resources
TSM TSM TSM TSM

Spectrum Scale file system


• Spectrum Protect instances run on cluster nodes
accessing the file system and disk directly Storage Network

Spectrum Scale Storage


• Spectrum Scale file systems balances the workload
and capacity for all TSM instances on disk
Spectrum Scale storage for Spectrum
Protect
• Provides standardized, scalable and easy to use
storage infrastructure for the multiple instances
33 © 2016 IBM Corporation
Our solution offers fast, seamless and virtually limitless scaling

• Without Spectrum Scale • With Spectrum Scale


– Each backup server has its own isolated file system – Scale capacity seamlessly and transparently to apps or
users under the shared file system global namespace
– Each Protect server and its dedicated LUN is tightly
coupled – Build your infrastructure using commodity storage, i.e. no
vendor lock in.
– Storage islands appear with underutilized capacity
– Central adminstration of all storage
– Capacity and performance management is challanging
– Scaling and performance may impact apps and users

Backup clients

Spectrum Protect
instances
Spectrum Scale shared file system

Storage

IBM Internal and Partner Use Only


|
© 2016 IBM Corporation
IBM Spectrum Protect on IBM Spectrum Scale Architecture
Application Application Application Application

Files DB Mail ERP


TSM client TSM client TSM client TSM client

 All TSM servers store DB and storage pools in GPFS


file systems
GPFS Server GPFS Server

TSM Server TSM Server – File system for databases provide low latency

GPFS file systems – File system for storage pools provide high sequential
performance

 GPFS can do both

DB STG
Tape
GPFS Storage

 Running multiple TSM instances on one GPFS cluster provides standardizes, scalable and easy to use storage infrastructure for
the TSM backup environment
 GPFS cluster provides a single file system and on demand resource sharing for all TSM instances

35 © 2016 IBM Corporation


Value proposition for Spectrum Protect on Spectrum Scale
• Optimized storage utilization – all Spectrum Scale servers use the same storage

• Operational efficiency with one storage system for all Spectrum Protect servers

• Scalable in multiple dimensions:


– Capacity: concurrently add more storage to Spectrum Scale file system
– Performance: concurrently add more Spectrum Scale / Spectrum Protect server or faster storage

• High performance with intelligent striping across all disk devices

• High availability in clustered file system

• Disaster protection with TSM or GPFS replication or GPFS native RAID (GNR)

• Cost efficient by utilizing standard infrastructure components

36 © 2016 IBM Corporation


Elastic Storage Server overview

• Spectrum Scale appliance (pre-packaged) Protocol / Application nodes (GPFS NSD clients)
– Graphical User Interface File server Database
Backup
Apps
Archive
– 3 Years Maintenance and Support

• Based on GPFS native RAID (declustered)


– Predictable performance LAN / IB

– Low impact during rebuild


Global name space
– 2 and 3 fault tolerance configurable
– End-to-end checksums
JBODs

• Provides GPFS file system Native RAID SW


– Applications are configured on extra nodes
JBODs

• Different models
Elastic Storage Server
– GS: small and fast (2 – 125 TB) (NSD Server)
– GL: large and scaling ( 150 – 1530 TB)

37 © 2016 IBM Corporation


Overview of Spectrum Protect with ESS

 IBM Elastic Storage Server provides:


 Scalable, extremely high-performance while being a low-cost storage platform
 Declustered GPFS Native RAID with options for 3 or 4 way mirroring, or double or triple parity RAID
 Data and redundancy information distributed across all disks in the JBOD
 Extremely fast rebuild times

 Benefits of integrating with Spectrum Protect:


 Simplified storage configuration
 Ethernet storage attachment (10GbE or 40GbE)
 Global namespace sharable by multiple Spectrum Protect servers
 Can be shared by more than one Spectrum Protect server
© 2016 IBM Corporation
Spectrum Protect server performance on ESS – outside view

“IBM's TSM backup product goes super-fast when backing up to the


Elastic Storage parallel file system….”

“Of course, you need fast network links as well and backup/archive
software that can use the links and back-end storage, as TSM can. If you
have these then your backup and archive, and subsequent restores, could
move data around like a dragster roaring down a speed strip.”

The Register http://www.theregister.co.uk/2014/12/22/dragster_backup_with_parallel_target_system/

More information about the tests on Developer Works

39

39 © 2016 IBM Corporation


TSM Blueprint: TSM with Elastic Storage Server - Available!

• Support for IBM Elastic Storage Server


– Configuration instructions for large TSM server with Elastic Storage Server
– Configuration script support for automating Spectrum Protect server setup with ESS
– Initially published for Linux x86_64

• See https://ibm.biz/TivoliStorageManagerBlueprints
40 © 2016 IBM Corporation
A Perfect Match:
Spectrum Protect and Elastic Storage
• Spectrum Scale and Spectrum Protect
Server(s)

Spectrum Protect together


provide high performance
and low latency Spectrum Protect
Clients Spectrum Scale File System(s)
characteristics
• Easily scale the system 1 or more
Elastic
Storage
performance- and Servers

capacity-wise by adding Production


Data Tape
Sequential access

more ESS to the cluster


Optional:
In environments where very huge amounts of very small
objects are stored, consider to place the TSM DB and logs
to an IBM Flash System
Consider to add physical tape for archving or offsite
vaulting

http://escc.mainz.de.ibm.com | gaschler@de.ibm.com

41 © 2016 IBM Corporation


Key values for Spectrum Protect on ESS
• Superior Performance
– No additional overhead with TSM server running on GPFS client
– ESS performance scales almost linear
– No impact during disk rebuild with GPFS Native RAID (GNR) on ESS

• Lower Cost
– No extra storage required for TSM DB
– Use of standard infrastructure components

• Excellent Data Protection


– With GNR, TSM backup and node replication
– Superior data protection with native RAID options

• Flexible Scalability
– Multiple TSM servers can share a single file system and storage
– Add more ESS building blocks as capacity and performance demands grow

• Ease of use with graphical user interface and TSM operation center
42
– TSM operations center provides advanced monitoring and reporting © 2016 IBM Corporation
A Smarter Storage Approach
The IBM Integrated Storage Portfolio

Thank you!
For more information:
Website: http://www-03.ibm.com/systems/storage/spectrum/index.html

© 2016 IBM Corporation


Backup

Spectrum Scale / ESS Solution & Architecture

© 2016 IBM Corporation


Spectrum Scale Parallel Architecture

Spectrum Scale
NSD Client File stored in blocks
Spectrum Scale File
System
 All NSD servers export NSDs to all the
clients in active-active mode
 Spectrum Scale stripes files across NSD
servers and NSDs in units of file-system
block-size Spectrum Scale NSD
Servers
 NSD client communicates with all the
servers
 File-system load spread evenly across all
the servers and NSDs. No HotSpots
 Easy to scale file-system capacity and
performance while keeping the architecture
balanced

Client does real-time parallel I/O to all the NSD servers and
storage volumes/NSDs
© 2016 IBM Corporation
Spectrum Scale – The Complete Data Management Solution for Enterprise
environments
Single Worldwide Name Space

cache site FPO


POSIX NFS CIFS
SMB CIFS
OpenStack

cache site AFM Spectrum Scale


ESS
cache site fast slow
Flash TSM
disk disk
LTFS
HPSS

Single file-system technology to serve home, high-speed


scratch, and analytics workloads as well as be globally available
to worldwide user community

© 2016 IBM Corporation


Customer Requirements Versus Product Functions

Protection type required for


Recommended Spectrum Storage solution:
Spectrum Scale file system:

Use Spectrum Protect in combination with Spectrum Scale mmbackup.


Backup Reason: Spectrum Archive Enterprise Edition does not provide classical backup features like versioning. Furthermore
Spectrum Protect backup can‘t be combined with Spectrum Archive on the same file system.

Use Spectrum Protect Backup Archive Client in combination with Spectrum Scale
mmbackup and Spectrum Protect for Space Management.
Backup + HSM*
Reason: Spectrum Protect for Space Management and backup archive client provide a close integration. Spectrum Protect
backup can‘t be combined with Spectrum Archive on the same file system.

See comparison of functions on next slides.


HSM* Reason: Both products integrate with Spectrum Scale. Both products Spectrum Archive Enterprise Edition and Spectrum
Protect for Space Management can be recommended. Each product has it‘s own strengths.

General hint for HSM only environments: If Spectrum Protect is already in use and the customer has skills in this area
Spectrum Protect for Space Management can be integrated into the environment easily. If Spectrum Protect is not an
option for the customer Spectrum Archive Enterprise Edition is the best approach.

* HSM refers to the capability to migrate


© Copyright files2015
IBM Corporation from disk to tape
47
© 2016 IBM Corporation
Functionality Compared – Backend Storage System And Platforms

Category Spectrum Protect for Space Management Spectrum Archive Enterprise Edition
Backend Storage Backend storage provided from Spectrum Protect IBM Tape drives and libraries
Type server with wide range of storage medium types
supported (Disk, Tape, Optical, Object)
Backend Storage Data is stored in proprietary format. Tape cartridges Data is stored in open LTFS format. Single cartridges
Data Format containing data can be used only in combination can be used directly with Spectrum Archive SE or LE
with Spectrum Protect server and vice versa (export and import function)
Supported Tape Multi-vendor support, including LTO, IBM TS1100, IBM LTO and TS1100 tape drives with IBM TS3500,
Systems Oracle StorageTek, DLT and virtual tape libraries. TS4500 and TS3310 libraries
Tape library sharing Yes, multiple TSM servers can share the same tape All Spectrum Archive nodes share 1 tape library and all
libraries and tape drives, but not tape cartridges. tape cartridges. Each node requires dedicated drives.
(IBM plans to support sharing of 2 tape libraries in
4Q15).
Backend Storage Data can be collocated on filespace level to Can be collocated on file system, directory and file
Device Collocation implement dedicated storage volume usage name level
Backend Storage Spectrum Protect servers uses DB2 instances for Metadata stored on tape cartridge and file system
Metadata metadata (Spectrum Scale)
Platforms See slide: „Highlevel Architecture“
© Copyright IBM Corporation 2015
48
© 2016 IBM Corporation
Functionality Compared – Data Transfer And Scalability

Function Spectrum Protect for Space Management Spectrum Archive Enterprise Edition
File Migration • Premigration and Migration of single files and • Premigration and Migration of single files in one
multiple (small) files in one transaction. transaction
• Tape optimized migration • Tape optimized migration
File Recall • Normal recall (full file) • Normal recall (full file)
• Streaming recall (for streaming applications e.g. • Tape optimized recall
media player) • Cluster wide recall distribution
• Partial recall (for partial access applications e.g.
data bases)
• Tape optimized recall
• Cluster wide recall distribution
Scaling migrate • Add Space Management nodes • Add Spectrum Archive EE nodes
and recall • Add Spectrum Protect servers • Add tape resources
throughput • Add tape resources
Linear scalability Limited by number of Spectrum Protect server and By adding tape drives and Spectrum Archive EE
LAN connections to Spectrum Protect server nodes

© Copyright IBM Corporation 2015


49
© 2016 IBM Corporation
Functionality Compared – Protection And DR

Function Spectrum Protect for Space Management Spectrum Archive Enterprise Edition
File system backup Close integration with Spectrum Protect backup No support for file system backup
archive client (mmbackup)
Creating multiple Using copy storage pools in Spectrum Protect Server. By migrating data to more than one tape cartridge
copies Node replication feature of Spectrum Protect server. pools (up to 3 copies)

Preservation of POSIX attributes and full ACL / EA support are POSIX attributes are preserved on tape
attributes preserved in Spectrum Protect server
Frontend Disaster • Restore of files from backup (when available) Recreation (rebuild) of deleted stub files, requires to
Recovery (GPFS) • Recreation of deleted stub files read all tapes
• Recovery of full file system with SOBAR

Backend Disaster DB2 is central metadata storage and can be restored Content of damaged tapes can be repaired if multiple
Recovery (TSM or from backup and use copy pools to recover from copies have been created during migration.
LTFS Tapes) volume failues. Switch to replication node of
Spectrum Protect server.
High Availability • Automated failover of HSM service in terms of node Manual failover of Spectrum Archive EE services in a
failure in a multi-node cluster multi-node cluster
• Automated recovery of local HSM service in terms
of processing failures
© Copyright IBM Corporation 2015
50
© 2016 IBM Corporation
Case Studies
• Backup of Spectrum Scale / ESS

51 © 2016 IBM Corporation


Use Case: Data Protection / Disaster Recovery

• Internal Data and Metadata Mirroring Options


• Protects against data corruption
• Snapshots – either at a folder level of granularity or a full system
• Multi-Site Replication
• Synchronous, asynchronous/Active File Management or “sharing”
• Continuously health monitoring
• Failures detected & automatic appropriate recovery action
• Extensive recovery capabilities
• Fault tolerance
• Continued access to data even if cluster nodes or storage fails
• Encryption and Secure Erase to protect sensitive information

 Always sync’d across 2 or  Allows multiple app.


Synchronous Replication Multi-cluster
3 failure domains groups to share
 All I/O still active during a portions or all data
domain failure. TCP/IP or IB NSD  Single copy of data
TCP/IP or IB
TCP/IP or IB Network Network Protocol
 I/O auto routed to the Network shared across
NSDs NSDs other surviving storage NSDs multiple sites
NSDs
Storage Storage domain.  I/O during remote file-
Storage Storage
 Seamless file-system access determined by
Single File system sync’d recovery BW
Remote Mount
across 2 locations Filesystem 1 Filesystem 2 © 2016 IBM Corporation
Use Case: Backup/Archive with Spectrum Archive
(LTFS Enterprise Edition)
Data Ingestion Data Processing Access Archival
or creation

 LTFS EE enables IBM tape libraries to replace tier 2/3 disk Benefits:
storage in Spectrum Scale-based tiered disk – Near instant access for users and applications
environments – Improved operations
 Storage Virtualization with transparent tiering to tape – Significant cost savings over more traditional storage
– Rapid implementation and integration
• LTFS EE creates “nearline” access tier 2/3 storage
– Improved editing workflow
with tape at 1/51 the cost
– Retention of more raw footage
• Helps reduce storage expense for data that does not
need the access performance of primary disk

Backup Tiered Storage Archiving


 Simple backup solution to file system  Policy based placement and migration  Archive large volumes of data / files
 Use tape for infrequently accessed files  Retain data for long periods of time
 Easy data access and restore
 Leverage simplicity and tape TCO  Leverage simplicity, tape TCO and
 Leverage simplicity and tape TCO
standardized format

John F Kennedy Center for the Performing Arts


 Business need:
The Kennedy Center holds 2,000 performances annually that are streamed live online. All performances are
preserved, protected and made accessible. In addition, the Millennium Stage On-Demand archive consists of
3,500 performances available on demand. The mission is to preserve these assets by digitizing, protecting and
providing rapid access to them.
 Solution:
Deployed Spectrum Scale and IBM Linear Tape File System Enterprise Edition with IBM LTO 6 tape to provide
a transparent architecture, rapid access, and high availability. User data files can be viewed as a normal file
system to OS applications while providing a seamless incorporation into existing environments

© 2016 IBM Corporation


Customer Deployment: Financial Services Customer
Client
• Credit information and information management services to over 45,000 businesses and 500 million
consumers in 33 countries worldwide
• Application: High-Volume data processing applications

Challenge
• Evaluate and re-architect an outdated GPFS environment.
• Provide enhanced business continuity using a second datacenter
• Provide an infrastructure for rapidly developed applications and real time analytics

Solution
• Integrated solution comprising Spectrum Scale ESS systems, Spectrum Archive EE, ProtectTIER, TS4500
Tape Libraries, Professional Services
Key Client Benefit
• Increased storage utilization through integration of SSDs, disks and tape
• Higher level of business continuity through Spectrum Scale‘s ‘built-in‘ DR capabilities and ESS‘ data integrity
features
• Simplified management due to a homogeneous, scalable, ‘building block‘ based architecture
• Increased peformance

© 2016 IBM Corporation


Customer Deployment: Financial Services Customer
Architecture
• 6x ESS GL6 at two sites in a DR configuration supporting Batch Credit Services’ data
processing applications
• 2x ESS GL6 and 2x GS2 with Spectrum Archive EE and 2x TS4500 Tape libraries at two
sites for mainframe data conversion

IBM Confidential
55
© 2016 IBM Corporation
Case Studies
• Backup to Spectrum Scale / ESS

56 © 2016 IBM Corporation


University of Oklahoma

Overview
Petascale Storage for Research Computing Industry: Education
Client: University of Oklahoma
Supercomputing Center for
Problem: growing data needs for research computing while meeting data Education and Research
management requirements of National Science Foundation (OSCER)
Products: GPFS, TSM, IBM disk and tape
storage and servers
Solution: Profile
• For high-capacity disk storage, the IBM System Storage DCS9900 was
selected—which is scalable up to 1.7 PB.
• For longer-term data storage, OU chose the System Storage TS3500 Tape
Library—with an initial capacity up to 4.3 PB and expandable to over 60 PB.
• To run these storage systems, six IBM System x3650 class servers were
selected, running IBM General Parallel File System (GPFS™) on the disk
system and IBM Tivoli Storage Manager on the tape library to automatically
move or copy data to tape.

Benefit: Neeman says one of the main reasons they chose IBM was the cost
effectiveness of the tape solution.
White paper available

IBM Confidential
57
© 2016 IBM Corporation
University of Colorado

Overview
PetaLibrary for Research Computing
Industry: Education
Client: University of Colorado
Research Computing Dept
Problem: growing data needs for research computing while meeting data Products: GPFS, TSM, IBM disk and tape
management requirements of National Science Foundation storage and servers

Solution:
Profile
• For high-capacity disk storage, the IBM System Storage DCS3700 was
selected
• For longer-term data storage, CU chose the System Storage TS3500 Tape
Library
• IBM General Parallel File System (GPFS™) for the disk system and IBM Tivoli The PetaLibrary is a National Science
Storage Manager on the tape library to automatically move or copy data to tape. Foundation-subsidized service for the
storage, archival, and sharing of research
data. It is available for a modest fee to any
Benefit: Scalable high performance solution that meets needs of broad set of US-based researcher affiliated with the
customers University of Colorado Boulder.

Video available

IBM Confidential
58
© 2016 IBM Corporation
Recent ESS Win for Fast Backup Pool Use Case
IASIS Healthcare (not yet publicly referenceable)
IASIS Healthcare chose IBM Elastic Storage Server to meet their needs for a 4. Cost - The Data Domain and other proposed backup
TSM fast backup pool. IASIS chose ESS for the following reasons: appliances (ExaGrid, Sepaton) are expensive. Cost
for Data Domain solution was around $3700/TB. The
cost for ESS solution was around $700/TB
1. Multi-use capability - No need to spend premium dollars on a backup
appliance that can only serve a singular function. With ESS, IASIS is
leveraging it not only for a backup target and replication, but for several 5. Customer desired encryption capability
other use cases. One example is an image store for an ambulatory EHR
that stores millions of images and needs these served up in a timely
manner
The Solution
2. Scalability - The existing backup appliances do not scale effectively, and
•2 GL4 Elastic Storage Server systems – with
IASIS had felt the pain of a "rip and replace." The ability of the ESS to
approximately 600TB usable space)
scale while adding performance was a great differentiator.
•GPFS Advanced Edition (includes
encryption)
3. Performance - IASIS was having challenges around backup windows and
restores. The ESS, utilizing 40GbE, will allow IASIS to shrink their backup
window and hasten their restore window within their RTO also allowing
them to meet their RPO.

© 2016 IBM Corporation


Key Learning Observations/Considerations from IASIS

1. Deduplication ratios played an important part in price comparison


2. Replacing VTL based solutions with fast ESS storage may require enabling Spectrum Protect deduplication for optimum $/TB
3. Sell the fact that for Disaster Recovery, Spectrum Protect is active/active so can provide a benefit of lower RTO ( vs. VTL cold
standby)
4. Spectrum Protect client side deduplication is advised if performing replication also, especially if available BW is limited. Server
side deduplication can also be used but requires additional, careful architecting.
5. Spectrum Protect Server upgrade to v7.1.1 would be preferred since there are enhancements for Spectrum Protect Node
Replication in that release
6. Ensure client is willing to be flexible to change their environment in order to get performance and cost benefits
7. Determine if there is an ISV involved and if there are any constraints or lag in software currency.
8. Highly recommend doing a Technical Delivery Assessment (TDA) to ensure a successful deployment
9. If done right and trained, this is a great opportunity for Business Partners to sell services!
10.Also consider using IBM Lab Based Services who have the capability & experience to help deploy this solution
11.Reference the Spectrum Protect “Blueprint” which now supports ESS
• https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/NEW%20-
%20Tivoli%20Storage%20Manager%20Blueprint%20-%20%20Improve%20the%20time-to-value%20of%20your%20deployments

IBM and IBM Business Partner Use Only © 2016 IBM Corporation

You might also like