XIV System Architecture2 PDF

XIV Education
XIV System Architecture
2009 IBM Corporation
Overview
Phase 10 Features and Capabilities Gen II Systems Hardware Design and Layout XIV Software Framework XIV Systems Management
The XIV Storage System architecture incorporates a variety of features designed to uniformly distribute data across key internal resources. This unique data distribution method fundamentally differentiates the XIV Storage System from conventional storage subsystems, thereby effecting numerous availability, performance, and management benefits across both physical and logical elements of the system.
Session I: Phase 10 Features and Capabilities

System Components Architectural design Grid Architecture Storage Virtualization and Logical Parallelism Logical System Concepts Usable Storage Capacity Storage Pool Concepts Capacity Allocation and Thin Provisioning
The XIV Storage System architecture is designed to deliver performance, scalability, and ease of management while harnessing the high capacity and cost benefits of SATA drives. The system employs off-the-shelf products as opposed to traditional offerings which need more expensive components using proprietary designs.
System Components
The XIV Storage System is comprised of the following components: Host Interface Modules consisting of six modules, each containing 12 SATA Disk Drives Data Modules made up of nine modules, each containing 12 SATA Disk Drives A UPS module complex made up of three redundant UPS units v Two Ethernet Switches and an Ethernet Switch Redundant Power Supply (RPS A Maintenance Module An Automatic Transfer Switch (ATS) for external power supply redundancy A modem, connected to the Maintenance module for externally servicing the system All the modules in the system are linked through an internal redundant Gigabit Ethernet network enabling maximum bandwidth utilization and resilient to at least any single component failure. The system and all of its components come pre assembled and wired in a lockable rack.
Systems Components (cont.)
Hardware elements The primary components of the XIV Storage System are known as modules. Modules provide processing, cache and host interfaces and are based on standard Intel/Linux systems. They are redundantly connected to one another via an internal switched Ethernet fabric. All of the modules work together concurrently as elements of a grid architecture, and therefore the system harnesses the powerful parallelism inherent to a distributed computing environment. Data Modules At a conceptual level, the Data Modules function as the elementary building blocks of the system, providing physical capacity, processing power, and caching, in addition to advanced systemmanaged services that comprise the systems internal operating environment. The equivalence of hardware across Data Modules and their ability to share and manage system software and services are key elements of the physical architecture as shown in the slide. Interface Modules FundamentalIy, Interface Modules are equivalent to Data Modules in all aspects, with the following exceptions: 1. In addition to disk, cache, and processing resources, Interface Modules are designed to include both Fibre Channel and iSCSI interfaces for host system connectivity as well as remote mirroring. The Slide conceptually illustrates the placement of Interface Modules within the topology of the XIV IBM Storage System architecture. 2. The system services and software functionality associated with managing external I/O reside exclusively on the Interface Modules. Ethernet switches The XIV Storage System contains a redundant switched Ethernet fabric that conducts both data and metadata traffic between the modules. Traffic can flow in the following ways: Between two Interface Modules Between an Interface Module and a Data Module Between two Data Modules
Architectural Design
Massive Parallelism Workload balancing Self-Healing True virtualization Thin provisioning
Interface
Interface
Interface
Interface
Interface
Interface
Interface
Switching
Switching
Module
6
Module
Module
Module
Module
Module
Module
Massive Parallelism the system architecture ensures full exploitation of all system components. Any I/O activity involving a specific logical volume in the system is always inherently handled by all spindles. The system harnesses all storage capacity and all internal bandwidth and it takes advantage of all processing power available. This is equally true for host-initiated I/O activity as it is for system-initiated activity such as rebuild processes and snapshot generation. All disks, CPUs, switches and other components of the system contribute at all times. Workload balancing The workload is evenly distributed over all hardware components at all times. All disks and modules are utilized equally, regardless of access patterns. Despite the fact that applications may access certain volumes more frequently than other volumes, or access certain parts of a volume more frequently than other parts, the load on the disks and modules will be balanced perfectly. Pseudo-random distribution ensures consistent load-balancing even after adding, deleting or resizing volumes, as well as adding or removing hardware. This balancing of all data on all system components eliminates the possibility of a hot-spot being created. Self-Healing Protection against double disk failure is provided by an efficient rebuild process that brings the system back to full redundancy in minutes. In addition, the XIV Storage System extends the self-healing concept, resuming redundancy even after failures in components other than disks. True virtualization Unlike other system architectures, storage virtualization is inherent to the basic principles of the XIV Storage System design. Physical drives and their locations are completely hidden from the user. This dramatically simplifies storage configuration, letting the system lay out the users volume in the optimal way. The automatic layout maximizes the systems performance by leveraging system resources for each volume, regardless of the users access patterns. Thin provisioning the system enables thin provisioning. That is the capability to allocate storage to applications on a just-in-time and as needed basis, allowing significant cost savings compared to traditional provisioning techniques. The savings are achieved by defining a logical capacity which is larger than the physical capacity. This capability allows users to improve storage utilization rates, thereby significantly reducing capital and operational expenses by allocating capacity based on total space consumed, rather than total space allocated.
Grid Architecture
Relative effect of the loss of a given computing resource is minimized All modules are able to participate equally in handling the total workload Modules consist of standard off the shelf components Computing resources can be dynamically changed both capacity and performance
By scaling out By scaling up
IBM XIV Storage System grid overview The XIV Grid design entails the following characteristics: Both Interface Modules and Data Modules work together in a distributed computing sense. In other words, although Interface Modules have additional interface ports and assume some unique functions, they also otherwise contribute to the system operations equally to Data Modules. The modules communicate with each other via the internal, redundant Ethernet network. The software services and distributed computing algorithms running within the modules collectively manage all aspects of the operating environment. Design principles The XIV Storage System grid architecture, by virtue of its distributed topology and standard Intel/Linux buildingblock components, ensures that the following design principles are possible: Performance: The relative effect of the loss of a given computing resource, or module, is minimized. Performance: All modules are able to participate equally in handling the total workload. This is true regardless of access patterns. The system architecture enables excellent load balancing, even if certain applications access certain volumes, or certain parts within a volume, more frequently. Openness: Modules consist of standard off the shelf components. Because components are not specifically engineered for the subsystem, the resources and time required for development of newer hardware technologies are minimized. This, coupled with the efficient integration of computing resources into the grid architecture, enable the subsystem to realize the rapid adoption of the newest hardware technologies available, without the need to deploy a whole new subsystem. Upgradability and scalability: Computing resources can be dynamically changed: Scaled out by either adding new modules to accommodate both new capacity and new performance demands, or by even by tying together groups of modules. Scaled up by upgrading modules.
Storage Virtualization and Logical Parallelism

Easier volume management Consistent performance and scalability High availability and data integrity Flexible snapshots Data migration efficiency Pseudo-random algorithm Modular software design
Easier volume management Logical volume placement is driven by the distribution algorithms, freeing the storage administrator from planning and maintaining volume layout. The data distribution algorithms manage all of the data in the system collectively without deference to specific logical volume definitions. Any interaction, whether host or subsystem driven, with a specific logical volume in the system is inherently handled by all resources; it harnesses all storage capacity, all internal bandwidth and, and all processing power currently available in the subsystem. Logical volumes are not exclusively associated with a subset of physical resources, nor is there a permanent static relationship between logical volumes and specific physical resources. Logical volumes can be dynamically resized. Logical volumes can be thinly provisioned. Consistent performance and scalability Hardware resources are always utilized equitably because all logical volumes always span all physical resources, and are therefore able to reap the performance potential of the full subsystem. Virtualization algorithms automatically redistribute logical volumes data and workload when new hardware is added, thereby maintaining the system balanced while preserving transparency to the attached hosts. Conversely, equilibrium and transparency are maintained during the phase-out of old or defective hardware resources. There are no pockets of capacity, orphaned spaces, or resources that are inaccessible due to array mapping constraints or data placement. Maximized availability and data integrity The full virtualization scheme enables the IBM XIV Storage Subsystem to manage and maintain data redundancy as hardware changes: In the event of a hardware failure or when hardware is phased out, data is automatically, efficiently, and rapidly rebuilt across all the drives and modules in the system, thereby preserving host transparency, equilibrium, and data redundancy at all times while virtually eliminating any performance penalty associated with conventional RAID rebuild activities. When new hardware is added to the system, data is transparently redistributed across all resources to restore equilibrium to the system.
Storage Virtualization and Logical Parallelism

Easier volume management Consistent performance and scalability High availability and data integrity Flexible snapshots Data migration efficiency Pseudo-random algorithm Modular software design
Flexible snapshots Full storage virtualization incorporates snapshots that are differential in nature; only updated data consumes physical capacity. Many concurrent snapshots (up to 16,000 volumes and snapshots can be defined). Note that this is possible because a snapshot uses physical space only after a change has occurred on the source. Multiples snapshots of a single master volume can exist independently of each other. Snapshots can be cascaded, in effect creating snapshots of snapshots. Snapshot creation/deletion does not require data to be copied, and hence occurs immediately. As updates occur to master volumes, the systems virtualized logical structure enables it to elegantly and efficiently preserve the original point-in-time data associated with any and all dependent snapshots by simply redirecting the update to a new physical location on disk. This process, referred to as redirect on write, occurs transparently from the host perspective by virtue of the virtualized remapping of the updated data, and minimizes any performance impact associated with preserving snapshots, regardless of the number of snapshots defined for a given master volume. Because they use redirect on write and do not necessitate data movement, the size of a snapshot is independent of the source volume size. Data migration efficiency XIV supports thin provisioning. When migrating from system that only support
Logical System Concepts Distribution Algorithm

Each volume is spread across all drives XIV disks behave like connected Data is cut into 1MB Partitions and stored on the disks vessels, as the distribution XIVs distribution algorithm automatically distributes algorithm aims for constant disk partitions across all disks in the system pseudo-randomly equilibrium.
Thus, XIVs overall disk spindle usage approaches 100% in all usage scenarios.
Interface
Interface
Interface
Switching
Module
10
Module
Module
10
Logical System Concepts Distribution Algorithm (Cont.)

Data distribution only changes when the system changes
Equilibrium is kept when new hardware is added Equilibrium is kept when old hardware is removed Equilibrium is kept after a hardware failure
Module 1 Module 2
Module 1
Module 2
Module 3
Module 3
11
11
Logical System Concepts - Partitions

Equilibrium is kept when new hardware is added Equilibrium is kept when old hardware is removed Equilibrium is kept after a hardware failure
Module 1
Module 2
Module 3
Module 4
[ hardware upgrade ] 12
Logical constructs The XIV Storage System logical architecture incorporates constructs that underlie the storage virtualization and distribution of data, integral to its design. The logical structure of the subsystem ensures there is optimum granularity in the mapping of logical elements to both modules and individual physical disks, thereby guaranteeing an ideal distribution of data across all physical resources. Partitions The fundamental building block of logical volumes is known as a partition. Partitions have the following characteristics: All partitions are 1MB (1024 KB) in size. A partition contains either a primary copy or secondary copy of data: Each partition is mapped to a single physical disk. This mapping is dynamically managed by the system via a proprietary pseudo-random distribution algorithm in order to preserve data redundancy and equilibrium. The storage administrator has no control or knowledge of the specific mapping of partitions to drives. Secondary partitions are always placed onto a physical disk that does not contain the primary partition. In addition, secondary partitions are also in a module that does not contain its corresponding primary partition.
12
Logical System Concepts Partitions (Cont.)

Equilibrium is kept when new hardware is added
The fact that distribution is full and

[ hardware failure ]
Equilibrium is kept when old hardware removed makes sure all is spindles join automatic
the effort of data re-distribution after Equilibrium is kept after a hardware failure configuration change.
Tremendous performance gains are seen in recovery/optimization times thanks to this fact. Module 1
Module 2
Module 3
Module 4
13
13
Logical System Concepts Slices

XIV data distribution architecture uses Slices for partition copies
Slices are spread across all disk drives in the system Each slice has two copies: Primary copy and a Secondary copy There are 16,384 slices (times 2 copies is a total of 32,768 slices) Each disk holds approx 182 slices [(16384 x 2)/180] A Primary and its Secondary Slice will never reside on the same module
Module 1
Module 2
14
Logical System Concepts Slices (Cont.)

When creating another volume which is 17GB bigger than Logical The Partition minimum numbers size of a volume also assigned is here When creating a it Volume itamust span Each Partition is numbered with Logical Partition In this example is are a 51GB Volume the minimum 17GB, the system will allocate several across all drives in system number for its Volume built from 3 x specific 17GB chunks 1MB Numbering Partition =17GB 2^20 (1,048,576) ~17GB is always in the modulus to = 16384 chunks for x it 16384
2 +3
26 9 26 9 26 9 26 9
Module 1
11 21 11 21 11 21 11 21
6 24 6 24 6 24 6 24
14 3 14 3 14 3 14 3
1 23 1 23 1 23 1 23
19 17 19 17 19 17 19 17
16 4 16 4 16 4 16 4
0 18 0 18 0 18 0 18
Module 2
27 12 27 12 27 12 27 12
15 2 15 2 15 2 15 2
20 7 20 7 20 7 20 7
5 13 5 13 5 13 5 13
8 25 8 25 8 25 8 25
22 10 22 10 22 10 22 10
8 76 6 +1 4 38 +0
15
Logical System Concepts Logical Volumes

Every logical volume is comprised of 1MB pieces of data Interface modules will not hold partition copies from other interface modules
This is called FC Proof Implemented because Interface modules are more prune to problems than Data modules
The physical capacity associated with a logical volume is always a multiple of 17GB The maximum number of volumes that can be concurrently defined on the system is 4,605
The same address space for both volumes and snapshots is 16,377
Logical volumes are administratively managed within the context of Storage Pools
16
Logical volumes The XIV Storage System presents logical volumes to hosts in the same manner as conventional subsystems, however, both the granularity of logical volumes and the mapping of logical volumes to physical disks is fundamentally different. As discussed previously, every logical volume is comprised of 1MB (1024KB) pieces of data known as partitions. The physical capacity associated with a logical volume is always a multiple of 17GB (decimal). Therefore, while it is possible to present a block-designated (discussed in module 3) logical volume to a host that is not a multiple of 17GB, the maximum physical space that is allocated for the volume will always be the sum of the minimum number of 17GB increments needed to meet the block-designated capacity. Note that the initial physical capacity actually allocated by the system upon volume creation may be less than this amount. The maximum number of volumes that can be concurrently defined on the system is limited by: 1. The logical address space limit: The logical address range of the system permits up to 16,377 volumes, although this constraint is purely logical and therefore should not normally be a practical consideration. Note that the same address space is used for both volumes and snapshots. 2. The limit imposed by the logical and physical topology of the system for the minimum volume size: The physical capacity of the system, based on 180 drives with 1TB of capacity per drive and assuming the minimum volume size of 17GB, limits the maximum volume count to 4,605 volumes. Again, since volumes and snapshots share the same address space, a system with active snapshots can have more than 4,605 addresses assigned collectively to both volumes and snapshots. Logical volumes are administratively managed within the context of Storage Pools. Since the concept of Storage Pools is administrative in nature, they are not part of the logical hierarchy inherent to the systems operational environment.
16
Logical System Concepts Volume Layout

The Partition Table maps between a logical partition number and the physical location on the disk The distribution algorithms seek to preserve the statistical equality of access among all physical disks
Each volume is allocated across at least 17GB (decimal) of capacity that is distributed evenly across all disks Each disk has its data mirrored across all other disks excluding same module
The storage system administrator does not plan the layout of volumes on the modules
There are no unusable pockets of capacity known as orphaned spaces
Upon component failure a new Goal Distribution is created

All disks participate in the enforcement and therefore rapidly return to full redundancy
17
Logical volume layout on physical disks The XIV Storage System facilitates the distribution of logical volumes over disks and modules by means of a dynamic relationship between primary data copies, secondary data copies, and physical disks. This virtualization of resources in the XIV Storage System is governed by a pseudo random algorithm. Partition table Mapping between a logical partition number and the physical location on disk is maintained in a Partition Table. The Partition Table maintains the relationship between the partitions that comprise a logical volume and their physical locations on disk. Volume layout At a high level, the data distribution scheme is an amalgam of mirroring and striping. While it is tempting to think of this in the context of RAID 1+0 (10) or 0+1, the low-level virtualization implementation precludes the usage of traditional RAID algorithms in the architecture. This is because conventional RAID implementations cannot incorporate dynamic, intelligent, and automatic management of data placement based on knowledge of the volume layout, nor is it feasible for a traditional RAID system to span all drives in a subsystem due to the vastly unacceptable rebuild times that would result. Partitions are distributed on all disks using what is defined as a pseudo-random distribution function. The distribution algorithms seek to preserve the statistical equality of access among all physical disks under all conceivable real-world aggregate workload conditions and associated volume access patterns. Essentially, while not truly random in nature, the distribution algorithms in combination with the system architecture preclude the occurrence of the phenomenon traditionally known of as hotspots. The XIV Storage System contains 180 disks, and each volume is allocated across at least 17GB (decimal) of capacity that is distributed evenly across all disks. Each logically adjacent partition on a volume is distributed across a different disk; partitions are not combined into groups before they are spread across the disks. The pseudo-random distribution ensures that logically adjacent partitions are never striped sequentially across physically adjacent disks. Each disk has its data mirrored across all other disks excluding the disks in the same module. Each disk holds approximately 1% of any other disk in other modules
17
Logical System Concepts Snapshots

A snapshot represents a point-in-time copy of a Volume Snapshots are governed by almost all of the principles that apply to Volumes Snapshots incorporate dependent relationships with their source volumes
Can be either logical volumes or other snapshots
A given partition of a primary volume and its snapshot are stored on the same disk
A write to this partition is redirected within the module, minimizing latency and utilization between modules As updates occur to master volumes, the systems virtualized logical structure enables it to preserve the original point-in-time data
18
Snapshots A snapshot represents a point-in-time copy of a Volume. Snapshots are governed by almost all of the principles that apply to Volumes. Unlike Volumes, snapshots incorporate dependent relationships with their source volumes, which can be either logical volumes or other snapshots. Because they are not independent entities, a given snapshot does not necessarily wholly consist of partitions that are unique to that snapshot. Conversely, a snapshot image will not share all of its partitions with its source volume if updates to the source occur after the snapshot was created. Volumes and snapshots Volumes and snapshots are mapped using the same distribution scheme. A given partition of a primary volume and its snapshot are stored on the same disk drive As a result, a write to this partition is redirected within the module, minimizing latency and utilization associated with additional interactions between modules. As updates occur to master volumes, the systems virtualized logical structure enables it to elegantly and efficiently preserve the original point-in-time data associated with any and all dependent snapshots by simply redirecting the update to a new physical location on the disk. This process, referred to as redirect on write, occurs transparently from the host perspective by virtue of the virtualized remapping of the updated data, and minimizes any performance impact associated with preserving snapshots regardless of the number of snapshots defined for a given master volume.
18
Logical System Concepts Snapshots
Snapshot creation/deletion is instantaneous

Host Logical View XIV Physical View
Volume
Snapshot
Vol
Snap
Restore Volume from snapshot copy On a As snapshot, each Server simply points Each Server Host has Writes pointers data, it inis memory placed to todisks original volume. Memory only randomly the across that hold system the data in 1MB locally chunks Operation
Data Module Data Module Data Module Data Module
19
19
19
Usable Storage Capacity

XIV Storage System reserves physical disk capacity for:
Global spare capacity Metadata, including statistics and traces Mirrored copies of data
The global reserved space includes sufficient space to withstand the failure of a full module in addition to three disks The system reserves roughly 4% of physical capacity for statistics and traces, as well as the distribution and partition tables Net usable capacity is reduced by a factor of 50% to account for data mirroring
Usable capacity = [1,000GB x 0.96 x [180-[12 + 3]]]/2 = 79,113GB
20
The XIV Storage System reserves physical disk capacity for: Global spare capacity Metadata, including statistics and traces Mirrored copies of data Global spare capacity The dynamically balanced distribution of data across all physical resources by definition obviates the inclusion of dedicated spare drives that are necessary with conventional RAID technologies. Instead, the XIV Storage System reserves capacity on each disk in order to provide adequate space for the redistribution and recreation of redundant data in the event of a hardware failure. The global reserved space includes sufficient space to withstand the failure of a full module in addition to three disks, enabling the system to execute a new Goal Distribution, discussed earlier, and return to full redundancy even after multiple hardware failures. Since the reserve spare capacity does not reside on dedicated disks, space for hot spare is reserved as a percentage of each individual drive overall capacity. Metadata and system reserve The system reserves roughly 4% of physical capacity for statistics and traces, as well as the distribution and partition tables. Net usable capacity The calculation of the net usable capacity of the system consists of the total disk count, less the number of disks reserved for sparing, multiplied by the amount of capacity on each disk that is dedicated to data, and finally reduced by a factor of 50% to account for data mirroring. Note: The calculation of the usable space is as follows: Usable capacity = [drive space x (% utilized for data) x [Total Drives - Hot Spare reserve]/2 Usable capacity = [1,000GB x 0.96 x [180-[12 + 3]]]/2 = 79,113GB (decimal)
20
Storage Pool Concepts

Storage Pools form the basis for controlling the usage of storage space Storage Pools consists exclusively of meta-data transactions A logical volume is defined within the context of one and only one Storage Pool A Consistency Group is a group of volumes that can be snapshotted at the same point in time Storage pool relationship:
A logical volume may have multiple independent snapshots. This logical volume is also known as a master volume A master volume and all of its associated snapshots are always a part of only one Storage Pool A volume may only be part of a single Consistency Group All volumes of a Consistency Group must belong to the same Storage Pool
21
While the hardware resources within the XIV Storage System are virtualized in a global sense, the available capacity in the system can be administratively portioned into separate and independent Storage Pools. The concept of Storage Pools is purely administrative in that they are not a layer of the functional hierarchical logical structure employed by the system operating environment. Instead, the flexibility of Storage Pool relationships from an administrative standpoint derives from the granular virtualization within the system. Essentially, Storage Pools function as a means to effectively manage a related group of logical volumes and their snapshots. Improved management of storage space Storage Pools form the basis for controlling the usage of storage space by specific applications, group of applications, or departments, enabling isolated management of relationships within the associated group of logical volumes and snapshots while imposing a capacity quota. A logical volume is defined within the context of one and only one Storage Pool and knowing that a Volume is equally distributed amongst all system disk resources, it follows that all Storage Pools must also span all system resources. As a consequence of the system virtualization, there are no limitations on the size of Storage Pools or on the associations between logical volumes and Storage Pools. In fact, manipulation of Storage Pools consists exclusively of meta-data transactions, and does not impose any data copying from one disk or module to another. Hence, changes are completed instantly and without any system overhead or performance degradation. Consistency Groups A Consistency Group is a group of volumes that can be snapshotted at the same point in time, thus ensuring a consistent image of all volumes within the group at that time. The concept of Consistency Group is ubiquitous among storage subsystems because there are many circumstances in which it is necessary to perform concurrent operations collectively across a set of volumes, so that the result of the operation preserves the consistency among volumes. For example, effective storage management activities for applications that span multiple volumes, or for creating point-in-time backups, would not be possible without first employing Consistency Groups. Storage Pool relationships Storage Pools facilitate administration of relationships between logical volumes, snapshots, and Consistency Groups. The following principles govern the relationships between logical entities within the Storage Pool: A logical volume may have multiple independent snapshots. This logical volume is also known as a master volume. A master volume and all of its associated snapshots are always a part of only one Storage Pool. A volume may only be part of a single Consistency Group. All volumes of a Consistency Group must belong to the same Storage Pool.
21
Storage Pool Concepts (Cont.)

Storage pool size can vary from 17GB to full system capacity Snapshot reserve capacity is defined within each regular Storage Pool and is maintained separately from logical volume capacity
Snapshots are structured as logical volumes, however, a Storage Pools snapshot reserve capacity is granular at the partition level (1MB) snapshots will only be automatically deleted when there is inadequate physical capacity available in the storage pool
Space allocated for a Storage Pool can be dynamically changed The designation of a Storage Pool as a regular pool or a thinly provisioned pool can be dynamically changed The storage administrator can relocate logical volumes between Storage Pools without any limitations
22
Storage Pools have the following characteristics: The size of a Storage Pool can range from as small as possible (17 GB, the minimum size that can be assigned to a logical volume) to as large as possible (the entirety of the available space in the system) without any limitation imposed by the system (this is not true for hosts, however). Snapshot reserve capacity is defined within each non-thinly provisioned, or regular, Storage Pool and is effectively maintained separately from logical, or master, volume capacity. The same principles apply for thinly provisioned Storage Pools with the exception that space is not guaranteed to be available for snapshots due to the potential for hard space depletion. Snapshots are structured in the same manner as logical volumes (also known as master volumes), however, a Storage Pools snapshot reserve capacity is granular at the partition level (1MB). In effect, snapshots collectively can be thought of as being thinly provisioned within each increment of 17GB of capacity defined in the snapshot reserve space. As discussed in the example above, snapshots will only be automatically deleted when there is inadequate physical capacity available within the context of each Storage Pool independently. This process is managed by a snapshot deletion priority scheme, Therefore, when a Storage Pools size is exhausted, only the snapshots that reside in the affected Storage Pool are deleted. The space allocated for a Storage Pool can be dynamically changed by the storage administrator: The Storage Pool can always be increased in size, limited only by the unallocated space on the system. The Storage Pool can always be decreased in size, limited only by the space consumed by the volumes and snapshots defined within that Storage Pool. The designation of a Storage Pool as a regular pool or a thinly provisioned pool can be dynamically changed even for existing Storage Pools. The storage administrator can relocate logical volumes between Storage Pools without any limitations, provided there is sufficient free space in the target Storage Pool. If necessary, the target Storage Pool capacity can be dynamically increased prior to volume relocation, assuming there is sufficient unallocated capacity available in the system. When a logical volume is relocated to a target Storage Pool, sufficient space must be available for all of its snapshots to reside in the target Storage Pool as well.
22
Capacity Allocation and Thin Provisioning

Soft volume size
The size of the logical volume that is observed by the host Soft volume size is specified in one of two ways, depending on units: In terms of GB: The system will allocate the soft volume soft size as the minimum number of discrete 17GB increments In terms of blocks: The capacity is indicated as a discrete number of 512 byte blocks
The system still consumes at a minimum of 17GB increments, however, the precise size in blocks is reported to the host
Hard volume size

The physical space allocated to the volume following host writes to the volume Upper limit is determined by the soft size assigned to the volume Allocated to volumes by the system in increments of 17GB due to the underlying logical and physical architecture Increasing the soft volume size does not affect the hard volume size
23
The XIV Storage System virtualization empowers storage administrators to thinly provision resources, vastly improving aggregate capacity utilization and simplifying resource allocation tremendously. Thin provisioning is a central theme of the virtualized design of the system, because it uncouples the virtual, or apparent, allocation of a resource from the underlying hardware allocation. Hard and soft volume sizes The physical capacity assigned to traditional, or fat, volumes is equivalent to the logical capacity presented to hosts. With the XIV Storage System, this does not need to be the case. All logical volumes by definition have the potential to be thinly provisioned as a consequence of the XIV Storage Systems virtualized architecture, and therefore provide the most efficient capacity utilization possible. For a given logical volume, there are effectively two associated sizes. The physical capacity allocated for the volume is not static, but increases as host writes fill the volume. Soft volume size This is the size of the logical volume that is observed by the host, as defined upon volume creation or as a result of a resizing command. The storage administrator specifies the soft volume size in the same manner regardless of whether the Storage Pool itself will be thinly provisioned. The soft volume size is specified in one of two ways, depending on units: 1. In terms of GB: The system will allocate the soft volume soft size as the minimum number of discrete 17GB increments needed to meet the requested volume size. 2. In terms of blocks: The capacity is indicated as a discrete number of 512 byte blocks. The system will still allocate the soft volume size consumed within the Storage Pool as the minimum number of discrete 17GB increments needed to meet the requested size (specified in 512 byte blocks), however, the size that is reported to hosts is equivalent to the precise number of blocks defined. {Incidentally, the snapshot reserve capacity associated with each Storage Pool is a soft capacity limit and it is specified by the storage administrator, though it effectively limits the hard capacity consumed collectively by snapshots as well. Hard volume size The volume allocated hard space reflects the physical space allocated to the volume following host writes to the volume, and is discretely and dynamically provisioned by the system (not the storage administrator). The upper limit of this provisioning is determined by the soft size assigned to the volume. The volume consumed hard space is not necessarily equal to the hard volume allocated capacity because the hard space allocation occurs in increments of 17GB, while actual space is consumed at the granularity of the 1MB partitions. Therefore, the actual physical space consumed by a volume within a Storage Pool is transient because a volumes consumed hard space reflects the total amount of data that has been previously written by host applications: Hard capacity is allocated to volumes by the system in increments of 17GB due to the underlying logical and physical architecture; there is no greater degree of granularity than17GB even if a only a few partitions are initially written beyond each 17GB boundary. Application write access patterns determine the rate at which the allocated hard volume capacity is consumed, and subsequently the rate at which the system allocates additional increments of 17GB up to the limit defined by the soft size for the volume. As a result, the storage administrator has no direct control over the hard capacity allocated to the volume by the system at any given point in time. During volume creation, or when a volume has been formatted, there is zero physical capacity assigned to the volume. As application writes accumulate to new areas of the volume, the physical capacity allocated to the volume will grow in increments of 17GB and may ultimately reach the full soft volume size. Increasing the soft volume size does not affect the hard volume size.
23
Capacity Allocation and Thin Provisioning (Cont.)

Vol-1 allocated soft space Even for Block defined The Block definition volumes, the system allocates allows hosts tocapacity see logical in increments precise number ofof 17GB blocks Vol-2 allocated soft space Taking Snapshots
Vol-1 size=10GB
Logical View
(Block Definition)
Snapshot Reserve
Unused space
17GB
17GB
17GB
17GB
17GB
17GB
Pool allocated soft space
A No new Thin Provisioned physical space is Pool is created actually allocated
Physical View
Pool allocated hard space
17GB
17GB
17GB
17GB
17GB
17GB
Vol-1 consumed hard space
Vol-2 consumed hard space
Snapshot consumed hard space The consumed hard space Snapshot allocated grows as host writes hard space to new areas of the accumulate volume
Vol-1 allocated hard space
Vol-2 allocated hard space
The consumed hard space grows as snapshots writes accumulate to new areas within the allocated snapshot reserve soft space
24
Storage Pool level thin provisioning While volumes are effectively thinly provisioned automatically by the system, Storage Pools can be defined by the storage administrator (when using the GUI) as either regular or thinly provisioned. Note that when using the XCLI, there is no specific parameter to indicate thin provisioning for a Storage Pool. You indirectly and implicitly create a Storage Pool as thinly provisioned by specifying a pool soft size greater than its hard size. With a regular pool, the host-apparent, capacity is guaranteed to be equal to the physical capacity reserved for the pool. The total physical capacity allocated to the constituent individual volumes and collective snapshots at any given time within a regular (non-thinly provisioned) will reflect the current usage by hosts. because the capacity is dynamically consumed as required. However, the remaining unallocated space within the pool remains reserved for the pool, and cannot be used by other Storage Pools. Therefore, the pool will not achieve full utilization unless the constituent volumes are fully utilized, but conversely there is no chance of exceeding the physical capacity that is available within the pool as is possible with a thinly provisioned pool. In contrast, a thinly provisioned Storage Pool is not fully backed by hard capacity, meaning the entirety of the logical space within the pool cannot be physically provisioned unless the pool is transformed first into a regular pool. However, benefits may be realized when physical space consumption is less than logical space assigned, because the amount of logical capacity assigned to the pool that is not covered by physical capacity is available for use by other Storage Pools.
24
Session II: Gen II Systems Hardware Design and Layout

XIV Storage System Model 2810-A14
Full Rack Systems Partially Populated Rack Systems
The Rack, ATS and UPS modules Data Modules Interface Module SATA Disk Drives The Patch Panel Interconnection and Switches Maintenance Module
25
This chapter describes the hardware architecture of the XIV Storage System. The physical structures that make up the XIV Storage System are presented, such as the system rack, Interface, Data and Management modules, disks, switches and power distribution devices.
25
XIV Storage System Model 2810-A14
26
The XIV Storage System seen in this slide is designed to be a scalable enterprise storage system based upon a grid array of hardware components. The architecture offers highest performance through maximized utilization of all disks, true distributed cache implementation, coupled with more effective bandwidth. It also offers superior reliability through distributed architecture, redundant components, self-monitoring and self-healing.
26
Full Rack Systems
27
Hardware characteristics The IBM 2810-A14 is a new generation of IBM high-performance, high-availability, and high-capacity enterprise disk storage subsystem. This slide summarizes the main hardware characteristics. All XIV hardware components come pre-installed in a standard APC AR3100 rack. At the bottom of the rack, a UPS module complex made up of three redundant UPS units, is installed and provides power to the Data Modules, Interface Modules and switches. A fully populated rack contains 15 Data Modules, where 6 modules are combined Data and Interface Modules equipped with the connectivity adapters (FC and Ethernet). Each module includes twelve 1TB SATA disk drives. This translates into a total raw capacity of 180TB for the complete system. Two 48 port 1Gbps Ethernet switches form the basis of an internal redundant Gigabit Ethernet that links all the modules in the system. The switches are installed in the middle of the rack between the Interface Modules. The connections between the modules and switches and also all internal power connections in the rack are realized by a redundant set of cables. For power connections, standard power cables and plugs are used. Additionally standard Ethernet cables are used for interconnection between the modules and switches. All 15 modules (6 Interface Modules and 9 Data Modules) have redundant connections through two 48-port 1 Gbps Ethernet switches. This grid network ensures communication between all modules even if one of the switches or a cable connection fails. Furthermore, this grid network provides the capabilities for parallelism and execution of a data distribution algorithm that contribute to the excellent performance of the XIV Storage System.
27
Partially Populated Rack Systems

3 3
72 ~27 48
28
Hardware characteristics The IBM 2810-A14 is a new generation Partially Populated Rack provides a solution for mid/large enterprises needing to begin working with XIV storage with less capacity. This slide summarizes the main hardware characteristics. All XIV hardware components come pre-installed in a standard APC AR3100 rack. At the bottom of the rack, a UPS module complex made up of three redundant UPS units, is installed and provides power to the Data Modules, Interface Modules and switches. A partially populated rack contains 6 Data Modules, where 3 modules are combined Data and Interface Modules equipped with the connectivity adapters (FC and Ethernet). Each module includes twelve 1TB SATA disk drives. This translates into a total raw capacity of 72TB for the complete system. Two 48 port 1Gbps Ethernet switches form the basis of an internal redundant Gigabit Ethernet that links all the modules in the system. The switches are installed in the middle of the rack between the Interface Modules. The connections between the modules and switches and also all internal power connections in the rack are realized by a redundant set of cables. For power connections, standard power cables and plugs are used. Additionally standard Ethernet cables are used for interconnection between the modules and switches. All 6 modules (3 Interface Modules and 3 Data Modules) have redundant connections through two 48-port 1 Gbps Ethernet switches. This grid network ensures communication between all modules even if one of the switches or a cable connection fails. Furthermore, this grid network provides the capabilities for parallelism and execution of a data distribution algorithm that contribute to the excellent performance of the XIV Storage System.
28
Partially Populated Rack Systems (Cont.)

Total Modules Usable Capacity Interface Modules Data Modules Disk Drives Fiber Channel Ports iSCSI Ports Memory (GB) Plant/Field Orderable 6 27 3 3 72 8 0 48 Plant 9 43 6 3 108 16 4 72 Field 10 50 6 4 120 16 4 80 Field 11 54 6 5 132 20 6 88 Field 12 61 6 6 144 20 6 96 Field 13 66 6 7 156 24 6 104 Field 14 73 6 8 168 24 6 112 Field 15 79 6 9 180 24 6 120 Both
29
Additional capacity configurations The XIV Storage System Model A14 is now available in a six module configuration consisting of three interface modules (feature number 1100) and three data modules (feature number 1105). This configuration is designed to support the same capabilities and functions as the current 15 module XIV Storage System with the IBM XIV Storage System Software V10. It has all of the same auxiliary components and ships in the same physical rack as the 15 module system. The six module configuration is field-upgradeable with additional interface modules and data modules to achieve configurations with a total of nine, ten, eleven, twelve, thirteen, fourteen, or fifteen modules. The resulting configuration can subsequently continue to be upgraded with one or more additional modules, up to the maximum of fifteen modules.
29
The Rack, ATS and UPS modules

In case of extended external power failure or outage, the UPS module complex maintains battery power long enough to allow a safe and ordered shutdown
The Automatic Transfer System (ATS) supplies power to all three UPSs and Maintenance module
30
The rack The IBM XIV hardware components are installed in a 19 NetShelter SX 42U rack (APC AR3100) from APC. The rack is 1070mm deep to accommodate deeper size modules and to provide more space for cables and connectors. Adequate space is provided to house all components and to properly route all cables. The rack door and side panels are locked with a key to prevent unauthorized access to the installed components. The UPS module complex The Uninterruptable Power Supply (UPS) module complex consists of three UPS units. Each unit maintains an internal power supply in the event of temporal failure of the external power supply. In case of extended external power failure or outage, the UPS module complex maintains battery power long enough to allow a safe and ordered shutdown of the XIV Storage System. The complex can sustain the failure of one UPS unit, while protecting against external power disorders. The three UPS modules are located at the bottom of the rack. Each of the modules has an output of 6 kVA to supply power to all other components in the rack and is 3U in height. The design allows proactive detection of temporary power problems and can correct them before the system goes down. In the case of a complete power outage, integrated batteries continue to supply power to the entire system. Depending on the load of the IBM XIV, the batteries are designed to continue system operation from 3.3 minutes to 11.9 minutes; This gives enough time to gracefully power-off the system. Automatic Transfer System (ATS) The Automatic Transfer System (ATS) supplies power to all three Uninterruptible Power Supplies (UPS), and to the Maintenance Module. Two separate external main power sources supply power to the ATS. In case of power problems or a failing UPS, the ATS reorganizes the power load balance between the power components. The operational components take over the load from the failing power source or power supply. This rearrangement of the internal power load is performed by the ATS in a seamless way and system operation continues without any application impact.
30
The UPS Behavior

All components power connections in the box are distributed across 3 UPSs All three UPSs are running self-test procedures to validate batteries operational state
Self-test schedule is cycled on the UPSs with 5 days interval between them WRONG
An operational system is where at least two UPSs are running on utility power and with at least 70% of battery charge level A single UPS failure will not impact power distribution If two or more UPSs are in failed state, the system will wait for a 30 sec grace period before determining system graceful shutdown If one UPS is in failed state, the next self-test instance will be skipped to avoid the chance of a second UPS failure
UPS self test procedures are controlled by the system MicroCode The self test cycle is once every 14 days, with a 9 hour interval between each UPS
31
The XIV Storage system is using its memory DIMMs for cache purposes. When a system has a problem with power distribution it is imperative that a full proof power distribution system will be in place to avoid data loss. Power Distribution Rules All system components use power connections that are distributed across 3 UPSs evenly. An fully operational system is where at least two UPSs are running on utility power and with at least 70% of battery charge level. In order to allow the system for a graceful shutdown (the process where system commits all remaining IOs in cache to disks and properly shuts down all system components), The XIV system needs at least two UPSs with a minimum of 70% charge level. In case of a single UPS failure, either from a self test failure or from a physical problem with the UPS itself, power distribution to the system will not be impacted. The system will continue to function normally. If two or more UPSs are in failed state, the system will wait for an additional 30 sec grace period before determining that the system is indeed experiencing a major problem (to avoid taking the system down in cases of short power spikes) and issue a graceful shutdown to it. If one UPS is in failed state for whatever reason, the next UPS self test instance will be skipped to avoid the chance of a second UPS failure that will cause the box to issue a graceful shutdown. The UPS self test procedures are controlled by the system MicroCode and can be configured, if needed, using developer level commands. The default self test interval is 5 days between each UPS.
31
Data Module
System fans x 10
Motherboard
SAS Expander Card
CF Card System PSUs x 2 PCI Slots (ETH)
32
The hardware of the Interface Modules and the Data Modules is a Xyratex 1235E-X1. The module has a 87.9 mm (2U) tall, is 483 mm wide and 707 mm deep. The weight depends on configuration and type (Data Module or Interface Module) and is maximum of 30 kg. The fully populated rack hosts 9 Data Modules (Module 1-3 and Module 10-15). There is no difference in the hardware between Data Modules and Interface Modules, except for the additional host adapters and GigE adapters in the Interface Modules. The main components of the module, beside the 12 disk drives are: System Planar Processor Memory / Cache Enclosure Management Card Cooling devices (fans) Memory Flash Card Redundant Power Supplies In addition, each Data Module contains four redundant Gigabit Ethernet ports. These ports together with the two switches form the internal network, which is the communication path for data and meta data between all modules. One Dual GigE adapter is integrated in the System Planar (port 1 and 2), the remaining two ports (3 and 4) are on a additional Dual GigE adapter installed in a PCIe slot.
32
Data Module (Cont.)
33
Back view picture of a Data module and the CF card with the Addonics adapter.
33
Data Module (Cont.)

The same system planar with a built-in SAS controller is used in both Data and Interface modules
Each module has 1 Intel Xeon Quad Core CPU. 8GB of fully buffered DIMM memory modules 10 fans for cooling of disks, CPU and board An enclosure management card to issue alarms in case of problems with the module 1GB Compact Flash card This card is the boot device of the module and contains the software and module configuration files Due to the configuration files the Compact Flash Card is not interchangeable between modules
34
System Planar The System Planar used in the Data Modules and the Interface Modules is a standard ATX board from Intel . This high-performance server board with a built in SAS adapter supports: 64-bit quad-core Intel Xeon processor to improve performance and headroom, and provide scalability and system redundancy with multiple virtual applications. Eight fully buffered 533/667 MHz DIMMs to increase capacity and performance. Dual Gb Ethernet with Intel I/O Acceleration Technology to improve application and network responsiveness by moving data to and from applications faster. Four PCI Express slots to provide the I/O bandwidth needed by servers. SAS adapter Processor: The processor is a Xeon Quad Core Processor. This 64-bit processor has the following characteristics: 2.33 GHz clock 12 MB cache 1.33 GHz Front Serial Bus
Memory / Cache: Every module has 8 GB of memory installed (8 x 1GB FBDIMM). Fully Buffered DIMM memory technology increases reliability, speed and density of memory for use with Xeon Quad Core Processor platforms. This processor memory configuration can provide 3 times higher memory throughput, enable increased capacity and speed to balance capabilities of quad core processors, perform reads and writes simultaneously and eliminate the previous read to write blocking latency. Part of the memory is used as module system memory, while the rest is used as cache memory for caching data previously read, pre-fetching of data from disk and for delayed destaging of previously written data.
34
Interface Module
System fans x 10
Motherboard
SAS Expander Card
CF Card System PSUs x 2 PCI Slots (FC & ETH)
35
Interface Module (Cont.)
36
Interface Module The Interface Module is similar to the Data Module. The only differences are: Each Interface Module contains iSCSI and Fibre Channel ports, through which hosts can attach to the XIV Storage System. These ports can also be used to establish Remote Mirror links with another, remote XIV Storage System. There are two 4-port GigE PCIe adapters installed for additional internal network connections and also for the iSCSI ports. There are six Interface Modules (modules 4-9) available in the rack. All Fibre Channel ports, iSCSI ports and Ethernet ports used for external connections are internally connected to a patch panel where the external cables are actually hooked up. Fibre Channel connectivity There are 4 FC ports (two 2-port adapters) available in each Interface Module for a total of 24 FCP ports. They support 4 Gbps (Gigabit per second) full-duplex data transfer over short wave fibre links, using 50 micron multimode cable. The cable needs to be terminated on one end by a Lucent Connector (LC). In each module the ports are allocated as follows: ports 1 and 2 are allocated for host connectivity ports 3 and 4 are allocated for remote connectivity 4Gb FC PCI Express adapter Fibre channel connections to the Interface Modules are realized by two 2-port 4Gb FC PCI Express Adapters per Interface Module for faster connectivity and improved data protection from LSI Corporation. This Fibre Channel host bus adapter (HBA) is LSIs powerful FC949E controller and features full-duplex capable FC ports, that automatically detect connection speed, and can each independently operate at 1,2 or 4Gbps. The ability to operate on slower speeds ensure that these adapters remain fully compatible with legacy equipment. New end-to-end error detection (CRC) for improved data integrity during reads and writes is also supported.
36
Interface Module (Cont.)
37
iSCSI connectivity There are six iSCSI service ports (two per Interface Module) available for iSCSI over IP/Ethernet services. These ports are available in Interface Modules 7,8 and 9 supporting 1Gbps Ethernet host connection. These ports should connect, through the patch panel to the users IP network and provide connectivity to the iSCSI hosts. iSCSI connections can be operated with different functionalities: As an iSCSI target: server hosts through the iSCSI protocol As an iSCSI initiator for remote mirroring when connected to another iSCSI port As an iSCSI for data migration when connected to third party iSCSI storage system For CLI and GUI access over the iSCSI ports iSCSI ports can be defined for different use: Each iSCSI port can be defined as an IP interface Groups of Ethernet iSCSI ports on the same module can be defined as a single link aggregation group (IEEE standard: 802.3ad) Ports defined as a link aggregation group must be connected to the same Ethernet switch, and a parallel link aggregation group must be defined on that Ethernet switch. Although a single port is defined as a link aggregation group of one, IBM XIV support can override this configuration if such a setup is not operable with the customer Ethernet switches. For each iSCSI IP interface these configuration options are definable: IP address (mandatory) Network mask (mandatory) Default gateway (optional) MTU; Default: 1,536; Maximum: 8,192 MTU
37
SATA Disk Drives
38
The SATA disk drives used in the IBM XIV are 1 TB, 7200 rpm hard drives designed for high-capacity storage in enterprise environments. All IBM XIV disks are installed in the front of the modules, twelve disks per module. Each single SATA disk is installed in a disk tray which connects the disk to the backplane and includes the disk indicators on the front. If a disk is failing it can be replaced easily from the front of the rack. The complete disk tray is one FRU which is latched by a mechanical handle in its position.
38
SATA Disk Drives (Cont.)

Performance features
3 Gb/s SAS interface supporting key features in SATA specification 32 MB cache buffer for enhanced data transfer performance Rotation Vibration Safeguard (RVS) prevents performance degradation
Reliability features
Advanced magnetic recording heads and media Self-Protection Throttling (SPT) monitors I/O Thermal Fly-height Control (TFC) provides better soft error rate Fluid Dynamic Bearing (FDB) motor improves acoustics and positional accuracy R/W heads are place on the load/unload ramp to protect user data when power is removed
39
The IBM XIV was engineered with substantial protection against data corruption and data loss, thus not just relying on the sophisticated distribution and reconstruction methods. Several features and functions implemented in the disk drive also increase reliability and performance. The highlights are: Performance features and benefits SAS interface The disk drive features a 3 Gb/s SAS interface supporting key features in the Serial-ATA specification including NCQ (Native Command Queuing), staggered spin-up and hot-swap capability. 32 MB cache buffer Internal 32 MB cache buffer enhances the data transfer performance Rotation Vibration Safeguard (RVS) In multi-drive environments, rotational vibration, which result from the vibration of neighboring drives in a system, can degrade hard drive performance. To aid in maintaining high performance the disk drive incorporates enhanced Rotation Vibration Safeguard (RVS) technology providing up to 50% improvement over the previous generation against performance degradation, leading the industry. Reliability features and benefits Advanced magnetic recording heads and media Excellent soft error rate for improved reliability and performance Self-Protection Throttling (SPT) SPT monitors and manages I/O to maximize reliability and performance Thermal Fly-height Control (TFC) TFC provides better soft error rate for improved reliability and performance Fluid Dynamic Bearing (FDB) Motor FDB Motor to improve acoustics and positional accuracy Load/unload ramp The R/W Heads are placed outside the data area to protect user data when power is removed
39
The Patch Panel

4 FC ports on each interface module (4, 5, 6, 7, 8, 9) 2 iSCSI ports on each interface module (7, 8, 9) 3 connections for GUI and/or XCLI from customer network (4, 5, 6) 2 ports for VPN connectivity (4, 6) 2 service ports (4, 5) 1 maintenance module connections 2 reserved port
40
The patch panel is located at the rear of the rack. Interface Modules are connected to the patch panel using 50 micron cables. All external connections should be made through the patch panel. In addition to the host connections and to the network connections further ports are available on the patch panel for service connections.
40
Interconnection and Switches

Internal module communication is based on 2 redundant 48 port Gigabit Ethernet switches
Switches are using interlink switch connection between them
Switches are using a RPS unit to eliminate the switch power supply as a single point of failure
41
Internal Ethernet switches The internal network is based on two redundant 48-port Gigabit Ethernet switches (Dell Power Connect 6248). Each of the modules (Data or Interface) is directly attached to each of the switches with multiple connections, and the switches are also linked to each other. This network topology enables maximal bandwidth utilization since the switches are used in active-active configuration, while being tolerant to any individual failure in network components like port, link, or switch. If one switch is failing the bandwidth of the remaining connections is fair enough to prevent noticeable performance impact and still keep enough parallelism in the system. The Dell PowerConnect 6248 is a Gigabit Ethernet Layer 3-Switch with 48 copper and 4 combined ports (SFP or 10/100/10000), robust stacking and 10-GigabitEthernet uplink capability. The switches are powered by Dell RPS-600 redundant power supplies, to eliminate the switch power supply as single point of failure.
41
Interconnection and Switches (Cont.)
42
Module - USB to serial The Module - USB to Serial connections are used by internal processes to keep alive the communication to the modules in the event the network connection is not operational. Modules are linked together with those USB to serial cables in groups of 3 modules. This emergency link is needed to communicate between the modules for internal processes and used by maintenance to repair internal network communication issues only. The USB to Serial connection is always connecting a group of three Modules. USB Module 1 is connected to Serial Module 3 USB Module 3 is connected to Serial Module 2 USB Module 2 is connected to Serial Module 1 This connection sequence is repeated for the modules 4-6, 7-9, 10-12, and 13-15.
42
Maintenance Module
Used for IBM XIV support to maintain and repair the system When needed, remote XIV support can connect remotely
Through a modem connection attached to the maintenance module
The maintenance module is a 1U generic server It is powered through the ATS directly
This is the only component in the system that is not redundant
The maintenance module is not part of the XIV storage architecture

If down it will not affect the system
The maintenance module is connected through Ethernet connections to modules 5 and 6

43
The Maintenance module and the Modem, installed in the middle of the rack are used for IBM XIV Support and the SSR/CE to maintain and repair the machine. When there is a software or hardware problem that needs the attention of the IBM XIV Support Center, a remote connection will be required and used to analyze and possibly repair the faulty system. The connection can be established either via VPN (virtual private network) broadband connection or via phone line and modem. Modem The Modem installed in the rack is needed and used for remote support. It enables the IBM XIV Support Center specialists and, if necessary, higher level of support to connect the XIV Storage System. Problem analysis and repair actions without a remote connection are complicated and time consuming. Maintenance Module A 1U remote support server is also required for full-functionality and supportability of the IBM XIV. This device has fairly generic requirements as it is only used to gain remote access to the device via VPN or modem for support personnel. The current choice for this device is a SuperMicro 1U server, with an average commodity level configuration
43
Session III: XIV Software Framework

Basic Terminology Communication infrastructure Single Module Frameworks System Nodes File Systems on the Module
44
Basic Terminology
Module a physical component
Regular module (contains disks) Interface module (disks and also SCSI interfaces) Power module Switching module
Node XIV software component that runs on several modules Singleton Node XIV software component that at any given time runs on a single module
45
Basic Terminology There are several components that complete the XIV Storage system. Module the physical components that are used to build the system Regular module contains only disks. Interface module contains disks and host connectivity interfaces for FC and iSCSI. Power module the UPSs of the system. Switching module the switches used to interconnect the all the different modules. Node a part of the XIV software components that runs on several modules Singleton Node a part of the XIV software components that at any given time runs only on a specific module
45
Communication infrastructure
NetPatrol
Guarantees network connectivity between every two modules
MCL
Provides a transactional layer between any two nodes Each node has a unique id through it a node resolves to the type of the node and the module the node runs on
RPC
Exported MicroCode functions are called via RPC Transported over MCL
XIV Configuration
Each module holds a copy of the XIV system configuration with current status of all XIV modules and nodes Transported over MCL
46
Communication Infrastructure The XIV Storage architecture includes several communication components that allow the system to provide its capabilities and maintain reliability. NetPatrol guarantees network connectivity between every two modules in the system. MCL Management Control Layer Each node has a unique identifier The id is unique and resolves to the type of the node and the module the node runs on Each process may have several MCL queues to send/receive transactions on its own Handles timeouts and retries Resolves singleton roles to node id, aware of singleton election Uses a textual based forward-compatible protocol RPC Remote Procedure Call Any exported MicroCode function may be called via RPC. There are no limitations Marshalling/Unmarshalling is generic and relies on auto-generated information from an XML file Supports both sync/async client and server calls Marshalls to compact binary form, and forward-compatible XML form Easy to migrate to transactional transport: MCL, SCSI XIV Configuration Implemented over MCL The XIV Configuration is loaded into the memory of each data and interface module It holds the current status of all modules and nodes in the system Each change in the status will trigger a set of operations to be handled by the system to maintain its full redundancy and proper operational state
46
Single Module Frameworks - Basic features

Modules are symmetric and have exactly the same data All configuration is saved in a single XML file The only difference between modules FS is the module_id file
On replacement modules the ID is assigned to them during the component testing phases prior to moving them to operational state
Tight integration with cluster hardware

Firmware management Hardware configuration Hardware monitoring
47
Single Module Frameworks On the XIV Storage system all Modules are symmetric and have exactly the same data. All the configuration data of the module is saved in a single XML file. Since each module has a unique ID, that helps the systems to define its purpose and services that should run on it, there is only one file that is different between each module in the system, and that is the module_id file. In case of a module replacement the new module FRU id is zerod out and is assigned to it during the component test phase. The MicroCode maximizes the use of module capabilities to achieve a high level of availability and reliability from the system. The MicroCode is tightly integrated with: Firmware manegement Hardware configuration Hardware monitoring
47
System Nodes
Platform Node all modules (process: platform_node)
The Platform Node manages installation/upgrade of Module software Hardware configuration Running services and nodes, and keepalive messages handling Auto-generating service-specific configuration files Sending heartbeats to Management Node Handling configuration changes for the module
Interface Node modules 4-9 (process: i_node)

Implements the necessary protocols to serve as a SCSI target for the FC and iSCSI transport
48
DESCRIPTION OF ALL NODES This section covers all Nodes in the system: Platform Node (process: platform_node) The Platform Node runs on all Modules and manages the software and hardware of a Module. The Platform Node manages installation/upgrade of Module software, configures all configurable hardware (WWNs, IP addresses, etc), running all Services and Nodes upon startup, auto-generating Service-specific and UNIX configuration files (/etc/ssh/sshd_config or /etc/hosts for example), handling the keepalive messages of all Nodes on a Module, sending heartbeats to the Management Node and handling Configuration changes for that Module. The Platform Node is normally the only process executed by xinit upon normal startup (it's hardcoded in xinit) and it spawns all Services and Nodes in accordance to the Configuration for the Module its running on. Interface Node (process: i_node) The Interface Node implements the necessary protocols to serve as a SCSI Target for the FC and iSCSI transport. For iSCSI communication it relies on an external process called iscsi_host_session to setup iSCSI sessions.
48
System Nodes (Cont.)

Cache Node all modules (process: cte)
The storage backend of the XIV Storage system Each is a holder of partitions against which IOs are performed
Gateway Node modules 4-9 (process: gw_node)

In charge of serving as the SCSI initiator for XDRP mirroring and data migration
Admin Node modules 4-9 (process: aserver)

Listens on port 7777 (using STunnel from 7778) and receives xml describing commands, then passes them to The Administrator The Administrator parses and validates the xmls, and translates them into an XIV RPC call to be executed by the Management Node
49
Cache Node (process: cte) The Cache Node runs on all Modules with storage disks (at the time of this writing, all Gen2 Modules). It is the 'storage backend' of the XIV storage array. Each Cache Node is the primary or secondary holder of zero or more data chunks called Partitions, against which IOs are performed. A Cache Node services reads/writes from/to Partitions and decides which Partitions to keep in memory. Gateway Node (process: gw_node) The Gateway Node runs on all Module with an external data port (iSCSI/FC), and is in charge of serving as the SCSI Initiator for XDRP mirroring and Data Migration. Amongst other things, the Gateway Node writes blocks to the Target of an XDRP Volume for which a Secondary Volume is defined, reads new blocks form data migrated Volumes and recovers bad blocks from the Secondary Volume of a Primary Volume for which XDRP is defined and a Media Error occurred. Admin Node (process: aserver) The Admin Node runs on all Modules with an external management port, and is in charge of processing and executing XCLI commands (the exterior API of the system). The Admin Node listens on port 7777 and is using STunnel to redirect commands coming from 7778. It receives XML describing commands in TCP and passes them to what's called "the Administrator". The Administrator (administrator.py) is a Python-written passive-node which parses and validates the XMLs, and translates them into an XIV RPC call to be executed by the Management Node (formerly called the Manager). The Admin Node spawns Administrators as necessary, and it may kill old ones and respawn new ones.
49

Management Node singleton 1|2|3 (process: manager)
In charge of managing system data redundancy by manipulating the data rebuilds and distributions Operation state changes (On, Maintenance, Shutting down) Processing XCLI commands as they are received from The Administrator in the form of an XIV RPC call
Cluster Node singleton 1|2|3 (process: cluster_hw)

In charge of managing hardware which doesnt belong to any particular module (UPS, Switch)
Event Node singleton 1|2|3 (process: event_node)

Processes event rules and acts upon need (SMTP, SNMP, SMS) Adds newly created events to the relevant part of the configuration
50
Management Node (process: manager) The Management Node is a Singleton whose primary responsibility is managing system data redundancy by means of manipulating the Distribution. In simpler terms, the Management module decides which Partition should reside on which Cache Nodes and manages Rebuild and Redistribution process. In addition, the Management Node is in charge of Operation State Changes (shutting down, Maintenance Mode, etc) and processing XCLI commands as they are received from the Administrator in the form of an XIV RPC call. Cluster Node (process: cluster_hw) The Cluster Node is a Singleton Node in charge of managing hardware which doesn't belong to any particular Module. For example, the Cluster Node monitors the UPS and switches, updating their status in the Configuration as necessary. Event Node (process: event_node) The Event Node is a Singleton Node which runs on a Module with an external management port. Its first duty is to process Event rules and acts upon them per every Event that is created (for example, rules may dictate sending an SMTP email or an SNMP trap). In addition, it adds newly created Events to the relevant part of the configuration, so Event Saver Nodes (see below) will store them.
50

SCSI Reservation Node - singleton 1|2|3 (process: isync_node)
Receives SCSI II & III commands (Reserve, Release, Register) for fast processing by the system SCSI Reservation table is maintained on each Interface Node (for redundancy) based on updates coming from the isync_node
Equip Master Node all modules (process: equip_master)

Lets foreign modules currently being equipped (test phase) to download the XIV software to them
Event Saver Node all modules (process: event_saver)

Receives all events created on the system and saves them to the VISDOR virtual disk
HW Monitoring Node all modules (process: hw_mon_node)

Monitors all available hardware on the module
51
SCSI Reservation Node (process: isync_node) The SCSI Reservation node is a Singleton node that handles incoming SCSI II and III commands from various hosts (Reserve, Release, Register). All incoming commands are routed through the i_nodes to the isync_node which then updates all the i_nodes on his decision. All the i_nodes hold a copy of the SCSI Reservation table for redundancy. Equip Master Node (process: equip_master) The Equip Master Node runs on all Modules and lets foreign Modules which are currently being equipped (during the 'test' phase) to download XIV software from the Module the Equip Master node is currently running on. It's a bit like an XIV RPC based file-server. Event Saver Node (process: event_saver) The Event Saver Node runs on on all Modules with storage disks (at the time of this writing, all Gen2 Modules). It receives all Events created on the system and saves them to permanent media (at the time of this writing, to a partition of the visdor virtual disk). Hardware Monitor Node (process: hw_mon_node) The Hardware Monitor runs on all Modules and checks all hardware in the module we know how to monitor. Monitored hardware includes the HBAs (if any), Disks, Enclosure (PSU, Fans, IPMI module, heat levels), Ethernet interfaces and SAS controller. The Hardware Monitor does not configure hardware or acts upon its findings, this is the job of other Nodes.
51
File Systems on the Module

XIV Storage System is based on the IBM-MCP Linux distribution Configuration for traditional Unix binaries is generated automatically The Compact Flash file-systems are mounted as Read Only Persistent data is stored on a special logical volume (VISDOR) triple-mirrored on the HDDs taking 2.5% of the system VISDOR holds event logs and traces for the system ISV holds statistics data for interface modules only
52
XIV File Systems The GEN 2 XIV Storage Systems is based on the IBM-MCP Linux distribution. All configurations for traditional Unix binaries are generated automatically. Inside each module there is a PCI addonics card to host a CF card. That CF card hold the basic MCP file system as read only. To accommodate some Unix related metadata, the system creates a small RAM based file system. Persistent data, such as event logs and traces, are stored on a special logical volume called VISDOR. VISDOR is triple-mirrored across all the disk drives in the system. VISDOR takes about 2.5% of the overall system capacity. VISDOR is usually mounted over /dev/sdb. Everything under /local resides in the VISDOR. ISV is an XIV hidden volume that is mapped to interface modules only. Usually mounted as /dev/sdc. It can hold approximately a years worth of statistics for general data and a months worth when sampling a specific Host/Volume.
52
Session IV: XIV Systems Management

Managing XIV Systems XIV Management Architecture The XIV Graphic User Interface (GUI) Managing Multiple Systems The XIV Command Line Interface (XCLI) Session XCLI Lab 1: Working with the XIV Management Tools
53
Managing XIV Systems

XIV Systems management can be done both through GUI and CLI commands XIV Storage Manager can be installed on:
Microsoft Windows GUI and CLI MacOS SystemX GUI and CLI Linux GUI and CLI SUN Solaris CLI IBM AIX CLI HP/UX CLI
More info on XIV Storage Management

http://www.ibm.com/systems/storage/disk/xiv/index.html
54
The XIV Storage System software supports the functions of the XIV Storage System. The software provides the functional capabilities of the system. It is preloaded on each module (data and Interface Modules) within the XIV Storage System. The functions and nature of this software are equivalent to what is usually referred to as microcode or firmware on other storage systems. The XIV Storage Management software is used to communicate with the XIV Storage System Software which in turn interacts with the XIV Storage hardware. The XIV Storage Manager can be installed on a Linux, SUN Solaris, IBM AIX, HP/UX, Microsoft Windows or MacOS based management workstation that will then act as the management console for the XIV Storage System. The Storage Manager software is provided at time of installation, or optionally downloadable from the following web site: http://www.ibm.com/systems/storage/disk/xiv/index.html For detailed information about the XIV Storage Management software compatibility refer to the XIV interoperability matrix or the System Storage Interoperability Center (SSIC) at: http://www.ibm.com/systems/support/storage/config/ssic/index.jsp The IBM XIV Storage Manager includes a user-friendly and intuitive Graphical User Interface (GUI) application, as well as an Extended Command Line Interface (XCLI) component offering a comprehensive set of commands to configure and monitor the system. Graphical User Interface (GUI) A simple and intuitive GUI allows a user to perform most administrative and technical operations (depending upon the user role) quickly and easily, with minimal training and knowledge. The main motivation behind the XIV management and GUI design is the desire to keep the complexities of the system and its internal workings completely hidden from the user. The most important operational challenges, such as overall configuration changes, volume creation or deletion, snapshot definitions, and many more, are achieved with a few clicks. This chapter contains descriptions and illustrations of tasks performed by a Storage administrator when using the XIV graphical user interface (GUI) to interact with the system. Extended Command Line Interface (XCLI) The XIV Extended Command Line Interface (XCLI) is a powerful text based. command line based tool that enables an administrator to issue simple commands to configure, manage or maintain the system, including the definitions required to connect with hosts and applications. The XCLI can be used in a shell environment to interactively configure the system or as part of a script to perform lengthy and more complex tasks.
54
XIV Management Architecture

Management workstation is used to:
Execute commands through the XCLI interface Control the XIV storage system through the GUI Configure email notification messages and SNMP traps upon occurrence of specific events or alerts
System management is achieved through 3 different IP addresses Both GUI and XCLI management run over TCP port 7778, with all traffic encrypted using SSL & TLS
55
The basic storage system configuration sequence followed in this chapter goes from the initial installation steps, followed by disk space definition and management and up to allocating or mapping this usable capacity to application hosts. Additional configuration or advanced management tasks are cross referenced to specific training parts where they are discussed in more details. After the installation and customization of the XIV Management Software on a management workstation, a physical Ethernet connection must be established to the XIV Storage System itself. The Management workstation is used to: Execute commands through the XCLI interface Control the XIV Storage System through the GUI Send E-mail notification messages and SNMP traps upon occurrence of specific events or alerts. To ensure management redundancy in case of Interface Module failure, the XIV Storage System management functionality is accessible from three different IP addresses. Each of the three IP addresses is linked to a different (hardware) Interface Module. The various IP addresses are transparent to the user and management functions can be performed through any of the IP addresses. These addresses can also be used simultaneously for access by multiple management clients. Users only need to configure the GUI or CLI for the set of IP addresses that are defined for the specific system. The XIV Storage System management connectivity system allows for users to manage the system from both CLI and GUI interfaces. Accordingly, the CLI and GUI can be configured to manage the system through IP interfaces. Both CLI and GUI management run over TCP port 7778, with all traffic encrypted through the Secure Sockets Layer (SSLv3) or Transport Layer Security (TLS 1.0) protocol.
55
The XIV Graphic User Interface (GUI)

All GUI connection require user authentication The GUI will show different options based on user level
Within the GUI you can add multiple systems Only one management IP is required to connect
If not changed yet, check the Use Predefined IP
56
Launching the Management Software GUI Upon launching the XIV GUI application a login window prompts you for a user name and its corresponding password before granting access to the XIV Storage System. The default admin user comes with storage administrator (storageadmin) rights. The XIV Storage System offers role based user access management. To connect to an XIV Storage System, you must initially add the system to make it visible in the GUI by specifying its IP addresses. To add the system, proceed as follows: 1. Make sure that the management workstation is setup to have access to the LAN subnet where the XIV Storage System resides. Verify the connection by pinging the IP address of the XIV Storage System. If this is the first time you start the GUI on this management workstation and no XIV Storage System had been previously defined to the GUI, the Add System Management dialog window is automatically displayed. If the default IP address of the XIV Storage System was not changed, check Use Predefined IP checkbox. This populates the IP/DNS Address1 field with the default IP address. Click the Add System button to effectively add the system to the GUI. If the default IP address had already been changed to a customer specified IP address (or set of IP addresses, for redundancy), you must enter those addresses in the IP/DNS Dress fields. Click the Add System button to effectively add the system to the GUI. 2. You are now returned to the main XIV Management window; Wait until the system is displayed and shows as enabled. Under normal circumstances the system will show a status of Full Redundancy displayed in a green label box. 3. Move the mouse cursor over the image of the XIV Storage System and left click to open the XIV Storage System Management main window.
56
The XIV Graphic User Interface (GUI) (Cont.)

Menu bar Toolbar User indicator
Function icons
Main display
57
Status bar indicators
The XIV Storage Management GUI is mostly self explanatory with a well organized structure and simple navigation. The main window is divided into the following areas: Function icons: located on the left side of the main window, you find a set of vertically stacked icons that are used to navigate between the functions of the GUI, according to the icon selected. Moving the mouse cursor over an icon brings up a corresponding option menu. The different menu options available from the function icons are presented in this slide. Main display: It occupies the major part of the window and provides graphical representation of the XIV Storage System. Moving the mouse cursor over the graphical representation of a specific hardware component (module, disk, UPS unit) brings up a status callout. When a specific function is selected, the main display shows a tabular representation of that function. Menu bar: It is used for configuring the system and as an alternative to the Function icons for accessing the different functions of the XIV Storage System. Toolbar: is used to access a range of specific actions linked to the individual functions of the a system. Status bar indicators: are located at the bottom of the window. This area indicates the overall operational levels of the XIV Storage System. The first indicator on the left shows the amount of soft or hard storage capacity currently allocated to Storage Pools and provides alerts when certain capacity thresholds are reached. As the physical, or hard, capacity consumed by volumes within a Storage Pool passes certain thresholds, the color of this meter indicates that additional hard capacity may need to be added to one or more Storage Pools. The second indicator in the middle, displays the number I/O operations per second (IOPS). The third indicator on the far right shows the general system status, and will for example indicate when a redistribution is underway.
57
The XIV Graphic User Interface (GUI) (Cont.)

Systems
Groups the managed systems
Volumes
Manage storage volumes and their snapshots. Define, delete, edit
Monitor
Defines the general system connectivity and monitor overall
Hosts and Clusters

Manage hosts: define, edit, delete, rename and link the host servers
Pools
Configure the features provided by the XIV storage
Remote Management
Define the communication topology between a local and a remote
Access Management
Access control system that specifies defined user roles to control access
58
Managing Multiple Systems
59
Using the main screen of the GUI allows you to see initial status of all your systems (based on systems you added to the GUI). The All Systems screen provides the following information on your systems: System name MicroCode version running on it Capacity used Graphic view of the system with warning and error indicators, if there are any Current IOPS Redundancy status
59
The XIV Command Line Interface (XCLI)

Managing XIV Storage systems can be done completely through CLI commands XCLI is commands and arguments are sensitive to user level XCLI is very simple and easy to use and are often used when in need of operations automation through scripting There are three ways to invoke XCLI functions:
Invoking the XCLI in order to define configurations A configuration is a mapping between a user-defined name and a list of three IP addresses, to simplify future use of XCLI Invoking the XCLI to execute a command The most basic use of XCLI Invoking the XCLI for general purpose functions Used to print the XCLIs help text
60
Login to the system with XCLI After installation of XIV Storage Management software, you can find the XCLI executable file in the specified installation directory. The usage of XCLI is very simple. XCLI commands and arguments are sensitive to the user level that runs them. This means that a user with admin permissions will be able to use and see more options than a user with viewer permissions (discussed later in this course). There are three ways of invoking the XCLI functions: Invoking the XCLI in order to define configurations. In these invocations, the XCLI utility is used to define configurations. A configuration is a mapping between a user-defined name and a list of three IP addresses. Such a configuration can be referenced later in order to execute a command without having to specifying the system IP addresses. These various configurations are stored on the local host running the XCLI utility and must be defined again for each host. Invoking the XCLI to execute a command. This is the most basic and important type of invocation. Whenever invoking an XCLI command you must also provide either the system's IP addresses or a configuration name. Invoking the XCLI for general purpose functions. Such invocations can be used to get the XCLI's software version or to print the XCLI's help text. The command to execute is generally specified along with parameters and their values. A script can be defined to specify the name and path to the commands file (lists commands will be executed in User Mode only). For a complete and detailed documentation of the IBM XIV Storage Manager XCLI, refer to the XCLI Reference Guide.
60
The XIV Command Line Interface (XCLI) (Cont.)

XCLI command needs 4 major element:
User name, Password, Management IP, XCLI command xcli u admin p SecretPass m 10.1.1.104 version_get xcli u <user> -p <pass> -m <IP1> [-m <IP2> [-m <IP3>]] command
Using x in XCLI will show detailed output in XML format Environment variables can be configured for UN & PWS
XIV_XCLIUSER=user XIV_XCLIPASSWORD=password
A configuration file can be used to streamline management

Create an empty xml file (e.g. c:\xivsys.xml) Set env var: XCLI_CONFIG_FILE (e.g. XCLI_CONFIG_FILE=c:\xivsys.xml) Populate config file: xcli f c:\xivsys.xml a xiv01 m 1.1.1.4 m 1.1.1.5 m 1.1.1.6 Run XCLI commands: xcli c xiv01 version_get
61
As part of XIVs high-availability features, each system is assigned three IP addresses. When executing a command, the XCLI utility is provided with these three IP addresses and tries each of them sequentially until communication with one of the IP addresses is successful. To issue a command against a specific XIV Storage System, you also need to supply the username and the password for it, which can be used with the following parameters: -u user or -user sets the user name that will be used to execute the command. -p password or -password is the CLI password that must be specified in order to execute a command in the system. -m IP1 [-m IP2 [-m IP3]] defines the IP addresses of the Nextra system. Example: xcli -u admin -p SecretPass -m 10.1.1.4 version_get The default IP address for XIV Storage System is 14.10.202.250 Managing the XIV Storage System via XCLI always requires that you specify these same parameters. To avoid repetitive typing, you can instead define and use specific environment variables. The XCLI utility requires user and password options. If user and passwords are not specified, the default environment variables XIV_XCLIUSER and XIV_XCLIPASSWORD are utilized. If neither command options nor environment variables are specified, commands are run with the user defined in config_set default_user=XXXX. This allows smooth migration to the IBM XIV Software System for customers that do not have defined users. The configurations are stored in a file under the user's home directory. A different file can be specified by -f or -file switch (applicable to configuration creation, configuration deletion, listing configurations and command execution). Alternatively, the environment variable XCLI_CONFIG_FILE, if defined, determines the file's name and path.
61
Session XCLI
Provides a shell like behavior for XCLI commands Installed with management utilities Used as the default management CLI Supports environment variables configured Supports Tab auto completion for commands and arguments
Can be initiated from right clicking on specific system in the GUI
Will use the GUIs credentials for the session
62
Using XCLI to run commands is very useful, but it lacks the abilities of a shell like behavior, like command auto completion and showing option when pressing the Tab key. The installation of the XIV Storage Management tool installs both the GUI and XCLI interfaces. It also installs a Session XCLI utility that provides you with a shell like behavior for XCLI commands. The Session XCLI supports the XIV_XCLIUSER and XIV_XCLIPASSWORD environment variables, and if configured, it will not ask for a user name and password, but only for a name or an IP address of a system.
62
Session XCLI (Cont.)

Getting help for XCLI commands is fairly easy:
help command=<CommandName> A short description of the provided command help command=<CommandName> format=full A complete outline of the command options, arguments and fields help category=<CategoryName> A list of all the available commands within a certain category help search=<SearchString> All the commands that have that search string in it
63
Getting help with XCLI commands There are many XCLI commands that allow you to fully control the XIV Storage System. You can use different help command options to get acquainted with the various XCLI commands. XCLI uses the help command to get info on commands, available arguments, possible fields and parameters. The help command has several useful options: Help command=<CommandName> The output will include a short description of the provided command. Help command=<CommandName> format=full The output will include a complete outline of the command options, arguments and fields. Help category=<CategoryName> The output will show a list of all the available commands within a certain category. Help search=<SearchString> All the commands that have that search string in it.
63
Lab 1: Working with the XIV Management Tools
Installing the XIV management utilities Working with the XIV tools Creating a configuration file for XIV
64
Lab assignment: 1. Start the VirtualBox console 2. Create a snapshot for the Windows and the Linux environments called Clean Install 3. Start the windows virtual environment with VirtualBox and login using the password password 4. Rename the windows name to serverX (where X is your assigned student number) and reboot the system 5. Go to the C:\XIV folder and install the XIV GUI package, while accepting the defaults 6. Log into the GUI using the default admin account 7. GUI - Add the system assigned to you by the instructor 8. GUI - Add the system assigned to the other group in your class as well 9. GUI - From the all systems view in the GUI, right click your assigned system and initiate a new Session XCLI window 10. XCLI search for all the commands that are volume related 1. Many of those commands are also referred to with the vol prefix 11. XCLI - Print out the full description of the version_get command 12. XCLI run the version_get command 13. XCLI display the output of version_get command in XML 14. Open a cmd windows 15. Use xcli.exe to display the output of version_get for your assigned system 16. Define the user and password environment variables in your Windows VirtualBox 1. UN: adminX 2. PWD: password 17. Create a new XIV configuration file and define both XIV system in it 1. Configure all three IP addresses for both systems 2. Call your assigned XIV system: xiv-first 3. Call the other XIV system: xiv-second 4. Make sure you define the environment variable for the configuration file 18. Open a new Session XCLI from the link on the desktop 1. Provide an IP address for your assigned system 2. Note that you are not required to provide user credentials 19. Close the Session XCLI 20. From the cmd window run the version_get command using xcli.exe on your assigned XIV system while using the name: xiv-first 21. Delete the configuration file and all three environment variables from the system 22. Shutdown the Windows environment
64
Module Summary
Phase 10 Features and Capabilities Gen II Systems Hardware Design and Layout XIV Software Framework XIV Systems Management
65

XIV System Architecture2 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

XIV System Architecture2 PDF

Uploaded by

Copyright:

Available Formats

XIV Education

XIV System Architecture

2009 IBM Corporation

Session I: Phase 10 Features and Capabilities

Systems Components (cont.)

Storage Virtualization and Logical Parallelism

Storage Virtualization and Logical Parallelism

Logical System Concepts Distribution Algorithm

Logical System Concepts Distribution Algorithm (Cont.)

Logical System Concepts - Partitions

Logical System Concepts Partitions (Cont.)

The fact that distribution is full and

Logical System Concepts Slices

Logical System Concepts Slices (Cont.)

Logical System Concepts Logical Volumes

Logical System Concepts Volume Layout

Upon component failure a new Goal Distribution is created

Logical System Concepts Snapshots

Logical System Concepts Snapshots

Snapshot creation/deletion is instantaneous

Data Module Data Module Data Module Data Module

Usable Storage Capacity

Storage Pool Concepts

Storage Pool Concepts (Cont.)

Capacity Allocation and Thin Provisioning

Hard volume size

Capacity Allocation and Thin Provisioning (Cont.)

Pool allocated soft space

A No new Thin Provisioned physical space is Pool is created actually allocated

Pool allocated hard space

Vol-1 consumed hard space

Vol-2 consumed hard space

Vol-1 allocated hard space

Vol-2 allocated hard space

Session II: Gen II Systems Hardware Design and Layout

XIV Storage System Model 2810-A14

Full Rack Systems

Partially Populated Rack Systems

Partially Populated Rack Systems (Cont.)

The Rack, ATS and UPS modules

The UPS Behavior

SAS Expander Card

CF Card System PSUs x 2 PCI Slots (ETH)

Data Module (Cont.)

Data Module (Cont.)

SAS Expander Card

CF Card System PSUs x 2 PCI Slots (FC & ETH)

Interface Module (Cont.)

Interface Module (Cont.)

SATA Disk Drives

SATA Disk Drives (Cont.)

The Patch Panel

Interconnection and Switches

Interconnection and Switches (Cont.)

The maintenance module is not part of the XIV storage architecture

The maintenance module is connected through Ethernet connections to modules 5 and 6

Session III: XIV Software Framework

Single Module Frameworks - Basic features

Tight integration with cluster hardware

Interface Node modules 4-9 (process: i_node)

System Nodes (Cont.)

Gateway Node modules 4-9 (process: gw_node)

Admin Node modules 4-9 (process: aserver)