Professional Documents
Culture Documents
Executive Summary
Storage I/O Control (SIOC) provides storage I/O performance isolation for virtual machines, thus enabling VMware® vSphere™
(“vSphere”) administrators to comfortably run important workloads in a highly consolidated virtualized storage environment. It
protects all virtual machines from undue negative performance impact due to misbehaving I/O-heavy virtual machines, often known
as the “noisy neighbor” problem. Furthermore, the service level of critical virtual machines can be protected by SIOC by giving them
preferential I/O resource allocation during periods of congestion. SIOC achieves these benefits by extending the constructs of shares
and limits, used extensively for CPU and memory, to manage the allocation of storage I/O resources. SIOC improves upon the previous
host-level I/O scheduler by detecting and responding to congestion occurring at the array, and enforcing share-based allocation of I/O
resources across all virtual machines and hosts accessing a datastore.
With SIOC, vSphere administrators can mitigate the performance loss of critical workloads due to high congestion and storage latency
during peak load periods. The use of SIOC will produce better and more predictable performance behavior for workloads during
periods of congestion. Benefits of leveraging SIOC:
• Provides performance protection by enforcing proportional fairness of access to shared storage
• Detects and manages bottlenecks at the array
• Maximizes your storage investments by enabling higher levels of virtual-machine consolidation across your shared datastores
The purpose of this paper is to explain the basic mechanics of how SIOC, a new feature in vSphere 4.1, works and to discuss
considerations for deploying it in your VMware virtualized environments.
T E C H N I C A L W H I T E PA P E R / 2
Storage I/O Control Technical Overview
and Considerations for Deployment
Figure 1. I/O Shares for Two Virtual Machines on a Single ESX Server (Host-Level Disk Scheduler)
In the case in which I/O shares for the virtual disks (VMDKs) of each of those virtual machines are set to different values, it is the local
scheduler that prioritizes the I/O traffic only in case the local HBA becomes congested.
T E C H N I C A L W H I T E PA P E R / 3
Storage I/O Control Technical Overview
and Considerations for Deployment
This described host-level capability has existed for several years in ESX Server prior to vSphere 4.1. It is this local-host level disk
scheduler that also enforces the limits set for a given virtual-machine disk. If a limit is set for a given VMDK, the I/O will be controlled
by the local disk scheduler so as to not exceed the defined amount of I/O per second.
vSphere 4.1 has added two key capabilities: (1) the enforcement of I/O prioritization across all ESX servers that share a common
datastore, and (2) detection of array-side bottlenecks. These are accomplished by way of a datastore-wide distributed disk scheduler
that uses I/O shares per virtual machine to determine whether device queues need to be throttled back on a given ESX server to allow
a higher-priority workload to get better performance. The datastore-wide disk scheduler totals up the disk shares for all the VMDKs
that a virtual machine has on the given datastore. The scheduler then calculates what percentage of the shares the virtual machine has
compared to the total number of shares of all the virtual machines running on the datastore. This percentage of shares is displayed in
the list of details shown in the view of virtual machines tab for each datastore, as seen in Figure 2.
As described before, SIOC engages only after a certain device-level latency is detected on the datastore. Once engaged, it begins to
assign fewer I/O queue slots to virtual machines with lower shares and more I/O queue slots to virtual machines with higher shares.
It throttles back the I/O for the lower-priority virtual machines, those with fewer shares, in exchange for the higher-priority virtual
machines getting more access to issue I/O traffic. However, it is important to understand that the maximum number of I/O queue
slots that can be used by the virtual machines on a given host cannot exceed the maximum device-queue depth for the device queue
of that ESX host. The ESX maximum queue depth varies by HBA model. The queue-depth maximum value is typically in range of 32
to 128. The lowest that SIOC can reduce the device queue depth to is 4. Figure 3a shows that, without SIOC, a virtual machine with
a lower number of shares, “VM C,” may get a larger percentage of the available storage-array device-queue slots and thus greater
storage array performance, while a virtual machine with higher I/O shares, “VM A,” gets fewer than its fair share and reduced storage
array performance. However, with SIOC engaged on that datastore, as in Figure 3b, the result will be that the lower-priority virtual
machine that is by itself on a separate host will be assigned a reduced number of I/O queue slots. That will result in fewer storage array
queue slots being used and a reduction in average device latency. The reduction in average device latency provides VM A and VM B
higher storage performance, as now the same number of I/Os that they previously were issuing complete faster due to the reduced
latency for each of those I/Os.
For instance, assume that VM A was using 18 I/O slots as shown in figure 3a. Without SIOC, the storage array latency could be
unbounded and the I/O workloads being performed by the lower priority VM C could cause a high storage device latency of, say,
40ms. In this example, VM A would have 18 I/Os @ 40ms worth of storage performance. Once enabled, SIOC controls the latency at
the configured congestion threshold, say 30ms. SIOC determines the number of storage array queue slots that can be used while
still maintaining an average device latency below the SIOC congestion threshold. Although SIOC does not directly manage the
storage array queue, it is able to indirectly control the storage array device queue by managing the ESX device queues that feed into
it. As shown in Figure 3b, SIOC has determined that 30 host-side storage queue slots can be used while still maintaining the desired
average device latency. SIOC then distributes those storage array queue slots to the various virtual machine workloads according to
their priorities. The net effect in this example is that VM C is throttled back to use only its correct relative share of the storage array.
T E C H N I C A L W H I T E PA P E R / 4
Storage I/O Control Technical Overview
and Considerations for Deployment
VM A, entitled to 60 percent of the queue slots (1500/2500 = 60 percent), is still is able to issue the same 18 I/Os but at a reduced
30ms latency. SIOC provides VM A greater storage performance by controlling VM C and ensuring it uses only its appropriate
allocation of total storage resources per performance. By throttling the ESX device-queue depths in proportion to the priorities of
the virtual machines that are using them, SIOC is able to control storage congestion at the storage array and distribute storage array
performance appropriately.
SIOC provides isolation and prioritized distribution of storage resources even when vSphere administrators have not manually
set individual disk-share priorities on each VMDK per virtual machine. SIOC protects virtual machines that are running on higher
consolidated ESX servers. In Figures 4a and 4b, all virtual machine disks have default (1000 shares), or equal disk shares. Without
SIOC, VM A and VM B are penalized and not provided equal access to storage resources simply because they are running together
on the same ESX server and sharing the same ESX device queue. Whereas VM C, running on a lower consolidated ESX host, is given
unfair preference to storage resources. Even administrators who do not wish to individually set VMDK disk shares can benefit from
this feature. SIOC provides these vSphere administrators the ability to enable storage isolation for all virtual machines accessing a
datastore by simply checking a single check box at the datastore level. This new storage management capability offered by SIOC
allows vSphere administrators the ability to run higher consolidated virtual environments by preventing imbalances of storage
resource allocation during times of storage contention.
T E C H N I C A L W H I T E PA P E R / 5
Storage I/O Control Technical Overview
and Considerations for Deployment
In these examples, SIOC is able to fully manage the storage array queue by throttling the ESX host device queues. This is possible
because all the workloads impacting the storage array queue are coming from the ESX hosts and are under SIOC’s control. However,
SIOC is able to provide storage workload isolation/prioritization even in scenarios in which external workloads, not under SIOC’s
control, are competing with those that it controls. In this scenario, SIOC will first automatically detect this situation, and then will
increase the number of device-queue slots it makes available to the virtual machine workloads so that they can compete more fairly
for total storage resources against external workloads. Using this approach, SIOC is able to maintain a balance between workload
isolation/prioritization and storage I/O throughput even when it cannot directly control or influence the external workload. This behavior
continues as long as the external workload persists and SIOC resumes normal operation once it stops detecting the external workload.
T E C H N I C A L W H I T E PA P E R / 6
Storage I/O Control Technical Overview
and Considerations for Deployment
SIOC can be used on any FC, iSCSI, or locally attached block storage device that is supported with vSphere 4.1. Review the vSphere
4.1 Hardware Compatibility List (http://www.vmware.com/go/hcl) for the entire list of supported storage devices. SIOC is supported
with FC and iSCSI storage devices that have automated tiered storage capabilities. However, when using SIOC with automated tiered
storage, the SIOC Congestion Threshold must be set appropriately to make sure the storage device’s automated tiered storage
capabilities are not impacted by SIOC.
At this time, SIOC is not supported with NFS storage devices or with Raw Device Mapping (RDM) virtual disks. SIOC is also not
supported with datastores that have multiple extents or are being managed by multiple vCenter Management Servers.
For complete step-by-step instructions on how to enable SIOC, or change the default latency threshold for a datastore or other
limitations, consult the documentation or see “Managing Storage I/O Resources” (Chapter 4) in the vSphere 4.1 Resource Management
Guide (http://www.vmware.com/pdf/vsphere4/r41/vsp_41_resource_mgmt.pdf)
T E C H N I C A L W H I T E PA P E R / 7
Storage I/O Control Technical Overview
and Considerations for Deployment
As part of vSphere 4.1, I/O per second (IOPS) limits on a per-VMDK level can be set to further manage and prioritize virtual machine
workloads. Limits (expressed in terms of IOPS) are implemented at the local-disk scheduler level and are always enforced regardless of
whether or not SIOC is enabled.
T E C H N I C A L W H I T E PA P E R / 8
Storage I/O Control Technical Overview
and Considerations for Deployment
aggressive throttling needed to maintain a lower latency might reduce the overall storage throughput. When the congestion threshold
is set higher, SIOC will not engage and begin prioritizing resources among virtual machines until the higher latency is reached. When
using a higher SIOC congestion latency, SIOC does not need to throttle storage workloads as much in order to maintain the storage
latency below the higher congestion threshold. This may allow for higher overall storage throughput.
The default congestion threshold has been set to minimize the impact of throttling on storage throughput while still providing
reasonably low storage latency and isolation for high-priority virtual machines. In most cases it is not necessary to modify the storage
congestion threshold from its default value. However, a user may decide to modify the value depending on the type and speed of their
storage device, the characteristics of the workloads in their virtual environment, and their storage-management preference between
workload isolation/prioritization and workload throughput. Because various storage devices have different latency characteristics,
users may need to modify the congestion threshold depending on their storage type. See Table 1 to determine the recommended
range of values for your storage-device type.
T y pe o f storage m edi a bac k i ng Reco mmen ded threshold (u se isolatio n vs . thro ugh put
Auto-tiered storage Use vendor recommended value, or if not provided by storage vendor, use the
Full LUN auto-tiering threshold value recommended above for the slowest tier of storage in the array.
Auto-tiered storage Use vendor recommended value, or if not provided by storage vendor, combine
Block level/sub-LUN auto-tiering ranges of fastest and slowest media types in array.
The congestion threshold may also need to be adjusted when using automated tiered storage devices. These are systems that contain
two or more types of storage media and automatically and transparently migrate data between the storage types in order to optimize
I/O performance. These systems typically try to keep the most frequently accessed or “hot” data on faster storage such as SSD, and
less frequently accessed or “cold” data on slower media such as SAS or FC disks. This means that the type of storage media backing a
particular LUN can change over time.
For full LUN auto-tiering storage devices, in which the entire LUN is migrated between different storage tiers, use the recommended
value or range for the slowest tier of storage in the device. For example, in a full LUN auto-tiering storage device that contains SSD and
Fibre Channel disks, use the congestion threshold value that is recommended for Fibre Channel.
With sub-LUN or block-level auto-tiering storage, in which individual storage blocks inside a LUN are migrated between storage tiers,
combine the recommended congestion threshold values/ranges for each storage type in the auto-tiering storage devices. For example,
in a sub-LUN / block-level auto-tiering storage device that contains an SSD storage tier and a Fibre Channel storage tier, use an SIOC
congestion threshold value in the range of 10–30ms. The exact SIOC congestion-threshold value to use is based on your individual
storage-device characteristics and your preference of isolation (using a smaller SIOC congestion-threshold value) or throughput
(using a larger SIOC congestion-threshold value). For example, in the SSD-FC scenario, the more SSD storage you have in the array,
the more your storage device characteristics will match that of the SSD storage type and thus the closer your threshold should be
to the SSD recommended value of 10ms, the low end of the combined SSD-FC range. Customers can use the midpoint of the range
as a conservative congestion threshold value that provides a balance between the preference for isolation and the preference for
throughput. In the SSD-FC example in which there was a range of 10–30ms, the conservative congestion threshold value would be 20ms.
T E C H N I C A L W H I T E PA P E R / 9
Storage I/O Control Technical Overview
and Considerations for Deployment
When modifying the SIOC congestion threshold, keep in mind that the SIOC latency is a normalized latency metric calculated and
normalized for I/O size and aggregate number of IOPS across all the storage workloads accessing the datastore. SIOC uses a normalized
latency to take into consideration that not all storage workloads are the same. Some storage workloads may issue larger I/O operations
that would naturally result in longer device latencies to service these larger I/O requests. Normalizing the storage-workload latencies
allows SIOC to compare and prioritize workloads more accurately by bringing them all into a common measurement. Because the
SIOC value is normalized, the actual observed latency as seen from the guest OS inside the virtual machine or from an individual ESX
host may be different than the calculated SIOC-normalized latency per datastore.
SIOC detects the moment when external workloads, not under SIOC’s control, may be impacting the virtual environment’s storage
resources. When SIOC detects an external workload, it will trigger a “Non-VI workload detected” informational alert in vCenter. In
most cases, this alert is purely informational and requires no action on the part of the vSphere administrator. However, the alert may
be an indicator of an incorrectly configured SIOC environment. vSphere administrators should verify that they are running a supported
SIOC configuration and that all datastores that utilize the same disk spindles have SIOC enabled with identical SIOC congestion-
threshold values. The alert might also be triggered by some backup products and other administrative workloads that bypass the ESX
host and directly access the datastore in order to accomplish their tasks. SIOC is supported in these configurations and the alert can
be safely ignored for these products. Refer to VMware KB article 1020651 for more details on the “Non-VI workload detected” alert.
T E C H N I C A L W H I T E PA P E R / 1 0
Storage I/O Control Technical Overview
and Considerations for Deployment
Detects and manages bottlenecks at the array only when congestion exists
SIOC detects a bottleneck at the datastore level, and manages I/O queue slot distribution across the ESX servers that share a datastore.
SIOC expands the I/O resource control beyond the bounds of a single ESX server to work across all ESX servers that share a datastore.
When SIOC is enabled on a datastore and no congestion exists at the device level, it will not be engaged in managing I/O resources
and will have no effect on I/O latency or throughput. In an optimized and well-configured environment, SIOC may only engage
at certain peak periods during the day. During these times of congestion and in the presence of external or non–SIOC controlled
workloads, SIOC strikes a balance between aggregate throughput and enforcement of virtual machine I/O shares.
SIOC helps vSphere administrators understand when more I/O throughput (device capacity) is needed. If SIOC is engaged for
significant periods of time during the day, it raises the question if there is a need for a change in the storage configuration. In this case,
an administrator might consider either adding more I/O capacity or using VMware Storage vMotion to migrate I/O intensive virtual
machines to an alternate datastore.
T E C H N I C A L W H I T E PA P E R / 1 1
Storage I/O Control Technical Overview
and Considerations for Deployment
Conclusion
SIOC offers I/O prioritization to virtual machines accessing shared storage resources. It allows vSphere administrators to align high-
priority virtual machine traffic with better performance and lower latency storage performance as compared to the lower-priority
virtual machines. It monitors datastore latency and engages when a preset congestion threshold has been exceeded. SIOC gives
vSphere administrators a new means to manage their VMware virtualized environments by allowing quality of service to be expressed
for storage workloads. As such, SIOC is a big step forward in the journey toward automated, policy-based management of shared
storage resources.
SIOC provides the means to better control a consolidated shared-storage resource by providing datastore-wide I/O prioritization,
helping to manage traffic on a shared and congested datastore. With the introduction of SIOC in vSphere 4.1, vSphere administrators
now have a new tool available to help them increase the consolidation density while ensuring that they will have peace of mind,
knowing that during periodic periods of peak I/O activity there will be a prioritization and proportional fairness enforced across all the
virtual machines accessing that shared resource.
VMware, Inc. 3401 Hillview Avenue Palo Alto CA 94304 USA Tel 877-486-9273 Fax 650-427-5001 www.vmware.com
Copyright © 2010 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. VMware products are covered by one or more patents listed at
http://www.vmware.com/go/patents. VMware is a registered trademark or trademark of VMware, Inc., in the United States and/or other jurisdictions. All other marks and names mentioned herein might be
trademarks of their respective companies. Item No: VMW_10Q3_WP_vSphere_4_1_SIOC_p12_A_R3