Thursday, December 10, 2015

SIOC - Storage IO Control

Introduction
Adaptive Queueing and Storage I/O Control are the vSphere features to handle storage performance and I/O congestion issues related to QUEUE FULL and I/O Latency. Both these features work by throttling the queue depth on I/O congestion detection. In this paper we will cover Storage I/O Control (SIOC) feature in detail and will also look at comparison between SIOC and Adaptive Queueing. 
vSphere has features to allocate and manage shared resources based on parameters like share, limits and reservation. This feature works effectively for CPU and Memory resources as they are allocated within the same server. For storage, the same algorithm does not work as the shared storage exist outside physical server and will be shared across multiple servers. There for vSphere introduced additional features like Adaptive Queuing and Storage I/O control to effectively manage shared access of external storage devices.

SIOC
As mentioned earlier, vSphere already have the feature to allocate disk shares per virtual machine. For example, you can define what is the percentage of disk share need to be allocated per virtual machine. Defining disk share per virtual machine helps to control disk share allocation across virtual machines within the same ESXi server. This helps to effectively handle any disk level resource contention occur within the single server. However this approach does not work to manage disk share allocation across physical servers. Since the same storage array and the LUNs will be shared across multiple physical servers, it is necessary to implement a resource sharing and control mechanism that can work across servers.

Storage I/O Control feature introduced in vSphere 4.1, address shared storage management issues mentioned above. SIOC consider the disk share allocation for virtual machines distributed across servers s and provides more granular control for disk share management. i.e, Storage I/O Control not only take the disk share allocation per server but also consider the disk share allocation at a datastore level and thereby provide better disk resource allocation.

Disk Share Usage

vSphere has disk share allocation feature which can be used to prioritize disk resource sharing between multiple virtual machine running on the same ESXi server. Disk resources like device queue depth will be allocated based on the percentage of shares assigned. For example, if there are three virtual machines running on an ESXi host with share allocation of Normal (1000) for two VMs and High (2000) for one VM. ESXi will sum up the total number of shares allocated to all VMs, which is equal to 4000 in this case (2000+1000+1000) and then allocate 50% of disk resources to the virtual machine with High share and 25% each to the other two virtual machines. For example, if there is 32 device queue depth available to this ESXi Host, 16 will be assigned to the VM with High share and 8 each to the VMs with Normal share allocation.

Note that the disk share definition scope is limited within the ESXi host and does not have any effect on virtual machine running on other ESXi servers.

Enable Disk Share

To configure disk share in Web Client, select Edit Settings and under Virtual Hardware tab you can find option to define disk share value.


Default value for disk shares is Normal (1000) and you can select share value of High (2000), Low (500) or define a custom value.

Disk Share Allocation without SIOC
How does disk share allocation works when only disk share value is set without enabling SIOC feature?

Assume there are three ESXi servers in a datacenter accessing the same datastore and there are six virtual machines running. Three virtual machines (VM-A, VM-B, VM-C) are deployed on host ESXi-01, two virtual machines (VM-D, VM-E) deployed on host ESXi-02 and one virtual machine (VM-F) deployed on host ESXi-03 all using the same datastore.


When SIOC is not enabled, equal disk resource will be allocated to all ESXi servers without considering the number of virtual machines running on each of the ESXi servers or their allocated share value. The disk resource allocated to each ESXi server will be distributed within the server based on share value allocation for virtual machine.

Assume that disk share is allocated as listed in below table for six virtual machines. 

Host Name
VM Name
Shares
Disk Share Value
ESXi-01
VM-A
High
2000
VM-B
Normal
1000
VM-C
Normal
1000
ESXi-02
VM-D
Custom
1500
VM-E
Low
500
ESXi-03
VM-F
Normal
1000

Assume that the total device queue depth available is 84 for three ESXi servers. When SIOC is not enabled device queue depth allocation per ESXi servers and virtual machine will be as shown in below diagram.


When SIOC is not enabled, all three ESXi servers will get same device queue depth allocation of 28. This allocation is done without considering the disk share allocation per virtual machine. From this diagram you can see the VM-A which has a disk share allocation of High (2000) shares is only getting device queue depth allocation. All other virtual machine in this data center has less disk share defined compared to virtual machine VM-A. If you look at virtual machine VM-F which has disk share defined as Normal (1000) is getting more device queue depth allocation i.e, 28. Similarly VM-D which has a Custom disk share of 1500 is also getting more queue depth compared to VM-A.

Since the disk share allocation across ESXi servers are not considered for per-server queue depth allocation, critical machines can experience IO contention because of high resource allocation to low priority virtual machines.

Disk Share Allocation with SIOC

SIOC helps to overcome this limitation and helps to ensure that high priority virtual machines with more disk share allocation always get more disk resources. SIOC achieve this by looking at disk share allocation for virtual machines across all ESXi servers and taking total shares allocated for a device into consideration.

For example, in the scenario mentioned above total number of disk share allocated across three ESXi Servers is 7000.
Host Name
VM Name
Shares
Disk Share Value
Total Disk Share per Server
ESXi-01
VM-A
High
2000
4000
VM-B
Normal
1000
VM-C
Normal
1000
ESXi-02
VM-D
Custom
1500
2000
VM-E
Low
500
ESXi-03
VM-F
Normal
1000
1000
Total Disk Share across three ESXi Server
7000
With SIOC enabled, total available device queue depth will be distributed per ESXi server using the formula listed below.
Device Queue Depth for Server-A = (Total Device Queue Depth / Total Disk Share allocation across all Servers) * Total Disk Share Allocation for Server-A

In below diagram, you can see that disk resource allocation is done for virtual machine based on their disk share allocation. 


Note that with SIOC, disk resources are not distributed equally to all servers. Based on the total disk share defined per server, disk resource is distributed. In this example, ESXi-01 server get a total device queue depth of 48 compared to device queue depth of 24 for ESXi-2 and device queue depth of 12 allocated to ESXi-03 server.

Also you can see that VM-A with highest disk share allocation of 2000 shares is getting maximum queue depth of 24 compared to all other virtual machines with less disk share defined. This will help to ensure that the critical virtual machine with highest disk share allocation always get highest disk resource allocation.

Enable SIOC feature
You can enable SIOC feature in Web Client by selecting the Storage tab and going to the Settings for the respective datastore. Under the Settings tab by default, the Storage I/O Control feature is disabled. Select the Edit button for Datastore Capabilities and in the pop-up window, you can Enable SIOC.



In latest version of vSphere, when you enable SIOC you have two options for detecting congestion threshold. You can either select Percentage of peak throughput (default is 90%) or Manual (default is 30 ms).

How SIOC Works?
Note that SIOC feature will kick in and start throttling queue depth only when the datastore latency reaches the defined value of Congestion Threshold. As mentioned above for Congestion Threshold, you can define the value in percentage of peak throughput or Manual.

Below is the recommendation for setting congestion threshold manual value for different type of storage devices (Ref:- http://www.vmware.com/files/pdf/techpaper/VMW-vSphere41-SIOC.pdf)


Option to set Percentage of peak throughput value is introduced in vSphere 5.1. Definition of this parameter as per VMware documentation is “The percentage of peak throughput value indicates the estimated latency threshold when the datastore is using that percentage of its estimated peak throughput.” Defining the Congestion Threshold in Percentage of peak throughput helps to automatically determine the latency and throughput by injecting I/O.

SIOC feature keep tracks of the I/O latency for a datastore and compare this value with the value of congestion threshold defined for that datastore. When the I/O latency for a datastore exceed the congestion threshold value SIOC features kicks and start controlling device queue depth allocation based on virtual machine disk share value. After datastore I/O latency exceeds congestion threshold value, SIOC limits the amount of I/O an ESXi server can transfer by throttling the device queue.

SIOC Recommendations
For critical virtual machines to get more disk resource allocation during I/O congestion you need to define the disk share for virtual machines carefully. This can be achieved by configuring high disk share value for critical virtual machines.

Since SIOC only kicks in after reaching the congestion threshold defined for datastore, it is important to set the correct congestion threshold while enabling SIOC feature. The problem with setting the Congestion Threshold manually is that if the value is set incorrectly, it may leads SIOC to start I/O throttle early and there can impact the storage performance.  At the same time, if you set a very high value for this parameter your virtual machine storage performance will be impacted because of latency issues.

Conclusion
Without Storage I/O Control feature enabled for a datastore, disk resources like device queue depth will be allocated for all servers equally without considering the number of virtual machines or the total disk share allocation per server. Disk share value defined without SIOC is only used for calculating the disk share allocation within local server and not at a datastore level.

SIOC consider the disk share allocation for virtual machines distributed across all servers at datastore level and provides more granular control for disk share management. SIOC helps to ensure that the critical virtual machines with highest disk share allocation always get more disk resource allocation.

PS:-Initially I wanted to publish this as a Whitepaper and not a blog post; hence you will find the template of this post different from my other blog posts.  As always keep sharing your feedback on this topic and also you can write to me pt.sudhish@gmail.com in case you have any queries or need additional information on this topic.





Adaptive Queuing Vs. SIOC

Adaptive Queuing Vs. SIOC

Both Adaptive Queuing and SIOC feature works by modifying device queue depth during I/O congestion. When both these features are using the same control knob to throttle I/O during congestion why do we need two different features?

From a vSphere administrator perspective SIOC has more features and can be enabled easily at datastore level compared to Adaptive Queuing. Also SIOC has the feature for defining congestion threshold value in Percentage of peak throughput which helps to automatically determine the latency and throughput by injecting I/O.

However SIOC is a vSphere only feature and will not work effectively in a heterogeneous environment with other workloads. For example, if you have a datacenter in which the same storage is shared across vSphere hosts as well as other server platform. If you enable SIOC for the datastore which is shared with non-vSphere servers, I/O throttling will happen for all vSphere hosts after exceeding the congestion threshold. Since non-vSphere hosts does not have any I/O throttling enabled they can start using the slots free-up by the vSphere hosts and the vSphere hosts will experience latency issue. This is where the Adaptive queuing feature helps. 

In a heterogeneous environment with vSphere and non-vSphere hosts, you can enable adaptive queuing for the vSphere hosts and also for non-vSphere hosts.